Database Modules
ClinVar
ClinVar Database Integration
Real-time integration with NCBI ClinVar database for clinical significance annotations.
Uses both REST API and local VCF files for comprehensive coverage.
-
class varannote.databases.clinvar.ClinVarDatabase(cache_dir=None, use_cache=True)[source]
Bases: object
ClinVar database integration for clinical variant significance
Provides access to:
- Clinical significance classifications
- Review status information
- Condition information
- Submission details
-
__init__(cache_dir=None, use_cache=True)[source]
Initialize ClinVar database connection
- Parameters:
-
-
get_variant_annotation(chrom, pos, ref, alt)[source]
Get ClinVar annotation for a specific variant
- Parameters:
chrom (str) – Chromosome (e.g., “17”, “X”)
pos (int) – Position (1-based)
ref (str) – Reference allele
alt (str) – Alternative allele
- Return type:
Dict
- Returns:
Dictionary with ClinVar annotations
-
batch_annotate(variants)[source]
Annotate multiple variants with ClinVar data
- Parameters:
variants (List[Dict]) – List of variant dictionaries
- Return type:
List[Dict]
- Returns:
List of variants with ClinVar annotations added
-
get_database_info()[source]
Get information about ClinVar database
- Return type:
Dict
gnomAD
gnomAD Database Integration
Real-time integration with gnomAD (Genome Aggregation Database) for population frequency data.
Provides comprehensive allele frequency information across diverse populations.
-
class varannote.databases.gnomad.GnomADDatabase(cache_dir=None, use_cache=True, version='v4.1')[source]
Bases: object
gnomAD database integration for population allele frequencies
Provides access to:
- Global allele frequencies
- Population-specific frequencies (AFR, AMR, EAS, EUR, SAS)
- Allele counts and numbers
- Quality metrics
-
__init__(cache_dir=None, use_cache=True, version='v4.1')[source]
Initialize gnomAD database connection
- Parameters:
cache_dir (Optional[str]) – Directory for caching results
use_cache (bool) – Whether to use local caching
version (str) – gnomAD version (v4.1, v3.1.2, v2.1.1)
-
get_variant_annotation(chrom, pos, ref, alt)[source]
Get gnomAD annotation for a specific variant
- Parameters:
chrom (str) – Chromosome (e.g., “17”, “X”)
pos (int) – Position (1-based)
ref (str) – Reference allele
alt (str) – Alternative allele
- Return type:
Dict
- Returns:
Dictionary with gnomAD annotations
-
batch_annotate(variants)[source]
Annotate multiple variants with gnomAD data
- Parameters:
variants (List[Dict]) – List of variant dictionaries
- Return type:
List[Dict]
- Returns:
List of variants with gnomAD annotations added
-
get_population_frequencies(chrom, pos, ref, alt)[source]
Get detailed population frequency breakdown
- Return type:
Dict
- Returns:
Dictionary with population-specific frequency data
-
get_database_info()[source]
Get information about gnomAD database
- Return type:
Dict
dbSNP
dbSNP Database Integration
Integration with NCBI dbSNP database for variant identifiers and basic information.
-
class varannote.databases.dbsnp.DbSNPDatabase(cache_dir=None, use_cache=True)[source]
Bases: object
dbSNP database integration for variant identifiers
Provides access to:
- rsID identifiers
- Variant validation status
- Allele frequencies (when available)
- Clinical significance flags
-
__init__(cache_dir=None, use_cache=True)[source]
Initialize dbSNP database connection
- Parameters:
-
-
get_variant_annotation(chrom, pos, ref, alt)[source]
Get dbSNP annotation for a specific variant
- Parameters:
chrom (str) – Chromosome (e.g., “17”, “X”)
pos (int) – Position (1-based)
ref (str) – Reference allele
alt (str) – Alternative allele
- Return type:
Dict
- Returns:
Dictionary with dbSNP annotations
-
batch_annotate(variants)[source]
Annotate multiple variants with dbSNP data
- Parameters:
variants (List[Dict]) – List of variant dictionaries
- Return type:
List[Dict]
- Returns:
List of variants with dbSNP annotations added
-
get_database_info()[source]
Get information about dbSNP database
- Return type:
Dict
COSMIC
COSMIC Database Integration
Integration with COSMIC (Catalogue of Somatic Mutations in Cancer) database.
Note: COSMIC requires authentication for full access, this provides basic functionality.
-
class varannote.databases.cosmic.COSMICDatabase(cache_dir=None, use_cache=True, api_key=None)[source]
Bases: object
COSMIC database integration for cancer mutation data
Provides access to:
- COSMIC mutation IDs
- Cancer type associations
- Mutation frequencies in cancer
- Tissue-specific data
Note: Full COSMIC access requires authentication and licensing.
This implementation provides basic public data access.
-
__init__(cache_dir=None, use_cache=True, api_key=None)[source]
Initialize COSMIC database connection
- Parameters:
cache_dir (Optional[str]) – Directory for caching results
use_cache (bool) – Whether to use local caching
api_key (Optional[str]) – COSMIC API key (optional, for enhanced access)
-
get_variant_annotation(chrom, pos, ref, alt)[source]
Get COSMIC annotation for a specific variant
- Parameters:
chrom (str) – Chromosome (e.g., “17”, “X”)
pos (int) – Position (1-based)
ref (str) – Reference allele
alt (str) – Alternative allele
- Return type:
Dict
- Returns:
Dictionary with COSMIC annotations
-
batch_annotate(variants)[source]
Annotate multiple variants with COSMIC data
- Parameters:
variants (List[Dict]) – List of variant dictionaries
- Return type:
List[Dict]
- Returns:
List of variants with COSMIC annotations added
-
get_database_info()[source]
Get information about COSMIC database
- Return type:
Dict