Databases API

Database Modules

ClinVar

ClinVar Database Integration

Real-time integration with NCBI ClinVar database for clinical significance annotations. Uses both REST API and local VCF files for comprehensive coverage.

class varannote.databases.clinvar.ClinVarDatabase(cache_dir=None, use_cache=True)[source]

Bases: object

ClinVar database integration for clinical variant significance

Provides access to: - Clinical significance classifications - Review status information - Condition information - Submission details

__init__(cache_dir=None, use_cache=True)[source]

Initialize ClinVar database connection

Parameters:
  • cache_dir (Optional[str]) – Directory for caching results

  • use_cache (bool) – Whether to use local caching

get_variant_annotation(chrom, pos, ref, alt)[source]

Get ClinVar annotation for a specific variant

Parameters:
  • chrom (str) – Chromosome (e.g., “17”, “X”)

  • pos (int) – Position (1-based)

  • ref (str) – Reference allele

  • alt (str) – Alternative allele

Return type:

Dict

Returns:

Dictionary with ClinVar annotations

batch_annotate(variants)[source]

Annotate multiple variants with ClinVar data

Parameters:

variants (List[Dict]) – List of variant dictionaries

Return type:

List[Dict]

Returns:

List of variants with ClinVar annotations added

get_database_info()[source]

Get information about ClinVar database

Return type:

Dict

gnomAD

gnomAD Database Integration

Real-time integration with gnomAD (Genome Aggregation Database) for population frequency data. Provides comprehensive allele frequency information across diverse populations.

class varannote.databases.gnomad.GnomADDatabase(cache_dir=None, use_cache=True, version='v4.1')[source]

Bases: object

gnomAD database integration for population allele frequencies

Provides access to: - Global allele frequencies - Population-specific frequencies (AFR, AMR, EAS, EUR, SAS) - Allele counts and numbers - Quality metrics

__init__(cache_dir=None, use_cache=True, version='v4.1')[source]

Initialize gnomAD database connection

Parameters:
  • cache_dir (Optional[str]) – Directory for caching results

  • use_cache (bool) – Whether to use local caching

  • version (str) – gnomAD version (v4.1, v3.1.2, v2.1.1)

get_variant_annotation(chrom, pos, ref, alt)[source]

Get gnomAD annotation for a specific variant

Parameters:
  • chrom (str) – Chromosome (e.g., “17”, “X”)

  • pos (int) – Position (1-based)

  • ref (str) – Reference allele

  • alt (str) – Alternative allele

Return type:

Dict

Returns:

Dictionary with gnomAD annotations

batch_annotate(variants)[source]

Annotate multiple variants with gnomAD data

Parameters:

variants (List[Dict]) – List of variant dictionaries

Return type:

List[Dict]

Returns:

List of variants with gnomAD annotations added

get_population_frequencies(chrom, pos, ref, alt)[source]

Get detailed population frequency breakdown

Return type:

Dict

Returns:

Dictionary with population-specific frequency data

get_database_info()[source]

Get information about gnomAD database

Return type:

Dict

dbSNP

dbSNP Database Integration

Integration with NCBI dbSNP database for variant identifiers and basic information.

class varannote.databases.dbsnp.DbSNPDatabase(cache_dir=None, use_cache=True)[source]

Bases: object

dbSNP database integration for variant identifiers

Provides access to: - rsID identifiers - Variant validation status - Allele frequencies (when available) - Clinical significance flags

__init__(cache_dir=None, use_cache=True)[source]

Initialize dbSNP database connection

Parameters:
  • cache_dir (Optional[str]) – Directory for caching results

  • use_cache (bool) – Whether to use local caching

get_variant_annotation(chrom, pos, ref, alt)[source]

Get dbSNP annotation for a specific variant

Parameters:
  • chrom (str) – Chromosome (e.g., “17”, “X”)

  • pos (int) – Position (1-based)

  • ref (str) – Reference allele

  • alt (str) – Alternative allele

Return type:

Dict

Returns:

Dictionary with dbSNP annotations

batch_annotate(variants)[source]

Annotate multiple variants with dbSNP data

Parameters:

variants (List[Dict]) – List of variant dictionaries

Return type:

List[Dict]

Returns:

List of variants with dbSNP annotations added

get_database_info()[source]

Get information about dbSNP database

Return type:

Dict

COSMIC

COSMIC Database Integration

Integration with COSMIC (Catalogue of Somatic Mutations in Cancer) database. Note: COSMIC requires authentication for full access, this provides basic functionality.

class varannote.databases.cosmic.COSMICDatabase(cache_dir=None, use_cache=True, api_key=None)[source]

Bases: object

COSMIC database integration for cancer mutation data

Provides access to: - COSMIC mutation IDs - Cancer type associations - Mutation frequencies in cancer - Tissue-specific data

Note: Full COSMIC access requires authentication and licensing. This implementation provides basic public data access.

__init__(cache_dir=None, use_cache=True, api_key=None)[source]

Initialize COSMIC database connection

Parameters:
  • cache_dir (Optional[str]) – Directory for caching results

  • use_cache (bool) – Whether to use local caching

  • api_key (Optional[str]) – COSMIC API key (optional, for enhanced access)

get_variant_annotation(chrom, pos, ref, alt)[source]

Get COSMIC annotation for a specific variant

Parameters:
  • chrom (str) – Chromosome (e.g., “17”, “X”)

  • pos (int) – Position (1-based)

  • ref (str) – Reference allele

  • alt (str) – Alternative allele

Return type:

Dict

Returns:

Dictionary with COSMIC annotations

batch_annotate(variants)[source]

Annotate multiple variants with COSMIC data

Parameters:

variants (List[Dict]) – List of variant dictionaries

Return type:

List[Dict]

Returns:

List of variants with COSMIC annotations added

get_database_info()[source]

Get information about COSMIC database

Return type:

Dict