gget
Overview
gget is a command-line bioinformatics tool and Python package providing unified access to 20+ genomic databases and analysis methods. Query gene information, sequence analysis, protein structures, expression data, and disease associations through a consistent interface. All gget modules work both as command-line tools and as Python functions.
Important: The databases queried by gget are continuously updated, which sometimes changes their structure. gget modules are tested automatically on a biweekly basis and updated to match new database structures when necessary.
Installation
Install gget in a clean virtual environment to avoid conflicts:
Using uv (recommended)
uv pip install gget
Or using pip
pip install --upgrade gget
In Python/Jupyter
import gget
Quick Start
Basic usage pattern for all modules:
Command-line
gget <module> [arguments] [options]
Python
gget.module(arguments, options)
Most modules return:
-
Command-line: JSON (default) or CSV with -csv flag
-
Python: DataFrame or dictionary
Common flags across modules:
-
-o/--out : Save results to file
-
-q/--quiet : Suppress progress information
-
-csv : Return CSV format (command-line only)
Module Categories
- Reference & Gene Information
gget ref - Reference Genome Downloads
Retrieve download links and metadata for Ensembl reference genomes.
Parameters:
-
species : Genus_species format (e.g., 'homo_sapiens', 'mus_musculus'). Shortcuts: 'human', 'mouse'
-
-w/--which : Specify return types (gtf, cdna, dna, cds, cdrna, pep). Default: all
-
-r/--release : Ensembl release number (default: latest)
-
-l/--list_species : List available vertebrate species
-
-liv/--list_iv_species : List available invertebrate species
-
-ftp : Return only FTP links
-
-d/--download : Download files (requires curl)
Examples:
List available species
gget ref --list_species
Get all reference files for human
gget ref homo_sapiens
Download only GTF annotation for mouse
gget ref -w gtf -d mouse
Python
gget.ref("homo_sapiens") gget.ref("mus_musculus", which="gtf", download=True)
gget search - Gene Search
Locate genes by name or description across species.
Parameters:
-
searchwords : One or more search terms (case-insensitive)
-
-s/--species : Target species (e.g., 'homo_sapiens', 'mouse')
-
-r/--release : Ensembl release number
-
-t/--id_type : Return 'gene' (default) or 'transcript'
-
-ao/--andor : 'or' (default) finds ANY searchword; 'and' requires ALL
-
-l/--limit : Maximum results to return
Returns: ensembl_id, gene_name, ensembl_description, ext_ref_description, biotype, URL
Examples:
Search for GABA-related genes in human
gget search -s human gaba gamma-aminobutyric
Find specific gene, require all terms
gget search -s mouse -ao and pax7 transcription
Python
gget.search(["gaba", "gamma-aminobutyric"], species="homo_sapiens")
gget info - Gene/Transcript Information
Retrieve comprehensive gene and transcript metadata from Ensembl, UniProt, and NCBI.
Parameters:
-
ens_ids : One or more Ensembl IDs (also supports WormBase, Flybase IDs). Limit: ~1000 IDs
-
-n/--ncbi : Disable NCBI data retrieval
-
-u/--uniprot : Disable UniProt data retrieval
-
-pdb : Include PDB identifiers (increases runtime)
Returns: UniProt ID, NCBI gene ID, primary gene name, synonyms, protein names, descriptions, biotype, canonical transcript
Examples:
Get info for multiple genes
gget info ENSG00000034713 ENSG00000104853 ENSG00000170296
Include PDB IDs
gget info ENSG00000034713 -pdb
Python
gget.info(["ENSG00000034713", "ENSG00000104853"], pdb=True)
gget seq - Sequence Retrieval
Fetch nucleotide or amino acid sequences for genes and transcripts.
Parameters:
-
ens_ids : One or more Ensembl identifiers
-
-t/--translate : Fetch amino acid sequences instead of nucleotide
-
-iso/--isoforms : Return all transcript variants (gene IDs only)
Returns: FASTA format sequences
Examples:
Get nucleotide sequences
gget seq ENSG00000034713 ENSG00000104853
Get all protein isoforms
gget seq -t -iso ENSG00000034713
Python
gget.seq(["ENSG00000034713"], translate=True, isoforms=True)
- Sequence Analysis & Alignment
gget blast - BLAST Searches
BLAST nucleotide or amino acid sequences against standard databases.
Parameters:
-
sequence : Sequence string or path to FASTA/.txt file
-
-p/--program : blastn, blastp, blastx, tblastn, tblastx (auto-detected)
-
-db/--database :
-
Nucleotide: nt, refseq_rna, pdbnt
-
Protein: nr, swissprot, pdbaa, refseq_protein
-
-l/--limit : Max hits (default: 50)
-
-e/--expect : E-value cutoff (default: 10.0)
-
-lcf/--low_comp_filt : Enable low complexity filtering
-
-mbo/--megablast_off : Disable MegaBLAST (blastn only)
Examples:
BLAST protein sequence
gget blast MKWMFKEDHSLEHRCVESAKIRAKYPDRVPVIVEKVSGSQIVDIDKRKYLVPSDITVAQFMWIIRKRIQLPSEKAIFLFVDKTVPQSR
BLAST from file with specific database
gget blast sequence.fasta -db swissprot -l 10
Python
gget.blast("MKWMFK...", database="swissprot", limit=10)
gget blat - BLAT Searches
Locate genomic positions of sequences using UCSC BLAT.
Parameters:
-
sequence : Sequence string or path to FASTA/.txt file
-
-st/--seqtype : 'DNA', 'protein', 'translated%20RNA', 'translated%20DNA' (auto-detected)
-
-a/--assembly : Target assembly (default: 'human'/hg38; options: 'mouse'/mm39, 'zebrafinch'/taeGut2, etc.)
Returns: genome, query size, alignment positions, matches, mismatches, alignment percentage
Examples:
Find genomic location in human
gget blat ATCGATCGATCGATCG
Search in different assembly
gget blat -a mm39 ATCGATCGATCGATCG
Python
gget.blat("ATCGATCGATCGATCG", assembly="mouse")
gget muscle - Multiple Sequence Alignment
Align multiple nucleotide or amino acid sequences using Muscle5.
Parameters:
-
fasta : Sequences or path to FASTA/.txt file
-
-s5/--super5 : Use Super5 algorithm for faster processing (large datasets)
Returns: Aligned sequences in ClustalW format or aligned FASTA (.afa)
Examples:
Align sequences from file
gget muscle sequences.fasta -o aligned.afa
Use Super5 for large dataset
gget muscle large_dataset.fasta -s5
Python
gget.muscle("sequences.fasta", save=True)
gget diamond - Local Sequence Alignment
Perform fast local protein or translated DNA alignment using DIAMOND.
Parameters:
-
Query: Sequences (string/list) or FASTA file path
-
--reference : Reference sequences (string/list) or FASTA file path (required)
-
--sensitivity : fast, mid-sensitive, sensitive, more-sensitive, very-sensitive (default), ultra-sensitive
-
--threads : CPU threads (default: 1)
-
--diamond_db : Save database for reuse
-
--translated : Enable nucleotide-to-amino acid alignment
Returns: Identity percentage, sequence lengths, match positions, gap openings, E-values, bit scores
Examples:
Align against reference
gget diamond GGETISAWESQME -ref reference.fasta --threads 4
Save database for reuse
gget diamond query.fasta -ref ref.fasta --diamond_db my_db.dmnd
Python
gget.diamond("GGETISAWESQME", reference="reference.fasta", threads=4)
- Structural & Protein Analysis
gget pdb - Protein Structures
Query RCSB Protein Data Bank for structure and metadata.
Parameters:
-
pdb_id : PDB identifier (e.g., '7S7U')
-
-r/--resource : Data type (pdb, entry, pubmed, assembly, entity types)
-
-i/--identifier : Assembly, entity, or chain ID
Returns: PDB format (structures) or JSON (metadata)
Examples:
Download PDB structure
gget pdb 7S7U -o 7S7U.pdb
Get metadata
gget pdb 7S7U -r entry
Python
gget.pdb("7S7U", save=True)
gget alphafold - Protein Structure Prediction
Predict 3D protein structures using simplified AlphaFold2.
Setup Required:
Install OpenMM first (version depends on Python version)
Python < 3.10:
conda install -qy conda==4.13.0 && conda install -qy -c conda-forge openmm=7.5.1
Python 3.10:
conda install -qy conda==24.1.2 && conda install -qy -c conda-forge openmm=7.7.0
Python 3.11:
conda install -qy conda==24.11.1 && conda install -qy -c conda-forge openmm=8.0.0
Then setup AlphaFold
gget setup alphafold
Parameters:
-
sequence : Amino acid sequence (string), multiple sequences (list), or FASTA file. Multiple sequences trigger multimer modeling
-
-mr/--multimer_recycles : Recycling iterations (default: 3; recommend 20 for accuracy)
-
-mfm/--multimer_for_monomer : Apply multimer model to single proteins
-
-r/--relax : AMBER relaxation for top-ranked model
-
plot : Python-only; generate interactive 3D visualization (default: True)
-
show_sidechains : Python-only; include side chains (default: True)
Returns: PDB structure file, JSON alignment error data, optional 3D visualization
Examples:
Predict single protein structure
gget alphafold MKWMFKEDHSLEHRCVESAKIRAKYPDRVPVIVEKVSGSQIVDIDKRKYLVPSDITVAQFMWIIRKRIQLPSEKAIFLFVDKTVPQSR
Predict multimer with higher accuracy
gget alphafold sequence1.fasta -mr 20 -r
Python with visualization
gget.alphafold("MKWMFK...", plot=True, show_sidechains=True)
Multimer prediction
gget.alphafold(["sequence1", "sequence2"], multimer_recycles=20)
gget elm - Eukaryotic Linear Motifs
Predict Eukaryotic Linear Motifs in protein sequences.
Setup Required:
gget setup elm
Parameters:
-
sequence : Amino acid sequence or UniProt Acc
-
-u/--uniprot : Indicates sequence is UniProt Acc
-
-e/--expand : Include protein names, organisms, references
-
-s/--sensitivity : DIAMOND alignment sensitivity (default: "very-sensitive")
-
-t/--threads : Number of threads (default: 1)
Returns: Two outputs:
-
ortholog_df: Linear motifs from orthologous proteins
-
regex_df: Motifs directly matched in input sequence
Examples:
Predict motifs from sequence
gget elm LIAQSIGQASFV -o results
Use UniProt accession with expanded info
gget elm --uniprot Q02410 -e
Python
ortholog_df, regex_df = gget.elm("LIAQSIGQASFV")
- Expression & Disease Data
gget archs4 - Gene Correlation & Tissue Expression
Query ARCHS4 database for correlated genes or tissue expression data.
Parameters:
-
gene : Gene symbol or Ensembl ID (with --ensembl flag)
-
-w/--which : 'correlation' (default, returns 100 most correlated genes) or 'tissue' (expression atlas)
-
-s/--species : 'human' (default) or 'mouse' (tissue data only)
-
-e/--ensembl : Input is Ensembl ID
Returns:
-
Correlation mode: Gene symbols, Pearson correlation coefficients
-
Tissue mode: Tissue identifiers, min/Q1/median/Q3/max expression values
Examples:
Get correlated genes
gget archs4 ACE2
Get tissue expression
gget archs4 -w tissue ACE2
Python
gget.archs4("ACE2", which="tissue")
gget cellxgene - Single-Cell RNA-seq Data
Query CZ CELLxGENE Discover Census for single-cell data.
Setup Required:
gget setup cellxgene
Parameters:
-
--gene (-g): Gene names or Ensembl IDs (case-sensitive! 'PAX7' for human, 'Pax7' for mouse)
-
--tissue : Tissue type(s)
-
--cell_type : Specific cell type(s)
-
--species (-s): 'homo_sapiens' (default) or 'mus_musculus'
-
--census_version (-cv): Version ("stable", "latest", or dated)
-
--ensembl (-e): Use Ensembl IDs
-
--meta_only (-mo): Return metadata only
-
Additional filters: disease, development_stage, sex, assay, dataset_id, donor_id, ethnicity, suspension_type
Returns: AnnData object with count matrices and metadata (or metadata-only dataframes)
Examples:
Get single-cell data for specific genes and cell types
gget cellxgene --gene ACE2 ABCA1 --tissue lung --cell_type "mucus secreting cell" -o lung_data.h5ad
Metadata only
gget cellxgene --gene PAX7 --tissue muscle --meta_only -o metadata.csv
Python
adata = gget.cellxgene(gene=["ACE2", "ABCA1"], tissue="lung", cell_type="mucus secreting cell")
gget enrichr - Enrichment Analysis
Perform ontology enrichment analysis on gene lists using Enrichr.
Parameters:
-
genes : Gene symbols or Ensembl IDs
-
-db/--database : Reference database (supports shortcuts: 'pathway', 'transcription', 'ontology', 'diseases_drugs', 'celltypes')
-
-s/--species : human (default), mouse, fly, yeast, worm, fish
-
-bkg_l/--background_list : Background genes for comparison
-
-ko/--kegg_out : Save KEGG pathway images with highlighted genes
-
plot : Python-only; generate graphical results
Database Shortcuts:
-
'pathway' → KEGG_2021_Human
-
'transcription' → ChEA_2016
-
'ontology' → GO_Biological_Process_2021
-
'diseases_drugs' → GWAS_Catalog_2019
-
'celltypes' → PanglaoDB_Augmented_2021
Examples:
Enrichment analysis for ontology
gget enrichr -db ontology ACE2 AGT AGTR1
Save KEGG pathways
gget enrichr -db pathway ACE2 AGT AGTR1 -ko ./kegg_images/
Python with plot
gget.enrichr(["ACE2", "AGT", "AGTR1"], database="ontology", plot=True)
gget bgee - Orthology & Expression
Retrieve orthology and gene expression data from Bgee database.
Parameters:
-
ens_id : Ensembl gene ID or NCBI gene ID (for non-Ensembl species). Multiple IDs supported when type=expression
-
-t/--type : 'orthologs' (default) or 'expression'
Returns:
-
Orthologs mode: Matching genes across species with IDs, names, taxonomic info
-
Expression mode: Anatomical entities, confidence scores, expression status
Examples:
Get orthologs
gget bgee ENSG00000169194
Get expression data
gget bgee ENSG00000169194 -t expression
Multiple genes
gget bgee ENSBTAG00000047356 ENSBTAG00000018317 -t expression
Python
gget.bgee("ENSG00000169194", type="orthologs")
gget opentargets - Disease & Drug Associations
Retrieve disease and drug associations from OpenTargets.
Parameters:
-
Ensembl gene ID (required)
-
-r/--resource : diseases (default), drugs, tractability, pharmacogenetics, expression, depmap, interactions
-
-l/--limit : Cap results count
-
Filter arguments (vary by resource):
-
drugs: --filter_disease
-
pharmacogenetics: --filter_drug
-
expression/depmap: --filter_tissue , --filter_anat_sys , --filter_organ
-
interactions: --filter_protein_a , --filter_protein_b , --filter_gene_b
Examples:
Get associated diseases
gget opentargets ENSG00000169194 -r diseases -l 5
Get associated drugs
gget opentargets ENSG00000169194 -r drugs -l 10
Get tissue expression
gget opentargets ENSG00000169194 -r expression --filter_tissue brain
Python
gget.opentargets("ENSG00000169194", resource="diseases", limit=5)
gget cbio - cBioPortal Cancer Genomics
Plot cancer genomics heatmaps using cBioPortal data.
Two subcommands:
search - Find study IDs:
gget cbio search breast lung
plot - Generate heatmaps:
Parameters:
-
-s/--study_ids : Space-separated cBioPortal study IDs (required)
-
-g/--genes : Space-separated gene names or Ensembl IDs (required)
-
-st/--stratification : Column to organize data (tissue, cancer_type, cancer_type_detailed, study_id, sample)
-
-vt/--variation_type : Data type (mutation_occurrences, cna_nonbinary, sv_occurrences, cna_occurrences, Consequence)
-
-f/--filter : Filter by column value (e.g., 'study_id:msk_impact_2017')
-
-dd/--data_dir : Cache directory (default: ./gget_cbio_cache)
-
-fd/--figure_dir : Output directory (default: ./gget_cbio_figures)
-
-dpi : Resolution (default: 100)
-
-sh/--show : Display plot in window
-
-nc/--no_confirm : Skip download confirmations
Examples:
Search for studies
gget cbio search esophag ovary
Create heatmap
gget cbio plot -s msk_impact_2017 -g AKT1 ALK BRAF -st tissue -vt mutation_occurrences
Python
gget.cbio_search(["esophag", "ovary"]) gget.cbio_plot(["msk_impact_2017"], ["AKT1", "ALK"], stratification="tissue")
gget cosmic - COSMIC Database
Search COSMIC (Catalogue Of Somatic Mutations In Cancer) database.
Important: License fees apply for commercial use. Requires COSMIC account credentials.
Parameters:
-
searchterm : Gene name, Ensembl ID, mutation notation, or sample ID
-
-ctp/--cosmic_tsv_path : Path to downloaded COSMIC TSV file (required for querying)
-
-l/--limit : Maximum results (default: 100)
Database download flags:
-
-d/--download_cosmic : Activate download mode
-
-gm/--gget_mutate : Create version for gget mutate
-
-cp/--cosmic_project : Database type (cancer, census, cell_line, resistance, genome_screen, targeted_screen)
-
-cv/--cosmic_version : COSMIC version
-
-gv/--grch_version : Human reference genome (37 or 38)
-
--email , --password : COSMIC credentials
Examples:
First download database
gget cosmic -d --email user@example.com --password xxx -cp cancer
Then query
gget cosmic EGFR -ctp cosmic_data.tsv -l 10
Python
gget.cosmic("EGFR", cosmic_tsv_path="cosmic_data.tsv", limit=10)
- Additional Tools
gget mutate - Generate Mutated Sequences
Generate mutated nucleotide sequences from mutation annotations.
Parameters:
-
sequences : FASTA file path or direct sequence input (string/list)
-
-m/--mutations : CSV/TSV file or DataFrame with mutation data (required)
-
-mc/--mut_column : Mutation column name (default: 'mutation')
-
-sic/--seq_id_column : Sequence ID column (default: 'seq_ID')
-
-mic/--mut_id_column : Mutation ID column
-
-k/--k : Length of flanking sequences (default: 30 nucleotides)
Returns: Mutated sequences in FASTA format
Examples:
Single mutation
gget mutate ATCGCTAAGCT -m "c.4G>T"
Multiple sequences with mutations from file
gget mutate sequences.fasta -m mutations.csv -o mutated.fasta
Python
import pandas as pd mutations_df = pd.DataFrame({"seq_ID": ["seq1"], "mutation": ["c.4G>T"]}) gget.mutate(["ATCGCTAAGCT"], mutations=mutations_df)
gget gpt - OpenAI Text Generation
Generate natural language text using OpenAI's API.
Setup Required:
gget setup gpt
Important: Free tier limited to 3 months after account creation. Set monthly billing limits.
Parameters:
-
prompt : Text input for generation (required)
-
api_key : OpenAI authentication (required)
-
Model configuration: temperature, top_p, max_tokens, frequency_penalty, presence_penalty
-
Default model: gpt-3.5-turbo (configurable)
Examples:
gget gpt "Explain CRISPR" --api_key your_key_here
Python
gget.gpt("Explain CRISPR", api_key="your_key_here")
gget setup - Install Dependencies
Install/download third-party dependencies for specific modules.
Parameters:
-
module : Module name requiring dependency installation
-
-o/--out : Output folder path (elm module only)
Modules requiring setup:
-
alphafold
-
Downloads ~4GB of model parameters
-
cellxgene
-
Installs cellxgene-census (may not support latest Python)
-
elm
-
Downloads local ELM database
-
gpt
-
Configures OpenAI integration
Examples:
Setup AlphaFold
gget setup alphafold
Setup ELM with custom directory
gget setup elm -o /path/to/elm_data
Python
gget.setup("alphafold")
Common Workflows
Workflow 1: Gene Discovery to Sequence Analysis
Find and analyze genes of interest:
1. Search for genes
results = gget.search(["GABA", "receptor"], species="homo_sapiens")
2. Get detailed information
gene_ids = results["ensembl_id"].tolist() info = gget.info(gene_ids[:5])
3. Retrieve sequences
sequences = gget.seq(gene_ids[:5], translate=True)
Workflow 2: Sequence Alignment and Structure
Align sequences and predict structures:
1. Align multiple sequences
alignment = gget.muscle("sequences.fasta")
2. Find similar sequences
blast_results = gget.blast(my_sequence, database="swissprot", limit=10)
3. Predict structure
structure = gget.alphafold(my_sequence, plot=True)
4. Find linear motifs
ortholog_df, regex_df = gget.elm(my_sequence)
Workflow 3: Gene Expression and Enrichment
Analyze expression patterns and functional enrichment:
1. Get tissue expression
tissue_expr = gget.archs4("ACE2", which="tissue")
2. Find correlated genes
correlated = gget.archs4("ACE2", which="correlation")
3. Get single-cell data
adata = gget.cellxgene(gene=["ACE2"], tissue="lung", cell_type="epithelial cell")
4. Perform enrichment analysis
gene_list = correlated["gene_symbol"].tolist()[:50] enrichment = gget.enrichr(gene_list, database="ontology", plot=True)
Workflow 4: Disease and Drug Analysis
Investigate disease associations and therapeutic targets:
1. Search for genes
genes = gget.search(["breast cancer"], species="homo_sapiens")
2. Get disease associations
diseases = gget.opentargets("ENSG00000169194", resource="diseases")
3. Get drug associations
drugs = gget.opentargets("ENSG00000169194", resource="drugs")
4. Query cancer genomics data
study_ids = gget.cbio_search(["breast"]) gget.cbio_plot(study_ids[:2], ["BRCA1", "BRCA2"], stratification="cancer_type")
5. Search COSMIC for mutations
cosmic_results = gget.cosmic("BRCA1", cosmic_tsv_path="cosmic.tsv")
Workflow 5: Comparative Genomics
Compare proteins across species:
1. Get orthologs
orthologs = gget.bgee("ENSG00000169194", type="orthologs")
2. Get sequences for comparison
human_seq = gget.seq("ENSG00000169194", translate=True) mouse_seq = gget.seq("ENSMUSG00000026091", translate=True)
3. Align sequences
alignment = gget.muscle([human_seq, mouse_seq])
4. Compare structures
human_structure = gget.pdb("7S7U") mouse_structure = gget.alphafold(mouse_seq)
Workflow 6: Building Reference Indices
Prepare reference data for downstream analysis (e.g., kallisto|bustools):
1. List available species
gget ref --list_species
2. Download reference files
gget ref -w gtf -w cdna -d homo_sapiens
3. Build kallisto index
kallisto index -i transcriptome.idx transcriptome.fasta
4. Download genome for alignment
gget ref -w dna -d homo_sapiens
Best Practices
Data Retrieval
-
Use --limit to control result sizes for large queries
-
Save results with -o/--out for reproducibility
-
Check database versions/releases for consistency across analyses
-
Use --quiet in production scripts to reduce output
Sequence Analysis
-
For BLAST/BLAT, start with default parameters, then adjust sensitivity
-
Use gget diamond with --threads for faster local alignment
-
Save DIAMOND databases with --diamond_db for repeated queries
-
For multiple sequence alignment, use -s5/--super5 for large datasets
Expression and Disease Data
-
Gene symbols are case-sensitive in cellxgene (e.g., 'PAX7' vs 'Pax7')
-
Run gget setup before first use of alphafold, cellxgene, elm, gpt
-
For enrichment analysis, use database shortcuts for convenience
-
Cache cBioPortal data with -dd to avoid repeated downloads
Structure Prediction
-
AlphaFold multimer predictions: use -mr 20 for higher accuracy
-
Use -r flag for AMBER relaxation of final structures
-
Visualize results in Python with plot=True
-
Check PDB database first before running AlphaFold predictions
Error Handling
-
Database structures change; update gget regularly: pip install --upgrade gget
-
Process max ~1000 Ensembl IDs at once with gget info
-
For large-scale analyses, implement rate limiting for API queries
-
Use virtual environments to avoid dependency conflicts
Output Formats
Command-line
-
Default: JSON
-
CSV: Add -csv flag
-
FASTA: gget seq, gget mutate
-
PDB: gget pdb, gget alphafold
-
PNG: gget cbio plot
Python
-
Default: DataFrame or dictionary
-
JSON: Add json=True parameter
-
Save to file: Add save=True or specify out="filename"
-
AnnData: gget cellxgene
Resources
This skill includes reference documentation for detailed module information:
references/
-
module_reference.md
-
Comprehensive parameter reference for all modules
-
database_info.md
-
Information about queried databases and their update frequencies
-
workflows.md
-
Extended workflow examples and use cases
For additional help:
-
Official documentation: https://pachterlab.github.io/gget/
-
GitHub issues: https://github.com/pachterlab/gget/issues
-
Citation: Luebbert, L. & Pachter, L. (2023). Efficient querying of genomic reference databases with gget. Bioinformatics. https://doi.org/10.1093/bioinformatics/btac836