Download a gene data package
Download an NCBI Datasets gene data package, including FASTA sequences and metadata
Download a gene data package
Using NCBI gene IDs
Download a gene data package by providing one or more gene IDs (space delimited). If using the --inputfile
option instead, each gene ID should be in a separate line.
datasets download gene gene-id 1 2 3 9 10 11 12 13 14 15 16 17
Using gene symbols
Run the following command to download a gene data package by gene symbols.
datasets download gene symbol ACRV1 A2M --taxon human
Using transcript or protein accessions
Download a gene data package by RefSeq nucleotide or protein accession.
datasets download gene accession NM_020107.5 NP_001334352.2
Using species name
Download a gene data package by species name or Taxonomy ID. Run the following command to download a gene data package for all human genes.
datasets download gene taxon human
Choosing which data files to include in the data package
Eukaryotic gene data packages
contain transcript and protein sequences and metadata by default, while prokaryotic data packages (WP_ accessions only) contain gene and protein sequences, plus metadata. You can choose to add additional data files or only include metadata in the data package using --include
with one or more terms.
Here are a few examples of using the --include
flag to choose which data files to include in the data package.
Get gene and protein sequences for the human BRCA1 gene (gene-id: 672):
datasets download gene gene-id 672 --include gene,protein
Get gene, transcript, CDS and protein sequences for the human BRCA1 gene (Gene ID: 672):
datasets download gene gene-id 672 --include gene,rna,cds,protein
Get a data package with only the gene data report (metadata):
datasets download gene gene-id 672 --include none