EDUCATIONAL RESOURCES

UMUC: Bioinformatics Studies
The Master of Science program in biotechnology serves a number of careers at the entry-, mid-, or high-level positions, depending on the prior level of experience of the candidate.
A specialization in Bioformatics covers a broad range of subjects (for example, biostatistics, databases and data structures, algorithms, gene expression analysis, and PERL) at the interface of molecular biology and computational science. [read more]

A guide to bioinformatics resources on the internet
The following list contains link to various bioinformatics related resources available on the internet. This list is by no means exhaustive, but does cover a large bulk of the bioinformatics material and subject matters. [read more]

Online Lectures on Bioinformatics
In the current context we can only give an extremely brief introduction to the basic notions of molecular biology. An overview can be found in any modern textbook on biology, biochemistry or molecular biology [read more]

Centre for Molecular and Biomolecular Informatics
The C MB I offers a number of Courses on Bioinformatics and Cheminformatics topics, which are described elsewhere, as well as a number of Interactive Web Tutorials.
These online tutorials are available for anyone to browse, but we know from years of experience teaching computational biology and chemistry courses at the CMBI that doing carefully designed exercises with an experienced instructor close at hand is the best way to learn. [read more]

[Databases] [Downloads] [Submissions] [How to]

Databases

BioProject (formerly Genome Project)

A collection of genomics, functional genomics, and genetics studies and links to their resulting datasets. This resource describes project scope, material, and objectives and provides a mechanism to retrieve datasets that are often difficult to find due to inconsistent annotation, multiple independent submissions, and the varied nature of diverse data types which are often stored in different databases.

BioSample

The BioSample database contains descriptions of biological source materials used in experimental assays.

BioSystems

Database that groups biomedical literature, small molecules, and sequence data in terms of biological relationships.

Bookshelf

A collection of biomedical books that can be searched online and that are linked to PubMed records through research paper citations within the text. The collection includes biomedical textbooks, other scientific target="_blank"s, some genetic resources, such as GeneReviews, and NCBI help manuals.

Cancer Chromosomes

Integrates data from three sources: the NCI/NCBI SKY/M-FISH and CGH Database, the NCI Mitelman Database of Chromosome Aberrations in Cancer, and the NCI Recurrent Aberrations in Cancer. The integrated databases can be searched for cytogenetic, clinical, and/or reference information.

ClinVar

The ClinVar resource is currently being developed to provide a public, tracked record of reported relationships among human variation and observed health status. It has a projected launch date for the latter part of 2011 - 2014.

CloneDB (formerly Clone Registry)

A database that integrates information about clones and libraries, including sequence data, map positions and distributor information.

Computational Resources from NCBI's Structure Group

A centralized page providing access and links to resources developed by the Structure Group of the NCBI Computational Biology Branch (CBB). These resources cover databases and tools to help in the study of macromolecular structures, conserved domains and protein classification, small molecules and their biological activity, and biological pathways and systems.

Consensus CDS (CCDS)

A collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality.

Conserved Domain Database (CDD)

A collection of sequence alignments and profiles representing protein domains conserved in molecular evolution. It also includes alignments of the domains to known 3-dimensional protein structures in the MMDB database.

Database of Expressed Sequence Tags (dbEST)

A divison of GenBank that contains short single-pass reads of cDNA (transcript) sequences. dbEST can be searched directly through the Nucleotide EST Database.

Database of Genome Survey Sequences (dbGSS)

A division of GenBank that contains short single-pass reads of genomic DNA. dbGSS can be searched directly through the Nucleotide GSS Database.

Database of Genomic Structural Variation (dbVar)

The dbVar database has been developed to archive information associated with large scale genomic variation, including large insertions, deletions, translocations and inversions. In addition to archiving variation discovery, dbVar also stores associations of defined variants with phenotype information.

Database of Genotypes and Phenotypes (dbGaP)

Archives and distributes the results of studies that have investigated the interaction of genotypes and phenotypes. Such studies include those assessing genome-wide association, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits.

Database of Major Histocompatibility Complex (dbMHC)

Provides an open, publicly accessible platform where the HLA community can submit, edit, view, and exchange data related to the human Major Histocompatibility Complex. It consists of an interactive Alignment Viewer for HLA and related genes, an MHC microsatellite database, a sequence interpretation site for Sequencing Based Typing (SBT), and a Primer/Probe database.

Database of Single Nucleotide Polymorphisms (dbSNP)

Includes single nucleotide polymorphisms, microsatellites, and small-scale insertions and deletions. dbSNP contains population-specific frequency and genotype data, experimental conditions, molecular context, and mapping information for both neutral polymorphisms and clinical mutations.

Epigenomics

This resource enables users to explore and visualize richly-annotated epigenomics datasets. It provides a unique interface to search and navigate epigenomic data in the context of biological sample information, as well as tools to select, download and view multiple sets of epigenomic data as tracks on genome browsers.

GenBank

The NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI. These three organizations exchange data on a daily basis. GenBank consists of several divisions, most of which can be accessed through the Nucleotide database. The exceptions are the EST and GSS divisions, which are accessed through the Nucleotide EST and Nucleotide GSS databases, respectively.

Gene

A searchable database of genes, focusing on genomes that have been completely sequenced and that have an active research community to contribute gene-specific data. Information includes nomenclature, chromosomal localization, gene products and their attributes (e.g., protein interactions), associated markers, phenotypes, interactions, and links to citations, sequences, variation details, maps, expression reports, homologs, protein domain content, and external databases.

Gene Expression Nervous System Atlas (GENSAT)

Maps the expression of genes in the central nervous system of the mouse, using both in situ hybridization and transgenic mouse techniques. The GENSAT database contains a series of images related to gene expression experiments.

Gene Expression Omnibus (GEO) Database

A public functional genomics data repository supporting MIAME-compliant data submissions. Array- and sequence-based data are accepted and tools are provided to help users query and download experiments and curated gene expression profiles.

Gene Expression Omnibus (GEO) Datasets

Stores curated gene expression and molecular abundance DataSets assembled from the Gene Expression Omnibus (GEO) repository. DataSet records contain additional resources, including cluster tools and differential expression queries.

Gene Expression Omnibus (GEO) Profiles

Stores individual gene expression and molecular abundance Profiles assembled from the Gene Expression Omnibus (GEO) repository. Search for specific profiles of interest based on gene annotation or pre-computed profile characteristics.

GeneTests

A publicly funded medical genetics information resource developed for physicians, other healthcare providers, and researchers, available at no cost to all interested persons.

Genes and Disease

Summary information for more than 80 genetic disorders with discussions of the underlying mutation(s) and clinical features, as well as links to related databases and organizations. The database is accessed through NCBI's Bookshelf.

Genetic Testing Registry (GTR)

The GTR is currently being developed to provide access to information about genetic tests for inherited and somatic genetic variations, including tests based on arrays and multiplex panels. Information in GTR will be based primarily on voluntary data submissions by test developers and manufacturers. It has a projected launch date in the latter part of 2011 - 2014.

Genome

Contains sequence and map data from the whole genomes of over 1000 organisms. The genomes represent both completely sequenced organisms and those for which sequencing is in progress. All three main domains of life (bacteria, archaea, and eukaryota) are represented, as well as many viruses, phages, viroids, plasmids, and organelles.

Genome Reference Consortium (GRC)

The Genome Reference Consortium (GRC) maintains responsibility for the human and mouse reference genomes. Members consist of The Genome Center at Washington University, the Wellcome Trust Sanger Institute, the European Bioinformatics Institute (EBI) and the National Center for Biotechnology Information (NCBI). The GRC works to correct misrepresented loci and to close remaining assembly gaps. In addition, the GRC seeks to provide alternate assemblies for complex or structurally variant genomic loci. At the GRC website (http://www.genomereference.org), the public can view genomic regions currently under review, report genome-related problems and contact the GRC.

HIV-1, Human Protein Interaction Database

The HIV-1, Human Protein Interaction Database contains information about known interactions of HIV-1 proteins with proteins from human hosts. It provides annotated bibliographies of published reports of protein interactions, with links to the corresponding PubMed records and sequence data.

HomoloGene

A gene homology tool that compares nucleotide sequences between pairs of organisms in order to identify putative orthologs. Curated orthologs are incorporated from a variety of sources via the Gene database.

Influenza Virus

Presents data from the NIAID Influenza Genome Sequencing Project and from GenBank, and provides tools for flu sequence analysis, annotation and submission to GenBank. It also provides links to other flu sequence resources, and publications and general information about flu viruses.

Journals in NCBI Databases

Subset of the NLM Catalog database providing information on journals that are referenced in NCBI database records, including PubMed abstracts. This subset can be searched using the journal target="_blank", MEDLINE or ISO abbreviation, ISSN, or the NLM Catalog ID.

MeSH Database

MeSH (Medical Subject Headings) is the U.S. National Library of Medicine's controlled vocabulary for indexing articles for MEDLINE/PubMed. MeSH terminology provides a consistent way to retrieve information that may use different terminology for the same concepts.

NCBI C++ Toolkit Manual

A comprehensive manual on the NCBI C++ toolkit, including its design and development framework, a C++ library reference, software examples and demos, FAQs and release notes. The manual is searchable online and can be downloaded as a series of PDF documents.

NCBI Education Page

Provides links to tutorials and training materials, including PowerPoint slides and print handouts.

NCBI Glossary

Part of the NCBI Handbook, this glossary contains descriptions of NCBI tools and acronyms, bioinformatics terms and data representation formats.

NCBI Handbook

An extensive collection of articles about NCBI databases and software. Designed for a novice user, each article presents a general overview of the resource and its design, along with tips for searching and using available analysis tools. All articles can be searched online and downloaded in PDF format; the handbook can be accessed through the NCBI Bookshelf.

NCBI Help Manual

Accessed through the NCBI Bookshelf, the Help Manual contains documentation for many NCBI resources, including PubMed, PubMed Central, the Entrez system, Gene, SNP and LinkOut. All chapters can be downloaded in PDF format.

NCBI Website Search

A database of static NCBI web pages, documentation, and online tools. These pages include such content as specialized online sequence analysis tools, back issues of newsletters, legacy resource description pages, sample code, and other miscellaneous resources. Searching this database is equivalent to a site search tool for the whole NCBI web site. FTP site is not covered.

National Library of Medicine (NLM) Catalog

Bibliographic data for all the journals, books, audiovisuals, computer software, electronic resources and other materials that are in the library's holdings.

Nucleotide Database

A collection of nucleotide sequences from several sources, including GenBank, RefSeq, the Third Party Annotation (TPA) database, and PDB. Searching the Nucleotide Database will yield available results from each of its component databases.

Online Mendelian Inheritance in Animals (OMIA)

Database of genes, inherited disorders and traits in animal species (other than human and mouse), with textual information and references, as well as links to relevant records from other NCBI databases, such as PubMed and Gene.

Online Mendelian Inheritance in Man (OMIM)

OMIM catalogs human genes and genetic disorders. NCBI maintains current content and continues to support its searching and integration with other NCBI databases. However, OMIM now has a new home at omim.org, and users are directed to this site for full record displays.

PopSet

Database of related DNA sequences that originate from comparative studies: phylogenetic, population, environmental and, to a lesser degree, mutational. Each record in the database is a set of DNA sequences. For example, a population set provides information on genetic variation within an organism, while a phylogenetic set may contain sequences, and their alignment, of a single gene obtained from several related organisms.

Probe

A public registry of nucleic acid reagents designed for use in a wide variety of biomedical research applications, together with information on reagent distributors, probe effectiveness, and computed sequence similarities.

Protein Clusters

A collection of related protein sequences (clusters), consisting of Reference Sequence proteins encoded by complete prokaryotic and organelle plasmids and genomes. The database provides easy access to annotation information, publications, domains, structures, external links, and analysis tools.

Protein Database

A database that includes protein sequence records from a variety of sources, including GenPept, RefSeq, Swiss-Prot, PIR, PRF, and PDB.

PubChem BioAssay

Consists of deposited bioactivity data and descriptions of bioactivity assays used to screen the chemical substances contained in the PubChem Substance database, including descriptions of the conditions and the readouts (bioactivity levels) specific to the screening procedure.

PubChem Compound

Contains unique, validated chemical structures (small molecules) that can be searched using names, synonyms or keywords. The compound records may link to more than one PubChem Substance record if different depositors supplied the same structure. These Compound records reflect validated chemical depiction information provided to describe substances in PubChem Substance. Structures stored within PubChem Compounds are pre-clustered and cross-referenced by identity and similarity groups. Additionally, calculated properties and descriptors are available for searching and filtering of chemical structures.

PubChem Substance

PubChem Substance records contain substance information electronically submitted to PubChem by depositors. This includes any chemical structure information submitted, as well as chemical names, comments, and links to the depositor's web site.

PubMed

A database of citations and abstracts for biomedical literature from MEDLINE and additional life science journals. Links are provided when full text versions of the articles are available via PubMed Central (described below) or other websites.

PubMed Central (PMC)

A digital archive of full-text biomedical and life sciences journal literature, including clinical medicine and public health.

RefSeqGene

RefSeqGene sequences are a collection of gene-specific reference genomic sequences from the NCBI RefSeq collection. They form a stable foundation for reporting mutations, for establishing consistent intron and exon numbering conventions, and for defining the coordinates of other biologically significant variation.

Reference Sequence (RefSeq)

A collection of curated, non-redundant genomic DNA, transcript (RNA), and protein sequences produced by NCBI. RefSeqs provide a stable reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis, expression studies, and comparative analyses. The RefSeq collection is accessed through the Nucleotide and Protein databases.

Retrovirus Resources

A collection of resources specifically designed to support the research of retroviruses, including a genotyping tool that uses the BLAST algorithm to identify the genotype of a query sequence; an alignment tool for global alignment of multiple sequences; an HIV-1 automatic sequence annotation tool; and annotated maps of numerous retroviruses viewable in GenBank, FASTA, and graphic formats, with links to associated sequence records.

SARS CoV

A summary of data for the SARS coronavirus (CoV), including: links to the most recent sequence data and publications, links to other SARS related resources, and a pre-computed alignment of genome sequences from various isolates.

Sequence Read Archive (SRA)

The Sequence Read Archive (SRA) stores sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Life Technologies AB SOLiD System®, Helicos Biosciences Heliscope®, Complete Genomics®, and Pacific Biosciences SMRT®.

Structure (Molecular Modeling Database)

Contains macromolecular 3D structures derived from the Protein Data Bank, as well as tools for their visualization and comparative analysis.

Taxonomy

Contains the names and phylogenetic lineages of more than 160,000 organisms that have molecular data in the NCBI databases. New taxa are added to the Taxonomy database as data are deposited for them.

Third Party Annotation (TPA) Database

A database that contains sequences built from the existing primary sequence data in GenBank. The sequences and corresponding annotations are experimentally supported and have been published in a peer-reviewed scientific journal. TPA records are retrieved through the Nucleotide Database.

Trace Archive

A repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects.

UniGene

A database that provides sets of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.

UniGene Library Browser

This database contains libraries of Expressed Sequence Tags (ESTs) organized by organism, tissue type and developmental stage.

UniSTS

A comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information, such as genomic position, genes, and sequences.

Viral Genomes

A wide range of resources, including a brief summary of the biology of viruses, links to viral genome sequences in Entrez Genome, and information about viral Reference Sequences, a collection of reference sequences for thousands of viral genomes.

Virus Variation

An extension of the Influenza Virus Resource to other organisms, providing an interface to download sequence sets of selected viruses; analysis tools, including virus-specific BLAST pages; and genome annotation pipelines (in progress).

Downloads

BLAST (Stand-alone)

BLAST executables for local use are provided for Solaris, LINUX, Windows, and MacOSX systems. See the README file in the ftp directory for more information. Pre-formatted databases for BLAST nucleotide, protein, and translated searches also are available for downloading under the db subdirectory.

FTP: BLAST Databases

Sequence databases for use with the stand-alone BLAST programs. The files in this directory are pre-formatted databases that are ready to use with BLAST.

FTP: CDD

This site provides full data records for CDD, along with individual Position Specific Scoring Matrices (PSSMs), mFASTA sequences and annotation data for each conserved domain. See the README file for full details.

FTP: FASTA BLAST Databases

Sequence databases in FASTA format for use with the stand-alone BLAST programs. These databases must be formatted using formatdb before they can be used with BLAST.

FTP: GenBank

This site contains files for all sequence records in GenBank in the default flat file format. The files are organized by GenBank division, and the full contents are described in the README.genbank file.

FTP: GenPept

The protein sequences corresponding to the translations of coding sequences (CDS) in GenBank are collected for each GenBank release..Please see the README file in the directory for more information.

FTP: Gene

This site contains three directories: DATA, GeneRIF and tools. The DATA directory contains files listing all data linked to GeneIDs along with subdirectories containing ASN.1 data for the Gene records. The GeneRIF (Gene References into Function) directory contains PubMed identifiers for articles describing the function of a single gene or interactions between products of two genes. Sample programs for manipulating gene data are provided in the tools directory. Please see the README file for details.

FTP: Gene Expression Nervous System Atlas (GENSAT)

This site contains GENSAT image data organized by gene and contributing institution.

FTP: Gene Expression Omnibus (GEO) Profiles and Datasets

This site contains GEO data in two formats: SOFT (Simple Omnibus in Text Format) and MINiML (MIAME Notation in Markup Language). Summary text files and supplementary data are also available. Please see the README.TXT file for more information.

FTP: Genome

This site contains genome sequence and mapping data for organisms in Entrez Genome. The data are organized in directories for single species or groups of species. Mapping data are collected in the directory MapView and are organized by species. See the README file in the root directory and the README files in the species subdirectories for detailed information.

FTP: Genome Mapping Data

Contains directories for each genome that include available mapping data for current and previous builds of that genome.

FTP: Genome Markers (UniSTS)

This directory contains text and XML files for UniSTS records along with mapping data.

FTP: HomoloGene

This site contains data for each build of HomoloGene, beginning with build 35. Complete data for each build are provided in XML, and a data summary is provided in tab-delimited text format.

FTP: NCBI Field Guide Manual

Downloadable material for NCBI's previously offered Field Guide training course.

FTP: NCBI Structure Course Materials

PowerPoint slides, handouts and exercises for the previously offered NCBI course "Exploring 3D Molecular Structures."

FTP: NCBI Taxonomy

This site contains the full taxonomy database along with files associating nucleotide and protein sequence records with their taxonomy IDs. See the taxdump_readme.txt and gi_taxid.readme files for more information.

FTP: Protein Clusters

This site contains data from the Protein Clusters database arranged by release date. See the README files for more information.

FTP: PubChem

This site provides data from the PubChem Substance, Compound and Bioassay databases for download via ftp. Full downloads of the databases are available along with daily, weekly and monthly updates for Substance and Compound. Substance and Compound data are provided in ASN.1, SDF and XML formats. See the README files for more information.

FTP: RefSeq

This site contains all nucleotide and protein sequence records in the Reference Sequence (RefSeq) collection. The ""release"" directory contains the most current release of the complete collection, while data for selected organisms (such as human, mouse and rat) are available in separate directories. Data are available in FASTA and flat file formats. See the README file for details.

FTP: SKY/M-Fish and CGH Data

This site contains SKY-CGH data in ASN.1, XML and EasySKYCGH formats. See the skycghreadme.txt file for more information.

FTP: SNP

Downloadable data for SNP.

FTP: Sequence Read Archive (SRA) Download Facility

This site contains next-generation sequencing data organized by the submitted sequencing project.

FTP: Site

FTP download site for NCBI databases, tools, and utilities.

FTP: Structure (MMDB)

This site contains ASN.1 data for all records in MMDB along with VAST alignment data and the non-redundant PDB (nr-PDB) data sets. See the README file for more information.

FTP: Trace Archive

This site contains the trace chromatogram data organized by species. Data include chromatogram, quality scores, FASTA sequences from automatic base calls, and other ancillary information in tab-delimited text as well as XML formats. See the README file for details.

FTP: UniGene

This site contains individual directories for each organism with data in UniGene. The data for each species includes the unique sequence for each UniGene cluster, all sequences in each cluster in FASTA format and library information for the cluster. See the README file for further details.

FTP: UniVec

This site contains the UniVec and UniVec_Core databases in FASTA format. See the README.uv file for details.

FTP: Whole Genome Shotgun Sequences

This site contains whole genome shotgun sequence data organized by the 4-digit project code. Data include GenBank and GenPept flat files, quality scores and summary statistics. See the README.genbank.wgs file for more information.

FTP: dbGAP Open-Access Data

Open-access data generally include summaries of genotype/phenotype association studies, descriptions of the measured variables, and study documents, such as the protocol and questionnaires. Access to individual-level data, including phenotypic data tables and genotypes, requires varying levels of authorization.

FTP: dbMHC Data

This site contains data in separate directories for the various projects and resources within the database of human major histocompatibility (dbMHC).

MEDLINE (Leasing)

NLM leases MEDLINE/PubMed to U.S. individuals or organizations.

NCBI Data Specifications

Specifications for NCBI data in ASN.1 or DTD format are available on the Index of data_specs page. The "NCBI_data_conversion.html" links to the conversion tool.

National Library of Medicine (NLM) DTDs

A suite of tag sets for authoring and archiving journal articles as well as transferring journal articles from publishers to archives and between archives. There are four tag sets: Archiving and Interchange Tag Set - Created to enable an archive to capture as many of the structural and semantic components of existing printed and tagged journal material as conveniently as possible; Journal Publishing Tag Set - Optimized for archives that wish to regularize and control their content, not to accept the sequence and arrangement presented to them by any particular publisher; Article Authoring Tag Set - Designed for authoring new journal articles; NCBI Book Tag Set - Written specifically to describe volumes for the NCBI online libraries.

PubChem Download Service

This service allows users to download compound or substance records corresponding to a set of PubChem identifiers, which can be supplied manually or through a text file. Numerous download formats are available, including SDF, XML and SMILES.

PubMed Central (PMC) Open-Access Subset

The PMC Open-Access Subset is a relatively small part of the total collection of articles in PMC. Whereas the majority of articles in PMC are subject to traditional copyright restrictions, these articles are protected by copyright, but are made available under a Creative Commons or similar license that generally allows more liberal redistribution and reuse than a traditional copyright. Please refer to the license statement in each article for specific terms of use.

RSS Feeds

Subscribe to Web/RSS feeds for updates about NCBI resources.

Submissions

BioProject Submission

An online form that provides an interface for researchers, consortia and organizations to register their BioProjects. This serves as the starting point for the submission of genomic and genetic data for the study. The data does not need to be submitted at the time of BioProject registration.

Database of Genotype and Phenotype (dbGaP) Data Submission Policies

Guidelines and requirements for submitting genotype and phenotype association data to dbGaP.

Database of Major Histocompatibility Complex (dbMHC) Microsatellite Markers Submission Template

Guidelines and template for submitting MHC region microsatellite data to dbMHC.

GenBank: BankIt

A web-based sequence submission tool for one or a few submissions to the GenBank database, designed to make the submission process quick and easy.

GenBank: Barcode

Tool for submission to the GenBank database of Barcode short nucleotide sequences from a standard genetic locus for use in species identification.

GenBank: Sequin

A stand-alone software tool developed by the NCBI for submitting and updating entries to public sequence databases (GenBank, EMBL, or DDBJ). It is capable of handling simple submissions that contain a single short mRNA sequence, complex submissions containing long sequences, multiple annotations, segmented sets of DNA, as well as sequences from phylogenetic and population studies with alignments. For simple submission, use the online submission tool BankIt instead.

GenBank: tbl2asn

A command-line program that automates the creation of sequence records for submission to GenBank using many of the same functions as Sequin. It is used primarily for submission of complete genomes and large batches of sequences.

Gene Expression Omnibus (GEO) Web Deposit

Submit expression data, such as microarray, SAGE or mass spectrometry datasets to the NCBI Gene Expression Omnibus (GEO) database.

GeneRIF

GeneRIF provides a simple mechanism to allow scientists to add to the functional annotation of genes in the Gene database.

NIH Manuscript Submissions (NIHMS)

The NIH Manuscript Submission (NIHMS) System is used to submit manuscripts that arise from NIH funding to the PubMed Central digital archive, in accordance with the NIH Public Access Policy and the law it implements. The law and Public Access Policy are intended to ensure that the public has access to the published results of NIH-funded research.

PubChem Deposition Gateway

This site is for users who want to contribute new data to PubChem and/or test data upload procedures. Contributors first create a test account where they can test the deposition procedures and preview their data records. They may then proceed to a deposition account whereby their data will be uploaded into the public database.

SNP Submission Tool

The SNP database tools page provides links to the general submission guidelines and to the submission handle request. The page has also two specific links for single- or batch submissions of the human variation data using Human Genome Variation Society nomenclature.

Sequence Read Archive Submission

This link describes how submitters of SRA data can obtain a secure NCBI FTP site for their data, and also describes the allowed data formats and directory structures.

Trace Archive Submission

This link describes how submitters of trace data can obtain a secure NCBI FTP site for their data, and also describes the allowed data formats and directory structures.

How To


[back to top]

The BioINFO Project

"Merging science & technology to create innovative solutions."