Abstract
The GENCODE project provides comprehensive annotation of the functional elements in human and mouse genomes with high accuracy. The annotations are released for the benefit of biomedical and genomic research domain. In this initiative, we have provided a basic user manual or roadmap to facilitate the exploration of GENCODE annotation. We have provided a brief history of GENCODE and the general working principles that GENCODE adopts for their annotation. Then, we have introduced few workflows to guide users in the extraction and exploration of GENCODE resources for downstream analysis. The structure of this chapter is as follows. We started by introducing the GENCODE from a historical perspective, the needs and objectives that led to its creation, and being one of the most reliable sources for human and mouse genome functional elements. Afterward, we provided an overview of the GENCODE database. Mainly, different types of annotated genes, their description, basic statistics, and how they were created with emphasis on the latest four releases. Following this database overview, we described different annotation methods adopted by the GENCODE consortium for both human and mouse genomes along with validation methods. Besides GENCODE annotation methods, the user can find GENCODE annotation data format fields and definitions as they appear in the GTF and GFF3 files. Then we described three different ways to access GENCODE annotations via the GENCODE portal, Ensembl Genome Browser, and UCSC Genome Browser. We concluded with three use cases showcasing how to explore the GENCODE annotation for answering research questions. Source code, interactive user guide, and other files are made available for users at https://github.com/smusleh/BookChapterGENCODE.
Original language | English |
---|---|
Title of host publication | Practical Guide to Life Science Databases |
Publisher | Springer Nature |
Pages | 1-25 |
Number of pages | 25 |
ISBN (Electronic) | 9789811658129 |
ISBN (Print) | 9789811658112 |
DOIs | |
Publication status | Published - 1 Jan 2022 |
Keywords
- Bioinformatics tool
- DNA elements
- Database
- GENCODE
- Genes
- Genomics
- Identification
- Long noncoding
- Protein
- RNA