GENCODE Annotation for the Human and Mouse Genome: A User Perspective

Saleh Musleh, Meshari Alazmi, Tanvir Alam*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

1 Citation (Scopus)

Abstract

The GENCODE project provides comprehensive annotation of the functional elements in human and mouse genomes with high accuracy. The annotations are released for the benefit of biomedical and genomic research domain. In this initiative, we have provided a basic user manual or roadmap to facilitate the exploration of GENCODE annotation. We have provided a brief history of GENCODE and the general working principles that GENCODE adopts for their annotation. Then, we have introduced few workflows to guide users in the extraction and exploration of GENCODE resources for downstream analysis. The structure of this chapter is as follows. We started by introducing the GENCODE from a historical perspective, the needs and objectives that led to its creation, and being one of the most reliable sources for human and mouse genome functional elements. Afterward, we provided an overview of the GENCODE database. Mainly, different types of annotated genes, their description, basic statistics, and how they were created with emphasis on the latest four releases. Following this database overview, we described different annotation methods adopted by the GENCODE consortium for both human and mouse genomes along with validation methods. Besides GENCODE annotation methods, the user can find GENCODE annotation data format fields and definitions as they appear in the GTF and GFF3 files. Then we described three different ways to access GENCODE annotations via the GENCODE portal, Ensembl Genome Browser, and UCSC Genome Browser. We concluded with three use cases showcasing how to explore the GENCODE annotation for answering research questions. Source code, interactive user guide, and other files are made available for users at https://github.com/smusleh/BookChapterGENCODE.

Original languageEnglish
Title of host publicationPractical Guide to Life Science Databases
PublisherSpringer Nature
Pages1-25
Number of pages25
ISBN (Electronic)9789811658129
ISBN (Print)9789811658112
DOIs
Publication statusPublished - 1 Jan 2022

Keywords

  • Bioinformatics tool
  • DNA elements
  • Database
  • GENCODE
  • Genes
  • Genomics
  • Identification
  • Long noncoding
  • Protein
  • RNA

Fingerprint

Dive into the research topics of 'GENCODE Annotation for the Human and Mouse Genome: A User Perspective'. Together they form a unique fingerprint.

Cite this