TY - JOUR
T1 - Comparison of Capture Hi-C Analytical Pipelines
AU - Aljogol, Dina
AU - Thompson, I. Richard
AU - Osborne, Cameron S.
AU - Mifsud, Borbala
N1 - Publisher Copyright:
Copyright © 2022 Aljogol, Thompson, Osborne and Mifsud.
PY - 2022/1/28
Y1 - 2022/1/28
N2 - It is now evident that DNA forms an organized nuclear architecture, which is essential to maintain the structural and functional integrity of the genome. Chromatin organization can be systematically studied due to the recent boom in chromosome conformation capture technologies (e.g., 3C and its successors 4C, 5C and Hi-C), which is accompanied by the development of computational pipelines to identify biologically meaningful chromatin contacts in such data. However, not all tools are applicable to all experimental designs and all structural features. Capture Hi-C (CHi-C) is a method that uses an intermediate hybridization step to target and select predefined regions of interest in a Hi-C library, thereby increasing effective sequencing depth for those regions. It allows researchers to investigate fine chromatin structures at high resolution, for instance promoter-enhancer loops, but it introduces additional biases with the capture step, and therefore requires specialized pipelines. Here, we compare multiple analytical pipelines for CHi-C data analysis. We consider the effect of retaining multi-mapping reads and compare the efficiency of different statistical approaches in both identifying reproducible interactions and determining biologically significant interactions. At restriction fragment level resolution, the number of multi-mapping reads that could be rescued was negligible. The number of identified interactions varied widely, depending on the analytical method, indicating large differences in type I and type II error rates. The optimal pipeline depends on the project-specific tolerance level of false positive and false negative chromatin contacts.
AB - It is now evident that DNA forms an organized nuclear architecture, which is essential to maintain the structural and functional integrity of the genome. Chromatin organization can be systematically studied due to the recent boom in chromosome conformation capture technologies (e.g., 3C and its successors 4C, 5C and Hi-C), which is accompanied by the development of computational pipelines to identify biologically meaningful chromatin contacts in such data. However, not all tools are applicable to all experimental designs and all structural features. Capture Hi-C (CHi-C) is a method that uses an intermediate hybridization step to target and select predefined regions of interest in a Hi-C library, thereby increasing effective sequencing depth for those regions. It allows researchers to investigate fine chromatin structures at high resolution, for instance promoter-enhancer loops, but it introduces additional biases with the capture step, and therefore requires specialized pipelines. Here, we compare multiple analytical pipelines for CHi-C data analysis. We consider the effect of retaining multi-mapping reads and compare the efficiency of different statistical approaches in both identifying reproducible interactions and determining biologically significant interactions. At restriction fragment level resolution, the number of multi-mapping reads that could be rescued was negligible. The number of identified interactions varied widely, depending on the analytical method, indicating large differences in type I and type II error rates. The optimal pipeline depends on the project-specific tolerance level of false positive and false negative chromatin contacts.
KW - capture Hi-C
KW - chromatin organization
KW - computational pipeline
KW - epigenetics
KW - gene regulation
UR - http://www.scopus.com/inward/record.url?scp=85125036677&partnerID=8YFLogxK
U2 - 10.3389/fgene.2022.786501
DO - 10.3389/fgene.2022.786501
M3 - Article
AN - SCOPUS:85125036677
SN - 1664-8021
VL - 13
JO - Frontiers in Genetics
JF - Frontiers in Genetics
M1 - 786501
ER -