Stong N, Deng Z, Gupta R, Hu S, Paul S, Weiner AK, Eichler EE, Graves T, Fronick CC, Courtney L, Wilson RK, Lieberman P, Davuluri RV, Riethman H
Mapping genome-wide data to human subtelomeres has been problematic due to the incomplete assembly and challenges of low-copy repetitive DNA elements. Here, we provide updated human subtelomere sequence assemblies that were extended by filling telomere-adjacent gaps using clone-based resources. A bioinformatic pipeline incorporating multi-read mapping for annotation of the updated assemblies using short-read datasets was developed and implemented. Annotation of subtelomeric sequence features as well as mapping of CTCF and cohesin binding sites using ChIP-seq datasets from multiple human cell types confirmed that CTCF and cohesin bind within 3 kb of the start of terminal repeat tracts at many, but not all subtelomeres. CTCF and cohesin co-occupancy was also enriched near Internal Telomere-like Sequence (ITS) islands and the non-terminal boundaries of subtelomere repeat elements (SREs) in transformed lymphoblastoid cell lines (LCLs) and human embryonic stem cell (ES) lines, but not significant in the primary fibroblast IMR90 cell line. Subtelomeric ITS islands were found to be frequent sites of artifactual mappings using short-read datasets due to the similarity of their sequences to those in terminal repeat tracts; TERF1 and TERF2 ChIP-seq peaks called at ITS sites could not be confirmed by ChIP-qPCR analysis of those sites. By contrast, subtelomeric CTCF and cohesin sites predicted by ChIP-seq using our bioinformatics pipeline (but not predicted when only uniquely mapping reads were considered) were consistently validated by ChIP-qPCR. The co-localized CTCF and cohesin sites in SRE regions are candidates for mediating long-range chromatin interactions in the transcript-rich SRE region. A public browser for the integrated display of short-read sequence-based annotations relative to key subtelomere features such as the start of each terminal repeat tract, SRE identity and organization, and subtelomeric gene models was established (vader.wistar.upenn.edu/humansubtel).