Najafabadi HS, Mnaimneh S, Schmitges FW, Garton M, Lam KN, Yang A, Albu M, Weirauch MT, Radovani E, Kim PM, Greenblatt J, Frey BJ, Hughes TR
Cys2-His2 zinc finger (C2H2-ZF) proteins represent the largest class of putative human transcription factors. However, for most C2H2-ZF proteins it is unknown whether they even bind DNA or, if they do, to which sequences. Here, by combining data from a modified bacterial one-hybrid system with protein-binding microarray and chromatin immunoprecipitation analyses, we show that natural C2H2-ZFs encoded in the human genome bind DNA both in vitro and in vivo, and we infer the DNA recognition code using DNA-binding data for thousands of natural C2H2-ZF domains. In vivo binding data are generally consistent with our recognition code and indicate that C2H2-ZF proteins recognize more motifs than all other human transcription factors combined. We provide direct evidence that most KRAB-containing C2H2-ZF proteins bind specific endogenous retroelements (EREs), ranging from currently active to ancient families. The majority of C2H2-ZF proteins, including KRAB proteins, also show widespread binding to regulatory regions, indicating that the human genome contains an extensive and largely unstudied adaptive C2H2-ZF regulatory network that targets a diverse range of genes and pathways.