Incarnato D, Neri F
Mouse E14 embryonic stem cells (ESCs) are the most used ESC line, often employed for genome-wide studies involving next generation sequencing analysis [1 – 5] . More than 2 × 10 E9 sequences made on Illumina platform derivedfromthegenomeofE14embryonicstemcellsculturedinourlaboratorywereusedtobuildadatabaseof about 2.7 × 10 E6 single nucleotide variant [6] . The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identi fi edvariants are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the mo- lecular biology of these cells. We created a new E14 genome assembly including the new identi fi ed variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because it could be a target of DNA methylation. Data were deposited in GEO datasets under reference GSM1283021 and here: http://epigenetics.hugef-research.org/data.php