Jørgensen TS, Xu Z, Hansen MA, Sørensen SJ, Hansen LH
Metagenomic approaches are widespread in microbiological research, but so far, the knowledge on extrachromosomal DNA diversity and composition has largely remained dependant on cultivating host organisms. Even with the emergence of metagenomics, complete circular sequences are rarely identified, and have required manual curation. We propose a robust in silico procedure for identifying complete small plasmids in metagenomic datasets from whole genome shotgun sequencing. From one very pure and exhaustively sequenced metamobilome from rat cecum, we identified a total of 616 circular sequences, 160 of which were carrying a gene with plasmid replication domain. Further homology analyses indicated that the majority of these plasmid sequences are novel. We confirmed the circularity of the complete plasmid candidates using an inverse-type PCR approach on a subset of sequences with 95% success, confirming the existence and length of discrete sequences. The implication of these findings is a broadened understanding of the traits of circular elements in nature and the possibility of massive data mining in existing metagenomic datasets to discover novel pools of complete plasmids thus vastly expanding the current plasmid database.