Full-length sequencing of circular DNA viruses and extrachromosomal circular DNA using CIDER-Seq
Devang Mehta, Luc Cornet, Matthias Hirsch-Hoffmann, Syed Shan-e-Ali Zaidi and Hervé Vanderschuren
Circular DNA is ubiquitous in nature in the form of plasmids, circular DNA viruses, and extrachromosomal circular DNA (eccDNA) in eukaryotes. Sequencing of such molecules is essential to profiling virus distributions, discovering new viruses and understanding the roles of eccDNAs in eukaryotic cells. Circular DNA enrichment sequencing (CIDER-Seq) is a technique to enrich and accurately sequence circular DNA without the need for polymerase chain reaction amplification, cloning, and computational sequence assembly. The approach is based on randomly primed circular DNA amplification, which is followed by several enzymatic DNA repair steps and then by long-read sequencing. CIDER-Seq includes a custom data analysis package (CIDER-Seq Data Analysis Software 2) that implements the DeConcat algorithm to deconcatenate the long sequencing products of random circular DNA amplification into the intact sequences of the input circular DNA. The CIDER-Seq data analysis package can generate full-length annotated virus genomes, as well as circular DNA sequences of novel viruses. Applications of CIDER-Seq also include profiling of eccDNA molecules such as transposable elements (TEs) from biological samples. The method takes ~2 weeks to complete, depending on the computational resources available. Owing to the present constraints of long-read single-molecule sequencing, the accuracy of circular virus and eccDNA sequences generated by the CIDER-Seq method scales with sequence length, and the greatest accuracy is obtained for molecules <10 kb long.
https://www.nature.com/articles/s41596-020-0301-0
Palantir: a springboard for the analysis of secondary metabolite gene clusters in large-scale genome mining projects
Loïc Meunier , Pierre Tocquin , Luc Cornet , Damien Sirjacobs , Valérie Leclère , Maude Pupin , Philippe Jacque sand Denis Baurain
To support small and large-scale genome mining projects, we present Post-processing Analysis tooLbox for ANTIsmash Reports (Palantir), a dedicated software suite for handling and refining secondary metabolite biosynthetic gene cluster (BGC) data annotated with the popular antiSMASH pipeline. Palantir provides new functionalities building on NRPS/PKS predictions from antiSMASH, such as improved BGC annotation, module delineation and easy access to sub-sequences at different levels (cluster, gene, module and domain). Moreover, it can parse user-provided antiSMASH reports and reformat them for direct use or storage in a relational database.
https://doi.org/10.1093/bioinformatics/btaa517