Background

A critical topic of insertional mutagenesis experiments performed on model organisms is mapping the hits of artificial transposons (ATs) at nucleotide level accuracy. Mapping errors may occur when sequencing artifacts or mutations as single nucleotide polymorphisms (SNPs) and small indels are present very close to the junction between a genomic sequence and a transposon inverted repeat (TIR). Another particular item of insertional mutagenesis is mapping of the transposon self-insertions and, to our best knowledge, there is no publicly available mapping tool designed to analyze such molecular events.

Results

We developed Genome ARTIST, a pairwise gapped aligner tool which works out both issues by means of an original, robust mapping strategy. Genome ARTIST is not designed to use next-generation sequencing (NGS) data but to analyze ATs insertions obtained in small to medium-scale mutagenesis experiments. Genome ARTIST employs a heuristic approach to find DNA sequence similarities and harnesses a multi-step implementation of a Smith-Waterman adapted algorithm to compute the mapping alignments. The experience is enhanced by easily customizable parameters and a user-friendly interface that describes the genomic landscape surrounding the insertion. Genome ARTIST is functional with many genomes of bacteria and eukaryotes available in Ensembl and GenBank repositories. Our tool specifically harnesses the sequence annotation data provided by FlyBase for Drosophila melanogaster (the fruit fly), which enables mapping of insertions relative to various genomic features such as natural transposons. Genome ARTIST was tested against other alignment tools using relevant query sequences derived from the D. melanogaster and Mus musculus (mouse) genomes. Real and simulated query sequences were also comparatively inquired, revealing that Genome ARTIST is a very robust solution for mapping transposon insertions.

Conclusions

Genome ARTIST is a stand-alone user-friendly application, designed for high-accuracy mapping of transposon insertions and self-insertions. The tool is also useful for routine aligning assessments like detection of SNPs or checking the specificity of primers and probes. Genome ARTIST is an open source software and is available for download at genomeartist.ro/download.html and at GitHub.
Genome ARTIST (ARtificial Transposon Insertion Site Tracker) is a new bioinformatics tool originally developed in order to allow a rapid detection of insertional mutations generated in the genome of Drosophila melanogaster by means of artificial P element derivatives.

Aside from the large gene disruption projects (BDGP, http://www.fruitfly.org; FlyBase, http://www.flybase.org), many fly laboratories run small scale transposon mutagenesis screenings. Basically, mobilization with a transposase source of artificial molecular constructs (derived from a natural P mobile element or from other transposons) induces insertional mutations in the germline. Many different mutant strains are derived from affected parents using classical genetic crosses and, in the end, their putative useful mutations are analyzed by inverse PCR and sequencing. The sequencing product is a mixture of information, where part of it pertains to the fruit fly canonical genome and the rest of it belongs to a specific artificial element. The most critical aspect of sequence analysis is to detect the exact border between the genomic and transposon DNA, equivalent with identification of the insertion site at the nucleotide level. Sequencing products are not always perfect and a few artifact bases mismatches may impair a fluent insertion mapping. Most commonly, the sequences of interest are aligned with BLAST (http://blast.ncbi.nlm.nih.gov) or BLAT (http://www.genome.ucsc.edu) against D. melanogaster official genome, which do not contain neither natural nor artificial P transposons. Often, additional manual sequence annotation is needed in order to finish an accurate insertion mapping and here is when Genome ARTIST enters the scene and offers a bit of help. The query sequence is simultaneously compared offline against both the D. melanogaster genome and the specific transposon sequence, partial sequence alignments are matched to each other, relative scores of alignments are calculated and the best mixed sequence with the genomic and transposon coordinates is offered to the user. Different colors are used for genomic versus transposon fragments, and an intuitive list of results and details is also depicted. One may easy observe the absolute site of insertion, the gene affected by the transposon insertion, and also the genes located in the close vicinity of the insertion. Special biological conditions occurring during mutagenesis experiments, as transposon reinsertions into the original mobile element copy, are not usually detected with other searching algorithms, therefore Genome ARTIST is designed to reveal and to interpret such events.

To some extent, Genome ARTIST is an alternative for the classical alignment algorithms and may be exploited for checking the specificity of short sequences as primers or probes. Last but not the least, aficionados of different model organisms may use the abilities of Genome ARTIST by loading other genomes and/or specific transposons. The performances of Genome ARTIST were tested on the genomes of Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila pseudoobscura, Ciona intestinalis, Danio rerio and Arabidopsis thaliana and offers similar results to those obtained on D. melanogaster. Additionally, pairwise comparative alignments may be performed among sequences pertaining to various species aside from D. melanogaster, allowing the identification of structural orthologous.

Genome ARTIST is the result of an interdisciplinary collaboration between researchers from the Department of Genetics, University of Bucharest, and computer engineers, graduates of "Politehnica" University of Bucharest, PUB. The authors of the software are Alexandru Al. Ecovoiu, Iulian Constantin Ghionoiu, Andrei Mihai Ciuca and Attila Cristian Ratiu.