# HG changeset patch # User bgruening # Date 1761642942 0 # Node ID b86f3e5626f4b1abeca47f289212199555313276 # Parent 358f58401cd6d1abbc4c15b0fe49943e00619657 planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/rna_tools/trna_prediction commit 1b3f92275bb60e606cd4fdc394fb9df95232a5aa diff -r 358f58401cd6 -r b86f3e5626f4 aragorn.xml --- a/aragorn.xml Wed Jul 26 10:14:05 2017 -0400 +++ b/aragorn.xml Tue Oct 28 09:15:42 2025 +0000 @@ -1,7 +1,13 @@ - + prediction (Aragorn) + + 1.2.41 + + + aragorn + - aragorn + aragorn python Assume that each sequence has a circular topology - - - - - - - - + + + + + + + + - + - - + + - + gff3_output - + @@ -95,12 +95,36 @@ - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - + @@ -109,11 +133,17 @@ - - - + + + + + + + + + - + @@ -123,9 +153,13 @@ - + + + + + - + @@ -136,27 +170,37 @@ - + + + + + + + + + + + -'): + if line.startswith(">"): genome_id = line[1:].strip() - if ' ' in genome_id: - genome_id = genome_id[0:genome_id.index(' ')] + if " " in genome_id: + genome_id = genome_id[0: genome_id.index(" ")] else: data = line.split() if len(data) == 5: # Parse data - strand = '-' if data[2].startswith('c') else '+' - start, end = data[2][data[2].index('[') + 1:-1].split(',') + strand = "-" if data[2].startswith("c") else "+" + start, end = data[2][data[2].index("[") + 1: -1].split(",") gff3 = { - 'parent': genome_id, - 'source': 'aragorn', - 'start': int(start), - 'end': int(end), - 'strand': strand, - 'score': '.', - '8': '.', + "parent": genome_id, + "source": "aragorn", + "start": int(start), + "end": int(end), + "strand": strand, + "score": ".", + "8": ".", } aa_long = data[1][5:] aa_short = aa_table[aa_long] - anticodon = data[4][1:data[4].index(")")].upper().replace("T", "U") - name = 'trn{}-{}'.format(aa_short, anticodon) + anticodon = data[4][1: data[4].index(")")].upper().replace("T", "U") + name = "trn{}-{}".format(aa_short, anticodon) if not full_gene_model: - gff3.update({ - 'type': 'tRNA', - 'quals': 'ID=tRNA{0}.{1};Name={name};product={2}'.format(genome_id, *data, name = name), - }) + gff3.update( + { + "type": "tRNA", + "quals": "ID=tRNA{0}.{1};Name={name};product={2}".format( + genome_id, *data, name=name + ), + } + ) output_line(gff3) else: - gff3.update({ - 'type': 'gene', - 'quals': 'ID=gene{0}.{1};Name={name};product={2}'.format(genome_id, *data, name = name), - }) + gff3.update( + { + "type": "gene", + "quals": "ID=gene{0}.{1};Name={name};product={2}".format( + genome_id, *data, name=name + ), + } + ) output_line(gff3) - gff3.update({ - 'type': 'tRNA', - 'quals': 'ID=tRNA{0}.{1};Parent=gene{0}.{1};Name={name};product={2}'.format(genome_id, *data, name = name), - }) + gff3.update( + { + "type": "tRNA", + "quals": "ID=tRNA{0}.{1};Parent=gene{0}.{1};Name={name};product={2}".format( + genome_id, *data, name=name + ), + } + ) output_line(gff3) # If no introns - if ')i(' not in data[4]: - gff3['type'] = 'exon' - gff3['quals'] = 'Parent=tRNA{0}.{1}'.format(genome_id, *data) + if ")i(" not in data[4]: + gff3["type"] = "exon" + gff3["quals"] = "Parent=tRNA{0}.{1}".format(genome_id, *data) output_line(gff3) else: - intron_location = data[4][data[4].rindex('(') + 1:-1].split(',') + intron_location = data[4][data[4].rindex("(") + 1: -1].split(",") intron_start, intron_length = map(int, intron_location) - if strand == '+': - original_end = gff3['end'] + if strand == "+": + original_end = gff3["end"] else: - original_end = gff3['start'] + original_end = gff3["start"] # EXON - gff3.update({ - 'type': 'exon', - 'quals': 'Parent=tRNA{0}.{1}'.format(genome_id, *data), - }) - if strand == '+': - gff3['end'] = gff3['start'] + intron_start - 2 + gff3.update( + { + "type": "exon", + "quals": "Parent=tRNA{0}.{1}".format(genome_id, *data), + } + ) + if strand == "+": + gff3["end"] = gff3["start"] + intron_start - 2 else: - gff3['start'] = gff3['end'] - intron_start + 2 + gff3["start"] = gff3["end"] - intron_start + 2 output_line(gff3) # INTRON - gff3.update({ - 'type': 'intron', - 'quals': 'Parent=tRNA{0}.{1}'.format(genome_id, *data), - }) - if strand == '+': - gff3['start'] = gff3['end'] + 1 - gff3['end'] = gff3['start'] + intron_length + 2 + gff3.update( + { + "type": "intron", + "quals": "Parent=tRNA{0}.{1}".format(genome_id, *data), + } + ) + if strand == "+": + gff3["start"] = gff3["end"] + 1 + gff3["end"] = gff3["start"] + intron_length + 2 else: - gff3['end'] = gff3['start'] - 1 - gff3['start'] = gff3['end'] - intron_length + 1 + gff3["end"] = gff3["start"] - 1 + gff3["start"] = gff3["end"] - intron_length + 1 output_line(gff3) # EXON - gff3.update({ - 'type': 'exon', - 'quals': 'Parent=tRNA{0}.{1}'.format(genome_id, *data), - }) - if strand == '+': - gff3['start'] = gff3['end'] + 1 - gff3['end'] = original_end + gff3.update( + { + "type": "exon", + "quals": "Parent=tRNA{0}.{1}".format(genome_id, *data), + } + ) + if strand == "+": + gff3["start"] = gff3["end"] + 1 + gff3["end"] = original_end else: - gff3['end'] = gff3['start'] - 1 - gff3['start'] = original_end + gff3["end"] = gff3["start"] - 1 + gff3["start"] = original_end output_line(gff3) diff -r 358f58401cd6 -r b86f3e5626f4 tRNAscan.py --- a/tRNAscan.py Wed Jul 26 10:14:05 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,70 +0,0 @@ -#!/usr/bin/env python - -""" - Converts tRNAScan output back to fasta-sequences. -""" -import sys -from Bio import SeqIO -from Bio.SeqRecord import SeqRecord -import subprocess - -def main(args): - """ - Call from galaxy: - tRNAscan.py $organism $mode $showPrimSecondOpt $disablePseudo $showCodons $tabular_output $inputfile $fasta_output - - tRNAscan-SE $organism $mode $showPrimSecondOpt $disablePseudo $showCodons -Q -y -q -b -o $tabular_output $inputfile; - """ - cmd = """tRNAscan-SE -Q -y -q -b %s""" % ' '.join( args[:-1] ) - child = subprocess.Popen(cmd.split(), - stdout=subprocess.PIPE, stderr=subprocess.PIPE) - stdout, stderr = child.communicate() - return_code = child.returncode - if return_code: - sys.stdout.write(stdout) - sys.stderr.write(stderr) - sys.stderr.write("Return error code %i from command:\n" % return_code) - sys.stderr.write("%s\n" % cmd) - else: - sys.stdout.write(stdout) - sys.stdout.write(stderr) - - outfile = args[-1] - sequence_file = args[-2] - tRNAScan_file = args[-3] - - with open( sequence_file ) as sequences: - sequence_recs = SeqIO.to_dict(SeqIO.parse(sequences, "fasta")) - - tRNAs = [] - with open(tRNAScan_file) as tRNA_handle: - for line in tRNA_handle: - line = line.strip() - if not line or line.startswith('#'): - continue - cols = line.split() - iid = cols[0].strip() - start = int(cols[2]) - end = int(cols[3]) - aa = cols[4] - codon = cols[5] - rec = sequence_recs[ iid ] - if start > end: - new_rec = rec[end:start] - new_rec.seq = new_rec.seq.reverse_complement() - new_rec.description = "%s %s %s %s %s" % (rec.description, aa, codon, start, end) - new_rec.id = rec.id - new_rec.name = rec.name - tRNAs.append( new_rec ) - else: - new_rec = rec[start:end] - new_rec.id = rec.id - new_rec.name = rec.name - new_rec.description = "%s %s %s %s %s" % (rec.description, aa, codon, start, end) - tRNAs.append( new_rec ) - - SeqIO.write(tRNAs, open(outfile, 'w+'), "fasta") - - -if __name__ == '__main__': - main(sys.argv[1:]) diff -r 358f58401cd6 -r b86f3e5626f4 tRNAscan.xml --- a/tRNAscan.xml Wed Jul 26 10:14:05 2017 -0400 +++ b/tRNAscan.xml Tue Oct 28 09:15:42 2025 +0000 @@ -1,74 +1,112 @@ - - (tRNAscan) + + using tRNAscan, EufindtRNA & tRNA covariance models + + 2.0.12 + + + trnascan-se + - trnascan-se - infernal - biopython - python + trnascan-se - - - - + + - + + + + - - - - - + + + + + + - - - + + + + + - + + + missed is True + + + stats is True + - + - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - 10.1093/nar/25.5.0955 + 10.1093/nar/gkab688 diff -r 358f58401cd6 -r b86f3e5626f4 test-data/aragorn_tansl-table-11_introns.gff3 --- a/test-data/aragorn_tansl-table-11_introns.gff3 Wed Jul 26 10:14:05 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,32 +0,0 @@ -##gff-version 3 -genome_with_introns aragorn gene 1533 4118 . - . ID=genegenome_with_introns.1;Name=trnK-UUU;product=tRNA-Lys -genome_with_introns aragorn tRNA 1533 4118 . - . ID=tRNAgenome_with_introns.1;Parent=genegenome_with_introns.1;Name=trnK-UUU;product=tRNA-Lys -genome_with_introns aragorn exon 4081 4118 . - . Parent=tRNAgenome_with_introns.1 -genome_with_introns aragorn intron 1569 4080 . - . Parent=tRNAgenome_with_introns.1 -genome_with_introns aragorn exon 1533 1568 . - . Parent=tRNAgenome_with_introns.1 -genome_with_introns aragorn gene 6453 6524 . - . ID=genegenome_with_introns.2;Name=trnQ-UUG;product=tRNA-Gln -genome_with_introns aragorn tRNA 6453 6524 . - . ID=tRNAgenome_with_introns.2;Parent=genegenome_with_introns.2;Name=trnQ-UUG;product=tRNA-Gln -genome_with_introns aragorn exon 6453 6524 . - . Parent=tRNAgenome_with_introns.2 -genome_with_introns aragorn gene 7730 7819 . - . ID=genegenome_with_introns.3;Name=trnS-GCU;product=tRNA-Ser -genome_with_introns aragorn tRNA 7730 7819 . - . ID=tRNAgenome_with_introns.3;Parent=genegenome_with_introns.3;Name=trnS-GCU;product=tRNA-Ser -genome_with_introns aragorn exon 7730 7819 . - . Parent=tRNAgenome_with_introns.3 -genome_with_introns aragorn gene 12356 12431 . + . ID=genegenome_with_introns.4;Name=trnP-UGG;product=tRNA-Pro -genome_with_introns aragorn tRNA 12356 12431 . + . ID=tRNAgenome_with_introns.4;Parent=genegenome_with_introns.4;Name=trnP-UGG;product=tRNA-Pro -genome_with_introns aragorn exon 12356 12431 . + . Parent=tRNAgenome_with_introns.4 -genome_with_introns aragorn gene 12597 12670 . + . ID=genegenome_with_introns.5;Name=trnW-CCA;product=tRNA-Trp -genome_with_introns aragorn tRNA 12597 12670 . + . ID=tRNAgenome_with_introns.5;Parent=genegenome_with_introns.5;Name=trnW-CCA;product=tRNA-Trp -genome_with_introns aragorn exon 12597 12670 . + . Parent=tRNAgenome_with_introns.5 -genome_with_introns aragorn gene 22050 22123 . - . ID=genegenome_with_introns.6;Name=trnR-CCG;product=tRNA-Arg -genome_with_introns aragorn tRNA 22050 22123 . - . ID=tRNAgenome_with_introns.6;Parent=genegenome_with_introns.6;Name=trnR-CCG;product=tRNA-Arg -genome_with_introns aragorn exon 22050 22123 . - . Parent=tRNAgenome_with_introns.6 -genome_with_introns aragorn gene 26588 26660 . - . ID=genegenome_with_introns.7;Name=trnM-CAU;product=tRNA-Met -genome_with_introns aragorn tRNA 26588 26660 . - . ID=tRNAgenome_with_introns.7;Parent=genegenome_with_introns.7;Name=trnM-CAU;product=tRNA-Met -genome_with_introns aragorn exon 26588 26660 . - . Parent=tRNAgenome_with_introns.7 -genome_with_introns aragorn gene 26850 27468 . + . ID=genegenome_with_introns.8;Name=trnY-AUA;product=tRNA-Tyr -genome_with_introns aragorn tRNA 26850 27468 . + . ID=tRNAgenome_with_introns.8;Parent=genegenome_with_introns.8;Name=trnY-AUA;product=tRNA-Tyr -genome_with_introns aragorn exon 26850 26884 . + . Parent=tRNAgenome_with_introns.8 -genome_with_introns aragorn intron 26885 27410 . + . Parent=tRNAgenome_with_introns.8 -genome_with_introns aragorn exon 27411 27468 . + . Parent=tRNAgenome_with_introns.8 -genome_with_introns aragorn gene 29759 29833 . + . ID=genegenome_with_introns.9;Name=trnH-GUG;product=tRNA-His -genome_with_introns aragorn tRNA 29759 29833 . + . ID=tRNAgenome_with_introns.9;Parent=genegenome_with_introns.9;Name=trnH-GUG;product=tRNA-His -genome_with_introns aragorn exon 29759 29833 . + . Parent=tRNAgenome_with_introns.9 diff -r 358f58401cd6 -r b86f3e5626f4 test-data/aragorn_tansl-table-1_tmRNA_tRNA.fasta --- a/test-data/aragorn_tansl-table-1_tmRNA_tRNA.fasta Wed Jul 26 10:14:05 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,3 +0,0 @@ ->1-1 tRNA-Ala(tgc) [381,453] -ggggatgtagctcatatggtagagcgctcgctttgcatgcgagaggcaca -gggttcgattccctgcatctcca diff -r 358f58401cd6 -r b86f3e5626f4 test-data/aragorn_tansl-table-1_tmRNA_tRNA.gff3 --- a/test-data/aragorn_tansl-table-1_tmRNA_tRNA.gff3 Wed Jul 26 10:14:05 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,2 +0,0 @@ -##gff-version 3 -gi|240255695:23036500-23037000 aragorn tRNA 381 453 . + . ID=tRNAgi|240255695:23036500-23037000.1;Name=trnA-UGC;product=tRNA-Ala diff -r 358f58401cd6 -r b86f3e5626f4 test-data/aragorn_tansl-table-1_tmRNA_tRNA.txt --- a/test-data/aragorn_tansl-table-1_tmRNA_tRNA.txt Wed Jul 26 10:14:05 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,70 +0,0 @@ ------------------------------- -ARAGORN v1.2.36 Dean Laslett ------------------------------- - -Please reference the following paper if you use this -program as part of any published research. - -Laslett, D. and Canback, B. (2004) ARAGORN, a -program for the detection of transfer RNA and -transfer-messenger RNA genes in nucleotide sequences. -Nucleic Acids Research, 32;11-16. - - -Searching for tRNA genes with no introns -Searching for tmRNA genes -Assuming circular topology, search wraps around ends -Searching both strands -Using standard genetic code - - -gi|240255695:23036500-23037000 Arabidopsis thaliana chromosome 3, complete sequence -501 nucleotides in sequence -Mean G+C content = 43.1% - -1. - - - - a - g-c - g-c - g+t - g-c - a-t - t-a - g-c tt - t gtccc a - ta a !!!!! g - a ctcg caggg c - t !!!! a tt - g gagc c - gta g g - c-gag - t-a - c-g - g-c - c-g - t t - t a - tgc - - - - tRNA-Ala(tgc) - 73 bases, %GC = 56.2 - Sequence [381,453] - - - ->tRNA-Ala(tgc) [381,453] -ggggatgtagctcatatggtagagcgctcgctttgcatgcgagaggcaca -gggttcgattccctgcatctcca - - - - -Number of tmRNA genes = 0 - - -Configuration: aragorn /tmp/tmpx1qAPk/files/000/dataset_3.dat -gc1 -m -t -c -o /tmp/tmpx1qAPk/files/000/dataset_4.dat -fasta diff -r 358f58401cd6 -r b86f3e5626f4 test-data/tRNAscan_eukaryotic_infernal.fasta --- a/test-data/tRNAscan_eukaryotic_infernal.fasta Wed Jul 26 10:14:05 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,3 +0,0 @@ ->gi|240255695:23036500-23037000 Arabidopsis thaliana chromosome 3, complete sequence Ala TGC 381 453 -GGGATGTAGCTCATATGGTAGAGCGCTCGCTTTGCATGCGAGAGGCACAGGGTTCGATTC -CCTGCATCTCCA diff -r 358f58401cd6 -r b86f3e5626f4 test-data/tRNAscan_eukaryotic_infernal.tabular --- a/test-data/tRNAscan_eukaryotic_infernal.tabular Wed Jul 26 10:14:05 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1 +0,0 @@ -gi|240255695:23036500-23037000 1 381 453 Ala TGC 0 0 67.36 diff -r 358f58401cd6 -r b86f3e5626f4 test-data/trna_arabidopsis.fasta --- a/test-data/trna_arabidopsis.fasta Wed Jul 26 10:14:05 2017 -0400 +++ b/test-data/trna_arabidopsis.fasta Tue Oct 28 09:15:42 2025 +0000 @@ -7,4 +7,5 @@ ATTTAGTTAAAAAAGTGTTGAGAATCATTCGGGGATGTAGCTCATATGGTAGAGCGCTCGCTTTGCATGC GAGAGGCACAGGGTTCGATTCCCTGCATCTCCATTTTTATTTTCTTTTTTTTATAACTTTTGGTGAGCTT AATGGCCCAAT - +>dummy_seq +ACTGACTGATCGTAACTAGTAGCGACTGATCGATCGTACTGCATGGCATGTGACTGTCCGTGACTGTACG