Mercurial > repos > peterjc > tmhmm_and_signalp
annotate tools/protein_analysis/signalp3.py @ 23:e1996f0f4e85 draft default tip
"v0.2.13 - Python 3 fix for raising StopIteration"
| author | peterjc | 
|---|---|
| date | Thu, 17 Jun 2021 17:59:33 +0000 | 
| parents | 238eae32483c | 
| children | 
| rev | line source | 
|---|---|
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 1 #!/usr/bin/env python | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 2 """Wrapper for SignalP v3.0 for use in Galaxy. | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 3 | 
| 5 
0f1c61998b22
Migrated tool version 0.0.8 from old tool shed archive to new tool shed repository
 peterjc parents: 
0diff
changeset | 4 This script takes exactly five command line arguments: | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 5 * the organism type (euk, gram+ or gram-) | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 6 * length to truncate sequences to (integer) | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 7 * number of threads to use (integer, defaults to one) | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 8 * an input protein FASTA filename | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 9 * output tabular filename. | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 10 | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 11 There are two further optional arguments | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 12 * cut type (NN_Cmax, NN_Ymax, NN_Smax or HMM_Cmax) | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 13 * output GFF3 filename | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 14 | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 15 It then calls the standalone SignalP v3.0 program (not the webservice) | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 16 requesting the short output (one line per protein) using both NN and HMM | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 17 for predictions. | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 18 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 19 First major feature is cleaning up the output. The raw output from SignalP | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 20 v3.0 looks like this (21 columns space separated): | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 21 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 22 # SignalP-NN euk predictions # SignalP-HMM euk predictions | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 23 # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? D ? # name ! Cmax pos ? Sprob ? | 
| 19 | 24 gi|2781234|pdb|1JLY| 0.061 17 N 0.043 17 N 0.199 1 N 0.067 N 0.055 N gi|2781234|pdb|1JLY|B Q 0.000 17 N 0.000 N | 
| 25 gi|4959044|gb|AAD342 0.099 191 N 0.012 38 N 0.023 12 N 0.014 N 0.013 N gi|4959044|gb|AAD34209.1|AF069992_1 Q 0.000 0 N 0.000 N | |
| 26 gi|671626|emb|CAA856 0.139 381 N 0.020 8 N 0.121 4 N 0.067 N 0.044 N gi|671626|emb|CAA85685.1| Q 0.000 0 N 0.000 N | |
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 27 gi|3298468|dbj|BAA31 0.208 24 N 0.184 38 N 0.980 32 Y 0.613 Y 0.398 N gi|3298468|dbj|BAA31520.1| Q 0.066 24 N 0.139 N | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 28 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 29 In order to make it easier to use in Galaxy, this wrapper script reformats | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 30 this to use tab separators. Also it removes the redundant truncated name | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 31 column, and assigns unique column names in the header: | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 32 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 33 #ID NN_Cmax_score NN_Cmax_pos NN_Cmax_pred NN_Ymax_score NN_Ymax_pos NN_Ymax_pred NN_Smax_score NN_Smax_pos NN_Smax_pred NN_Smean_score NN_Smean_pred NN_D_score NN_D_pred HMM_bang HMM_Cmax_score HMM_Cmax_pos HMM_Cmax_pred HMM_Sprob_score HMM_Sprob_pred | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 34 gi|2781234|pdb|1JLY|B 0.061 17 N 0.043 17 N 0.199 1 N 0.067 N 0.055 N Q 0.000 17 N 0.000 N | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 35 gi|4959044|gb|AAD34209.1|AF069992_1 0.099 191 N 0.012 38 N 0.023 12 N 0.014 N 0.013 N Q 0.000 0 N 0.000 N | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 36 gi|671626|emb|CAA85685.1| 0.139 381 N 0.020 8 N 0.121 4 N 0.067 N 0.044 N Q 0.000 0 N 0.000 N | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 37 gi|3298468|dbj|BAA31520.1| 0.208 24 N 0.184 38 N 0.980 32 Y 0.613 Y 0.398 N Q 0.066 24 N 0.139 N | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 38 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 39 The second major feature is overcoming SignalP's built in limit of 4000 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 40 sequences by breaking up the input FASTA file into chunks. This also allows | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 41 us to pre-trim the sequences since SignalP only needs their starts. | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 42 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 43 The third major feature is taking advantage of multiple cores (since SignalP | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 44 v3.0 itself is single threaded) by using the individual FASTA input files to | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 45 run multiple copies of TMHMM in parallel. I would normally use Python's | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 46 multiprocessing library in this situation but it requires at least Python 2.6 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 47 and at the time of writing Galaxy still supports Python 2.4. | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 48 | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 49 Note that this is somewhat redundant with job-splitting available in Galaxy | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 50 itself (see the SignalP XML file for settings). | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 51 | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 52 Finally, you can opt to have a GFF3 file produced which will describe the | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 53 predicted signal peptide and mature peptide for each protein (using one of | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 54 the predictors which gives a cleavage site). *WORK IN PROGRESS* | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 55 """ # noqa: E501 | 
| 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 56 | 
| 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 57 from __future__ import print_function | 
| 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 58 | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 59 import os | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 60 import sys | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 61 import tempfile | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 62 | 
| 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 63 from seq_analysis_utils import fasta_iterator, split_fasta | 
| 9 | 64 from seq_analysis_utils import run_jobs, thread_count | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 65 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 66 FASTA_CHUNK = 500 | 
| 19 | 67 MAX_LEN = 6000 # Found by trial and error | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 68 | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 69 if "-v" in sys.argv or "--version" in sys.argv: | 
| 23 | 70 print("SignalP Galaxy wrapper version 0.0.20") | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 71 sys.exit(os.system("signalp -version")) | 
| 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 72 | 
| 19 | 73 if len(sys.argv) not in [6, 8]: | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 74 sys.exit( | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 75 "Require five (or 7) arguments, organism, truncate, threads, " | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 76 "input protein FASTA file & output tabular file (plus " | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 77 "optionally cut method and GFF3 output file). " | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 78 "Got %i arguments." % (len(sys.argv) - 1) | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 79 ) | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 80 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 81 organism = sys.argv[1] | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 82 if organism not in ["euk", "gram+", "gram-"]: | 
| 19 | 83 sys.exit("Organism argument %s is not one of euk, gram+ or gram-" % organism) | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 84 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 85 try: | 
| 11 
99b82a2b1272
Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
 peterjc parents: 
9diff
changeset | 86 truncate = int(sys.argv[2]) | 
| 19 | 87 except ValueError: | 
| 11 
99b82a2b1272
Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
 peterjc parents: 
9diff
changeset | 88 truncate = 0 | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 89 if truncate < 0: | 
| 19 | 90 sys.exit("Truncate argument %s is not a positive integer (or zero)" % sys.argv[2]) | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 91 | 
| 9 | 92 num_threads = thread_count(sys.argv[3], default=4) | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 93 fasta_file = sys.argv[4] | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 94 tabular_file = sys.argv[5] | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 95 | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 96 if len(sys.argv) == 8: | 
| 11 
99b82a2b1272
Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
 peterjc parents: 
9diff
changeset | 97 cut_method = sys.argv[6] | 
| 
99b82a2b1272
Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
 peterjc parents: 
9diff
changeset | 98 if cut_method not in ["NN_Cmax", "NN_Ymax", "NN_Smax", "HMM_Cmax"]: | 
| 19 | 99 sys.exit("Invalid cut method %r" % cut_method) | 
| 11 
99b82a2b1272
Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
 peterjc parents: 
9diff
changeset | 100 gff3_file = sys.argv[7] | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 101 else: | 
| 11 
99b82a2b1272
Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
 peterjc parents: 
9diff
changeset | 102 cut_method = None | 
| 
99b82a2b1272
Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
 peterjc parents: 
9diff
changeset | 103 gff3_file = None | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 104 | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 105 | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 106 tmp_dir = tempfile.mkdtemp() | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 107 | 
| 19 | 108 | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 109 def clean_tabular(raw_handle, out_handle, gff_handle=None): | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 110 """Clean up SignalP output to make it tabular.""" | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 111 for line in raw_handle: | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 112 if not line or line.startswith("#"): | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 113 continue | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 114 parts = line.rstrip("\r\n").split() | 
| 19 | 115 assert len(parts) == 21, repr(line) | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 116 assert parts[14].startswith(parts[0]), ( | 
| 11 
99b82a2b1272
Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
 peterjc parents: 
9diff
changeset | 117 "Bad entry in SignalP output, ID miss-match:\n%r" % line | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 118 ) | 
| 19 | 119 # Remove redundant truncated name column (col 0) | 
| 120 # and put full name at start (col 14) | |
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 121 parts = parts[14:15] + parts[1:14] + parts[15:] | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 122 out_handle.write("\t".join(parts) + "\n") | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 123 | 
| 19 | 124 | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 125 def make_gff(fasta_file, tabular_file, gff_file, cut_method): | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 126 """Make a GFF file.""" | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 127 cut_col, score_col = { | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 128 "NN_Cmax": (2, 1), | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 129 "NN_Ymax": (5, 4), | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 130 "NN_Smax": (8, 7), | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 131 "HMM_Cmax": (16, 15), | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 132 }[cut_method] | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 133 | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 134 source = "SignalP" | 
| 19 | 135 strand = "." # not stranded | 
| 136 phase = "." # not phased | |
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 137 tags = "Note=%s" % cut_method | 
| 19 | 138 | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 139 tab_handle = open(tabular_file) | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 140 line = tab_handle.readline() | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 141 assert line.startswith("#ID\t"), line | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 142 | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 143 gff_handle = open(gff_file, "w") | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 144 gff_handle.write("##gff-version 3\n") | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 145 | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 146 for (title, seq), line in zip(fasta_iterator(fasta_file), tab_handle): | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 147 parts = line.rstrip("\n").split("\t") | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 148 seqid = parts[0] | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 149 assert title.startswith(seqid), "%s vs %s" % (seqid, title) | 
| 19 | 150 if not seq: | 
| 151 # Is it possible to have a zero length reference in GFF3? | |
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 152 continue | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 153 cut = int(parts[cut_col]) | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 154 if cut == 0: | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 155 assert cut_method == "HMM_Cmax", cut_method | 
| 19 | 156 # TODO - Why does it do this? | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 157 cut = 1 | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 158 assert 1 <= cut <= len(seq), "%i for %s len %i" % (cut, seqid, len(seq)) | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 159 score = parts[score_col] | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 160 gff_handle.write("##sequence-region %s %i %i\n" % (seqid, 1, len(seq))) | 
| 19 | 161 # If the cut is at the very begining, there is no signal peptide! | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 162 if cut > 1: | 
| 19 | 163 # signal_peptide = SO:0000418 | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 164 gff_handle.write( | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 165 "%s\t%s\t%s\t%i\t%i\t%s\t%s\t%s\t%s\n" | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 166 % ( | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 167 seqid, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 168 source, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 169 "signal_peptide", | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 170 1, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 171 cut - 1, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 172 score, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 173 strand, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 174 phase, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 175 tags, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 176 ) | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 177 ) | 
| 19 | 178 # mature_protein_region = SO:0000419 | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 179 gff_handle.write( | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 180 "%s\t%s\t%s\t%i\t%i\t%s\t%s\t%s\t%s\n" | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 181 % ( | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 182 seqid, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 183 source, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 184 "mature_protein_region", | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 185 cut, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 186 len(seq), | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 187 score, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 188 strand, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 189 phase, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 190 tags, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 191 ) | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 192 ) | 
| 19 | 193 tab_handle.close() | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 194 gff_handle.close() | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 195 | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 196 | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 197 if num_threads == 1: | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 198 # Still want to call split_fasta to apply truncation, but | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 199 # no reason to make multiple files - and more chance of | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 200 # hitting file system glitches if we do. So, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 201 FASTA_CHUNK = sys.maxsize | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 202 | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 203 fasta_files = split_fasta( | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 204 fasta_file, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 205 os.path.join(tmp_dir, "signalp"), | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 206 n=FASTA_CHUNK, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 207 truncate=truncate, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 208 max_len=MAX_LEN, | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 209 ) | 
| 19 | 210 temp_files = [f + ".out" for f in fasta_files] | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 211 assert len(fasta_files) == len(temp_files) | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 212 jobs = [ | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 213 "signalp -short -t %s %s > %s" % (organism, fasta, temp) | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 214 for (fasta, temp) in zip(fasta_files, temp_files) | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 215 ] | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 216 assert len(fasta_files) == len(temp_files) == len(jobs) | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 217 | 
| 19 | 218 | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 219 def clean_up(file_list): | 
| 19 | 220 """Remove temp files, and if possible the temp directory.""" | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 221 for f in file_list: | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 222 if os.path.isfile(f): | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 223 os.remove(f) | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 224 try: | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 225 os.rmdir(tmp_dir) | 
| 19 | 226 except Exception: | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 227 pass | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 228 | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 229 | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 230 if len(jobs) > 1 and num_threads > 1: | 
| 19 | 231 # A small "info" message for Galaxy to show the user. | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 232 print("Using %i threads for %i tasks" % (min(num_threads, len(jobs)), len(jobs))) | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 233 results = run_jobs(jobs, num_threads) | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 234 assert len(fasta_files) == len(temp_files) == len(jobs) | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 235 for fasta, temp, cmd in zip(fasta_files, temp_files, jobs): | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 236 error_level = results[cmd] | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 237 try: | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 238 output = open(temp).readline() | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 239 except IOError: | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 240 output = "(no output)" | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 241 if error_level or output.lower().startswith("error running"): | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 242 clean_up(fasta_files + temp_files) | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 243 if output: | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 244 sys.stderr.write( | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 245 "One or more tasks failed, e.g. %i from %r gave:\n%s" | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 246 % (error_level, cmd, output) | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 247 ) | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 248 else: | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 249 sys.stderr.write( | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 250 "One or more tasks failed, e.g. %i from %r with no output\n" | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 251 % (error_level, cmd) | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 252 ) | 
| 20 
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
 peterjc parents: 
19diff
changeset | 253 sys.exit(error_level) | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 254 del results | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 255 | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 256 out_handle = open(tabular_file, "w") | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 257 fields = ["ID"] | 
| 19 | 258 # NN results: | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 259 for name in ["Cmax", "Ymax", "Smax"]: | 
| 19 | 260 fields.extend(["NN_%s_score" % name, "NN_%s_pos" % name, "NN_%s_pred" % name]) | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 261 fields.extend(["NN_Smean_score", "NN_Smean_pred", "NN_D_score", "NN_D_pred"]) | 
| 19 | 262 # HMM results: | 
| 21 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 263 fields.extend( | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 264 [ | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 265 "HMM_type", | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 266 "HMM_Cmax_score", | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 267 "HMM_Cmax_pos", | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 268 "HMM_Cmax_pred", | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 269 "HMM_Sprob_score", | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 270 "HMM_Sprob_pred", | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 271 ] | 
| 
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
 peterjc parents: 
20diff
changeset | 272 ) | 
| 0 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 273 out_handle.write("#" + "\t".join(fields) + "\n") | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 274 for temp in temp_files: | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 275 data_handle = open(temp) | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 276 clean_tabular(data_handle, out_handle) | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 277 data_handle.close() | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 278 out_handle.close() | 
| 
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 peterjc parents: diff
changeset | 279 | 
| 19 | 280 # GFF3: | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 281 if cut_method: | 
| 11 
99b82a2b1272
Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
 peterjc parents: 
9diff
changeset | 282 make_gff(fasta_file, tabular_file, gff3_file, cut_method) | 
| 7 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 283 | 
| 
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
 peterjc parents: 
5diff
changeset | 284 clean_up(fasta_files + temp_files) | 
