Mercurial > repos > peterjc > tmhmm_and_signalp
annotate tools/protein_analysis/promoter2.py @ 23:e1996f0f4e85 draft default tip
"v0.2.13 - Python 3 fix for raising StopIteration"
author | peterjc |
---|---|
date | Thu, 17 Jun 2021 17:59:33 +0000 |
parents | 238eae32483c |
children |
rev | line source |
---|---|
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
1 #!/usr/bin/env python |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
2 """Wrapper for Promoter 2.0 for use in Galaxy. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
3 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
4 This script takes exactly three command line arguments: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
5 * number of threads |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
6 * an input DNA FASTA filename |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
7 * output tabular filename. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
8 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
9 It calls the Promoter 2.0 binary (e.g. .../promoter-2.0/bin/promoter_Linux, |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
10 bypassing the Perl wrapper script 'promoter' which imposes a significant |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
11 performace overhead for no benefit here (we don't need HTML output for |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
12 example). |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
13 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
14 The main feature is this Python wrapper script parsers the bespoke |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
15 tabular output from Promoter 2.0 and reformats it into a Galaxy friendly |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
16 tab separated table. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
17 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
18 Additionally, in order to take advantage of multiple cores the input FASTA |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
19 file is broken into chunks and multiple copies of promoter run at once. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
20 This can be used in combination with the job-splitting available in Galaxy. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
21 Note that rewriting the FASTA input file allows us to avoid a bug in |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
22 promoter 2 with long descriptions in the FASTA header line (over 200 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
23 characters) which produces stray fragements of the description in the |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
24 output file, making parsing non-trivial. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
25 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
26 TODO - Automatically extract the sequence containing a promoter prediction? |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
27 """ |
20
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
28 |
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
29 from __future__ import print_function |
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
30 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
31 import commands |
20
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
32 import os |
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
33 import sys |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
34 import tempfile |
20
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
35 |
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
36 from seq_analysis_utils import run_jobs, split_fasta, thread_count |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
37 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
38 FASTA_CHUNK = 500 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
39 |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
40 if "-v" in sys.argv or "--version" in sys.argv: |
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
41 sys.exit(os.system("promoter -V")) |
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
42 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
43 if len(sys.argv) != 4: |
21
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
44 sys.exit( |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
45 "Require three arguments, number of threads (int), input DNA FASTA " |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
46 "file & output tabular file. Got %i arguments." % (len(sys.argv) - 1) |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
47 ) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
48 |
19 | 49 num_threads = thread_count(sys.argv[3], default=4) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
50 fasta_file = os.path.abspath(sys.argv[2]) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
51 tabular_file = os.path.abspath(sys.argv[3]) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
52 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
53 tmp_dir = tempfile.mkdtemp() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
54 |
19 | 55 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
56 def get_path_and_binary(): |
20
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
57 """Determine path and binary names for promoter tool.""" |
19 | 58 platform = commands.getoutput("uname") # e.g. Linux |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
59 shell_script = commands.getoutput("which promoter") |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
60 if not os.path.isfile(shell_script): |
19 | 61 sys.exit("ERROR: Missing promoter executable shell script") |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
62 path = None |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
63 for line in open(shell_script): |
19 | 64 if line.startswith("setenv"): # could then be tab or space! |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
65 parts = line.rstrip().split(None, 2) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
66 if parts[0] == "setenv" and parts[1] == "PROM": |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
67 path = parts[2] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
68 if not path: |
19 | 69 sys.exit("ERROR: Could not find promoter path (PROM) in %r" % shell_script) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
70 if not os.path.isdir(path): |
19 | 71 sys.exit("ERROR: %r is not a directory" % path) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
72 bin = "%s/bin/promoter_%s" % (path, platform) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
73 if not os.path.isfile(bin): |
19 | 74 sys.exit("ERROR: Missing promoter binary %r" % bin) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
75 return path, bin |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
76 |
19 | 77 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
78 def make_tabular(raw_handle, out_handle): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
79 """Parse text output into tabular, return query count.""" |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
80 identifier = None |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
81 queries = 0 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
82 for line in raw_handle: |
20
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
83 # print(repr(line)) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
84 if not line.strip() or line == "Promoter prediction:\n": |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
85 pass |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
86 elif line[0] != " ": |
19 | 87 identifier = line.strip().replace("\t", " ").split(None, 1)[0] |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
88 queries += 1 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
89 elif line == " No promoter predicted\n": |
19 | 90 # End of a record |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
91 identifier = None |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
92 elif line == " Position Score Likelihood\n": |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
93 assert identifier |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
94 else: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
95 try: |
19 | 96 position, score, likelihood = line.strip().split(None, 2) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
97 except ValueError: |
20
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
98 print("WARNING: Problem with line: %r" % line) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
99 continue |
19 | 100 # sys.exit("ERROR: Problem with line: %r" % line) |
21
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
101 if likelihood not in [ |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
102 "ignored", |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
103 "Marginal prediction", |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
104 "Medium likely prediction", |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
105 "Highly likely prediction", |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
106 ]: |
19 | 107 sys.exit("ERROR: Problem with line: %r" % line) |
21
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
108 out_handle.write( |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
109 "%s\t%s\t%s\t%s\n" % (identifier, position, score, likelihood) |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
110 ) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
111 return queries |
19 | 112 |
20
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
113 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
114 working_dir, bin = get_path_and_binary() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
115 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
116 if not os.path.isfile(fasta_file): |
19 | 117 sys.exit("ERROR: Missing input FASTA file %r" % fasta_file) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
118 |
19 | 119 # Note that if the input FASTA file contains no sequences, |
120 # split_fasta returns an empty list (i.e. zero temp files). | |
121 # We deliberately omit the FASTA descriptions to avoid a | |
122 # bug in promoter2 with descriptions over 200 characters. | |
21
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
123 fasta_files = split_fasta( |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
124 fasta_file, os.path.join(tmp_dir, "promoter"), FASTA_CHUNK, keep_descr=False |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
125 ) |
19 | 126 temp_files = [f + ".out" for f in fasta_files] |
21
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
127 jobs = [ |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
128 "%s %s > %s" % (bin, fasta, temp) for fasta, temp in zip(fasta_files, temp_files) |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
129 ] |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
130 |
19 | 131 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
132 def clean_up(file_list): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
133 for f in file_list: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
134 if os.path.isfile(f): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
135 os.remove(f) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
136 try: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
137 os.rmdir(tmp_dir) |
19 | 138 except Exception: |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
139 pass |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
140 |
20
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
141 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
142 if len(jobs) > 1 and num_threads > 1: |
19 | 143 # A small "info" message for Galaxy to show the user. |
20
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
144 print("Using %i threads for %i tasks" % (min(num_threads, len(jobs)), len(jobs))) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
145 cur_dir = os.path.abspath(os.curdir) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
146 os.chdir(working_dir) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
147 results = run_jobs(jobs, num_threads) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
148 os.chdir(cur_dir) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
149 for fasta, temp, cmd in zip(fasta_files, temp_files, jobs): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
150 error_level = results[cmd] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
151 if error_level: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
152 try: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
153 output = open(temp).readline() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
154 except IOError: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
155 output = "" |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
156 clean_up(fasta_files + temp_files) |
21
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
157 sys.exit( |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
158 "One or more tasks failed, e.g. %i from %r gave:\n%s" |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
159 % (error_level, cmd, output), |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
160 error_level, |
238eae32483c
"Check this is up to date with all 2020 changes (black etc)"
peterjc
parents:
20
diff
changeset
|
161 ) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
162 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
163 del results |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
164 del jobs |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
165 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
166 out_handle = open(tabular_file, "w") |
8
976a5f2833cd
Uploaded v0.1.1 of the bundle, which fixes an error in the header of the tabular output produced for Promoter 2.0
peterjc
parents:
7
diff
changeset
|
167 out_handle.write("#Identifier\tPosition\tScore\tLikelihood\n") |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
168 queries = 0 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
169 for temp in temp_files: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
170 data_handle = open(temp) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
171 count = make_tabular(data_handle, out_handle) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
172 data_handle.close() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
173 if not count: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
174 clean_up(fasta_files + temp_files) |
19 | 175 sys.exit("No output from promoter2") |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
176 queries += count |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
177 out_handle.close() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
178 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
179 clean_up(fasta_files + temp_files) |
20
a19b3ded8f33
v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents:
19
diff
changeset
|
180 print("Results for %i queries" % queries) |