Mercurial > repos > peterjc > tmhmm_and_signalp
annotate tools/protein_analysis/promoter2.py @ 18:eb6ac44d4b8e draft
Suite v0.2.8, record Promoter 2 verion + misc internal updates
author | peterjc |
---|---|
date | Tue, 01 Sep 2015 09:56:36 -0400 |
parents | e52220a9ddad |
children | f3ecd80850e2 |
rev | line source |
---|---|
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
1 #!/usr/bin/env python |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
2 """Wrapper for Promoter 2.0 for use in Galaxy. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
3 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
4 This script takes exactly three command line arguments: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
5 * number of threads |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
6 * an input DNA FASTA filename |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
7 * output tabular filename. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
8 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
9 It calls the Promoter 2.0 binary (e.g. .../promoter-2.0/bin/promoter_Linux, |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
10 bypassing the Perl wrapper script 'promoter' which imposes a significant |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
11 performace overhead for no benefit here (we don't need HTML output for |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
12 example). |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
13 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
14 The main feature is this Python wrapper script parsers the bespoke |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
15 tabular output from Promoter 2.0 and reformats it into a Galaxy friendly |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
16 tab separated table. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
17 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
18 Additionally, in order to take advantage of multiple cores the input FASTA |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
19 file is broken into chunks and multiple copies of promoter run at once. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
20 This can be used in combination with the job-splitting available in Galaxy. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
21 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
22 Note that rewriting the FASTA input file allows us to avoid a bug in |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
23 promoter 2 with long descriptions in the FASTA header line (over 200 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
24 characters) which produces stray fragements of the description in the |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
25 output file, making parsing non-trivial. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
26 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
27 TODO - Automatically extract the sequence containing a promoter prediction? |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
28 """ |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
29 import sys |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
30 import os |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
31 import commands |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
32 import tempfile |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
33 from seq_analysis_utils import sys_exit, split_fasta, run_jobs, thread_count |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
34 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
35 FASTA_CHUNK = 500 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
36 |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
37 if "-v" in sys.argv or "--version" in sys.argv: |
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
38 sys.exit(os.system("promoter -V")) |
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
39 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
40 if len(sys.argv) != 4: |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
41 sys_exit("Require three arguments, number of threads (int), input DNA FASTA file & output tabular file. " |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
42 "Got %i arguments." % (len(sys.argv)-1)) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
43 |
9 | 44 num_threads = thread_count(sys.argv[3],default=4) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
45 fasta_file = os.path.abspath(sys.argv[2]) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
46 tabular_file = os.path.abspath(sys.argv[3]) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
47 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
48 tmp_dir = tempfile.mkdtemp() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
49 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
50 def get_path_and_binary(): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
51 platform = commands.getoutput("uname") #e.g. Linux |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
52 shell_script = commands.getoutput("which promoter") |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
53 if not os.path.isfile(shell_script): |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
54 sys_exit("ERROR: Missing promoter executable shell script") |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
55 path = None |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
56 for line in open(shell_script): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
57 if line.startswith("setenv"): #could then be tab or space! |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
58 parts = line.rstrip().split(None, 2) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
59 if parts[0] == "setenv" and parts[1] == "PROM": |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
60 path = parts[2] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
61 if not path: |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
62 sys_exit("ERROR: Could not find promoter path (PROM) in %r" % shell_script) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
63 if not os.path.isdir(path): |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
64 sys_exit("ERROR: %r is not a directory" % path) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
65 bin = "%s/bin/promoter_%s" % (path, platform) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
66 if not os.path.isfile(bin): |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
67 sys_exit("ERROR: Missing promoter binary %r" % bin) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
68 return path, bin |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
69 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
70 def make_tabular(raw_handle, out_handle): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
71 """Parse text output into tabular, return query count.""" |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
72 identifier = None |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
73 queries = 0 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
74 for line in raw_handle: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
75 #print repr(line) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
76 if not line.strip() or line == "Promoter prediction:\n": |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
77 pass |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
78 elif line[0] != " ": |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
79 identifier = line.strip().replace("\t", " ").split(None,1)[0] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
80 queries += 1 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
81 elif line == " No promoter predicted\n": |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
82 #End of a record |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
83 identifier = None |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
84 elif line == " Position Score Likelihood\n": |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
85 assert identifier |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
86 else: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
87 try: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
88 position, score, likelihood = line.strip().split(None,2) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
89 except ValueError: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
90 print "WARNING: Problem with line: %r" % line |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
91 continue |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
92 #sys_exit("ERROR: Problem with line: %r" % line) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
93 if likelihood not in ["ignored", |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
94 "Marginal prediction", |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
95 "Medium likely prediction", |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
96 "Highly likely prediction"]: |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
97 sys_exit("ERROR: Problem with line: %r" % line) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
98 out_handle.write("%s\t%s\t%s\t%s\n" % (identifier, position, score, likelihood)) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
99 return queries |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
100 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
101 working_dir, bin = get_path_and_binary() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
102 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
103 if not os.path.isfile(fasta_file): |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
104 sys_exit("ERROR: Missing input FASTA file %r" % fasta_file) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
105 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
106 #Note that if the input FASTA file contains no sequences, |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
107 #split_fasta returns an empty list (i.e. zero temp files). |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
108 #We deliberately omit the FASTA descriptions to avoid a |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
109 #bug in promoter2 with descriptions over 200 characters. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
110 fasta_files = split_fasta(fasta_file, os.path.join(tmp_dir, "promoter"), FASTA_CHUNK, keep_descr=False) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
111 temp_files = [f+".out" for f in fasta_files] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
112 jobs = ["%s %s > %s" % (bin, fasta, temp) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
113 for fasta, temp in zip(fasta_files, temp_files)] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
114 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
115 def clean_up(file_list): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
116 for f in file_list: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
117 if os.path.isfile(f): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
118 os.remove(f) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
119 try: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
120 os.rmdir(tmp_dir) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
121 except: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
122 pass |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
123 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
124 if len(jobs) > 1 and num_threads > 1: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
125 #A small "info" message for Galaxy to show the user. |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
126 print "Using %i threads for %i tasks" % (min(num_threads, len(jobs)), len(jobs)) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
127 cur_dir = os.path.abspath(os.curdir) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
128 os.chdir(working_dir) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
129 results = run_jobs(jobs, num_threads) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
130 os.chdir(cur_dir) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
131 for fasta, temp, cmd in zip(fasta_files, temp_files, jobs): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
132 error_level = results[cmd] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
133 if error_level: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
134 try: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
135 output = open(temp).readline() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
136 except IOError: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
137 output = "" |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
138 clean_up(fasta_files + temp_files) |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
139 sys_exit("One or more tasks failed, e.g. %i from %r gave:\n%s" % (error_level, cmd, output), |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
140 error_level) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
141 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
142 del results |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
143 del jobs |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
144 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
145 out_handle = open(tabular_file, "w") |
8
976a5f2833cd
Uploaded v0.1.1 of the bundle, which fixes an error in the header of the tabular output produced for Promoter 2.0
peterjc
parents:
7
diff
changeset
|
146 out_handle.write("#Identifier\tPosition\tScore\tLikelihood\n") |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
147 queries = 0 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
148 for temp in temp_files: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
149 data_handle = open(temp) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
150 count = make_tabular(data_handle, out_handle) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
151 data_handle.close() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
152 if not count: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
153 clean_up(fasta_files + temp_files) |
18
eb6ac44d4b8e
Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents:
9
diff
changeset
|
154 sys_exit("No output from promoter2") |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
155 queries += count |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
156 out_handle.close() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
157 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
158 clean_up(fasta_files + temp_files) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
diff
changeset
|
159 print "Results for %i queries" % queries |