Mercurial > repos > peterjc > tmhmm_and_signalp

diff tools/protein_analysis/tmhmm2.py @ 2:6901298ac16c
Migrated tool version 0.0.5 from old tool shed archive to new tool shed repository
author: peterjc
date: Tue, 07 Jun 2011 18:04:39 -0400
parents: 3ff1dcbb9440
children: 9b45a8743100
--- a/tools/protein_analysis/tmhmm2.py	Tue Jun 07 18:04:05 2011 -0400
+++ b/tools/protein_analysis/tmhmm2.py	Tue Jun 07 18:04:39 2011 -0400
@@ -6,14 +6,17 @@
 v2.0 program (not the webservice) requesting the short output (one line per
 protein).
 
-First major feature is cleaning up the tabular output. The raw output from
-TMHMM v2.0 looks like this (six columns tab separated):
+The first major feature is cleaning up the tabular output. The short form raw
+output from TMHMM v2.0 looks like this (six columns tab separated):
 
  gi|2781234|pdb|1JLY|B	len=304 ExpAA=0.01	First60=0.00	PredHel=0	Topology=o
  gi|4959044|gb|AAD34209.1|AF069992_1	len=600	ExpAA=0.00	First60=0.00	PredHel=0	Topology=o
  gi|671626|emb|CAA85685.1|	len=473 ExpAA=0.19	First60=0.00 PredHel=0	Topology=o
  gi|3298468|dbj|BAA31520.1|	len=107	ExpAA=59.37	First60=31.17	PredHel=3	Topology=o23-45i52-74o89-106i
 
+If there are any additional 'comment' lines starting with the hash (#)
+character these are ignored by this script.
+
 In order to make it easier to use in Galaxy, this wrapper script simplifies
 this to remove the redundant tags, and instead adds a comment line at the
 top with the column names:
@@ -55,7 +58,8 @@
     """Clean up tabular TMHMM output, returns output line count."""
     count = 0
     for line in raw_handle:
-        if not line:
+        if not line.strip() or line.startswith("#"):
+            #Ignore any blank lines or comment lines
             continue
         parts = line.rstrip("\r\n").split("\t")
         try:
@@ -82,7 +86,7 @@
 #split_fasta returns an empty list (i.e. zero temp files).
 fasta_files = split_fasta(fasta_file, tabular_file, FASTA_CHUNK)
 temp_files = [f+".out" for f in fasta_files]
-jobs = ["tmhmm %s > %s" % (fasta, temp)
+jobs = ["tmhmm -short %s > %s" % (fasta, temp)
         for fasta, temp in zip(fasta_files, temp_files)]
 
 def clean_up(file_list):
author	peterjc
date	Tue, 07 Jun 2011 18:04:39 -0400
parents	3ff1dcbb9440
children	9b45a8743100