Mercurial > repos > peterjc > tmhmm_and_signalp
comparison tools/protein_analysis/tmhmm2.xml @ 0:bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
author | peterjc |
---|---|
date | Tue, 07 Jun 2011 18:03:34 -0400 |
parents | |
children | 3ff1dcbb9440 |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:bca9bc7fdaef |
---|---|
1 <tool id="tmhmm2" name="TMHMM 2.0" version="0.0.1"> | |
2 <description>Find transmembrane domains in protein sequences</description> | |
3 <command interpreter="python"> | |
4 tmhmm2.py 8 $fasta_file $tabular_file | |
5 ##I want the number of threads to be a Galaxy config option... | |
6 </command> | |
7 <inputs> | |
8 <param name="fasta_file" type="data" format="fasta" label="FASTA file of protein sequences"/> | |
9 <!-- | |
10 <param name="version" type="select" display="radio" label="Model version"> | |
11 <option value="">Version 1 (old)</option> | |
12 <option value="" selected="True">Version 2 (default)</option> | |
13 </param> | |
14 --> | |
15 </inputs> | |
16 <outputs> | |
17 <data name="tabular_file" format="tabular" label="TMHMM results" /> | |
18 </outputs> | |
19 <requirements> | |
20 <requirement type="binary">tmhmm</requirement> | |
21 </requirements> | |
22 <tests> | |
23 <test> | |
24 <param name="fasta_file" value="four_human_proteins.fasta" ftype="fasta"/> | |
25 <output name="tabular_file" file="four_human_proteins.tmhmm2.tsv" ftype="tabular"/> | |
26 </test> | |
27 </tests> | |
28 <help> | |
29 | |
30 **What it does** | |
31 | |
32 This calls the TMHMM v2.0 tool for prediction of transmembrane (TM) helices in proteins using a hidden Markov model (HMM). | |
33 | |
34 The input is a FASTA file of protein sequences, and the output is tabular with six columns (one row per protein): | |
35 | |
36 1. Sequence identifier | |
37 2. Sequence length | |
38 3. Expected number of amino acids in TM helices (ExpAA). If this number is larger than 18 it is very likely to be a transmembrane protein (OR have a signal peptide). | |
39 4. Expected number of amino acids in TM helices in the first 60 amino acids of the protein (Exp60). If this number more than a few, be aware that a predicted transmembrane helix in the N-term could be a signal peptide. | |
40 5. Number of transmembrane helices predicted by N-best. | |
41 6. Topology predicted by N-best (encoded as a strip using o for output and i for inside) | |
42 | |
43 Predicted TM segments in the n-terminal region sometime turn out to be signal peptides. | |
44 | |
45 One of the most common mistakes by the program is to reverse the direction of proteins with one TM segment. | |
46 | |
47 Do not use the program to predict whether a non-membrane protein is cytoplasmic or not. | |
48 | |
49 **Notes** | |
50 | |
51 The raw output from TMHMM v2.0 looks like this (six columns tab separated): | |
52 | |
53 =================================== ======= =========== ============= ========= ============================= | |
54 gi|2781234|pdb|1JLY|B len=304 ExpAA=0.01 First60=0.00 PredHel=0 Topology=o | |
55 gi|4959044|gb|AAD34209.1|AF069992_1 len=600 ExpAA=0.00 First60=0.00 PredHel=0 Topology=o | |
56 gi|671626|emb|CAA85685.1| len=473 ExpAA=0.19 First60=0.00 PredHel=0 Topology=o | |
57 gi|3298468|dbj|BAA31520.1| len=107 ExpAA=59.37 First60=31.17 PredHel=3 Topology=o23-45i52-74o89-106i | |
58 =================================== ======= =========== ============= ========= ============================= | |
59 | |
60 In order to make it easier to use in Galaxy, the wrapper script simplifies this to remove the redundant tags, and instead adds a comment line at the top with the column names: | |
61 | |
62 =================================== === ===== ======= ======= ==================== | |
63 #ID len ExpAA First60 PredHel Topology | |
64 gi|2781234|pdb|1JLY|B 304 0.01 0.00 0 o | |
65 gi|4959044|gb|AAD34209.1|AF069992_1 600 0.00 0.00 0 o | |
66 gi|671626|emb|CAA85685.1| 473 0.19 0.00 0 o | |
67 gi|3298468|dbj|BAA31520.1| 107 59.37 31.17 3 o23-45i52-74o89-106i | |
68 =================================== === ===== ======= ======= ==================== | |
69 | |
70 **References** | |
71 | |
72 Krogh, Larsson, von Heijne, and Sonnhammer. | |
73 Predicting Transmembrane Protein Topology with a Hidden Markov Model: Application to Complete Genomes. | |
74 J. Mol. Biol. 305:567-580, 2001. | |
75 | |
76 Sonnhammer, von Heijne, and Krogh. | |
77 A hidden Markov model for predicting transmembrane helices in protein sequences. | |
78 In J. Glasgow et al., eds.: Proc. Sixth Int. Conf. on Intelligent Systems for Molecular Biology, pages 175-182. AAAI Press, 1998. | |
79 | |
80 </help> | |
81 </tool> |