Mercurial > repos > galaxyp > nbic_fasta

<!--
# =====================================================
# $Id: ProteinDigestor.xml 90 2011-01-19 13:20:31Z pieter.neerincx@gmail.com $
# $URL: https://trac.nbic.nl/svn/galaxytools/trunk/tools/general/FastaTools/ProteinDigestor.xml $
# $LastChangedDate: 2011-01-19 07:20:31 -0600 (Wed, 19 Jan 2011) $
# $LastChangedRevision: 90 $
# $LastChangedBy: pieter.neerincx@gmail.com $
# =====================================================
-->
<tool id="ProteinDigestor1" name="ProteinDigestor" version="2.0">
  <description>In silico digestion of proteins into peptides</description>
  <command interpreter="perl">ProteinDigestor.pl -i $input -o $output -r '$rule' -l ERROR</command>
  <inputs>
    <param format="fasta" name="input" type="data" label="Proteins to digest" help="in FASTA format."/>
    <param name="rule" type="select" label="Protease">
      <!--<label>Protease</label>-->
      <option value="c:K,R:P=1">Trypsin</option>
      <option value="c:K,R">Trypsin/P</option>
      <option value="c:F,L,W,Y:P=1">Chymotrypsin</option>
      <option value="c:F,L,W,Y">Chymotrypsin/P</option>
      <option value="c:E,Z:P=1">V8_E</option>
      <option value="c:E,D,B,Z:P=1">V8_DE</option>
      <option value="c:K:P=1">LysC</option>
      <option value="c:K">LysC/P</option>
      <option value="n:K">LysN/P</option>
    </param>
  </inputs>
  <outputs>
    <data format="tabular" name="output" label="${rule.value_label} digest on ${input.name}"/>
  </outputs>
  <!--
  <tests>
    <test>
      <param name="input"       value="" />
      <output name="output"     file="" />
    </test>
  </tests>
  -->
  <help>

.. class:: infomark

**What it does**

This tool digests protein sequences from a FASTA file.\
In addition some statistics are calculated for each peptide:

 * Molecular weight in Da (MW).
 * Isoelectric point (pI)
 * Net charge assuming K, R and H add +1 and D and E add -1 to the net charge of the peptide.
 * The mount of potential phosphorilation sites (S, T or Y amino acids).

**Getting input data**

This tool requires protein sequences in FASTA format. \
If your protein sequences are not in FASTA format, you'll have to convert it first.\
For sequences in tab delimited files for example you can use the *FASTA manipulation* -> *Tabular-to-FASTA* tool.

-----

**Example**

If for example the protein sequences in FASTA format would be this::

   >FamousDatabase:Q42593Z L-ascorbate peroxidase T, chloroplastic;
   MSVSLSAKASHLLCSSTRVSLSPAVTSSSSSPVVRPALSSSTS*
   >FamousDatabase:A0MQ79Z Ascorbate peroxidase;
   MVKPNYPVVSEEYLIAVDKAKKKLRGFIAEKNCAPLMLRL*

and you would perform an *in silico* digest with Trypsin, the result will look like this::

   Protein ID               Peptide                     MW         pI                  Charge   number S   number T   number Y
   FamousDatabase:Q42593Z   ASHLLCSSTR                  1074.225   8.30219268798828    2        3          1          0
   FamousDatabase:Q42593Z   VSLSPAVTSSSSSPVVRPALSSSTS   2390.61    9.72000885009766    1        11         2          0
   FamousDatabase:Q42593Z   MSVSLSAK                    821.995    8.50011444091797    1        3          0          0
   FamousDatabase:A0MQ79Z   NCAPLMLR                    917.175    8.24974822998047    1        0          0          0
   FamousDatabase:A0MQ79Z   K                           146.19     8.75005340576172    1        0          0          0
   FamousDatabase:A0MQ79Z   K                           146.19     8.75005340576172    1        0          0          0
   FamousDatabase:A0MQ79Z   LR                          287.375    9.75002288818359    1        0          0          0
   FamousDatabase:A0MQ79Z   GFIAEK                      663.775    6.00136566162109    0        0          0          0
   FamousDatabase:A0MQ79Z   L                           131.18     5.52498626708984    0        0          0          0
   FamousDatabase:A0MQ79Z   AK                          217.265    8.79502105712891    1        0          0          0
   FamousDatabase:A0MQ79Z   MVKPNYPVVSEEYLIAVDK         2194.59    4.67711639404297   -1        1          0          2


========================================================
*Need to digest with another protease?*
========================================================

Contact your local bioinformaticians to add other proteases...

  </help>
</tool>
author	galaxyp
date	Fri, 10 May 2013 17:15:08 -0400
parents
children