comparison SMART/galaxy/clusterizeBySlidingWindows.xml @ 15:440ceca58672

Uploaded
author m-zytnicki
date Mon, 22 Apr 2013 11:08:07 -0400
parents 769e306b7933
children 94ab73e8a190
comparison
equal deleted inserted replaced
14:c79b9ae3f65f 15:440ceca58672
1 <tool id="clusterizeBySlidingWindows" name="clusterize By SlidingWindows"> 1 <tool id="clusterizeBySlidingWindows" name="clusterize By SlidingWindows">
2 <description>Produces a GFF3 file that clusters a list of transcripts using a sliding window. Cluster the data into regions (defined by size and overlap with next region) and keep only highest peaks.</description> 2 <description>Produces a GFF3 file that clusters a list of transcripts using a sliding window. Cluster the data into regions (defined by size and overlap with next region).</description>
3 <command interpreter="python"> 3 <command interpreter="python">
4 ../Java/Python/clusterizeBySlidingWindows.py -i $formatType.inputFileName 4 ../Java/Python/clusterizeBySlidingWindows.py -i $formatType.inputFileName
5 #if $formatType.FormatInputFileName == 'bed': 5 #if $formatType.FormatInputFileName == 'bed':
6 -f bed 6 -f bed
7 #elif $formatType.FormatInputFileName == 'gff': 7 #elif $formatType.FormatInputFileName == 'gff':
71 </conditional> 71 </conditional>
72 72
73 73
74 <param name="size" type="text" value="50000" label="Size option" help="Size of the regions."/> 74 <param name="size" type="text" value="50000" label="Size option" help="Size of the regions."/>
75 <param name="overlap" type="text" value="50" label="Overlap option" help="Overlap between two consecutive regions."/> 75 <param name="overlap" type="text" value="50" label="Overlap option" help="Overlap between two consecutive regions."/>
76 <param name="normalize" type="boolean" truevalue="-m" falsevalue="" checked="false" label="Normalize option for only GFF3 file format" help="This option normalizes (Warning!! Only for GFF3 file!)"/> 76 <param name="normalize" type="boolean" truevalue="-m" falsevalue="" checked="false" label="Normalize option for only GFF3 file format" help="(only work if the tag nbOccurrences is set)"/>
77 <param name="strands" type="boolean" truevalue="-2" falsevalue="" checked="false" label="strands option" help="Consider the two strands separately."/> 77 <param name="strands" type="boolean" truevalue="-2" falsevalue="" checked="false" label="Consider the two strands separately"/>
78 78
79 <conditional name="OptionTag"> 79 <conditional name="OptionTag">
80 <param name="tag" type="select" label="use a given tag as input (instead of summing number of features)"> 80 <param name="tag" type="select" label="Use a given tag as input (instead of summing number of features)">
81 <option value="Yes">Yes</option> 81 <option value="Yes">Yes</option>
82 <option value="No" selected="true">No</option> 82 <option value="No" selected="true">No</option>
83 </param> 83 </param>
84 <when value="Yes"> 84 <when value="Yes">
85 <param name="value" type="text" value="None" label="tag option" help="write a tag name you want to observe."/> 85 <param name="value" type="select" label="tag name"/>
86 </when> 86 </when>
87 <when value="No"> 87 <when value="No">
88 </when> 88 </when>
89 </conditional> 89 </conditional>
90 90
91 91
92 <conditional name="OptionsOperation"> 92 <conditional name="OptionsOperation">
93 <param name="operation" type="select" label="combine tag value with given operation [choice (sum, avg, med, min, max)]"> 93 <param name="operation" type="select" label="combine tag value with given operation">
94 <option value="Yes">Yes</option> 94 <option value="Yes">Yes</option>
95 <option value="No" selected="true">No</option> 95 <option value="No" selected="true">No</option>
96 </param> 96 </param>
97 <when value="Yes"> 97 <when value="Yes">
98 <param name="value" type="text" value="None" label="operation option" help="You can ONLY choose one of fowlling operation : sum, avg, med, min, max."/> 98 <param name="value" type="select" label="operation" help="You can ONLY choose one of following operation : sum, avg, med, min, max.">
99 <option value="sum">sum</option>
100 <option value="avg">average</option>
101 <option value="med">median</option>
102 <option value="min">minimum</option>
103 <option value="max">maximum</option>
104 </param>
99 </when> 105 </when>
100 <when value="No"> 106 <when value="No">
101 </when> 107 </when>
102 </conditional> 108 </conditional>
103 109
112 </when> 118 </when>
113 <when value="No"> 119 <when value="No">
114 </when> 120 </when>
115 </conditional> 121 </conditional>
116 122
117 <param name="strand" type="boolean" truevalue="-2" falsevalue="" checked="false" label="strand option" help="This option considers the two strands separately."/>
118 <param name="plot" type="boolean" truevalue="-p" falsevalue="" checked="false" label="plot option" help="This option creates a png file."/>
119 <param name="excel" type="boolean" truevalue="-x" falsevalue="" checked="false" label="excel option" help="This option creates a csv file."/>
120
121 </inputs> 123 </inputs>
122 124
123 <outputs> 125 <outputs>
124 <data name="outputFileGff" format="gff3"/> 126 <data name="outputFileGff" format="gff3"/>
125 <data name="excelOutput" format="csv">
126 <filter>excel</filter>
127 </data>
128 <data name="plotPng" format="png">
129 <filter>plot</filter>
130 </data>
131 </outputs> 127 </outputs>
128
129 <help>
130 Sliding windows are a convenient ways to clusterize data mapped on the genome. There are two important parameters of a sliding window: the size of the window and the size of the overlap.
131
132 By default, sliding windows count the number of reads in each window. However, you can basically merge any information which is contained in the tags. You can compute the average, sum, median, max or min of the tags for each window. For instance, every window can contain the average cluster size, if you merge clusters instead of reads.
133
134 The output file is a GFF3 file, where each element is a window. There is a special tag for each window, whose name is **nbElements** if you counted the number of transcripts per sliding window. However, if you performed a **min** (resp. **max**, **sum**, **median**, **average**) operation on the tags **value** of the transcripts, then the tag of the window will be **minValue** (resp. **maxValue**, **sumValue**, **medValue**, **avgValue**). You can also specify the name of your tag (which is actually advised: **nbReadsInSample1** will always be more informative than **nbElements**).
135
136 You also have different option, which can select the *n* % highest regions, or the regions with at least *n* features in it, or even the regions with at least *n* unique features. This last option is useful when you want to cluster the reads which have mapped only once, for instance.
137 </help>
132 </tool> 138 </tool>