0
|
1 <tool id="categorize_elements_satisfying_criteria" name="Categorize Elements" version="1.0.0">
|
|
2 <description>satisfying criteria</description>
|
|
3
|
|
4 <command interpreter="perl">
|
|
5 categorize_elements_satisfying_criteria.pl $inputFile1 $inputFile2 $outputFile1
|
|
6 </command>
|
|
7
|
|
8 <inputs>
|
|
9 <param format="tabular" name="inputFile1" type="data" label="Select file containing categories and their elements"/>
|
|
10 <param format="tabular" name="inputFile2" type="data" label="Select file containing criteria and elements data"/>
|
|
11 </inputs>
|
|
12
|
|
13 <outputs>
|
|
14 <data format="tabular" name="outputFile1"/>
|
|
15 </outputs>
|
|
16
|
|
17 <tests>
|
|
18 <test>
|
|
19 <param name="inputFile1" value="categories.tabular" ftype="tabular" />
|
|
20 <param name="inputFile2" value="criteria_elements_data.tabular" ftype="tabular" />
|
|
21 <output name="outputFile1" file="categorized_elements.tabular" />
|
|
22 </test>
|
|
23 </tests>
|
|
24
|
|
25
|
|
26 <help>
|
|
27
|
|
28 .. class:: infomark
|
|
29
|
|
30 **What it does**
|
|
31
|
|
32 The program takes as input a set of categories, such that each category contains many elements. It also takes a table relating elements with criteria, such that each element is assigned a number representing the number of times the element satisfies a certain criterion.
|
|
33
|
|
34 - The first input is a TABULAR format file, such that the left column represents the names of categories and, all other columns represent the names of elements in each category.
|
|
35 - The second input is a TABULAR format file relating elements with criteria, such that the first line represents the names of criteria and the left column represents the names of elements.
|
|
36 - The output is a TABULAR format file relating catergories with criteria, such that each categoy is assigned a number representing the total number of times its elements satisfies a certain criterion.. Each category is assigned as many numbers as criteria.
|
|
37
|
|
38
|
|
39 **Example**
|
|
40
|
|
41 Let the first input file be a group of motif categories as follows::
|
|
42
|
|
43 Deletion_Hotspots deletionHoptspot1 deletionHoptspot2 deletionHoptspot3
|
|
44 Dna_Pol_Pause_Frameshift dnaPolPauseFrameshift1 dnaPolPauseFrameshift2 dnaPolPauseFrameshift3 dnaPolPauseFrameshift4
|
|
45 Indel_Hotspots indelHotspot1
|
|
46 Insertion_Hotspots insertionHotspot1 insertionHotspot2
|
|
47 Topoisomerase_Cleavage_Sites topoisomeraseCleavageSite1 topoisomeraseCleavageSite2 topoisomeraseCleavageSite3
|
|
48
|
|
49
|
|
50 And let the second input file represent the number of times each motif occurs in a certain window size of indel flanking regions, as follows::
|
|
51
|
|
52 10bp 20bp 40bp
|
|
53 deletionHoptspot1 1 1 2
|
|
54 deletionHoptspot2 1 1 1
|
|
55 deletionHoptspot3 0 0 0
|
|
56 dnaPolPauseFrameshift1 1 1 1
|
|
57 dnaPolPauseFrameshift2 0 2 1
|
|
58 dnaPolPauseFrameshift3 0 0 0
|
|
59 dnaPolPauseFrameshift4 0 1 2
|
|
60 indelHotspot1 0 0 0
|
|
61 insertionHotspot1 0 0 1
|
|
62 insertionHotspot2 1 1 1
|
|
63 topoisomeraseCleavageSite1 1 1 1
|
|
64 topoisomeraseCleavageSite2 1 2 1
|
|
65 topoisomeraseCleavageSite3 0 0 2
|
|
66
|
|
67 Running the program will give the total number of times the motifs of each category occur in every window size of indel flanking regions::
|
|
68
|
|
69 10bp 20bp 40bp
|
|
70 Deletion_Hotspots 2 2 3
|
|
71 Dna_Pol_Pause_Frameshift 1 4 4
|
|
72 Indel_Hotspots 0 0 0
|
|
73 Insertion_Hotspots 1 1 2
|
|
74 Topoisomerase_Cleavage_Sites 2 3 4
|
|
75
|
|
76 </help>
|
|
77
|
|
78 </tool>
|