0
|
1 <tool id ="pyCalculateMutationFrequencies" name="pyCalculateMutationFrequencies">
|
|
2 <requirements>
|
|
3 <requirement type="package">pyCRAC</requirement>
|
|
4 </requirements>
|
|
5 <command interpreter="python">
|
|
6 /usr/local/bin/pyCalculateMutationFrequencies.py
|
|
7 -r $readdatafile
|
|
8 -i $intervaldatafile
|
|
9 -c $addChr.chr
|
|
10 -o $output
|
|
11 --mutsfreq $mutsfreq
|
|
12 </command>
|
|
13 <version_command>/usr/local/bin/pyCalculateMutationFrequencies.py --version</version_command>
|
|
14 <inputs>
|
|
15 <param format="gff" name="readdatafile" type="data" label="GFF Reads File --readdatafile" help="GFF file containing read data" />
|
|
16 <param format="gtf" name="intervaldatafile" type="data" label="GFF Interval File --intervaldatafile" help="GFF file containing interval co-ordinates"/>
|
|
17 <conditional name="addChr">
|
|
18 <param name="chrfile" type="select" label="Choose Chromosome length file from">
|
|
19 <option value="default" selected="true">Defaults</option>
|
|
20 <option value="other">History</option>
|
|
21 </param>
|
|
22 <when value="default">
|
|
23 <param name="chr" type="select" label="Chromosome length file -c" help="This file should have two columns: first column is the names of the chromosomes, second column is length of the chromosomes.Use pyCrac utility pyCalculateChromosomeLengths to create.">
|
|
24 <options from_data_table="pycrac_chr"/>
|
|
25 </param>
|
|
26 </when>
|
|
27 <when value="other">
|
|
28 <param format="tabular" name="chr" type="data" label="Chromosome length file -c" help="This file should have two columns: first column is the names of the chromosomes, second column is length of the chromosomes"/>
|
|
29 </when>
|
|
30 </conditional>
|
|
31
|
|
32 <param format="integer" name="mutsfreq" type="integer" label="Minimum mutation frequency --mutsfreq " value="0" size="10" help="sets the minimal mutations frequency for an interval that you want to have written to our output file">
|
|
33 <validator type="in_range" min="0" message="Please enter a value >= 0"/>
|
|
34 </param>
|
|
35 <param name="label" type="text" format="txt" size="30" value="pyCalculateMutationFrequencies" label="Enter output file label -o" />
|
|
36 </inputs>
|
|
37 <outputs>
|
|
38 <data format="gtf" name="output" label="${label.value}.gtf"/>
|
|
39 </outputs>
|
|
40 <help>
|
|
41
|
|
42 .. class:: infomark
|
|
43
|
|
44 **pyCalculateMutationFrequencies**
|
|
45
|
|
46 pyCalculateMutationFrequencies is part of the pyCRAC_ package. Takes an interval file and a pyReadCounters GTF file and calculates (cross-linking induced) mutation frequencies fore each interval.
|
|
47 This tool can be used to calculate mutation frequencies for significant intervals (pyCalculateFDRs output file) or over-represented motifs (pyMotif GTF output file).
|
|
48 It expects a pyCRAC GTF count_output_reads.gtf file and a GTF file with the intervals.
|
|
49
|
|
50 For example::
|
|
51
|
|
52 This pyCalculateFDRs GTF output file::
|
|
53
|
|
54 ##gff-version 2
|
|
55 # generated by pyCalculateFDRs version 0.0.3, Sat Jun 1 21:16:23 2013
|
|
56 # pyCalculateFDRs.py -f test_count_output_reads.gtf -r 200 -o test_count_output_FDRs_005.gtf -v -m 0.05
|
|
57 # chromosome feature source start end minimal_coverage strand . attributes
|
|
58 chrII protein_coding exon 203838 203887 3 + . gene_id "YBL011W"; gene_name "SCT1";
|
|
59 chrII intergenic_region exon 407669 407708 3 + . gene_id "INT_0_445"; gene_name "INT_0_445";
|
|
60 chrII intergenic_region exon 585158 585195 2 + . gene_id "INT_0_562"; gene_name "INT_0_562";
|
|
61 chrII protein_coding exon 372390 372433 4 - . gene_id "YBR067C"; gene_name "TIP1";
|
|
62 chrII intergenic_region exon 380754 380815 6 - . gene_id "INT_0_431"; gene_name "INT_0_431";
|
|
63 chrIII protein_coding exon 138001 138044 5 + . gene_id "YCR012W"; gene_name "PGK1";
|
|
64 chrIII intergenic_region exon 227997 228036 5 + . gene_id "INT_0_885"; gene_name "INT_0_885";
|
|
65 chrIII intergenic_region exon 227997 228037 4 + . gene_id "INT_0_887"; gene_name "INT_0_887";
|
|
66 chrIII tRNA exon 227997 228037 4 + . gene_id "tS(CGA)C"; gene_name "SUP61";
|
|
67
|
|
68 Will be converted into::
|
|
69
|
|
70 ##gff-version 2
|
|
71 # generated by pyCalculateFDRs version 0.0.3, Sat Jun 1 21:16:23 2013
|
|
72 # /Library/Frameworks/EPD64.framework/Versions/Current/bin/pyCalculateFDRs.py -f test_count_output_reads.gtf -r 200 -o test_count_output_FDRs_005.gtf -v -m 0.05
|
|
73 # chromosome feature source start end minimal_coverage strand . attributes
|
|
74 chrII protein_coding exon 203838 203887 3 + . gene_id "YBL011W"; gene_name "SCT1"; # 203882D33.3,203883D33.3,203884D33.3;
|
|
75 chrII intergenic_region exon 407669 407708 3 + . gene_id "INT_0_445"; gene_name "INT_0_445"; # 407680D33.3,407681D33.3;
|
|
76 chrII intergenic_region exon 585158 585195 2 + . gene_id "INT_0_562"; gene_name "INT_0_562"; # 585171D100.0,585172D100.0,585173D100.0;
|
|
77 chrII protein_coding exon 372390 372433 4 - . gene_id "YBR067C"; gene_name "TIP1"; # 372412D50.0,372413D50.0;
|
|
78 chrII intergenic_region exon 380754 380815 6 - . gene_id "INT_0_431"; gene_name "INT_0_431"; # 380786D90.2,380787D90.2;
|
|
79 chrIII protein_coding exon 138001 138044 5 + . gene_id "YCR012W"; gene_name "PGK1"; # 138025D40.0,138026D30.0,138027D40.0;
|
|
80 chrIII intergenic_region exon 227997 228036 5 + . gene_id "INT_0_885"; gene_name "INT_0_885"; # 228006D85.7,228007D100.0;
|
|
81 chrIII intergenic_region exon 227997 228037 4 + . gene_id "INT_0_887"; gene_name "INT_0_887"; # 228006D85.7,228007D100.0;
|
|
82 chrIII tRNA exon 227997 228037 4 + . gene_id "tS(CGA)C"; gene_name "SUP61"; # 228006D85.7,228007D100.0;
|
|
83
|
|
84
|
|
85 The hash character at the end of each line (#) shows chromosomal coordinates of mutated nucleotides within the cluster interval and their mutation frequencies.
|
|
86
|
|
87 For example::
|
|
88
|
|
89 # 228007D100.0
|
|
90
|
|
91 indicates that 100% of the nucleotides in position 228007 were deleted in the interval.
|
|
92
|
|
93 By setting the --mutsfreq flag you can set a limit for the lowest mutation frequency that you want to have reported.
|
|
94 This makes it relatively easy to select those significant regions that have nucleotides with high mutation frequencies.
|
|
95
|
|
96 .. _pyCRAC: http://sandergranneman.bio.ed.ac.uk/Granneman_Lab/pyCRAC_software.html
|
|
97
|
|
98 ------
|
|
99
|
|
100 **Parameter list**
|
|
101
|
|
102 Options::
|
|
103
|
|
104 -i intervals.gtf, --intervaldatafile=intervals.gtf
|
|
105 provide the path to your GTF interval data file.
|
|
106 -r reads.gtf, --readdatafile=reads.gtf
|
|
107 provide the path to your GTF read data file.
|
|
108 -c yeast.txt, --chromfile=yeast.txt
|
|
109 Location of the chromosome info file. This file should
|
|
110 have two columns: first column is the names of the
|
|
111 chromosomes, second column is length of the
|
|
112 chromosomes. Default is yeast
|
|
113 -o intervals_with_muts.gtf, --output_file=intervals_with_muts.gtf
|
|
114 provide a name for an output file. By default it
|
|
115 writes to the standard output
|
|
116 --mutsfreq=10, --mutationfrequency=10
|
|
117 sets the minimal mutations frequency for an interval
|
|
118 that you want to have written to our output file.
|
|
119 Default = 0%. Example: if the mutsfrequency is set at
|
|
120 10 and an interval position has a mutated in less than
|
|
121 10% of the reads,then the mutation will not be
|
|
122 reported.
|
|
123
|
|
124
|
|
125 </help>
|
|
126 </tool> |