comparison fetchflank.xml @ 2:d5ed5c2e25c3 draft

Uploaded
author arkarachai-fungtammasan
date Wed, 22 Apr 2015 12:48:40 -0400
parents 07588b899c13
children
comparison
equal deleted inserted replaced
1:f2bab38e3cbd 2:d5ed5c2e25c3
1 <tool id="fetchflank" name="Fetch flanking bases" version="1.0.0"> 1 <tool id="fetchflank" name="Fetch bases flanking" version="1.0.0">
2 <description> of microsatellites and output as two fastq files in forward-forward orientation</description> 2 <description> the STRs in the reads and output two fastq files in forward-forward orientation</description>
3 <command interpreter="python">pair_fetch_DNA_ff.py $microsat_in_read $Leftflanking $Rightflanking $qualitycutoff $lengthofbasetocheckquality </command> 3 <command interpreter="python">pair_fetch_DNA_ff.py $microsat_in_read $Leftflanking $Rightflanking $qualitycutoff $lengthofbasetocheckquality </command>
4 4
5 <inputs> 5 <inputs>
6 <param name="microsat_in_read" type="data" label="Select data of microsatellites in reads" /> 6 <param name="microsat_in_read" type="data" label="Select data of microsatellites in reads" />
7 <param name="qualitycutoff" type="integer" value="20" label="Minimum quality score (Phred+33) for microsatellites and flanking regions" /> 7 <param name="qualitycutoff" type="integer" value="20" label="Minimum quality score (Phred+33) for microsatellites and flanking regions" />
27 27
28 .. class:: infomark 28 .. class:: infomark
29 29
30 **What it does** 30 **What it does**
31 31
32 This tool will fetch flanking regions around microsatellites, screen for quality score at microsatellites and adjacent flanking regions, and output two fastq files containing flanking regions in forward-forward direction. 32 This tool will fetch flanking regions around STRs from the reads output by "STR detection" step, screen for quality score at STRs and adjacent flanking regions, and output two fastq files containing flanking regions in forward-forward direction.
33 33
34 - This tool assumes that the quality score is Phred+33, such as Sanger fastq. 34 - This tool assumes that the quality score is Phred+33, such as Sanger fastq.
35 - Reads that have either left or right flanking regions shorter than the length of flanking regions that require quality screening will be removed. 35 - Reads that have either left or right flanking regions shorter than the length of flanking regions that require quality screening will be removed.
36 36
37 **Citation** 37 **Citation**
38 When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research** 38 When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research**
39 39
40 **Input** 40 **Input**
41 41
42 The input files need to be in the same format as output from **microsatellite detection program**. This format contains **length of repeat**, **length of left flanking region**, **length of right flanking region**, **repeat motif**, **hamming (editing) distance**, **read name**, **read sequence**, **read quality score** 42 The input file needs to be in the same format as output from **STR detection** step. This format contains **length of repeat**, **length of left flanking region**, **length of right flanking region**, **repeat motif**, **hamming (editing) distance**, **read name**, **read sequence**, **read quality score**
43 43
44 **Output** 44 **Output**
45 45
46 The output will be the two fastq files. The first file contains left flank regions. The second file contains right flanking regions. 46 The output will be two fastq files. The first file contains left flanking bases. The second file contains right flanking bases.
47 47
48 **Example** 48 **Example**
49 49
50 - Suppose we detected the microsatellites from short reads :: 50 - Starting with this test input ::
51 51
52 6 40 54 G 0 SRR345592.75000006 HS2000-192_107:1:63:5822:176818_1_per1_1 TACCCTCCTGTCTTCCCAGACTGATTTCTGTTCCTGCCCTggggggTTCTTGACTCCTCTGAATGGGTACGGGAGTGTGGACCTCAGGGAGGCCCCCTTG GGGGGGGGGGGGGGGGGFGGGGGGGGGFEGGGGGGGGGGG?FFDFGGGGGG?FFFGGGGGDEGGEFFBEFCEEBD@BACB*?=99(/=5'6=4:CCC*AA 52 6 40 54 G 0 SRR345592.75000006 HS2000-192_107:1:63:5822:176818_1_per1_1 TACCCTCCTGTCTTCCCAGACTGATTTCTGTTCCTGCCCTggggggTTCTTGACTCCTCTGAATGGGTACGGGAGTGTGGACCTCAGGGAGGCCCCCTTG GGGGGGGGGGGGGGGGGFGGGGGGGGGFEGGGGGGGGGGG?FFDFGGGGGG?FFFGGGGGDEGGEFFBEFCEEBD@BACB*?=99(/=5'6=4:CCC*AA
53 53
54 54
55 - We want to get fastq files of flanking regions around microsatellite with quality score at least 20 on Phred +33 55 - If we want to get fastq files of flanking regions around the detected STRs with quality score of at least 20, the program will report these two fastq files ::
56
57 - Then the program will report these two fastq files ::
58 56
59 @SRR345592.75000006 HS2000-192_107:1:63:5822:176818_1_per1_1 57 @SRR345592.75000006 HS2000-192_107:1:63:5822:176818_1_per1_1
60 TACCCTCCTGTCTTCCCAGACTGATTTCTGTTCCTGCCCT 58 TACCCTCCTGTCTTCCCAGACTGATTTCTGTTCCTGCCCT
61 +SRR345592.75000006 HS2000-192_107:1:63:5822:176818_1_per1_1 59 +SRR345592.75000006 HS2000-192_107:1:63:5822:176818_1_per1_1
62 GGGGGGGGGGGGGGGGGFGGGGGGGGGFEGGGGGGGGGGG 60 GGGGGGGGGGGGGGGGGFGGGGGGGGGFEGGGGGGGGGGG