comparison PEsortedSAM2readprofile.xml @ 2:d5ed5c2e25c3 draft

Uploaded
author arkarachai-fungtammasan
date Wed, 22 Apr 2015 12:48:40 -0400
parents 07588b899c13
children 35aedbe548b9
comparison
equal deleted inserted replaced
1:f2bab38e3cbd 2:d5ed5c2e25c3
1 <tool id="PEsortedSAM2readprofile" name="Combine mapped flaked bases" version="1.0.0"> 1 <tool id="PEsortedSAM2readprofile" name="Combine mapped faux paired-end reads" version="1.0.0">
2 <description> from SAM file sorted by readname </description> 2 <description> and get the reference STR allele from the reference genome </description>
3 <command interpreter="python2.7">PEsortedSAM2readprofile.py $flankedbasesSAM $twobitref $maxTRlength $maxoriginalreadlength $output </command> 3 <command interpreter="python2.7">PEsortedSAM2readprofile.py $flankedbasesSAM $twobitref $maxTRlength $maxoriginalreadlength $output </command>
4 4
5 <inputs> 5 <inputs>
6 <param name="flankedbasesSAM" type="data" format="sam" label="Select sorted SAM file (by readname) of flaked bases" /> 6 <param name="flankedbasesSAM" type="data" format="sam" label="Select sorted SAM file (by readname) of flaked bases" />
7 <param name="twobitref" type="data" label="Select twobit file reference genome" /> 7 <param name="twobitref" type="data" label="Select twobit file reference genome" />
28 28
29 .. class:: infomark 29 .. class:: infomark
30 30
31 **What it does** 31 **What it does**
32 32
33 - This tool will take SAM file sorted by read name, remove unpaired reads, report microsatellites sequences in the reference genome that correspond to the space between paired end reads. Coordinate of start and stop for left and right flanking regions of microsatellites and microsatellite itself as inferred from paired end reads will also be reported. 33 - This tool will take SAM file (sorted by read name), remove unpaired reads, and combine paired faux read-pairs into a single row. It also reports Short Tandem Repeats (STRs) sequences in the reference genome that correspond to the space between the faux paired end reads and the coordinate of start and stop for left and right flanking regions of STRs.
34 - These microsatellites in reference can be used to filter out reads that do not contain microsatellites that concur with microsatellites in reference where the reads mapped to.
35 34
36 **Citation** 35 **Citation**
37 36
38 When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research** 37 When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research**
39 38
40 **Input** 39 **Input**
41 40
42 - Sorted SAM files by read name 41 - Sorted SAM files by read name.
43 42
44 **Output** 43 **Output**
45 44
46 The output will combined two lines of input which are paired. The output format is as follow. 45 The output will combine the two faux paired-end read lines of input ito the following single line format:
47 46
48 - Column 1 = read name 47 - Column 1 = read name
49 - Column 2 = chromosome 48 - Column 2 = chromosome
50 - Column 3 = left flanking region start 49 - Column 3 = left flanking region start
51 - Column 4 = left flanking region stop 50 - Column 4 = left flanking region stop
52 - Column 5 = microsatellite start 51 - Column 5 = STR start
53 - Column 6 = microsatellite stop 52 - Column 6 = STR stop
54 - Column 7 = right flanking region start 53 - Column 7 = right flanking region start
55 - Column 8 = right flanking region stop 54 - Column 8 = right flanking region stop
56 - Column 9 = microsatellite length in reference 55 - Column 9 = STR length in reference
57 - Column 10= microsatellite sequence in reference 56 - Column 10= STR sequence in reference
58 57
59 58
60 59
61 </help> 60 </help>
62 </tool> 61 </tool>