comparison PEsortedSAM2readprofile.xml @ 0:07588b899c13 draft

Uploaded
author arkarachai-fungtammasan
date Wed, 01 Apr 2015 17:05:51 -0400
parents
children d5ed5c2e25c3
comparison
equal deleted inserted replaced
-1:000000000000 0:07588b899c13
1 <tool id="PEsortedSAM2readprofile" name="Combine mapped flaked bases" version="1.0.0">
2 <description> from SAM file sorted by readname </description>
3 <command interpreter="python2.7">PEsortedSAM2readprofile.py $flankedbasesSAM $twobitref $maxTRlength $maxoriginalreadlength $output </command>
4
5 <inputs>
6 <param name="flankedbasesSAM" type="data" format="sam" label="Select sorted SAM file (by readname) of flaked bases" />
7 <param name="twobitref" type="data" label="Select twobit file reference genome" />
8 <param name="maxTRlength" type="integer" value="100" label="Maximum expected microsatellite length (bp)" />
9 <param name="maxoriginalreadlength" type="integer" value="101" label="Maxinum original read length" />
10
11 </inputs>
12 <outputs>
13 <data name="output" format="tabular" />
14 </outputs>
15 <tests>
16 <!-- Test data with valid values -->
17 <test>
18 <param name="flankedbasesSAM" value="samplesortedPESAM_C.sam"/>
19 <param name="twobitref" value="shifted.2bit"/>
20 <param name="maxTRlength" value="100"/>
21 <param name="maxoriginalreadlength" value="250"/>
22 <output name="output" file="samplePESAM_2_profile_C.txt"/>
23 </test>
24
25 </tests>
26 <help>
27
28
29 .. class:: infomark
30
31 **What it does**
32
33 - This tool will take SAM file sorted by read name, remove unpaired reads, report microsatellites sequences in the reference genome that correspond to the space between paired end reads. Coordinate of start and stop for left and right flanking regions of microsatellites and microsatellite itself as inferred from paired end reads will also be reported.
34 - These microsatellites in reference can be used to filter out reads that do not contain microsatellites that concur with microsatellites in reference where the reads mapped to.
35
36 **Citation**
37
38 When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research**
39
40 **Input**
41
42 - Sorted SAM files by read name
43
44 **Output**
45
46 The output will combined two lines of input which are paired. The output format is as follow.
47
48 - Column 1 = read name
49 - Column 2 = chromosome
50 - Column 3 = left flanking region start
51 - Column 4 = left flanking region stop
52 - Column 5 = microsatellite start
53 - Column 6 = microsatellite stop
54 - Column 7 = right flanking region start
55 - Column 8 = right flanking region stop
56 - Column 9 = microsatellite length in reference
57 - Column 10= microsatellite sequence in reference
58
59
60
61 </help>
62 </tool>