Mercurial > repos > boris > getalleleseq
comparison getalleleseq.xml @ 0:c542b3075f29 draft
Uploaded repo.tar.gz
author | boris |
---|---|
date | Mon, 03 Feb 2014 13:07:13 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:c542b3075f29 |
---|---|
1 <tool id="getalleleseq" name="FASTA from allele counts" version="0.0.1" force_history_refresh="True"> | |
2 <description>Generate major and minor allele sequences from alleles table</description> | |
3 <command interpreter="python">getalleleseq.py | |
4 $alleles | |
5 -l $seq_length | |
6 -j $major_seq | |
7 -d $__new_file_path__ | |
8 -p $major_seq.id | |
9 </command> | |
10 <inputs> | |
11 <param format="tabular" name="alleles" type="data" label="Table containing major and minor alleles base per position" help="must be tabular and follow the Variant Annotator tool output format"/> | |
12 <param name="seq_length" type="integer" value="16569" label="Background sequence length" help="e.g. 16569 for mitochondrial variants"/> | |
13 </inputs> | |
14 <outputs> | |
15 <data format="fasta" name="major_seq"/> | |
16 </outputs> | |
17 <tests> | |
18 <test> | |
19 <param name="alleles" value="test-table-getalleleseq.tab"/> | |
20 <param name="seq_length" value="16569"/> | |
21 <output name="major_seq" file="test-major-allele-out-getalleleseq.fa"/> | |
22 </test> | |
23 </tests> | |
24 | |
25 <help> | |
26 | |
27 | |
28 The major allele sequence of a sample is simply the sequence consisting of the most frequent nucleotide per position. | |
29 Replacing the major allele for the second most frequent allele at diploid positions generates the minor allele sequence. | |
30 | |
31 ----- | |
32 | |
33 .. class:: infomark | |
34 | |
35 **What it does** | |
36 | |
37 It takes the table generated from the Variant Annotator tool to derive a major and minor allele sequence per sample. | |
38 Since all sequences share the same length all the major allele sequences are included into a single file (with proper headers per sample) | |
39 to create a multiple sequence alignment in FASTA format that can be used for downstream phylogenetic analyses. | |
40 In contrast, the minor allele sequences are informed as single FASTA files per sample to ease their downstream manipulation. | |
41 | |
42 ----- | |
43 | |
44 .. class:: warningmark | |
45 | |
46 **Note** | |
47 | |
48 Please, follow the format described below for the input file: | |
49 | |
50 ----- | |
51 | |
52 .. class:: infomark | |
53 | |
54 **Formats** | |
55 | |
56 **Variant Annotator tool output format** | |
57 | |
58 Columns:: | |
59 | |
60 1. sample id | |
61 2. chromosome | |
62 3. position | |
63 4 counts for A's | |
64 5. counts for C's | |
65 6. counts for G's | |
66 7. counts for T's | |
67 8. Coverage | |
68 9. Number of alleles passing frequency threshold | |
69 10. Major allele | |
70 11. Minor allele | |
71 12. Minor allele frequency in position | |
72 | |
73 | |
74 **FASTA multiple alignment** | |
75 | |
76 See http://www.bioperl.org/wiki/FASTA_multiple_alignment_format | |
77 | |
78 ----- | |
79 | |
80 **Example** | |
81 | |
82 - For the following dataset:: | |
83 | |
84 S9 chrM 3 3 0 2 214 219 0 T A 0.013698630137 | |
85 S9 chrM 4 3 249 3 0 255 0 C N 0.0 | |
86 S9 chrM 5 245 1 1 0 247 1 A N 0.0 | |
87 S11 chrM 6 0 292 0 0 292 1 C . 0.0 | |
88 S7 chrM 6 0 254 0 0 254 1 C . 0.0 | |
89 S9 chrM 6 2 306 2 0 310 0 C N 0.0 | |
90 S11 chrM 7 281 0 3 0 284 0 A G 0.0105633802817 | |
91 S7 chrM 7 249 0 2 0 251 1 A G 0.00796812749004 | |
92 etc. for all covered positions per sample... | |
93 | |
94 - Running this tool with background sequence length 16569 will produce 4 files:: | |
95 | |
96 1. Multiple alignment FASTA file containing the major allele sequences of samples S7, S9 and S11 | |
97 2. minor allele sequence of sample S7 | |
98 3. minor allele sequence of sample S9 | |
99 4. minor allele sequence of sample S11 | |
100 | |
101 ----- | |
102 | |
103 **Citation** | |
104 | |
105 If you use this tool, please cite Dickins B, Rebolledo-Jaramillo B, et al. *In preparation.* | |
106 (boris-at-bx.psu.edu) | |
107 | |
108 </help> | |
109 </tool> |