Mercurial > repos > peterjc > seq_filter_by_mapping
annotate tools/seq_filter_by_mapping/seq_filter_by_mapping.xml @ 3:481b0a925e66 draft
v0.0.6 Python 3 compatible print function
author | peterjc |
---|---|
date | Wed, 17 May 2017 09:24:01 -0400 |
parents | 48e71dfd51b3 |
children | f82868a026ea |
rev | line source |
---|---|
3 | 1 <tool id="seq_filter_by_mapping" name="Filter sequences by mapping" version="0.0.6"> |
0 | 2 <description>from SAM/BAM file</description> |
3 <requirements> | |
2
48e71dfd51b3
v0.0.5 Depend on Biopython 1.67 from Tool Shed or (Bio)conda
peterjc
parents:
1
diff
changeset
|
4 <requirement type="package" version="1.67">biopython</requirement> |
0 | 5 <requirement type="package" version="0.1.19">samtools</requirement> |
6 </requirements> | |
2
48e71dfd51b3
v0.0.5 Depend on Biopython 1.67 from Tool Shed or (Bio)conda
peterjc
parents:
1
diff
changeset
|
7 <version_command> |
48e71dfd51b3
v0.0.5 Depend on Biopython 1.67 from Tool Shed or (Bio)conda
peterjc
parents:
1
diff
changeset
|
8 python $__tool_directory__/seq_filter_by_mapping.py --version |
48e71dfd51b3
v0.0.5 Depend on Biopython 1.67 from Tool Shed or (Bio)conda
peterjc
parents:
1
diff
changeset
|
9 </version_command> |
48e71dfd51b3
v0.0.5 Depend on Biopython 1.67 from Tool Shed or (Bio)conda
peterjc
parents:
1
diff
changeset
|
10 <command detect_errors="aggressive"> |
48e71dfd51b3
v0.0.5 Depend on Biopython 1.67 from Tool Shed or (Bio)conda
peterjc
parents:
1
diff
changeset
|
11 python $__tool_directory__/seq_filter_by_mapping.py -i '$input_file' -f '$input_file.ext' -m $pair_mode |
0 | 12 #if $output_choice_cond.output_choice=="both" |
2
48e71dfd51b3
v0.0.5 Depend on Biopython 1.67 from Tool Shed or (Bio)conda
peterjc
parents:
1
diff
changeset
|
13 -p '$output_pos' -n '$output_neg' |
0 | 14 #elif $output_choice_cond.output_choice=="pos" |
2
48e71dfd51b3
v0.0.5 Depend on Biopython 1.67 from Tool Shed or (Bio)conda
peterjc
parents:
1
diff
changeset
|
15 -p '$output_pos' |
0 | 16 #elif $output_choice_cond.output_choice=="neg" |
2
48e71dfd51b3
v0.0.5 Depend on Biopython 1.67 from Tool Shed or (Bio)conda
peterjc
parents:
1
diff
changeset
|
17 -n '$output_neg' |
0 | 18 #end if |
19 ## Now loop over all the mapping files | |
2
48e71dfd51b3
v0.0.5 Depend on Biopython 1.67 from Tool Shed or (Bio)conda
peterjc
parents:
1
diff
changeset
|
20 #for i in $mapping_file#'${i}' #end for# |
0 | 21 </command> |
22 <inputs> | |
23 <param name="input_file" type="data" format="fasta,fastq,sff" label="Sequence file to be filtered" help="FASTA, FASTQ, or SFF format." /> | |
24 <param name="mapping_file" type="data" format="sam,bam" multiple="true" label="SAM/BAM mapping of those sequences" help="SAM or BAM format." /> | |
25 <conditional name="output_choice_cond"> | |
26 <param name="output_choice" type="select" label="Output mapped reads, unmapped reads, or both?"> | |
27 <option value="both">Both mapped and unmapped reads, as two files</option> | |
28 <option value="pos">Just mapped reads, as a single file</option> | |
29 <option value="neg">Just unmapped reads, as a single file</option> | |
30 </param> | |
31 <!-- Seems need these dummy entries here, compare this to indels/indel_sam2interval.xml --> | |
32 <when value="both" /> | |
33 <when value="pos" /> | |
34 <when value="neg" /> | |
35 </conditional> | |
36 <param name="pair_mode" type="select" label="Paired read treatment"> | |
37 <option value="lax" selected="true">Treat as a pair, allow either read to be mapped</option> | |
38 <option value="strict">Treat as a pair, require both reads to be mapped</option> | |
39 <!-- The following would actually be more work as have to store qname/1 and qname/2 separately for filter... | |
40 <option value="solo">Treat independently (will split partners when only one maps)</option> | |
41 --> | |
42 </param> | |
43 </inputs> | |
44 <outputs> | |
1 | 45 <data name="output_pos" format_source="input_file" metadata_source="input_file" label="$input_file.name (mapped)"> |
0 | 46 <filter>output_choice_cond["output_choice"] != "neg"</filter> |
47 </data> | |
1 | 48 <data name="output_neg" format_source="input_file" metadata_source="input_file" label="$input_file.name (unmapped)"> |
0 | 49 <filter>output_choice_cond["output_choice"] != "pos"</filter> |
50 </data> | |
51 </outputs> | |
52 <tests> | |
53 <test> | |
54 <param name="input_file" value="SRR639755_mito_pairs.fastq.gz" ftype="fastqsanger" /> | |
55 <param name="mapping_file" value="SRR639755_sample_by_coord.sam" ftype="sam" /> | |
56 <param name="pair_mode" value="lax" /> | |
57 <param name="output_choice" value="pos" /> | |
58 <output name="output_pos" file="SRR639755_sample_lax.fastq" ftype="fastqsanger" /> | |
59 </test> | |
60 <test> | |
61 <param name="input_file" value="SRR639755_mito_pairs.fastq.gz" ftype="fastqsanger" /> | |
62 <param name="mapping_file" value="SRR639755_sample_by_coord.sam" ftype="sam" /> | |
63 <param name="pair_mode" value="strict" /> | |
64 <param name="output_choice" value="pos" /> | |
65 <output name="output_pos" file="SRR639755_sample_strict.fastq" ftype="fastqsanger" /> | |
66 </test> | |
67 </tests> | |
68 <help> | |
69 **What it does** | |
70 | |
71 By default it divides a FASTA, FASTQ or Standard Flowgram Format (SFF) file in | |
72 two, those sequences (or read pairs) which do or don't map in the provided | |
73 SAM/BAM file. You can opt to have a single output file of just the mapping reads, | |
74 or just the non-mapping ones. | |
75 | |
76 **Example Usage** | |
77 | |
78 You might wish to perform a contamination screan by mapping your reads against | |
79 known contaminant reference sequences, then use this tool to select only the | |
80 unmapped reads for further analysis (e.g. *de novo* assembly). | |
81 | |
82 Similarly you might wish to map your reads against a known bacterial reference, | |
83 then take the non-mapping sequences forward for analysis if looking for novel | |
84 plasmids. | |
85 | |
86 | |
87 **References** | |
88 | |
89 If you use this Galaxy tool in work leading to a scientific publication please | |
90 cite: | |
91 | |
92 Peter J.A. Cock (2014), Galaxy tool for filtering reads by mapping | |
93 http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_mapping | |
94 | |
95 This tool uses Biopython to read and write SFF files, so you may also wish to | |
96 cite the Biopython application note (and Galaxy too of course): | |
97 | |
98 Cock et al (2009). Biopython: freely available Python tools for computational | |
99 molecular biology and bioinformatics. Bioinformatics 25(11) 1422-3. | |
100 http://dx.doi.org/10.1093/bioinformatics/btp163 pmid:19304878. | |
101 | |
102 This tool is available to install into other Galaxy Instances via the Galaxy | |
103 Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_mapping | |
104 </help> | |
105 <citations> | |
106 <citation type="doi">10.1093/bioinformatics/btp163</citation> | |
107 </citations> | |
108 </tool> |