Mercurial > repos > nml > refseq_masher
comparison contains.xml @ 1:2c1cb37a3ffe draft
planemo upload for repository https://github.com/phac-nml/refseq_masher commit e83730c5df84a960af58ff09db7eb09d11df2398
author | nml |
---|---|
date | Wed, 08 May 2019 09:15:56 -0400 |
parents | 26df66c32861 |
children | 1ec42f033bb4 |
comparison
equal
deleted
inserted
replaced
0:26df66c32861 | 1:2c1cb37a3ffe |
---|---|
1 <tool id="refseq_masher_contains" name="RefSeq Masher Contains" version="0.1.1"> | 1 <tool id="refseq_masher_contains" name="RefSeq Masher Contains" version="0.1.2"> |
2 <description> | 2 <description> |
3 Find NCBI RefSeq Genomes contained in your sequences | 3 Find NCBI RefSeq Genomes contained in your sequences |
4 </description> | 4 </description> |
5 <requirements> | 5 <requirements> |
6 <requirement type="package" version="0.1.1">refseq_masher</requirement> | 6 <requirement type="package" version="0.1.1">refseq_masher</requirement> |
7 </requirements> | 7 </requirements> |
8 <command detect_errors="exit_code"> | 8 <command detect_errors="exit_code"> |
9 <![CDATA[ | 9 <![CDATA[ |
10 | 10 |
11 #import re | 11 #import re |
12 #import os | |
12 | 13 |
13 #if $input.type == 'fasta' | 14 #if $input.type == 'fasta' |
14 #set $input_files = '"{}"'.format($input.fasta.name) | 15 #set $input_files = '"{}.fasta"'.format(os.path.splitext($input.fasta.name)[0]) |
15 ln -s "$input.fasta" $input_files && | 16 ln -s "$input.fasta" $input_files && |
16 #elif $input.type == 'paired' | 17 #elif $input.type == 'paired' |
17 #set $_forward_ext = '.fastq.gz' if $re.match(r'.*\.gz$', $input.forward.name) else '.fastq' | 18 #set $_forward_ext = '.fastq.gz' if $re.match(r'.*\.gz$', $input.forward.name) else '.fastq' |
18 #set $_forward = '"{}_1{}"'.format($re.sub(r'_[12]\..+$', '', $input.forward.name), $_forward_ext) | 19 #set $_forward = '"{}_1{}"'.format($re.sub(r'_[12]\..+$', '', $input.forward.name), $_forward_ext) |
19 #set $_reverse_ext = '.fastq.gz' if $re.match(r'.*\.gz$', $input.reverse.name) else '.fastq' | 20 #set $_reverse_ext = '.fastq.gz' if $re.match(r'.*\.gz$', $input.reverse.name) else '.fastq' |
169 =========================== | 170 =========================== |
170 | 171 |
171 Find what NCBI RefSeq genomes are contained within your sequence data using Mash_ with a Mash sketch database of 54,925 NCBI RefSeq Genomes. | 172 Find what NCBI RefSeq genomes are contained within your sequence data using Mash_ with a Mash sketch database of 54,925 NCBI RefSeq Genomes. |
172 | 173 |
173 | 174 |
174 Source code available on Github at https://github.com/phac-nml/refseq_masher | 175 Source code available on Github at github.com/phac-nml/refseq_masher |
175 | 176 |
176 | 177 |
177 `contains` - find what NCBI RefSeq Genomes are contained in your input sequences | 178 `contains` - find what NCBI RefSeq Genomes are contained in your input sequences |
178 -------------------------------------------------------------------------------- | 179 -------------------------------------------------------------------------------- |
179 | 180 |
215 Design, Setting and Patients Forty-five samples were selected from a set of fecal specimens obtained from patients with diarrhea during the 2011 outbreak of STEC O104:H4 in Germany. Samples were chosen to represent STEC-positive patients with a range of clinical conditions and colony counts together with a small number of patients with other infections (Campylobacter jejnuni, Clostridium difficile and Salmonella enterica). Samples were subjected to high-throughput sequencing on the Illumina MiSeq and HiSeq 2500, followed by bioinformatics analysis. | 216 Design, Setting and Patients Forty-five samples were selected from a set of fecal specimens obtained from patients with diarrhea during the 2011 outbreak of STEC O104:H4 in Germany. Samples were chosen to represent STEC-positive patients with a range of clinical conditions and colony counts together with a small number of patients with other infections (Campylobacter jejnuni, Clostridium difficile and Salmonella enterica). Samples were subjected to high-throughput sequencing on the Illumina MiSeq and HiSeq 2500, followed by bioinformatics analysis. |
216 | 217 |
217 | 218 |
218 We're going to download the FASTQ files for ERR260489_:: | 219 We're going to download the FASTQ files for ERR260489_:: |
219 | 220 |
220 wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR260/ERR260489/ERR260489_1.fastq.gz | 221 wget ftp.sra.ebi.ac.uk/vol1/fastq/ERR260/ERR260489/ERR260489_1.fastq.gz |
221 wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR260/ERR260489/ERR260489_2.fastq.gz | 222 wget ftp.sra.ebi.ac.uk/vol1/fastq/ERR260/ERR260489/ERR260489_2.fastq.gz |
222 | 223 |
223 | 224 |
224 We're going to run `refseq_masher` against these FASTQ files:: | 225 We're going to run `refseq_masher` against these FASTQ files:: |
225 | 226 |
226 refseq_masher -vv contains --top-n-results 50 -p 12 -o containment-ERR260489.tab ERR260489_1.fastq.gz ERR260489_2.fastq.gz | 227 refseq_masher -vv contains --top-n-results 50 -p 12 -o containment-ERR260489.tab ERR260489_1.fastq.gz ERR260489_2.fastq.gz |
293 | 294 |
294 Licensed under the Apache License, Version 2.0 (the "License"); you may not use | 295 Licensed under the Apache License, Version 2.0 (the "License"); you may not use |
295 this work except in compliance with the License. You may obtain a copy of the | 296 this work except in compliance with the License. You may obtain a copy of the |
296 License at: | 297 License at: |
297 | 298 |
298 http://www.apache.org/licenses/LICENSE-2.0 | 299 www.apache.org/licenses/LICENSE-2.0 |
299 | 300 |
300 Unless required by applicable law or agreed to in writing, software distributed | 301 Unless required by applicable law or agreed to in writing, software distributed |
301 under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR | 302 under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR |
302 CONDITIONS OF ANY KIND, either express or implied. See the License for the | 303 CONDITIONS OF ANY KIND, either express or implied. See the License for the |
303 specific language governing permissions and limitations under the License. | 304 specific language governing permissions and limitations under the License. |
307 | 308 |
308 **Gary van Domselaar**: gary.vandomselaar@phac-aspc.gc.ca | 309 **Gary van Domselaar**: gary.vandomselaar@phac-aspc.gc.ca |
309 | 310 |
310 | 311 |
311 | 312 |
312 .. _Mash: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0997-x | 313 .. _Mash: genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0997-x |
313 .. _SAMEA1877340: https://www.ebi.ac.uk/ena/data/view/SAMEA1877340 | 314 .. _SAMEA1877340: www.ebi.ac.uk/ena/data/view/SAMEA1877340 |
314 .. _PRJEB1775: https://www.ebi.ac.uk/ena/data/view/PRJEB1775 | 315 .. _PRJEB1775: www.ebi.ac.uk/ena/data/view/PRJEB1775 |
315 .. _ERR260489: https://www.ebi.ac.uk/ena/data/view/ERR260489&display=html | 316 .. _ERR260489: www.ebi.ac.uk/ena/data/view/ERR260489&display=html |
316 | 317 |
317 ]]> | 318 ]]> |
318 </help> | 319 </help> |
319 <citations> | 320 <citations> |
320 <!-- Citation for Mash paper --> | 321 <!-- Citation for Mash paper --> |