0
|
1 Galaxy tool to filter FASTA, FASTQ or SFF sequences by SAM/BAM mapping
|
|
2 ======================================================================
|
|
3
|
1
|
4 This tool is copyright 2014-2015 by Peter Cock, The James Hutton Institute
|
0
|
5 (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved.
|
|
6 See the licence text below.
|
|
7
|
|
8 This tool is a short Python script (using Biopython library functions) which
|
|
9 divides a FASTA, FASTQ, or SFF file in two, those sequences which do or do
|
|
10 not map according to given SAM/BAM file(s).
|
|
11
|
|
12 Example uses include mapping of FASTQ reads against a known contaminant
|
|
13 in order to remove reads prior to a de novo assembly.
|
|
14
|
|
15 This tool is available from the Galaxy Tool Shed at:
|
|
16
|
|
17 * http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_mapping
|
|
18
|
|
19 See also related tools:
|
|
20
|
|
21 * http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id
|
|
22 * http://toolshed.g2.bx.psu.edu/view/peterjc/seq_select_by_id
|
|
23 * http://toolshed.g2.bx.psu.edu/view/peterjc/seq_rename
|
|
24
|
|
25
|
|
26 Automated Installation
|
|
27 ======================
|
|
28
|
|
29 This should be straightforward using the Galaxy Tool Shed, which should be
|
|
30 able to automatically install the dependency on Biopython and samtools
|
|
31 and then install this tool and run its unit tests.
|
|
32
|
|
33
|
|
34 Manual Installation
|
|
35 ===================
|
|
36
|
|
37 There are just two files to install to use this tool from within Galaxy:
|
|
38
|
|
39 * ``seq_filter_by_mapping.py`` (the Python script)
|
|
40 * ``seq_filter_by_mapping.xml`` (the Galaxy tool definition)
|
|
41
|
|
42 The suggested location is a dedicated ``tools/seq_filter_by_mapping/`` folder.
|
|
43
|
|
44 You will also need to modify the ``tools_conf.xml`` file to tell Galaxy to offer the
|
|
45 tool. One suggested location is in the filters section. Simply add the line::
|
|
46
|
|
47 <tool file="seq_filter_by_mapping/seq_filter_by_mapping.xml" />
|
|
48
|
|
49 If you wish to run the unit tests, also move/copy the ``test-data/`` files
|
|
50 under Galaxy's ``test-data/`` folder. Then::
|
|
51
|
|
52 $ ./run_tests.sh -id seq_filter_by_mapping
|
|
53
|
|
54 You will also need to install Biopython 1.54 or later. That's it.
|
|
55
|
|
56
|
|
57 History
|
|
58 =======
|
|
59
|
|
60 ======= ======================================================================
|
|
61 Version Changes
|
|
62 ------- ----------------------------------------------------------------------
|
|
63 v0.0.1 - Initial version.
|
|
64 v0.0.2 - Fixed some error messages.
|
1
|
65 v0.0.3 - Report counts for FASTQ as done for FASTA and SFF files.
|
|
66 v0.0.4 - Use the ``format_source=...`` tag.
|
|
67 - Reorder XML elements (internal change only).
|
|
68 - Planemo for Tool Shed upload (``.shed.yml``, internal change only).
|
0
|
69 ======= ======================================================================
|
|
70
|
|
71
|
|
72 Developers
|
|
73 ==========
|
|
74
|
|
75 Development is on this GitHub repository:
|
|
76 https://github.com/peterjc/pico_galaxy/tree/master/tools/seq_filter_by_mapping
|
|
77
|
|
78 Much of the code was copied from my older tool:
|
|
79 https://github.com/peterjc/pico_galaxy/tree/master/tools/seq_filter_by_id
|
|
80
|
1
|
81 For pushing a release to the test or main "Galaxy Tool Shed", use the following
|
|
82 Planemo commands (which requires you have set your Tool Shed access details in
|
|
83 ``~/.planemo.yml`` and that you have access rights on the Tool Shed)::
|
|
84
|
|
85 $ planemo shed_upload --shed_target testtoolshed --check_diff ~/repositories/pico_galaxy/tools/seq_filter_by_mapping/
|
|
86 ...
|
|
87
|
|
88 or::
|
0
|
89
|
1
|
90 $ planemo shed_upload --shed_target toolshed --check_diff ~/repositories/pico_galaxy/tools/seq_filter_by_mapping/
|
|
91 ...
|
|
92
|
|
93 To just build and check the tar ball, use::
|
0
|
94
|
1
|
95 $ planemo shed_upload --tar_only ~/repositories/pico_galaxy/tools/seq_filter_by_mapping/
|
|
96 ...
|
|
97 $ tar -tzf shed_upload.tar.gz
|
|
98 test-data/SRR639755_mito_pairs.fastq.gz
|
|
99 test-data/SRR639755_sample_by_coord.sam
|
|
100 test-data/SRR639755_sample_lax.fastq
|
|
101 test-data/SRR639755_sample_strict.fastq
|
0
|
102 tools/seq_filter_by_mapping/README.rst
|
|
103 tools/seq_filter_by_mapping/seq_filter_by_mapping.py
|
|
104 tools/seq_filter_by_mapping/seq_filter_by_mapping.xml
|
|
105 tools/seq_filter_by_mapping/tool_dependencies.xml
|
|
106
|
|
107
|
|
108 Licence (MIT)
|
|
109 =============
|
|
110
|
|
111 Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
112 of this software and associated documentation files (the "Software"), to deal
|
|
113 in the Software without restriction, including without limitation the rights
|
|
114 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
115 copies of the Software, and to permit persons to whom the Software is
|
|
116 furnished to do so, subject to the following conditions:
|
|
117
|
|
118 The above copyright notice and this permission notice shall be included in
|
|
119 all copies or substantial portions of the Software.
|
|
120
|
|
121 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
122 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
123 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
124 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
125 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
126 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
|
127 THE SOFTWARE.
|