Mercurial > repos > iuc > tb_variant_filter
changeset 5:ee4a90760848 draft
planemo upload for repository https://github.com/COMBAT-TB/tb_variant_filter commit e064fb07acad057d3df849a6f153ed6ef90837f1
author | iuc |
---|---|
date | Sun, 18 Dec 2022 17:39:30 +0000 |
parents | e7aff4a85df5 |
children | 32f14a2723ec |
files | tb_variant_filter.xml |
diffstat | 1 files changed, 17 insertions(+), 14 deletions(-) [+] |
line wrap: on
line diff
--- a/tb_variant_filter.xml Sun Aug 29 21:46:17 2021 +0000 +++ b/tb_variant_filter.xml Sun Dec 18 17:39:30 2022 +0000 @@ -1,7 +1,7 @@ -<tool id="tb_variant_filter" name="TB Variant Filter" version="@TOOL_VERSION@+galaxy2" profile="20.09"> +<tool id="tb_variant_filter" name="TB Variant Filter" version="@TOOL_VERSION@+galaxy0" profile="20.09"> <description>M. tuberculosis H37Rv VCF filter</description> <macros> - <token name="@TOOL_VERSION@">0.3.5</token> + <token name="@TOOL_VERSION@">0.3.6</token> </macros> <requirements> <requirement type="package" version="@TOOL_VERSION@">tb_variant_filter</requirement> @@ -12,7 +12,7 @@ #if str($filter_options.show_filter_options) == "yes": --region_filter $filter_options.region_filter #else - --region_filter pe_ppe,uvp + --region_filter farhat_rlc #end if #end if #if "close_to_indel_filter" in str($filters).split(',') @@ -57,11 +57,12 @@ <when value="yes"> <param argument="--region_filter" type="select" multiple="true" label="Region filters to enable"> <!-- if these are changed the code above needs to change to keep the defaults in line with those that are default here --> - <option value="farhat_rlc">Refined Low Confidence regions from Farhat lab</option> - <option value="pe_ppe" selected="true">PE/PPE</option> + <option value="farhat_rlc" selected="true">Refined Low Confidence regions from Farhat lab</option> + <option value="farhat_rlc_lowmap">Refined Low Confidence and Low Mappability regions from Farhat lab (for < 100bp or single ended reads)</option> + <option value="pe_ppe">PE/PPE</option> <option value="tbprofiler">TBProfiler antibiotic resistant genes</option> <option value="mtbseq">MTBseq antibiotic resistant genes</option> - <option value="uvp" selected="true">UVP repeat / insertion sequence sites</option> + <option value="uvp">UVP repeat / insertion sequence sites</option> </param> <param argument="--indel_window_size" type="integer" value="5" label="Window to mask around indels"/> <param argument="--min_percentage_alt" type="float" value="90" @@ -145,19 +146,21 @@ It currently has 5 main modes: 1. Filter by region. Mask out variants in certain regions. Region lists available as: - 1. Refined Low Confidence (RLC) regions from `Marin et al 2021 <https://www.biorxiv.org/content/10.1101/2021.04.08.438862v1.full>`_ - 2. PE/PPE genes from `Fishbein et al 2015 <https://onlinelibrary.wiley.com/doi/full/10.1111/mmi.12981>`_ - 3. `TBProfiler <http://tbdr.lshtm.ac.uk/>`_ list of antibiotic resistant genes - 4. `MTBseq <https://github.com/ngs-fzb/MTBseq_source>`_ list of antibiotic resistant genes - 5. `UVP <https://github.com/CPTR-ReSeqTB/UVP>`_ list of repetitive loci in M. tuberculosis genome + 1. Refined Low Confidence (RLC) regions from `Marin et al 2022 <https://doi.org/10.1093/bioinformatics/btac023>`_ + 2. Refined Low Confidence (RLC) and Low Mappability regions from `Marin et al 2022 <https://doi.org/10.1093/bioinformatics/btac023>`_ + 3. PE/PPE genes from `Fishbein et al 2015 <https://onlinelibrary.wiley.com/doi/full/10.1111/mmi.12981>`_ + 4. `TBProfiler <http://tbdr.lshtm.ac.uk/>`_ list of antibiotic resistant genes + 5. `MTBseq <https://github.com/ngs-fzb/MTBseq_source>`_ list of antibiotic resistant genes + 6. `UVP <https://github.com/CPTR-ReSeqTB/UVP>`_ list of repetitive loci in M. tuberculosis genome 2. Filter by window around indels. Masks out variants within a certain distance (by default 5 bases) of an insertion or deletion site. 3. Filter by percentage of alternate allele bases. Mask out variants with less than a minimum percentage (by default 90%) alternative alleles. 4. Filter by depth of aligned reads. 5. Filter out all variants that are not SNV (single nucleotide variants). -For region filtering, the default choice is to use the PE/PPE and UVP regions to mask out variants. `Marin et al 2021 <https://www.biorxiv.org/content/10.1101/2021.04.08.438862v1.full>`_ -from Prof Maha Farhat's lab make a persuasive argument that their smaller list of Refined Low Confidence (RLC) regions is a better argument but this work has not yet been peer -review so it is included as an option that is not currently the default. +For region filtering, the default choice is to use the RLC regions. These are based on `Marin et al 2022 <https://doi.org/10.1093/bioinformatics/btac023>`_, +a study of regions of the M. tuberculosis H37Rv genome where Illumina reads don't map well. If you are using reads shower than 100 base pairs +or single-ended reads, you should use the RLC and Low Mappability region filter. The PE/PPE and UVP region filters are retained for backward compatibility +but the afore-mentioned paper has shown that they exclude too much of the genome from analysis. When used together the effects of the filters are added (i.e. a variant is masked out if it is masked by any of the filters). ]]></help>