Mercurial > repos > bgruening > hicexplorer_hiccorrectmatrix
view hicCorrectMatrix.xml @ 8:f7d344dacfeb draft
planemo upload for repository https://github.com/maxplanck-ie/HiCExplorer/tree/master/galaxy/wrapper/ commit 0456f085bac2c88b8cbddfcf12b02776d2a0d457
author | iuc |
---|---|
date | Wed, 07 Mar 2018 03:24:24 -0500 |
parents | 1612e9458940 |
children | ac80bd0a96ca |
line wrap: on
line source
<tool id="hicexplorer_hiccorrectmatrix" name="@BINARY@" version="@WRAPPER_VERSION@.0"> <description>Runs Dekker's iterative correction over a hic matrix.</description> <macros> <token name="@BINARY@">hicCorrectMatrix</token> <import>macros.xml</import> </macros> <expand macro="requirements" /> <command detect_errors="exit_code"><![CDATA[ hicCorrectMatrix $mode.mode_selector --matrix '$matrix_h5_cooler' ## special: --chromosomes is optional, but if given needs at least one argument #set chroms = '" "'.join([ str($var.chromosome) for $var in $chromosomes ]) #if chroms --chromosomes "$chroms" #end if #if $mode.mode_selector == 'correct': --iterNum $mode.iterNum --outFileName matrix.$mode.outputFormat #if $mode.filterThreshold_low and $mode.filterThreshold_large: --filterThreshold $mode.filterThreshold_low $mode.filterThreshold_large #end if #if $mode.inflationCutoff: --inflationCutoff $mode.inflationCutoff #end if #if $mode.transCutoff: --transCutoff $mode.transCutoff #end if #if $mode.sequencedCountCutoff: --sequencedCountCutoff $mode.sequencedCountCutoff #end if $mode.skipDiagonal $mode.perchr #elif $mode.mode_selector == 'merge_failed': --plotName diagnostic_plot.png --outMatrixFile corrected_matrix.npz.h5 #if $mode.xMax: --xMax $mode.xMax #end if #if $mode.filterThreshold_low and $mode.filterThreshold_large: --filterThreshold '$mode.filterThreshold_low' '$mode.filterThreshold_large' #end if #else: --plotName diagnostic_plot.png #if $mode.xMax: --xMax $mode.xMax #end if #end if #if $mode.mode_selector == 'correct': && mv matrix.$mode.outputFormat matrix #end if ]]> </command> <inputs> <expand macro='matrix_h5_cooler_macro' /> <conditional name="mode"> <param name="mode_selector" type="select" label="Range restriction (in bp)" argument="--range"> <option value="diagnostic_plot">Diagnostic plot</option> <option value="correct">Correct matrix</option> </param> <when value="diagnostic_plot"> <expand macro="xMax" /> </when> <when value="correct"> <param argument="--iterNum" name="iterNum" type="integer" optional="true" value="500" label="Number of iterations" /> <param argument="--inflationCutoff" name="inflationCutoff" type="float" optional="true" label="Inflation cutoff" value="" help="Value corresponding to the maximum number of times a bin can be scaled up during the iterative correction. For example, a inflationCutoff of 3 will filter out all bins that were expanded 3 times or more during the iterative correction."/> <param argument="--transCutoff" name="transCutoff" type="float" optional="true" label="Trans region cutoff" value="" help="Clip high counts in the top -transcut trans regions (i.e. between chromosomes). A usual value is 0.05."/> <param argument="--sequencedCountCutoff" name="sequencedCountCutoff" optional="true" type="float" label="Sequenced count cutoff" help="Each bin receives a value indicating the fraction that is covered by reads. A cutoff of 0.5 will discard all those bins that have less than half of the bin covered."/> <param argument="--skipDiagonal" name="skipDiagonal" type="boolean" truevalue="--skipDiagonal" falsevalue="" checked="false" label="Skip diagonal counts"/> <param argument="--perchr" name="perchr" type="boolean" truevalue="--perchr" falsevalue="" checked="false" label="Normalize each chromosome separately" /> <expand macro="filterThreshold" /> <param name='outputFormat' type='select' label="Output file format"> <option value='h5'>HiCExplorer format</option> <option value="cool">cool</option> </param> </when> </conditional> <repeat name="chromosomes" min="0" title="Include chromosomes" help="List of chromosomes to be included in the iterative correction. The order of the given chromosomes will be kept for the resulting corrected matrix"> <param name="chromosome" type="text" value="" > <validator type="empty_field" /> </param> </repeat> </inputs> <outputs> <data name="outFileName" from_work_dir="matrix" format="h5"> <change_format> <when input="mode.outputFormat" value="cool" format="cool" /> </change_format> <filter>mode['mode_selector'] == "correct"</filter> </data> <data name="diagnostic_plot" from_work_dir="diagnostic_plot.png" format="png"> <filter>mode['mode_selector'] == "diagnostic_plot"</filter> </data> </outputs> <tests> <test> <param name="matrix_h5_cooler" value="small_test_matrix.h5"/> <param name="mode_selector" value="correct"/> <repeat name="chromosomes"> <param name="chromosome" value="chrUextra"/> </repeat> <repeat name="chromosomes"> <param name="chromosome" value="chr3LHet"/> </repeat> <param name='outputFormat' value='h5'/> <output name="outFileName" file="hicCorrectMatrix_result1.npz.h5" ftype="h5" compare="sim_size"/> </test> <test> <param name="matrix_h5_cooler" value="small_test_matrix.h5"/> <param name="mode_selector" value="diagnostic_plot"/> <repeat name="chromosomes"> <param name="chromosome" value="chrUextra"/> </repeat> <repeat name="chromosomes"> <param name="chromosome" value="chr3LHet"/> </repeat> <output name="diagnostic_plot" file="diagnostic_plot.png" ftype="png" compare="sim_size"/> </test> </tests> <help><![CDATA[ Matrix correction ================== ``hicCorrectMatrix`` runs Dekker's iterative correction over a Hi-C matrix (`Imakaev 2012`_.). For correcting the matrix, it is important to remove the unassembled scaffolds (e.g. `NT_`), mitochondrial DNA and Y chromosome and keep only chromosomes, as scaffolds create problems with matrix correction. Therefore we use the chromosome names (1-19, X, Y) here. **Important**: Use ‘chr1 chr2 chr3 etc.’ if your genome index uses chromosome names with the ‘chr’ prefix. Matrix correction works in two steps: first a histogram containing the sum of contact per bin (row sum) is produced. This plot needs to be inspected to decide the best threshold for removing bins with lower number of reads. The second steps removes the low scoring bins and does the correction. Input ----- Diagnostic plot ~~~~~~~~~~~~~~~~ Plots a histogram of the coverage per bin together with the modified z-score based on the median absolute deviation method. See Boris Iglewicz and David Hoaglin 1993, Volume 16: How to Detect and Handle Outliers The ASQC Basic References in Quality Control: Statistical Techniques, Edward F. Mykytka, Ph.D., Editor. Parameters __________ - the contact matrix - Max value for the x-axis in counts per bin - include chromosomes Correct ~~~~~~~ Run the iterative correction. Parameters __________ - number of iterations - inflation cutoff - trans region cutoff - sequenced count cutoff - skip diagonal counts - normalize each chromosome separately - remove bins of low coverage - remove bins of large coverage - include chromosomes Output ------ Diagnostic plot: .. image:: $PATH_TO_IMAGES/diagnostic_plot.png :width: 70% Correct: - the corrected contact matrix | For more information about HiCExplorer please consider our documentation on readthedocs.io_ .. _readthedocs.io: http://hicexplorer.readthedocs.io/en/latest/index.html .. _`Imakaev 2012`: http://doi.org/doi:10.1038/nmeth.2148 ]]></help> <expand macro="citations" /> </tool>