Mercurial > repos > bgruening > hicexplorer_hicdetectloops
comparison hicDetectLoops.xml @ 2:ce8d154fc6b5 draft
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/hicexplorer commit 4b602d427e0fc0da5353a4510798349de98e4ae4"
author | iuc |
---|---|
date | Wed, 11 Mar 2020 17:08:20 -0400 |
parents | 2d1f9759fa0e |
children | 1119cdd14ddb |
comparison
equal
deleted
inserted
replaced
1:5759cd7db0d1 | 2:ce8d154fc6b5 |
---|---|
45 #set $chromosome = ' '.join([ '\'%s\'' % $chrom for $chrom in str($chromosomes).split(' ') ]) | 45 #set $chromosome = ' '.join([ '\'%s\'' % $chrom for $chrom in str($chromosomes).split(' ') ]) |
46 --chromosomes $chromosome | 46 --chromosomes $chromosome |
47 #end if | 47 #end if |
48 | 48 |
49 --statisticalTest $statisticalTest_selector | 49 --statisticalTest $statisticalTest_selector |
50 | 50 |
51 --outFileName output_loop.bedgraph | 51 --outFileName output_loop.bedgraph |
52 | 52 |
53 --threads @THREADS@ | 53 --threads @THREADS@ -tpc @THREADS@ |
54 ]]> | 54 ]]> |
55 </command> | 55 </command> |
56 <inputs> | 56 <inputs> |
57 <expand macro="matrix_h5_cooler_macro" /> | 57 <expand macro="matrix_h5_cooler_macro" /> |
58 <param argument="--peakWidth" type="integer" optional='true' label="Peak width" help= "The width of the peak region in bins. The square around the peak will include (2 * peakWidth)^2 bins." /> | 58 <param argument="--peakWidth" type="integer" optional='true' label="Peak width" help= "The width of the peak region in bins. The square around the peak will include (2 * peakWidth)^2 bins." /> |
59 <param argument="--windowSize" type="integer" optional='true' label="Window size" help= "The window size for the neighborhood region the peak is located in. All values from this region (exclude the values from the peak | 59 <param argument="--windowSize" type="integer" optional='true' label="Window size" help= "The window size for the neighborhood region the peak is located in. All values from this region (exclude the values from the peak |
60 region) are tested against the peak region for significant difference. The square will have the size of (2 * windowSize)^2 bins" /> | 60 region) are tested against the peak region for significant difference. The square will have the size of (2 * windowSize)^2 bins" /> |
61 <param argument="--pValuePreselection" type="float" label="P-value preselection" help= "Only candidates with p-values less the given threshold will be considered as candidates. | 61 <param argument="--pValuePreselection" type="float" label="P-value preselection" help= "Only candidates with p-values less the given threshold will be considered as candidates. |
62 For each genomic distance a negative binomial distribution is fitted and for each pixel a p-value given by the cumulative density function is given. | 62 For each genomic distance a negative binomial distribution is fitted and for each pixel a p-value given by the cumulative density function is given. |
63 This does NOT influence the p-value for the neighborhood testing." value='0.05'/> | 63 This does NOT influence the p-value for the neighborhood testing." value='0.05'/> |
64 <param argument="--peakInteractionsThreshold" type="integer" label="Minimum interaction number" help= "The minimum number of interactions a detected peaks needs to have to be considered." value='5' /> | 64 <param argument="--peakInteractionsThreshold" type="integer" label="Minimum interaction number" help= "The minimum number of interactions a detected peaks needs to have to be considered." value='5' /> |
65 <param argument="--maximumInteractionPercentageThreshold" type="float" value='0.1' label="Maximum interaction share" help= "For each genomic distance the maximum value is considered and all candidates need to have at least \'max_value * maximumInteractionPercentageThreshold\' interactions." /> | 65 <param argument="--maximumInteractionPercentageThreshold" type="float" value='0.1' label="Maximum interaction share" help= "For each genomic distance the maximum value is considered and all candidates need to have at least \'max_value * maximumInteractionPercentageThreshold\' interactions." /> |
66 <param argument="--pValue" type="float" label="P-value" help= "Rejection level for the statistical test for H0. H0 is peak region and background have the same distribution." value='0.05'/> | 66 <param argument="--pValue" type="float" label="P-value" help= "Rejection level for the statistical test for H0. H0 is peak region and background have the same distribution." value='0.05'/> |
67 <param argument="--maxLoopDistance" optional='true' type="integer" label="Maximal loop distance" help= "Maximum genomic distance of a loop, usually loops are within a distance of ~2MB." value='2000000'/> | 67 <param argument="--maxLoopDistance" optional='true' type="integer" label="Maximal loop distance" help= "Maximum genomic distance of a loop, usually loops are within a distance of ~2MB." value='2000000'/> |
71 <param name="statisticalTest_selector" type="select" label="Stistical test"> | 71 <param name="statisticalTest_selector" type="select" label="Stistical test"> |
72 <option value="wilcoxon-rank-sum" selected="True">Wilcoxon rank-sum'</option> | 72 <option value="wilcoxon-rank-sum" selected="True">Wilcoxon rank-sum'</option> |
73 <option value="anderson-darling">Anderson-Darling</option> | 73 <option value="anderson-darling">Anderson-Darling</option> |
74 </param> | 74 </param> |
75 </inputs> | 75 </inputs> |
76 <outputs> | 76 <outputs> |
77 <data name='output_loops' from_work_dir='output_loop.bedgraph' format='bedgraph' label='Computed loops'/> | 77 <data name='output_loops' from_work_dir='output_loop.bedgraph' format='bedgraph' label='Computed loops'/> |
78 </outputs> | 78 </outputs> |
79 <tests> | 79 <tests> |
80 <test> | 80 <test> |
81 <param name="matrix_h5_cooler" value="small_test_matrix.cool"/> | 81 <param name="matrix_h5_cooler" value="small_test_matrix.cool"/> |
93 Loop detection | 93 Loop detection |
94 ============== | 94 ============== |
95 | 95 |
96 Computes enriched regions (peaks) or long range contacts on the given contact matrix. | 96 Computes enriched regions (peaks) or long range contacts on the given contact matrix. |
97 | 97 |
98 hicDetectLoops can detect enriched interaction regions (peaks / loops) based on a strict candidate selection, negative binomial distributions and Anderson-Darling / Wilcoxon rank-sum tests. | |
99 | |
100 The algorithm was mainly develop on GM12878 cells from Rao 2014 on 10kb and 5kb fixed bin size resolution. | |
101 | |
102 _________________ | |
103 | |
104 Usage | |
105 ----- | |
106 | |
107 A command line example is available below (easily matchable in Galaxy using each field information): | |
108 | |
109 ̀`$ hicDetectLoops -m matrix.cool -o loops.bedgraph --maxLoopDistance 2000000 --windowSize 10 --peakWidth 6 --pValuePreselection 0.05 --pValue 0.05 --peakInteractionsThreshold 20 --maximumInteractionPercentageThreshold 0.1 --statisticTest anderson-darling` | |
110 | |
111 The candidate selection is based on the restriction of the maximum genomic distance, here 2MB. This distance is given by Rao 2014. For each genomic distance a negative binomial distribution is computed and only interaction pairs with a threshold less than ``--pValuePreselection`` are accepted. Detected candidates need to have at least an interaction count of ``--maximumInteractionPercentageThreshold`` times the maximum value for their genomic distance. Please note that ``--maximumInteractionPercentageThreshold`` was introduced with HiCExplorer release 3.2. Earlier versions did not have this parameter yet and therefore their outputs may differ. In a second step, each candidate is considered compared to its neighborhood. This neighborhood is defined by the ``--windowSize`` parameter in the x and y dimension. Per neighborhood only one candidate is considered, therefore only the candidate with the highest peak values is accepted. As a last step, the neighborhood is split into a peak and background region (parameter ``--peakWidth``). The peakWidth can never be larger than the windowSize. However, we recommend for 10kb matrices a windowSize of 10 and a peakWidth of 6. | |
112 | |
113 The output file (´´-o loops.bedgraph``) contains the x and y position of each loop and its corresponding p-value of the Anderson-Darling test. | |
114 | |
115 `1 120000000 122500000 1 145000000 147500000 0.001` | |
116 | |
117 The results can visualized via hicPlotMatrix: | |
118 | |
119 `$ hicPlotMatrix -m matrix.cool -o plot.png --log1p --region 1:18000000-22000000 --loops loops.bedgraph` | |
120 | |
121 .. image:: $PATH_TO_IMAGES/hicDetectLoops.png | |
122 :width: 50% | |
123 | |
124 | |
98 For more information about HiCExplorer please consider our documentation on readthedocs.io_. | 125 For more information about HiCExplorer please consider our documentation on readthedocs.io_. |
99 | 126 |
100 .. _readthedocs.io: http://hicexplorer.readthedocs.io/en/latest/index.html | 127 .. _readthedocs.io: http://hicexplorer.readthedocs.io/en/latest/index.html |
101 | 128 |
102 ]]></help> | 129 ]]></help> |