Mercurial > repos > bgruening > hicexplorer_hiccorrectmatrix
comparison hicCorrectMatrix.xml @ 10:bfa1c014f64a draft
planemo upload for repository https://github.com/maxplanck-ie/HiCExplorer/tree/master/galaxy/wrapper/ commit dddc0b9035b8edadfd45d74b01aeca245c2725d7
author | iuc |
---|---|
date | Fri, 27 Apr 2018 08:38:17 -0400 |
parents | ac80bd0a96ca |
children | 92fc291ceb1a |
comparison
equal
deleted
inserted
replaced
9:ac80bd0a96ca | 10:bfa1c014f64a |
---|---|
187 ------ | 187 ------ |
188 | 188 |
189 Diagnostic plot | 189 Diagnostic plot |
190 _______________ | 190 _______________ |
191 | 191 |
192 The diagnostic plot consists of a bar plot of the contacts coverage per bins size together with the | 192 |
193 modified z-score based on the Median Absolute Deviation (MAD) method. | 193 The goal of the diagnostic plot is to help the user decide on a cutoff threshold that will ignore Hi-C matrix |
194 | 194 bins with few reads assigned to them. The plot is a histogram of the total number of Hi-C reads per matrix bin. |
195 See Boris Iglewicz and David Hoaglin 1993, Volume 16: | 195 A secondary scale based on the mean absolute deviation score, is shown on top of the figure. |
196 How to Detect and Handle Outliers The ASQC Basic References in Quality Control: Statistical Techniques, | 196 This secondary scale aims to offer 'normalized' values that are comparable across samples |
197 Edward F. Mykytka, Ph.D., Editor. | 197 independently of the sequencing depth and the fraction of usable Hi-C reads. In all samples that we have studied, |
198 | 198 the histogram follows a bimodal distribution where the first peak is for bins with zero reads which usually occur |
199 Using this diagnostic plot, a user can decide if values | 199 at repetitive regions. Other low scoring bins tend to be close to repetitive regions. |
200 with a too low (and/or too high) number of contacts in respect to their genomic distance should | 200 Also, low scoring bins can be caused by absence of a restriction site in the bin or because the restriction |
201 be removed from the data before the correction applies. | 201 site is present but the restriction enzyme did not cut. The valley between the two peaks in the |
202 | 202 histogram is set by default as cutoff threshold. |
203 Moreover, the shown distribution should be a Gaussian bell. If it doesn’t follow a Gaussian distribution | 203 However, it is important to revise this as in some cases the selected value could not be correct. |
204 this is an indicator that the used data is of bad quality or that the used contact matrix | |
205 is maybe not the one that should be used. It can happen that users select for example a merge | |
206 matrix with a lower resolution that was previously needed for plotting. In such cases the | |
207 diagnostic plot helps to detect this and prevent the user from running the analysis on a wrong dataset. | |
208 | 204 |
209 | 205 |
210 .. image:: $PATH_TO_IMAGES/diagnostic_plot.png | 206 .. image:: $PATH_TO_IMAGES/diagnostic_plot.png |
211 :width: 50% | 207 :width: 50% |
212 | 208 |
213 On the example plot above, a user can then use the lower threshold defined by the MAD method (black bold bar), or define its own threshold based on the contacts distribution. | 209 On the example plot above, a user can then use the lower threshold defined by the Median Absolute Deviation (MAD) method (black bold bar), or define its own threshold based on the contacts distribution. |
214 | 210 |
215 Correct | 211 Correct |
216 _______ | 212 _______ |
217 | 213 |
218 Run the iterative correction and outputs the corrected matrix. This matrix can then be used with all downstream analysis tools such as ``hicPlotMatrix``, ``hicPlotTADs``, ``hicPlotViewpoint``, ``hicAggregateContacts`` for **visualization of Hi-C data**, ``hicCorrelate``, ``hicPlotDistVsCounts``, ``hicTransform``, ``hicFindTADs``, ``hicPCA`` **for data and scores computation on Hi-C data**. | 214 Run the iterative correction and outputs the corrected matrix. This matrix can then be used with all downstream analysis tools such as ``hicPlotMatrix``, ``hicPlotTADs``, ``hicPlotViewpoint``, ``hicAggregateContacts`` for **visualization of Hi-C data**, ``hicCorrelate``, ``hicPlotDistVsCounts``, ``hicTransform``, ``hicFindTADs``, ``hicPCA`` **for data and scores computation on Hi-C data**. |