Mercurial > repos > fubar > bigwigoutlierbed
diff README.md @ 0:2fbbc1be6655 draft
planemo upload for repository https://github.com/jackh726/bigtools commit ce6b9f638ebcebcad5a5b10219f252962f30e5cc-dirty
author | fubar |
---|---|
date | Mon, 01 Jul 2024 00:53:01 +0000 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.md Mon Jul 01 00:53:01 2024 +0000 @@ -0,0 +1,31 @@ +## bigwig peak outlier to bed + +### July 30 2024 for the VGP + +This code will soon become a Galaxy tool, for building some of the [NIH MARBL T2T assembly polishing](https://github.com/marbl/training) tools as Galaxy workflows. + +The next JBrowse2 tool release will include a plugin for optional colours to distinguish bed features, shown being tested in the screenshots below. + +### Find and mark BigWig peaks to a bed file for display + +In the spirit of DeepTools, but finding contiguous regions where the bigwig value is either above or below a given centile. +0.99 and 0.01 for example. These quantile cut point values are found and applied over each chromosome using some [cunning numpy code](http://gregoryzynda.com/python/numpy/contiguous/interval/2019/11/29/contiguous-regions.html) + +data:image/s3,"s3://crabby-images/6110a/6110ac59b2bde5e3eb07fbdf8d4677c1fe7d1fcd" alt="image" + +data:image/s3,"s3://crabby-images/ccaa1/ccaa15faa30145f68da16fe78abb7c5e21da2018" alt="image" + +Big differences between chromosomes 14,15,21,22 and Y in this "all contigs" view - explanations welcomed: + +data:image/s3,"s3://crabby-images/630e7/630e78be6023fa8f814a68aa9d77763e308b72b1" alt="image" + + +[pybedtools](https://github.com/jackh726/bigtools) is used for the bigwig interface. Optionally allow +multiple bigwigs to be processed into a single bed - the bed features have the bigwig name in the label for viewing. + +### Note on quantiles per chromosome rather than quantiles for the whole bigwig + +It is just not feasible to hold all contigs in the entire decoded bigwig in RAM to estimate quantiles. It may be +better to sample across all chromosomes so as not to lose any systematic differences between them - the current method will hide those +differences unfortunately. Sampling might be possible. Looking at the actual quantile values across a couple of test bigwigs suggests that +there is not much variation between chromosomes but there's now a tabular report to check them for each input bigwig.