Mercurial > repos > petr-novak > re_utils
diff ChipSeqRatioDef.xml @ 9:c2c69c6090f0 draft
Uploaded
author | petr-novak |
---|---|
date | Fri, 31 Jan 2020 06:55:23 -0500 |
parents | 89c5ba120b21 |
children | 5376e1c9adec |
line wrap: on
line diff
--- a/ChipSeqRatioDef.xml Mon Dec 09 04:14:48 2019 -0500 +++ b/ChipSeqRatioDef.xml Fri Jan 31 06:55:23 2020 -0500 @@ -22,12 +22,12 @@ </command> <inputs> - <param name="ChipFile" label="Chip Sequences" type="data" format="fasta" help="NGS data in fasta format"/> - <param name="InputFile" label="Input Sequences" type="data" format="fasta" help="NGS data in fasta format"/> - <param name="ContigFile" label="Reference - Contig Sequences" type="data" format="fasta" - help="Contigs obtained from RepeatExplorer clustering pipeline in fasta file"/> - <param name="MaxCl" label="Number of clusters to be shown in graph" type="integer" value="200"/> - <param name="bitscore" label="Minimum bit score threshold" type="integer" value="50" help="All similarity hits with lower bit score will not be considered for ChIP/Input ratio calculation"/> + <param name="ChipFile" label="Chip reads" type="data" format="fasta" help="Reads in FASTA format"/> + <param name="InputFile" label="Input reads" type="data" format="fasta" help="Reads in FASTA format"/> + <param name="ContigFile" label="Reference - contig sequences" type="data" format="fasta" + help="Contigs from RepeatExplorer clustering (the file "contigs.fasta")"/> + <param name="MaxCl" label="Number of top clusters to be shown in graph" type="integer" value="200"/> + <param name="bitscore" label="Bit score threshold" type="integer" value="50" help="Similarity hits with lower bit score will not be used for ChIP/Input ratio calculation"/> </inputs> <outputs> <data name="OutputFile" format="tabular" @@ -40,20 +40,22 @@ <help> **What it does** -Analysis of NGS sequences from Chromatin Imunoprecipitation. ChiP -and Input reads are mapped to contigs obtained from graph based -repetitive sequence clustering(`Novak et al. 2013`__) to enriched repeats. Reads from input -and ChIP should be ideally short illumina reads with uniform length -above 80 nt. It is sufficiant to use about 1 milion of reads for both Input and Chip. +The ChIP-seq Mapper evaluates the enrichment of repetitive sequences in sequencing data from chromatin +immunoprecipitation experiments, using repeats identified by RepeatExplorer as the reference. The tool +performs BLASTN similarity search of the read sequences to the reference, +and the reads producing hits that passed the user-specified similarity threshold are assigned to the +repeat clusters. The assignment is made to the cluster that produced the best similarity hit, and every +read is assigned to only a single cluster. Following read mapping, the numbers of reads from the +INPUT and ChIP samples are evaluated, and ChIP/INPUT ratios of the normalized read counts are reported +for individual clusters. +ChIP and INPUT reads should be of uniform lengths of at least 40 nt. The bit score threshold value should be +adjusted based on the length of the analyzed reads (the value equal to the read length is recommended for a start). This method was first used in (`Neumann et al. 2012`__) for -identification of repetitive sequences associated with cetromeric -region. If you use this method, reference: +identification of repetitive sequences associated with centromeres: `PLoS Genet. Epub 2012 Jun 21. Stretching the rules: monocentric chromosomes with multiple centromere domains. Neumann P, Navrátilová A, Schroeder-Reiter E, Koblížková A, Steinbauerová V, Chocholová E, Novák P, Wanner G, Macas J.`__. -.. __: http://bioinformatics.oxfordjournals.org/content/29/6/792.full - .. __: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1002777 .. __: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1002777