Mercurial > repos > petr-novak > re_utils

diff ChipSeqRatioDef.xml @ 9:c2c69c6090f0 draft
Uploaded
author: petr-novak
date: Fri, 31 Jan 2020 06:55:23 -0500
parents: 89c5ba120b21
children: 5376e1c9adec
--- a/ChipSeqRatioDef.xml	Mon Dec 09 04:14:48 2019 -0500
+++ b/ChipSeqRatioDef.xml	Fri Jan 31 06:55:23 2020 -0500
@@ -22,12 +22,12 @@
     </command>
 
     <inputs>
-        <param name="ChipFile" label="Chip Sequences" type="data" format="fasta" help="NGS data in fasta format"/> 
-	<param name="InputFile" label="Input Sequences" type="data" format="fasta" help="NGS data in fasta format"/>
-	<param name="ContigFile" label="Reference - Contig Sequences" type="data" format="fasta"
-	       help="Contigs obtained from RepeatExplorer clustering pipeline in fasta file"/> 
-	<param name="MaxCl" label="Number of clusters to be shown in graph" type="integer" value="200"/>   
-	<param name="bitscore" label="Minimum bit score threshold" type="integer" value="50" help="All similarity hits with lower bit score will not be considered for ChIP/Input ratio calculation"/>   
+        <param name="ChipFile" label="Chip reads" type="data" format="fasta" help="Reads in FASTA format"/> 
+	<param name="InputFile" label="Input reads" type="data" format="fasta" help="Reads in FASTA format"/>
+	<param name="ContigFile" label="Reference - contig sequences" type="data" format="fasta"
+	       help="Contigs from RepeatExplorer clustering (the file &quot;contigs.fasta&quot;)"/> 
+	<param name="MaxCl" label="Number of top clusters to be shown in graph" type="integer" value="200"/>   
+	<param name="bitscore" label="Bit score threshold" type="integer" value="50" help="Similarity hits with lower bit score will not be used for ChIP/Input ratio calculation"/>   
     </inputs>
     <outputs>
     	<data name="OutputFile" format="tabular"
@@ -40,20 +40,22 @@
     <help>
 **What it does**
 
-Analysis of NGS sequences from Chromatin Imunoprecipitation. ChiP
-and Input reads are mapped to contigs obtained from graph based
-repetitive sequence clustering(`Novak et al. 2013`__) to enriched repeats. Reads from input
-and ChIP should be ideally short illumina reads with uniform length
-above 80 nt. It is sufficiant to use about 1 milion of reads for both Input and Chip.
+The ChIP-seq Mapper evaluates the enrichment of repetitive sequences in sequencing data from chromatin 
+immunoprecipitation experiments, using repeats identified by RepeatExplorer as the reference. The tool 
+performs BLASTN similarity search of the read sequences to the reference, 
+and the reads producing hits that passed the user-specified similarity threshold are assigned to the 
+repeat clusters. The assignment is made to the cluster that produced the best similarity hit, and every 
+read is assigned to only a single cluster. Following read mapping, the numbers of reads from the 
+INPUT and ChIP samples are evaluated, and ChIP/INPUT ratios of the normalized read counts are reported 
+for individual clusters.
+ChIP and INPUT reads should be of uniform lengths of at least 40 nt. The bit score threshold value should be
+adjusted based on the length of the analyzed reads (the value equal to the read length is recommended for a start). 
 This method was first used in (`Neumann et al. 2012`__) for
-identification of repetitive sequences associated with cetromeric
-region. If you use this method, reference:
+identification of repetitive sequences associated with centromeres:
 
 
 `PLoS Genet. Epub 2012 Jun 21. Stretching the rules: monocentric chromosomes with multiple centromere domains. Neumann P, Navrátilová A, Schroeder-Reiter E, Koblížková A, Steinbauerová V, Chocholová E, Novák P, Wanner G, Macas J.`__.
 
-.. __: http://bioinformatics.oxfordjournals.org/content/29/6/792.full
- 
 .. __: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1002777
 .. __: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1002777
author	petr-novak
date	Fri, 31 Jan 2020 06:55:23 -0500
parents	89c5ba120b21
children	5376e1c9adec