# HG changeset patch # User rnateam # Date 1404839070 14400 # Node ID f973ec6e5192ab06f5a12696ce1f3d0874e11e60 # Parent a92d8689aa30be7572d476ebac5f8c0b254b7da9 Uploaded diff -r a92d8689aa30 -r f973ec6e5192 blockclust.xml --- a/blockclust.xml Tue Jul 08 11:59:49 2014 -0400 +++ b/blockclust.xml Tue Jul 08 13:04:30 2014 -0400 @@ -1,5 +1,5 @@ - Non-coding RNA clustering from deep sequencing read profiles + efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles blockclust eden @@ -10,7 +10,6 @@ echo '1.0' - #if str($tool_mode.operation) == "pre": BlockClustPipeLine.pl -m PRE -bam $tool_mode.reads_bam -tbed $tags_bed #elif str($tool_mode.operation) == "clust": @@ -73,7 +72,7 @@ - + @@ -119,10 +118,10 @@ tool_mode["operation"]=="clust" - + tool_mode["operation"]=="clust" - + (( tool_mode["operation"] == 'clust' and @@ -131,7 +130,7 @@ )) - + (( tool_mode["operation"] == 'clust' and @@ -156,26 +155,57 @@ **What it does** -Clusters the read profiles (i.e., blockgroups) from the blockbuster tool. +BlockClust is an efficient approach to detect transcripts with similar +processing patterns. We propose a novel way to encode expression profiles +in compact discrete structures, which can then be processed using +fast graph-kernel techniques. BlockClust allows both clustering and +classification of small non-coding RNAs. + +BlockClust runs in three modes: +1) Pre-processing - converts given mapped reads (BAM) into BED file of tags +2) Clustering and classification - of given input block groups (from blockbuster tool) as explained in [1]_ +3) Post-processing - extracts distribution of clusters searched against Rfam database and plots hierarchical clustering made out of centroids of each BlockClust predicted cluster. + +For a thorough analysis of your data, we suggest you to use complete blockclust workflow, which contains all three modes of operation. **Inputs** -BlockClust needs output of tool blockbuster as input +BlockClust input files are dependent on the mode of operation: +1) Pre-processing mode: +Binary Sequence Alignment Map (BAM) file + +2) Clustering and classification: +A blockgroups file generated by blockbuster tool +Select reference genome + +3) Post-processing: +Output of cmsearch, searched clusters generated by BlockClust against Rfam +BED file containing clusters generated by BlockClust +Pairwise similarities of blockgroups generated by BlockClust **Output** +1) Pre-processing mode: +BED file of tags with expressions -BlockClust produces a fasta file containing clusters. +2) Clustering and classification: +Hierarchical clustering plot of all input blockgroups by their similarity +Pairwise similarities of all input blockgroups +BED file containing predicted clusters +BED file containing prediction of blockgroups by pre-compiled SVM binary classification model. + +3) Post-processing: +Distribution of clusters with annotations searched against Rfam database +hierarchical clustering made out of centroids of each BlockClust predicted cluster ------ **Licenses** -If **BlockClust** is used to obtain results for scientific publications it -should be cited as [1]_. +If **BlockClust** is used to obtain results for scientific publications it should be cited as [1]_. **References** -.. [1] +[1] Pavankumar Videm, Dominic Rose, Fabrizio Costa, and Rolf Backofen. "BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles." Bioinformatics 30, no. 12 (2014): i274-i282.