Mercurial > repos > rnateam > blockclust
view blockclust.xml @ 2:f973ec6e5192 draft
Uploaded
author | rnateam |
---|---|
date | Tue, 08 Jul 2014 13:04:30 -0400 |
parents | a92d8689aa30 |
children | 27dde42069e0 |
line wrap: on
line source
<tool id="blockclust" name="BlockClust" version="1.0.0"> <description>efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles</description> <requirements> <requirement type="package" version="1.0">blockclust</requirement> <requirement type="package" version="1.1">eden</requirement> <requirement type="package" version="3.0.1">R_3_0_1</requirement> <requirement type="package" version="0.1.19">samtools</requirement> <requirement type="package" version="12.135">mcl</requirement> <requirement type="package" version="1.0">blockclust_rlibs</requirement> </requirements> <version_command>echo '1.0'</version_command> <command> #if str($tool_mode.operation) == "pre": BlockClustPipeLine.pl -m PRE -bam $tool_mode.reads_bam -tbed $tags_bed #elif str($tool_mode.operation) == "clust": #set $outputdir = $clusters.files_path #set $accept_bed=list() #set $reject_bed=list() ## prepare annotations #if str($tool_mode.reference) == "hg19": $accept_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/hg19/hg19.accept.bed") $reject_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/hg19/hg19.reject.bed") #elif str($tool_mode.reference) == "mm10": $accept_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/mm10/mm10.accept.bed") $reject_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/mm10/mm10.reject.bed") #elif str($tool_mode.reference) == "dm3": $accept_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/dm3/dm3.accept.bed") $reject_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/dm3/dm3.reject.bed") #elif str($tool_mode.reference) == "rheMac3": $accept_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/rheMac3/rheMac3.accept.bed") $reject_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/rheMac3/rheMac3.reject.bed") #elif str($tool_mode.reference) == "panTro4": $accept_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/panTro3/panTro4.accept.bed") $reject_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/panTro3/panTro4.reject.bed") #elif str($tool_mode.reference) == "xenTro3": $accept_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/xenTro3/xenTro3.accept.bed") $reject_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/xenTro3/xenTro3.reject.bed") #elif str($tool_mode.reference) == "celWS235": $accept_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/celWS235/celWS235.accept.bed") $reject_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/celWS235/celWS235.reject.bed") #elif str($tool_mode.reference) == "tair10": $accept_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/tair10/tair10.accept.bed") $reject_bed.append("\$BLOCKCLUST_DATA_PATH/annotations/tair10/tair10.reject.bed") #end if BlockClustPipeLine.pl -m TEST -f SEQUENCE -c \$BLOCKCLUST_DATA_PATH/blockclust.config -t $tool_mode.input_bbo -a #echo ''.join( $accept_bed ) -r #echo ''.join( $reject_bed ) -o $outputdir #if str($tool_mode.pred.enable_pred) == "yes": -p -pm $tool_mode.pred.pred_mode -md \$BLOCKCLUST_DATA_PATH/models -x 3; #if str($tool_mode.pred.pred_mode) == "nearest_neighbour": cp #echo os.path.join($outputdir,'nearest_neighbour_predictions.txt')# $nearest_neighbour_pred_bed; #elif str($tool_mode.pred.pred_mode) == "model_based": cp #echo os.path.join($outputdir,'model_based_predictions.txt')# $model_based_pred_bed; #end if #else: ; #end if cp #echo os.path.join($outputdir, 'mcl_clusters','all_clusters.bed')# $clusters; cp #echo os.path.join($outputdir, 'hclust_tree.pdf')# $hclust_plot; cp #echo os.path.join($outputdir, 'discretized.gspan.tab')# $sim_tab_out #elif str($tool_mode.operation) == "post": BlockClustPipeLine.pl -m POST -cbed $tool_mode.clusters_bed -cm $tool_mode.cmsearch_out -tab $tool_mode.sim_tab_in -o ./; #end if </command> <inputs> <conditional name="tool_mode"> <param name="operation" type="select" label="Select mode of operation"> <option value="pre">Pre-processing </option> <option value="clust">Clustering and classification</option> <option value="post">Post-processing</option> </param> <when value="pre"> <param name="reads_bam" type="data" format="bam" label="BAM file containing alignments" /> </when> <when value="clust"> <param name="input_bbo" type="data" format="tabular" label="Input blockgroups file" /> <param name="reference" type="select" label="Select reference genome"> <option value="hg19">Human (hg19)</option> <option value="mm10">Mouse (mm10)</option> <option value="dm3">Fly (dm3)</option> <option value="rheMac3">Monkey (rheMac3)</option> <option value="panTro4">Chimp (panTro4)</option> <option value="xenTro3">Frog (xenTro3)</option> <option value="celWS235">C. elegans (celWS235)</option> <option value="tair10">Arabidopsis thaliana (tair10)</option> </param> <conditional name="pred"> <param name="enable_pred" type="select" label="Would you like to perform classification?"> <option value="no">No</option> <option value="yes">Yes</option> </param> <when value="yes"> <param name="pred_mode" type="select" label="Mode of classification"> <option value="model_based">Model based</option> <option value="nearest_neighbour">Nearest neighbour</option> </param> </when> </conditional> </when> <when value="post"> <param name="cmsearch_out" type="data" format="tabular" label="Output of cmsearch tool" /> <param name="clusters_bed" type="data" format="bed" label="BED file containing clusters (output of BlockClust)" /> <param name="sim_tab_in" type="data" format="tabular" label="Pairwise similarities file" /> </when> </conditional> </inputs> <outputs> <data format="bed" name="tags_bed" label="BlockClust: BAM to BED on ${on_string}"> <filter> tool_mode["operation"]=="pre"</filter> </data> <data format="pdf" name="hclust_plot" label="BlockClust: Hierarchical clustering plot on ${on_string}" > <filter> tool_mode["operation"]=="clust"</filter> </data> <data format="bed" name="clusters" label="BlockClust: BED of predicted clusters on ${on_string}"> <filter> tool_mode["operation"]=="clust"</filter> </data> <data format="bed" name="model_based_pred_bed" label="BlockClust: Model based predictions BED on ${on_string}"> <filter> (( tool_mode["operation"] == 'clust' and tool_mode["pred"]["enable_pred"] == "yes" and tool_mode["pred"]["pred_mode"] == "model_based" )) </filter> </data> <data format="bed" name="nearest_neighbour_pred_bed" label="BlockClust: Nearest neighbor predictions BED on ${on_string}"> <filter> (( tool_mode["operation"] == 'clust' and tool_mode["pred"]["enable_pred"] == "yes" and tool_mode["pred"]["pred_mode"] == "nearest_neighbour" )) </filter> </data> <data format="tabular" name="sim_tab_out" label="BlockClust: Pairwise similarities on ${on_string}"> <filter> tool_mode["operation"]=="clust"</filter> </data> <data format="pdf" name="cluster_dist" from_work_dir="cluster_distribution.pdf" label="BlockClust: Cluster distribution on ${on_string}" > <filter> tool_mode["operation"]=="post"</filter> </data> <data format="pdf" name="cluster_hclust" from_work_dir="hclust_tree_clusters.pdf" label="BlockClust: Hierarchical clustering plot of cluster centroids on ${on_string}" > <filter> tool_mode["operation"]=="post"</filter> </data> </outputs> <help> .. class:: infomark **What it does** BlockClust is an efficient approach to detect transcripts with similar processing patterns. We propose a novel way to encode expression profiles in compact discrete structures, which can then be processed using fast graph-kernel techniques. BlockClust allows both clustering and classification of small non-coding RNAs. BlockClust runs in three modes: 1) Pre-processing - converts given mapped reads (BAM) into BED file of tags 2) Clustering and classification - of given input block groups (from blockbuster tool) as explained in [1]_ 3) Post-processing - extracts distribution of clusters searched against Rfam database and plots hierarchical clustering made out of centroids of each BlockClust predicted cluster. For a thorough analysis of your data, we suggest you to use complete blockclust workflow, which contains all three modes of operation. **Inputs** BlockClust input files are dependent on the mode of operation: 1) Pre-processing mode: Binary Sequence Alignment Map (BAM) file 2) Clustering and classification: A blockgroups file generated by blockbuster tool Select reference genome 3) Post-processing: Output of cmsearch, searched clusters generated by BlockClust against Rfam BED file containing clusters generated by BlockClust Pairwise similarities of blockgroups generated by BlockClust **Output** 1) Pre-processing mode: BED file of tags with expressions 2) Clustering and classification: Hierarchical clustering plot of all input blockgroups by their similarity Pairwise similarities of all input blockgroups BED file containing predicted clusters BED file containing prediction of blockgroups by pre-compiled SVM binary classification model. 3) Post-processing: Distribution of clusters with annotations searched against Rfam database hierarchical clustering made out of centroids of each BlockClust predicted cluster ------ **Licenses** If **BlockClust** is used to obtain results for scientific publications it should be cited as [1]_. **References** [1] Pavankumar Videm, Dominic Rose, Fabrizio Costa, and Rolf Backofen. "BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles." Bioinformatics 30, no. 12 (2014): i274-i282. </help> </tool>