ballgown_wrapper: ballgown.xml comparison

comparison ballgown.xml @ 17:05977e96375b draft default tip

Uploaded

author	theo.collard
date	Tue, 03 Oct 2017 09:25:51 -0400
parents	fa62657e9b57
children

comparison

equal deleted inserted replaced

-:4290f0f3d908
+:05977e96375b
-<tool id="ballgown" name="Ballgown" version="0.5.0" workflow_compatible="true">
+<tool id="ballgown" name="Ballgown" version="2.2.0" workflow_compatible="true">
 <description>Flexible, isoform-level differential expression analysis</description>
 <requirements>
 <requirement type="package" version="2.2.0">bioconductor-ballgown</requirement>
 <requirement type="package" version="0.5.0">r-dplyr</requirement>
 <requirement type="package" version="1.3.2">r-optparse</requirement>
+</requirements>
-</requirements>
+<command detect_errors="aggressive"><![CDATA[
-<command interpreter="Rscript" detect_errors="aggressive">
+##------------------------------------------------------------------------------------
-		##------------------------------------------------------------------------------------
+## This function reads the input file with the mapping between samples and files
-		## This function reads the input file with the mapping between samples and files
+## E.g. of result:
-		## E.g. of result:
+## mapping = {
-		## mapping = {
+##     "e2t.ctab"   : "sample1",
-		##     "e2t.ctab"   : "sample1",
+##     "other.ctab" : "sample2",
-		##     "other.ctab" : "sample2",
+##     "i2t.ctab"   : "sample1",
-		##     "i2t.ctab"   : "sample1",
+##     "t_data.ctab": "sample1"
-		##     "t_data.ctab": "sample1"
+##      ...
-		##      ...
+## }
-		## }
+##------------------------------------------------------------------------------------
-		##------------------------------------------------------------------------------------
+#def read_sample_mapping_file(sample_mapping_file):
-		#def read_sample_mapping_file(sample_mapping_file):
+#try
-			#try
+#set mapping = {}
-				#set mapping = {}
+#set file = open($sample_mapping_file.dataset.dataset.get_file_name(),'r')
-				#set file = open($sample_mapping_file.dataset.dataset.get_file_name(),'r')
+#for $line in $file:
-				#for $line in $file:
+#set content= $line.strip().split('\t')
-					#set content= $line.strip().split('\t')
+#for $map in $content:
-					#for $map in $content:
+#set mapping[$map]= $content[0]
-						#set mapping[$map]= $content[0]
+#end for
-					#end for
+#end for
-				#end for
+#return $mapping
-				#return $mapping
+#except
-			#except
+#return None
-				#return None
+#end try
-			#end try
+#end def
-		#end def
+##------------------------------------------------------------------------------------
-		##------------------------------------------------------------------------------------
+## This function returns the name of the sample associated to a given file
-		## This function returns the name of the sample associated to a given file
+##------------------------------------------------------------------------------------
-		##------------------------------------------------------------------------------------
+#def get_sample_name($dataset, $sample_mapping):
-		#def get_sample_name($dataset, $sample_mapping):
+##If the file with samples mapping was provided
-			##If the file with samples mapping was provided
+#if $sample_mapping != None:
-			#if $sample_mapping != None:
+#return $sample_mapping.get($dataset.name, None)
-				#return $sample_mapping.get($dataset.name, None)
+##Otherwise with extract the sample name from the filename
-			##Otherwise with extract the sample name from the filename
+#else:
-			#else:
+#return str($dataset.element_identifier)
-				#return str($dataset.element_identifier)
+#end if
-			#end if
+#end def
-		#end def
+##------------------------------------------------------------------------------------
-		##------------------------------------------------------------------------------------
+## This function reads a dataset or list of datasets and sets the corresponding value
-		## This function reads a dataset or list of datasets and sets the corresponding value
+## in the $result variable
-		## in the $result variable
+## e.g. of result
-		## e.g. of result
+##'sample1' : {
-		##'sample1' : {
+##         'e_data': '/export/galaxy-central/database/files/000/dataset_13.dat'
-		##         'e_data': '/export/galaxy-central/database/files/000/dataset_13.dat'
+##         'i_data': '/export/galaxy-central/database/files/000/dataset_10.dat',
-		##         'i_data': '/export/galaxy-central/database/files/000/dataset_10.dat',
+##         't_data': '/export/galaxy-central/database/files/000/dataset_12.dat',
-		##         't_data': '/export/galaxy-central/database/files/000/dataset_12.dat',
+##         'e2t': '/export/galaxy-central/database/files/000/dataset_9.dat',
-		##         'e2t': '/export/galaxy-central/database/files/000/dataset_9.dat',
+##         'i2t': '/export/galaxy-central/database/files/000/dataset_11.dat'
-		##         'i2t': '/export/galaxy-central/database/files/000/dataset_11.dat'
+##      },
-		##      },
+##------------------------------------------------------------------------------------
-		##------------------------------------------------------------------------------------
+#def read_input_files($param_name, $param_value, $result, $sample_mapping, $create_if_empty):
-		#def read_input_files($param_name, $param_value, $result, $sample_mapping, $create_if_empty):
+## If input is a data collection
-			## If input is a data collection
+#if isinstance($param_value, list):
-			#if isinstance($param_value, list):
+## For each dataset
-				## For each dataset
+#for $dataset in $param_value:
-				#for $dataset in $param_value:
+## Get the sample name
-					## Get the sample name
+#set sample_name = $get_sample_name($dataset, $sample_mapping)
-					#set sample_name = $get_sample_name($dataset, $sample_mapping)
+## Check if sample is already registered
-					## Check if sample is already registered
+#if not($result.has_key($sample_name)):
-					#if not($result.has_key($sample_name)):
+#if ($create_if_empty == True):
-						#if ($create_if_empty == True):
+#set result[$sample_name] = {}
-							#set result[$sample_name] = {}
+#else:
-						#else:
+#raise ValueError("Error in input. Please check that input contains all the required files for sample " + $sample_name)
-							#raise ValueError("Error in input. Please check that input contains all the required files for sample " + $sample_name)
+#end if
-						#end if
+#end if
-					#end if
+## Register the file to the sample
-					## Register the file to the sample
+#set result[$sample_name][$param_name] = str($dataset.dataset.dataset.get_file_name())
-					#set result[$sample_name][$param_name] = str($dataset.dataset.dataset.get_file_name())
+#end for
-				#end for
+#else:
-			#else:
+#if not($result.has_key("sample_1")):
-				#if not($result.has_key("sample_1")):
+#set result["sample_1"] = {}
-					#set result["sample_1"] = {}
+#end if
-				#end if
+#set result["sample_1"][$param_name] = str($param_name.dataset.dataset.get_file_name())
-				#set result["sample_1"][$param_name] = str($param_name.dataset.dataset.get_file_name())
+#end if
-			#end if
+#return $result
-			#return $result
+#end def
-		#end def
+##------------------------------------------------------------------------------------
-		##------------------------------------------------------------------------------------
+## Main body of the tool
-		## Main body of the tool
+##------------------------------------------------------------------------------------
-		##------------------------------------------------------------------------------------
+## Set the params for the next R script
-		## Set the params for the next R script
+#set result={}
-		#set result={}
+#set sample_mapping=None
-		#set sample_mapping=None
+## If the samples mapping file was provided, parse the content
-		## If the samples mapping file was provided, parse the content
+#if $samples_names != None and not(isinstance($samples_names, list) and (None in $samples_names)):
-		#if $samples_names != None and not(isinstance($samples_names, list) and (None in $samples_names)):
+#set sample_mapping = $read_sample_mapping_file($samples_names)
-			#set sample_mapping = $read_sample_mapping_file($samples_names)
+#end if
-		#end if
+## READ THE CONTENT FOR e_data AND STORE THE FILES
-		## READ THE CONTENT FOR e_data AND STORE THE FILES
+## INDEXED BY THEIR SAMPLE NAME
-		## INDEXED BY THEIR SAMPLE NAME
+## e.g. 'HBR_Rep1' : {
-		## e.g. 'HBR_Rep1' : {
+##         'e_data': '/export/galaxy-central/database/files/000/dataset_13.dat'
-		##         'e_data': '/export/galaxy-central/database/files/000/dataset_13.dat'
+##         'i_data': '/export/galaxy-central/database/files/000/dataset_10.dat',
-		##         'i_data': '/export/galaxy-central/database/files/000/dataset_10.dat',
+##         't_data': '/export/galaxy-central/database/files/000/dataset_12.dat',
-		##         't_data': '/export/galaxy-central/database/files/000/dataset_12.dat',
+##         'e2t': '/export/galaxy-central/database/files/000/dataset_9.dat',
-		##         'e2t': '/export/galaxy-central/database/files/000/dataset_9.dat',
+##         'i2t': '/export/galaxy-central/database/files/000/dataset_11.dat'
-		##         'i2t': '/export/galaxy-central/database/files/000/dataset_11.dat'
+##      },
-		##      },
+##      'HBR_Rep2' : {...}
-		##      'HBR_Rep2' : {...}
+#set $result = $read_input_files("e_data.ctab", $e_data, $result, $sample_mapping, True)
-		#set $result = $read_input_files("e_data.ctab", $e_data, $result, $sample_mapping, True)
+#set $result = $read_input_files("i_data.ctab", $i_data, $result, $sample_mapping, False)
-		#set $result = $read_input_files("i_data.ctab", $i_data, $result, $sample_mapping, False)
+#set $result = $read_input_files("t_data.ctab", $t_data, $result, $sample_mapping, False)
-		#set $result = $read_input_files("t_data.ctab", $t_data, $result, $sample_mapping, False)
+#set $result = $read_input_files("e2t.ctab", $e2t, $result, $sample_mapping, False)
-		#set $result = $read_input_files("e2t.ctab", $e2t, $result, $sample_mapping, False)
+#set $result = $read_input_files("i2t.ctab", $i2t, $result, $sample_mapping, False)
-		#set $result = $read_input_files("i2t.ctab", $i2t, $result, $sample_mapping, False)
+## For each input sample, create a directory and link the input files for ballgown
-		## For each input sample, create a directory and link the input files for ballgown
+#import os
-		#import os
+#set n_sample = 1
-		#set n_sample = 1
+#for $key, $value in $result.iteritems():
-		#for $key, $value in $result.iteritems():
+#if str($file_format.format) == 'tsv':
-			#set dir_name = str($output.files_path) + "/" + $key + "/"
+#set dir_name = str($toutput.files_path) + '/' + $key + '/'
-			$os.makedirs($dir_name)
+#else:
-			#for $file_name, $file_path in $value.iteritems():
+#set dir_name = str($output.files_path) + '/' + $key + '/'
-				$os.symlink($file_path, $dir_name + $file_name)
+#end if
-			#end for
+$os.makedirs($dir_name)
-			#set n_sample = $n_sample + 1
+#for $file_name, $file_path in $value.iteritems():
-		#end for
+$os.symlink($file_path, $dir_name + $file_name)
+#end for
-		## Run the R script with the location of the linked files and the name for outpot file
+#set n_sample = $n_sample + 1
-		ballgown.R --directory $output.files_path --outputtranscript $output --outputgenes $outputgn --texpression $trexpression --phendat $phendata --bgout $bgo
+#end for
-	</command>
-<inputs>
+## Run the R script with the location of the linked files and the name for outpot file
-<param name="e_data" type="data" multiple="true" format="tabular" label="Exon-level expression measurements" help="One row per exon. See below for more details."/>
-		<param name="i_data" type="data" multiple="true" format="tabular" label="Intron- (i.e., junction-) level expression measurements" help="One row per intron. See below for more details."/>
+Rscript '$__tool_directory__/ballgown.R' --texpression $trexpression --phendat '$phendata' --bgout '$bgo' -f '$file_format.format'
-		<param name="t_data" type="data" multiple="true" format="tabular" label="Transcript-level expression measurements" help="One row per transcript. See below for more details."/>
+#if str($file_format.format) == 'tsv':
-		<param name="e2t" type="data" multiple="true" format="tabular" label="Exons-transcripts mapping" help="Table with two columns, e_id and t_id, denoting which exons belong to which transcripts. See below for more details."/>
+--tsvoutputtranscript $toutputtranscript
-		<param name="i2t" type="data" multiple="true" format="tabular" label="Introns-transcripts mapping" help="Table with two columns, i_id and t_id, denoting which introns belong to which transcripts. See below for more details."/>
+--tsvoutputgenes $toutput
-		<param name="samples_names" type="data" optional="true" multiple="false" format="tabular" label="File names for samples" help="Optional. Use in case that the names for the analysed samples cannot be extracted from the filenames."/>
+--directory $toutput.files_path
-<param argument="--phendat" name="phendata" type="data" format="csv" label="phenotype data" />
+#else:
-<param argument="--texpression" name="trexpression" type="float" value="0.5" label="minimal transcript expression to appear in the results"/>
+--outputtranscript $output
-</inputs>
+--outputgenes $outputgn
-<outputs>
+--directory $output.files_path
-<data name="bgo" format="rda" file="ballgown_object.rda" label="${tool.name} on ${on_string}: ballgown object (R data file)"/>
+#end if
-<data name="output" format="csv" file="output_transcript.csv" label="${tool.name} on ${on_string}: transcripts expression (tabular)"/>
+]]></command>
-<data name="outputgn" format="csv" file="output_genes.csv" label="${tool.name} on ${on_string}: genes expression (tabular)"/>
+<inputs>
-</outputs>
+<param name="e_data" type="data_collection" collection_type="list" format="tabular" label="Exon-level expression measurements"
-<tests>
+help="One row per exon. See below for more details."/>
-</tests>
+<param name="i_data" type="data_collection" collection_type="list" format="tabular"
-<help>
+label="Intron- (i.e., junction-) level expression measurements"
+help="One row per intron. See below for more details."/>
+<param name="t_data" type="data_collection" collection_type="list" format="tabular"
+label="Transcript-level expression measurements" help="One row per transcript. See below for more details."/>
+<param name="e2t" type="data_collection" collection_type="list" format="tabular"
+label="Exons-transcripts mapping"
+help="Table with two columns, e_id and t_id, denoting which exons belong to which transcripts. See below for more details."/>
+<param name="i2t" type="data_collection" collection_type="list" format="tabular"
+label="Introns-transcripts mapping"
+help="Table with two columns, i_id and t_id, denoting which introns belong to which transcripts. See below for more details."/>
+<param name="samples_names" type="data" optional="true" multiple="false" format="tabular"
+label="File names for samples"
+help="Optional. Use in case that the names for the analysed samples cannot be extracted from the filenames."/>
+<param argument="--phendat" name="phendata" type="data" format="csv" label="phenotype data" />
+<param argument="--texpression" name="trexpression" type="float" value="0.5" label="minimal transcript expression to appear in the results"/>
+<conditional name="file_format">
+<param argument='--format' type="select" label="Output format">
+<option value="tsv" selected="true">tsv</option>
+<option value="csv">csv</option>
+</param>
+<when value="tsv"/>
+<when value="csv"/>
+</conditional>
+</inputs>
+<outputs>
+<data name="bgo" format="rdata" from_work_dir="ballgown_object.rda" label="${tool.name} on ${on_string}: ballgown_object_R_data_file"/>
+<data name="output" format="csv" from_work_dir="output_transcript.csv" label="${tool.name} on ${on_string}: transcripts_expression_tabular">
+<filter>file_format['format']=="csv"</filter>
+</data>
+<data name="outputgn" format="csv" from_work_dir="output_genes.csv" label="${tool.name} on ${on_string}: genes_expression_tabular">
+<filter>file_format['format']=="csv"</filter>
+</data>
+<data name="toutputtranscript" format="tabular" from_work_dir="output_transcript.tsv" label="${tool.name} on ${on_string}: transcripts_expression_tabular">
+<filter>file_format['format']=="tsv"</filter>
+</data>
+<data name="toutput" format="tabular" from_work_dir="output_genes.tsv" label="${tool.name} on ${on_string}: genes_expression_tabular">
+<filter>file_format['format']=="tsv"</filter>
+</data>
+</outputs>
+<tests>
+<test>
+<param name="e_data">
+<collection type="list">
+<element name="HBR_Rep1" value="HBR_Rep1/e_data.ctab"/>
+<element name="HBR_Rep2" value="HBR_Rep2/e_data.ctab"/>
+<element name="HBR_Rep3" value="HBR_Rep3/e_data.ctab"/>
+<element name="UHR_Rep1" value="UHR_Rep1/e_data.ctab"/>
+<element name="UHR_Rep2" value="UHR_Rep2/e_data.ctab"/>
+<element name="UHR_Rep3" value="UHR_Rep3/e_data.ctab"/>
+</collection>
+</param>
+<param name="i_data">
+<collection type="list">
+<element name="HBR_Rep1" value="HBR_Rep1/i_data.ctab"/>
+<element name="HBR_Rep2" value="HBR_Rep2/i_data.ctab"/>
+<element name="HBR_Rep3" value="HBR_Rep3/i_data.ctab"/>
+<element name="UHR_Rep1" value="UHR_Rep1/i_data.ctab"/>
+<element name="UHR_Rep2" value="UHR_Rep2/i_data.ctab"/>
+<element name="UHR_Rep3" value="UHR_Rep3/i_data.ctab"/>
+</collection>
+</param>
+<param name="t_data">
+<collection type="list">
+<element name="HBR_Rep1" value="HBR_Rep1/t_data.ctab"/>
+<element name="HBR_Rep2" value="HBR_Rep2/t_data.ctab"/>
+<element name="HBR_Rep3" value="HBR_Rep3/t_data.ctab"/>
+<element name="UHR_Rep1" value="UHR_Rep1/t_data.ctab"/>
+<element name="UHR_Rep2" value="UHR_Rep2/t_data.ctab"/>
+<element name="UHR_Rep3" value="UHR_Rep3/t_data.ctab"/>
+</collection>
+</param>
+<param name="e2t">
+<collection type="list">
+<element name="HBR_Rep1" value="HBR_Rep1/e2t.ctab"/>
+<element name="HBR_Rep2" value="HBR_Rep2/e2t.ctab"/>
+<element name="HBR_Rep3" value="HBR_Rep3/e2t.ctab"/>
+<element name="UHR_Rep1" value="UHR_Rep1/e2t.ctab"/>
+<element name="UHR_Rep2" value="UHR_Rep2/e2t.ctab"/>
+<element name="UHR_Rep3" value="UHR_Rep3/e2t.ctab"/>
+</collection>
+</param>
+<param name="i2t">
+<collection type="list">
+<element name="HBR_Rep1" value="HBR_Rep1/i2t.ctab"/>
+<element name="HBR_Rep2" value="HBR_Rep2/i2t.ctab"/>
+<element name="HBR_Rep3" value="HBR_Rep3/i2t.ctab"/>
+<element name="UHR_Rep1" value="UHR_Rep1/i2t.ctab"/>
+<element name="UHR_Rep2" value="UHR_Rep2/i2t.ctab"/>
+<element name="UHR_Rep3" value="UHR_Rep3/i2t.ctab"/>
+</collection>
+</param>
+<param name="phendata" value="phendata.csv"/>
+<output name="outputgn" file="genes_expression_tabular.csv"/>
+<output name="output" file="transcripts_expression_tabular.csv"/>
+<output name="bgo" file="ballgown_object_R_data_file.rda"/>
+</test>
+</tests>
+<help><![CDATA[
 =======================
 Ballgown
 =======================
 -----------------------
 **What it does**
 - **i2t**: Tab file or collection of tab files. Table with two columns, i_id and t_id, denoting which introns belong to which transcripts. These ids match the ids in the i_data and t_data tables.
 - samples_names: (optional) Tab file. Table with five columns, one row per sample. Defines which files from the input belong to each sample in the experiment.
 .. class:: infomark
-'''TIP''' *Note* Here's an example of a good phenotype data file for your expirement.
+'''TIP''' *Note* Here's an example of a good phenotype data file for your experiment.
 +--------------+-------------------------+-------------------------+---+
 |ids           |experimental variable 1  |experimental variable 2  |...|
 +==============+=========================+=========================+===+
 |sample 1      |value 1                  |value 2                  |...|
 - **Ballgown object** : this is the ballgown object created during the process. This file can be re-used later for further analysis in a R console.
 ----
 **Authors**: Théo Collard [SLU Global Bioinformatics Centre], Rafael Hernández de Diego [SLU Global Bioinformatics Centre], and Tomas Klingström [SLU Global Bioinformatics Centre]
+]]></help>
-Sources are available at https://github.com/CollardT/Ballgown-Wrapper
+<citations>
+<citation type="doi">doi:10.1038/nprot.2016.095</citation>
-</help>
+</citations>
 </tool>

Mercurial > repos > theo.collard > ballgown_wrapper

comparison ballgown.xml @ 17:05977e96375b draft default tip