view selectionExtracted.xml @ 6:7076911e5c64 draft default tip

"planemo upload for repository https://github.com/sblanck/MPAgenomics4Galaxy/tree/master/mpagenomics_wrappers commit a644ed69951bcc1ac46426c5e6c9a0af1003a9a8-dirty"
author sblanck
date Tue, 20 Apr 2021 15:00:42 +0000
parents 4f753bb8681e
children
line wrap: on
line source

<tool id="markersSelection" name="Markers selection" version="1.3.0">
   <requirements>
        <container type="docker">sblanck/mpagenomicsdependencies</container>
  </requirements>
  <description> of previously extracted signal</description>
  <command>
  	<![CDATA[ 
        Rscript 
        ${__tool_directory__}/selectionExtracted.R 
    --input '$input' 
  	--response '$response' 
  	--new_file_path '$output.extra_files_path' 
  	--folds '$folds' 
  	--loss '$loss' 
  	--outputlog '$outputlog' 
  	--output '$output' 
  	--log '$log'
  	]]>
  </command>
  <inputs>
    <param name="input" type="data" format="sef,saf" label="Input Signal" help="see below for more information on file format"/>
	<param name="response" type="data" format="csv" label="Data response" help="Data response csv file. See below for more information on file format" />    
	<param name="folds" type="integer" min="1" value="10" label ="Number of folds for cross validation" help="Integer between 1 and number of file in the .cel file dataset"/>
    <param name="loss" type="select" multiple="false" label="Response type">
      	<option value="linear">Linear</option>
    	<option value="logistic">Logistic</option>
    </param>
   
	<!--param name="outputgraph" type="select" label="Output figures">
        <option value="TRUE">Yes</option>
        <option value="FALSE">No</option>
     </param-->    
    <param name="outputlog" type="select" label="Output log">
        <option value="TRUE">Yes</option>
        <option value="FALSE">No</option>
    </param>    
  </inputs>       
  <outputs>
    <data format="tabular" name="output" label="selection of ${input.name}" />
    <data format="log" name="log" label="log of selection of ${input.name}" >
    	<filter>outputlog == "TRUE"</filter>
    </data>
  </outputs>
  <stdio>
    <exit_code range="1:"   level="fatal"   description="See logs for more details" />
   </stdio>
  <help>
  **What it does**
   	    	
This tool selects some relevant markers according to a response using penalized regressions.
  
Input:

*A tabular text file containing 3 fixed columns and 1 column per sample:*
	
	- chr: Chromosome.
	- position: Genomic position (in bp).
  	- probeNames: Names of the probes.
  	- One column per sample which contain the copy number signal for each sample.

Output:
  	
*A tabular text file containing 5 columns which describe all the selected SNPs (1 line per SNP):*
	
	- chr: Chromosome containing the selected SNP.
  	- position: Position of the selected SNP.
	- index: Index of the selected SNP.
	- names: Name of the selected SNP.
	- coefficient: Regression coefficient of the selected SNP.

-----

**Data Response csv file**
     	
Data response csv file format:
	
	- The first column contains the names of the different files of the dataset.
     	 
	- The second column is the response associated with each file. 
     	
	- Column names of these two columns are respectively files and response.

	- Columns are separated by a comma
     	
	- *Extensions of the files (.CEL for example) should be removed*


     	
**Example** 

Let 3 .cel files in the studied dataset ::
     	
     	patient1.cel
     	patient2.cel
     	patient3.cel 
     	
The csv file should look like this ::
     	
     	files,response
     	patient1,1.92145
     	patient2,2.12481
     	patient3,1.23545


-----
  	   	
**Citation**
	
If you use this tool please cite : 

`Q. Grimonprez, A. Celisse, M. Cheok, M. Figeac, and G. Marot. MPAgenomics : An R package for multi-patients analysis of genomic markers, 2014. Preprint &lt;http://fr.arxiv.org/abs/1401.5035&gt;`_
  
 </help>
</tool>