genome_diversity: rank_pathways.xml comparison

comparison rank_pathways.xml @ 14:8ae67e9fb6ff

Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]

author	miller-lab
date	Fri, 28 Sep 2012 11:35:56 -0400
parents
children	d6b961721037

comparison

equal deleted inserted replaced

-:fdb4240fb565
+:8ae67e9fb6ff
+<tool id="gd_calc_freq" name="Rank Pathways" version="1.0.0">
+<description>: Assess the impact of gene sets on pathways</description>
+<command interpreter="python">
+#if str($output_format) == 'a'
+calctfreq.py
+#else if str($output_format) == 'b'
+calclenchange.py
+#end if
+"--loc_file=${GALAXY_DATA_INDEX_DIR}/gd.rank.loc"
+"--species=${input.metadata.dbkey}"
+"--input=${input}"
+"--output=${output}"
+"--posKEGGclmn=${input.metadata.kegg_path}"
+"--KEGGgeneposcolmn=${input.metadata.kegg_gene}"
+</command>
+<inputs>
+<param name="input" type="data" format="gd_sap" label="Table">
+<validator type="metadata" check="kegg_gene,kegg_path" message="Missing KEGG gene code column and/or KEGG pathway code/name column metadata.  Click the pencil icon in the history item to edit/save the metadata attributes" />
+</param>
+<param name="output_format" type="select" label="Output format">
+<option value="a" selected="true">ranked by percentage of genes affected</option>
+<option value="b">ranked by change in length and number of paths</option>
+</param>
+</inputs>
+<outputs>
+<data name="output" format="tabular" />
+</outputs>
+<tests>
+<test>
+<param name="input" value="test_in/sample.gd_sap" ftype="gd_sap" />
+<param name="output_format" value="a" />
+<output name="output" file="test_out/rank_pathways/rank_pathways.tabular" />
+</test>
+</tests>
+<help>
+**What it does**
+This tool produces a table ranking the pathways based on the percentage
+of genes in an input dataset, out of the total in each pathway.
+Alternatively, the tool ranks the pathways based on the change in
+length and number of paths connecting sources and sinks.  This change is
+calculated between graphs representing pathways with and without excluding
+the nodes that represent the genes in an input list.  Sources are all
+the nodes representing the initial reactants/products in the pathway.
+Sinks are all the nodes representing the final reactants/products in
+the pathway.
+If pathways are ranked by percentage of genes affected, the output is
+a tabular dataset with the following columns:
+1. number of genes in the pathway present in the input dataset
+2. percentage of the total genes in the pathway included in the input dataset
+3. rank of the frequency (from high freq to low freq)
+4. name of the pathway
+If pathways are ranked by change in length and number of paths, the
+output is a tabular dataset with the following columns:
+1. change in the mean length of paths between sources and sinks
+2. mean length of paths between sources and sinks in the pathway including the genes in the input dataset.  If the pathway do not have sources/sinks, the length is assumed to be infinite (I)
+3. mean length of paths between sources and sinks in the pathway excluding the genes in the input dataset.  If the pathway do not have sources/sinks, the length is assumed to be infinite (I)
+4. rank of the change in the mean length of paths between sources and sinks (from high change to low change)
+5. change in the number of paths between sources and sinks
+6. number of paths between sources and sinks in the pathway including the genes in the input dataset.  If the pathway do not have sources/sinks, it is assumed to be a circuit (C)
+7. number of paths between sources and sinks in the pathway excluding the genes in the input dataset.  If the pathway do not have sources/sinks, it is assumed to be a circuit (C)
+8. rank of the change in the number of paths between sources and sinks (from high change to low change)
+9. name of the pathway
+</help>
+</tool>

Mercurial > repos > miller-lab > genome_diversity

comparison rank_pathways.xml @ 14:8ae67e9fb6ff