Mercurial > repos > xuebing > sharplabtool
comparison tools/regVariation/best_regression_subsets.xml @ 0:9071e359b9a3
Uploaded
author | xuebing |
---|---|
date | Fri, 09 Mar 2012 19:37:19 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:9071e359b9a3 |
---|---|
1 <tool id="BestSubsetsRegression1" name="Perform Best-subsets Regression"> | |
2 <description> </description> | |
3 <command interpreter="python"> | |
4 best_regression_subsets.py | |
5 $input1 | |
6 $response_col | |
7 $predictor_cols | |
8 $out_file1 | |
9 $out_file2 | |
10 1>/dev/null | |
11 2>/dev/null | |
12 </command> | |
13 <inputs> | |
14 <param format="tabular" name="input1" type="data" label="Select data" help="Dataset missing? See TIP below."/> | |
15 <param name="response_col" label="Response column (Y)" type="data_column" data_ref="input1" /> | |
16 <param name="predictor_cols" label="Predictor columns (X)" type="data_column" data_ref="input1" multiple="true" > | |
17 <validator type="no_options" message="Please select at least one column."/> | |
18 </param> | |
19 </inputs> | |
20 <outputs> | |
21 <data format="input" name="out_file1" metadata_source="input1" /> | |
22 <data format="pdf" name="out_file2" /> | |
23 </outputs> | |
24 <requirements> | |
25 <requirement type="python-module">rpy</requirement> | |
26 </requirements> | |
27 <tests> | |
28 <!-- Testing this tool will not be possible because this tool produces a pdf output file. | |
29 --> | |
30 </tests> | |
31 <help> | |
32 | |
33 .. class:: infomark | |
34 | |
35 **TIP:** If your data is not TAB delimited, use *Edit Datasets->Convert characters* | |
36 | |
37 ----- | |
38 | |
39 .. class:: infomark | |
40 | |
41 **What it does** | |
42 | |
43 This tool uses the 'regsubsets' function from R statistical package for regression subset selection. It outputs two files, one containing a table with the best subsets and the corresponding summary statistics, and the other containing the graphical representation of the results. | |
44 | |
45 ----- | |
46 | |
47 .. class:: warningmark | |
48 | |
49 **Note** | |
50 | |
51 - This tool currently treats all predictor and response variables as continuous variables. | |
52 | |
53 - Rows containing non-numeric (or missing) data in any of the chosen columns will be skipped from the analysis. | |
54 | |
55 - The 6 columns in the output are described below: | |
56 | |
57 - Column 1 (Vars): denotes the number of variables in the model | |
58 - Column 2 ([c2 c3 c4...]): represents a list of the user-selected predictor variables (full model). An asterix denotes the presence of the corresponding predictor variable in the selected model. | |
59 - Column 3 (R-sq): the fraction of variance explained by the model | |
60 - Column 4 (Adj. R-sq): the above R-squared statistic adjusted, penalizing for higher number of predictors (p) | |
61 - Column 5 (Cp): Mallow's Cp statistics | |
62 - Column 6 (bic): Bayesian Information Criterion. | |
63 | |
64 | |
65 </help> | |
66 </tool> |