Mercurial > repos > devteam > best_regression_subsets
comparison best_regression_subsets.xml @ 0:54c7a01a2cc7
Imported from capsule None
| author | devteam |
|---|---|
| date | Tue, 01 Apr 2014 10:50:17 -0400 |
| parents | |
| children | e769cde223a5 |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:54c7a01a2cc7 |
|---|---|
| 1 <tool id="BestSubsetsRegression1" name="Perform Best-subsets Regression" version="0.0.1"> | |
| 2 <description> </description> | |
| 3 <command interpreter="python"> | |
| 4 best_regression_subsets.py | |
| 5 $input1 | |
| 6 $response_col | |
| 7 $predictor_cols | |
| 8 $out_file1 | |
| 9 $out_file2 | |
| 10 1>/dev/null | |
| 11 2>/dev/null | |
| 12 </command> | |
| 13 <inputs> | |
| 14 <param format="tabular" name="input1" type="data" label="Select data" help="Dataset missing? See TIP below."/> | |
| 15 <param name="response_col" label="Response column (Y)" type="data_column" data_ref="input1" /> | |
| 16 <param name="predictor_cols" label="Predictor columns (X)" type="data_column" data_ref="input1" multiple="true" > | |
| 17 <validator type="no_options" message="Please select at least one column."/> | |
| 18 </param> | |
| 19 </inputs> | |
| 20 <outputs> | |
| 21 <data format="input" name="out_file1" metadata_source="input1" /> | |
| 22 <data format="pdf" name="out_file2" /> | |
| 23 </outputs> | |
| 24 <requirements> | |
| 25 <requirement type="python-module">rpy</requirement> | |
| 26 </requirements> | |
| 27 <tests> | |
| 28 <!-- Testing this tool will not be possible because this tool produces a pdf output file. | |
| 29 --> | |
| 30 </tests> | |
| 31 <help> | |
| 32 | |
| 33 .. class:: infomark | |
| 34 | |
| 35 **TIP:** If your data is not TAB delimited, use *Edit Datasets->Convert characters* | |
| 36 | |
| 37 ----- | |
| 38 | |
| 39 .. class:: infomark | |
| 40 | |
| 41 **What it does** | |
| 42 | |
| 43 This tool uses the 'regsubsets' function from R statistical package for regression subset selection. It outputs two files, one containing a table with the best subsets and the corresponding summary statistics, and the other containing the graphical representation of the results. | |
| 44 | |
| 45 ----- | |
| 46 | |
| 47 .. class:: warningmark | |
| 48 | |
| 49 **Note** | |
| 50 | |
| 51 - This tool currently treats all predictor and response variables as continuous variables. | |
| 52 | |
| 53 - Rows containing non-numeric (or missing) data in any of the chosen columns will be skipped from the analysis. | |
| 54 | |
| 55 - The 6 columns in the output are described below: | |
| 56 | |
| 57 - Column 1 (Vars): denotes the number of variables in the model | |
| 58 - Column 2 ([c2 c3 c4...]): represents a list of the user-selected predictor variables (full model). An asterix denotes the presence of the corresponding predictor variable in the selected model. | |
| 59 - Column 3 (R-sq): the fraction of variance explained by the model | |
| 60 - Column 4 (Adj. R-sq): the above R-squared statistic adjusted, penalizing for higher number of predictors (p) | |
| 61 - Column 5 (Cp): Mallow's Cp statistics | |
| 62 - Column 6 (bic): Bayesian Information Criterion. | |
| 63 | |
| 64 | |
| 65 </help> | |
| 66 </tool> |
