Mercurial > repos > devteam > analyze_covariates
annotate analyze_covariates.xml @ 2:738bc749f9a3 draft default tip
Added a dependency on R, since the Analyze Covariates tool generates an R script for plotting residual error.
| author | devteam |
|---|---|
| date | Wed, 02 Apr 2014 16:04:37 -0400 |
| parents | 1d427770e7ec |
| children |
| rev | line source |
|---|---|
| 0 | 1 <tool id="gatk_analyze_covariates" name="Analyze Covariates" version="0.0.5"> |
| 2 <description>- draw plots</description> | |
| 3 <requirements> | |
| 4 <requirement type="package" version="1.4">gatk</requirement> | |
|
2
738bc749f9a3
Added a dependency on R, since the Analyze Covariates tool generates an R script for plotting residual error.
devteam
parents:
1
diff
changeset
|
5 <requirement type="package" version="2.11.0">R</requirement> |
| 0 | 6 </requirements> |
| 7 <macros> | |
| 8 <import>gatk_macros.xml</import> | |
| 9 </macros> | |
| 10 <command interpreter="python">gatk_wrapper.py | |
| 11 --max_jvm_heap_fraction "1" | |
| 12 --stdout "${output_log}" | |
| 13 --html_report_from_directory "${output_html}" "${output_html.files_path}" | |
| 14 -p 'java | |
|
1
1d427770e7ec
Uploaded tool wrapper XML with correct specification of java .jar path.
devteam
parents:
0
diff
changeset
|
15 -jar "\$JAVA_JAR_PATH/AnalyzeCovariates.jar" |
| 0 | 16 -recalFile "${input_recal}" |
| 17 -outputDir "${output_html.files_path}" | |
| 18 ##--num_threads 4 ##hard coded, for now | |
| 19 ##-log "${output_log}" | |
| 20 ##-Rscript,--path_to_Rscript path_to_Rscript; on path is good enough | |
| 21 #if $analysis_param_type.analysis_param_type_selector == "advanced": | |
| 22 --ignoreQ "${analysis_param_type.ignore_q}" | |
| 23 --numRG "${analysis_param_type.num_read_groups}" | |
| 24 --max_quality_score "${analysis_param_type.max_quality_score}" | |
| 25 --max_histogram_value "${analysis_param_type.max_histogram_value}" | |
| 26 ${analysis_param_type.do_indel_quality} | |
| 27 #end if | |
| 28 ' | |
| 29 </command> | |
| 30 <inputs> | |
| 31 <param name="input_recal" type="data" format="csv" label="Covariates table recalibration file" help="-recalFile,--recal_file &lt;recal_file&gt;" /> | |
| 32 <conditional name="analysis_param_type"> | |
| 33 <param name="analysis_param_type_selector" type="select" label="Basic or Advanced options"> | |
| 34 <option value="basic" selected="True">Basic</option> | |
| 35 <option value="advanced">Advanced</option> | |
| 36 </param> | |
| 37 <when value="basic"> | |
| 38 <!-- Do nothing here --> | |
| 39 </when> | |
| 40 <when value="advanced"> | |
| 41 <param name="ignore_q" type="integer" value="5" label="Ignore bases with reported quality less than this number." help="-ignoreQ,--ignoreQ &lt;ignoreQ&gt; "/> | |
| 42 <param name="num_read_groups" type="integer" value="-1" label="Only process N read groups." help="-numRG,--numRG &lt;numRG&gt;"/> | |
| 43 <param name="max_quality_score" type="integer" value="50" label="Max quality score" help="-maxQ,--max_quality_score &lt;max_quality_score&gt;"/> | |
| 44 <param name="max_histogram_value" type="integer" value="0" label="Max histogram value" help="-maxHist,--max_histogram_value &lt;max_histogram_value&gt;"/> | |
| 45 <param name="do_indel_quality" type="boolean" truevalue="--do_indel_quality" falsevalue="" label="Do indel quality" help="--do_indel_quality"/> | |
| 46 </when> | |
| 47 </conditional> | |
| 48 </inputs> | |
| 49 <outputs> | |
| 50 <data format="html" name="output_html" label="${tool.name} on ${on_string} (HTML)" /> | |
| 51 <data format="txt" name="output_log" label="${tool.name} on ${on_string} (log)" /> | |
| 52 </outputs> | |
| 53 <tests> | |
| 54 <test> | |
| 55 <param name="input_recal" value="gatk/gatk_count_covariates/gatk_count_covariates_out_1.csv" ftype="csv" /> | |
| 56 <param name="analysis_param_type_selector" value="basic" /> | |
| 57 <output name="output_html" file="gatk/gatk_analyze_covariates/gatk_analyze_covariates_out_1.html" /> | |
| 58 <output name="output_log" file="gatk/gatk_analyze_covariates/gatk_analyze_covariates_out_1.log.contains" compare="contains" /> | |
| 59 </test> | |
| 60 </tests> | |
| 61 <help> | |
| 62 **What it does** | |
| 63 | |
| 64 Create collapsed versions of the recal csv file and call R scripts to plot residual error versus the various covariates. | |
| 65 | |
| 66 For more information on base quality score recalibration using the GATK, see this `tool specific page <http://www.broadinstitute.org/gsa/wiki/index.php/Base_quality_score_recalibration>`_. | |
| 67 | |
| 68 To learn about best practices for variant detection using GATK, see this `overview <http://www.broadinstitute.org/gsa/wiki/index.php/Best_Practice_Variant_Detection_with_the_GATK_v3>`_. | |
| 69 | |
| 70 If you encounter errors, please view the `GATK FAQ <http://www.broadinstitute.org/gsa/wiki/index.php/Frequently_Asked_Questions>`_. | |
| 71 | |
| 72 ------ | |
| 73 | |
| 74 **Inputs** | |
| 75 | |
| 76 GenomeAnalysisTK: AnalyzeCovariates accepts an recal CSV file. | |
| 77 | |
| 78 | |
| 79 **Outputs** | |
| 80 | |
| 81 The output is in CSV and HTML files with links to PDF graphs and a data files. | |
| 82 | |
| 83 | |
| 84 Go `here <http://www.broadinstitute.org/gsa/wiki/index.php/Input_files_for_the_GATK>`_ for details on GATK file formats. | |
| 85 | |
| 86 ------- | |
| 87 | |
| 88 **Settings**:: | |
| 89 | |
| 90 recal_file The input recal csv file to analyze | |
| 91 output_dir The directory in which to output all the plots and intermediate data files | |
| 92 path_to_Rscript The path to your implementation of Rscript. For Broad users this is maybe /broad/tools/apps/R-2.6.0/bin/Rscript | |
| 93 path_to_resources Path to resources folder holding the Sting R scripts. | |
| 94 ignoreQ Ignore bases with reported quality less than this number. | |
| 95 numRG Only process N read groups. Default value: -1 (process all read groups) | |
| 96 max_quality_score The integer value at which to cap the quality scores, default is 50 | |
| 97 max_histogram_value If supplied, this value will be the max value of the histogram plots | |
| 98 do_indel_quality If supplied, this value will be the max value of the histogram plots | |
| 99 | |
| 100 @CITATION_SECTION@ | |
| 101 </help> | |
| 102 </tool> |
