# HG changeset patch # User iuc # Date 1409002991 14400 # Node ID f244b8209eb85e4ca3fbb9a48c077466716d5174 # Parent 2553f84b817465343b44639f9a652bd2d847f006 bug fix release diff -r 2553f84b8174 -r f244b8209eb8 base_recalibrator.xml --- a/base_recalibrator.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/base_recalibrator.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + calculates covariates used to recalibrate base quality scores of reads @@ -302,4 +302,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 depth_of_coverage.xml --- a/depth_of_coverage.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/depth_of_coverage.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + on BAM files @@ -692,7 +692,7 @@ DepthOfCoverage processes a set of bam files to determine coverage at different levels of partitioning and aggregation. Coverage can be analyzed per locus, per interval, per gene, or in total; can be partitioned by sample, by read group, by technology, by center, or by library; and can be summarized by mean, median, quartiles, and/or percentage of bases covered to or beyond a threshold. Additionally, reads and bases can be filtered by mapping or base quality score. -For more information on the GATK Depth of Coverage, see this `tool specific page <http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_annotator_DepthOfCoverage.html>`_. +For more information on the GATK Depth of Coverage, see this `tool specific page <http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_coverage_DepthOfCoverage.html>`_. To learn about best practices for variant detection using GATK, see this `overview <http://www.broadinstitute.org/gatk/guide/topic?name=best-practices>`_. @@ -738,4 +738,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 gatk2_annotations.txt.sample --- a/gatk2_annotations.txt.sample Wed Feb 19 04:39:38 2014 -0500 +++ b/gatk2_annotations.txt.sample Mon Aug 25 17:43:11 2014 -0400 @@ -1,30 +1,26 @@ #unique_id name gatk_value tools_valid_for -AlleleBalance AlleleBalance AlleleBalance UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -AlleleBalanceBySample AlleleBalanceBySample AlleleBalanceBySample UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -BaseCounts BaseCounts BaseCounts UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -BaseQualityRankSumTest BaseQualityRankSumTest BaseQualityRankSumTest UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -ChromosomeCounts ChromosomeCounts ChromosomeCounts UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -DepthOfCoverage DepthOfCoverage DepthOfCoverage UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -DepthPerAlleleBySample DepthPerAlleleBySample DepthPerAlleleBySample UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -FisherStrand FisherStrand FisherStrand UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -GCContent GCContent GCContent UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -HaplotypeScore HaplotypeScore HaplotypeScore UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -HardyWeinberg HardyWeinberg HardyWeinberg UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -HomopolymerRun HomopolymerRun HomopolymerRun UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -InbreedingCoeff InbreedingCoeff InbreedingCoeff UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -IndelType IndelType IndelType UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -LowMQ LowMQ LowMQ UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -MVLikelihoodRatio MVLikelihoodRatio MVLikelihoodRatio UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -MappingQualityRankSumTest MappingQualityRankSumTest MappingQualityRankSumTest UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -MappingQualityZero MappingQualityZero MappingQualityZero UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -MappingQualityZeroBySample MappingQualityZeroBySample MappingQualityZeroBySample UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -MappingQualityZeroFraction MappingQualityZeroFraction MappingQualityZeroFraction UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -NBaseCount NBaseCount NBaseCount UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -QualByDepth QualByDepth QualByDepth UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -RMSMappingQuality RMSMappingQuality RMSMappingQuality UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -ReadDepthAndAllelicFractionBySample ReadDepthAndAllelicFractionBySample ReadDepthAndAllelicFractionBySample UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -ReadPosRankSumTest ReadPosRankSumTest ReadPosRankSumTest UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -SampleList SampleList SampleList UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -SnpEff SnpEff SnpEff VariantAnnotator,VariantRecalibrator -SpanningDeletions SpanningDeletions SpanningDeletions UnifiedGenotyper,VariantAnnotator,VariantRecalibrator -TechnologyComposition TechnologyComposition TechnologyComposition UnifiedGenotyper,VariantAnnotator,VariantRecalibrator +AlleleBalance AlleleBalance AlleleBalance UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +AlleleBalanceBySample AlleleBalanceBySample AlleleBalanceBySample UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +BaseCounts BaseCounts BaseCounts UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +BaseQualityRankSumTest BaseQualityRankSumTest BaseQualityRankSumTest UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +ChromosomeCounts ChromosomeCounts ChromosomeCounts UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +Coverage Coverage Coverage UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +DepthPerAlleleBySample DepthPerAlleleBySample DepthPerAlleleBySample UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +FisherStrand FisherStrand FisherStrand UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +GCContent GCContent GCContent UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +HaplotypeScore HaplotypeScore HaplotypeScore UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +HardyWeinberg HardyWeinberg HardyWeinberg UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +HomopolymerRun HomopolymerRun HomopolymerRun UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +InbreedingCoeff InbreedingCoeff InbreedingCoeff UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +LowMQ LowMQ LowMQ UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +MVLikelihoodRatio MVLikelihoodRatio MVLikelihoodRatio VariantAnnotator,VariantRecalibrator,HaplotypeCaller +MappingQualityRankSumTest MappingQualityRankSumTest MappingQualityRankSumTest UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +MappingQualityZero MappingQualityZero MappingQualityZero UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +MappingQualityZeroBySample MappingQualityZeroBySample MappingQualityZeroBySample UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +NBaseCount NBaseCount NBaseCount UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +QualByDepth QualByDepth QualByDepth UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +RMSMappingQuality RMSMappingQuality RMSMappingQuality UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +ReadPosRankSumTest ReadPosRankSumTest ReadPosRankSumTest UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +SampleList SampleList SampleList UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller +SnpEff SnpEff SnpEff VariantAnnotator,VariantRecalibrator,HaplotypeCaller +SpanningDeletions SpanningDeletions SpanningDeletions UnifiedGenotyper,VariantAnnotator,VariantRecalibrator,HaplotypeCaller diff -r 2553f84b8174 -r f244b8209eb8 gatk2_macros.xml --- a/gatk2_macros.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/gatk2_macros.xml Mon Aug 25 17:43:11 2014 -0400 @@ -3,13 +3,16 @@ gatk2 samtools + picard GATK2_PATH GATK2_SITE_OPTIONS + --num_threads \${GALAXY_SLOTS:-4} + 2.8 java -jar "\$GATK2_PATH/GenomeAnalysisTK.jar" @@ -54,7 +57,7 @@ #end for -p '--interval_set_rule "${gatk_param_type.interval_set_rule}"' - + -p '--interval_padding "${gatk_param_type.interval_padding}"' -p '--downsampling_type "${gatk_param_type.downsampling_type.downsampling_type_selector}"' #if str( $gatk_param_type.downsampling_type.downsampling_type_selector ) != "NONE": -p '--${gatk_param_type.downsampling_type.downsample_to_type.downsample_to_type_selector} "${gatk_param_type.downsampling_type.downsample_to_type.downsample_to_value}"' @@ -217,7 +220,9 @@ - + + @@ -295,7 +300,7 @@ - + @@ -341,4 +346,11 @@ If you use this tool in Galaxy, please cite Blankenberg D, et al. *In preparation.* + + + 10.1038/ng.806 + 10.1101/gr.107524.110 + 10.1002/0471250953.bi1110s43 + + diff -r 2553f84b8174 -r f244b8209eb8 gatk2_wrapper.py --- a/gatk2_wrapper.py Wed Feb 19 04:39:38 2014 -0500 +++ b/gatk2_wrapper.py Mon Aug 25 17:43:11 2014 -0400 @@ -7,7 +7,6 @@ import sys, optparse, os, tempfile, subprocess, shutil from binascii import unhexlify -from string import Template GALAXY_EXT_TO_GATK_EXT = { 'gatk_interval':'intervals', 'bam_index':'bam.bai', 'gatk_dbsnp':'dbSNP', 'picard_interval_list':'interval_list' } #items not listed here will use the galaxy extension as-is GALAXY_EXT_TO_GATK_FILE_TYPE = GALAXY_EXT_TO_GATK_EXT #for now, these are the same, but could be different if needed @@ -19,6 +18,7 @@ if tmp_dir and os.path.exists( tmp_dir ): shutil.rmtree( tmp_dir ) + def gatk_filename_from_galaxy( galaxy_filename, galaxy_ext, target_dir = None, prefix = None ): suffix = GALAXY_EXT_TO_GATK_EXT.get( galaxy_ext, galaxy_ext ) if prefix is None: @@ -29,36 +29,39 @@ os.symlink( galaxy_filename, gatk_filename ) return gatk_filename + def gatk_filetype_argument_substitution( argument, galaxy_ext ): return argument % dict( file_type = GALAXY_EXT_TO_GATK_FILE_TYPE.get( galaxy_ext, galaxy_ext ) ) + def open_file_from_option( filename, mode = 'rb' ): if filename: return open( filename, mode = mode ) return None + def html_report_from_directory( html_out, dir ): html_out.write( '\n\nGalaxy - GATK Output\n\n\n

\n

    \n' ) for fname in sorted( os.listdir( dir ) ): html_out.write( '
  • %s
  • \n' % ( fname, fname ) ) html_out.write( '
\n\n\n' ) -def index_bam_files( bam_filenames, tmp_dir ): + +def index_bam_files( bam_filenames ): for bam_filename in bam_filenames: bam_index_filename = "%s.bai" % bam_filename if not os.path.exists( bam_index_filename ): #need to index this bam file stderr_name = tempfile.NamedTemporaryFile( prefix = "bam_index_stderr" ).name command = 'samtools index %s %s' % ( bam_filename, bam_index_filename ) - proc = subprocess.Popen( args=command, shell=True, stderr=open( stderr_name, 'wb' ) ) - return_code = proc.wait() - if return_code: + try: + subprocess.check_call( args=command, shell=True, stderr=open( stderr_name, 'wb' ) ) + except: for line in open( stderr_name ): print >> sys.stderr, line - os.unlink( stderr_name ) #clean up - cleanup_before_exit( tmp_dir ) raise Exception( "Error indexing BAM file" ) - os.unlink( stderr_name ) #clean up + finally: + os.unlink( stderr_name ) def __main__(): #Parse Command Line @@ -74,8 +77,7 @@ parser.add_option( '-e', '--phone_home', dest='phone_home', action='store', type="string", default='STANDARD', help='What kind of GATK run report should we generate(NO_ET|STANDARD|STDOUT)' ) parser.add_option( '-K', '--gatk_key', dest='gatk_key', action='store', type="string", default=None, help='What kind of GATK run report should we generate(NO_ET|STANDARD|STDOUT)' ) (options, args) = parser.parse_args() - - tmp_dir = tempfile.mkdtemp( prefix='tmp-gatk-' ) + if options.pass_through_options: cmd = ' '.join( options.pass_through_options ) else: @@ -87,42 +89,50 @@ elif options.max_jvm_heap_fraction is not None: cmd = cmd.replace( 'java ', 'java -XX:DefaultMaxRAMFraction=%s -XX:+UseParallelGC ' % ( options.max_jvm_heap_fraction ), 1 ) bam_filenames = [] - if options.datasets: - for ( dataset_arg, filename, galaxy_ext, prefix ) in options.datasets: - gatk_filename = gatk_filename_from_galaxy( filename, galaxy_ext, target_dir = tmp_dir, prefix = prefix ) - if dataset_arg: - cmd = '%s %s "%s"' % ( cmd, gatk_filetype_argument_substitution( dataset_arg, galaxy_ext ), gatk_filename ) - if galaxy_ext == "bam": - bam_filenames.append( gatk_filename ) - index_bam_files( bam_filenames, tmp_dir ) - #set up stdout and stderr output options - stdout = open_file_from_option( options.stdout, mode = 'wb' ) - stderr = open_file_from_option( options.stderr, mode = 'wb' ) - #if no stderr file is specified, we'll use our own - if stderr is None: - stderr = tempfile.NamedTemporaryFile( prefix="gatk-stderr-", dir=tmp_dir ) - - proc = subprocess.Popen( args=cmd, stdout=stdout, stderr=stderr, shell=True, cwd=tmp_dir ) - return_code = proc.wait() - - if return_code: - stderr_target = sys.stderr - else: - stderr_target = sys.stdout - stderr.flush() - stderr.seek(0) - while True: - chunk = stderr.read( CHUNK_SIZE ) - if chunk: - stderr_target.write( chunk ) + tmp_dir = tempfile.mkdtemp( prefix='tmp-gatk-' ) + try: + if options.datasets: + for ( dataset_arg, filename, galaxy_ext, prefix ) in options.datasets: + gatk_filename = gatk_filename_from_galaxy( filename, galaxy_ext, target_dir = tmp_dir, prefix = prefix ) + if dataset_arg: + cmd = '%s %s "%s"' % ( cmd, gatk_filetype_argument_substitution( dataset_arg, galaxy_ext ), gatk_filename ) + if galaxy_ext == "bam": + bam_filenames.append( gatk_filename ) + if galaxy_ext == 'fasta': + subprocess.check_call( 'samtools faidx "%s"' % gatk_filename, shell=True ) + subprocess.check_call( 'java -jar %s R=%s O=%s QUIET=true' % ( os.path.join(os.environ['JAVA_JAR_PATH'], 'CreateSequenceDictionary.jar'), gatk_filename, os.path.splitext(gatk_filename)[0] + '.dict' ), shell=True ) + index_bam_files( bam_filenames ) + #set up stdout and stderr output options + stdout = open_file_from_option( options.stdout, mode = 'wb' ) + stderr = open_file_from_option( options.stderr, mode = 'wb' ) + #if no stderr file is specified, we'll use our own + if stderr is None: + stderr = tempfile.NamedTemporaryFile( prefix="gatk-stderr-", dir=tmp_dir ) + + proc = subprocess.Popen( args=cmd, stdout=stdout, stderr=stderr, shell=True, cwd=tmp_dir ) + return_code = proc.wait() + + if return_code: + stderr_target = sys.stderr else: - break - stderr.close() + stderr_target = sys.stdout + stderr.flush() + stderr.seek(0) + while True: + chunk = stderr.read( CHUNK_SIZE ) + if chunk: + stderr_target.write( chunk ) + else: + break + stderr.close() + finally: + cleanup_before_exit( tmp_dir ) + #generate html reports if options.html_report_from_directory: for ( html_filename, html_dir ) in options.html_report_from_directory: html_report_from_directory( open( html_filename, 'wb' ), html_dir ) - - cleanup_before_exit( tmp_dir ) + -if __name__=="__main__": __main__() +if __name__ == "__main__": + __main__() diff -r 2553f84b8174 -r f244b8209eb8 haplotype_caller.xml --- a/haplotype_caller.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/haplotype_caller.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + Call SNPs and indels simultaneously via local de-novo assembly of haplotypes in an active region @@ -158,7 +158,7 @@ - + @@ -191,7 +191,7 @@ - + @@ -320,4 +320,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 indel_realigner.xml --- a/indel_realigner.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/indel_realigner.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + - perform local realignment @@ -206,4 +206,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 print_reads.xml --- a/print_reads.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/print_reads.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + on BAM files @@ -32,11 +32,6 @@ #include source=$standard_gatk_options# - #if str( $reference_source.reference_source_selector ) == "history": - -d "-R" "${reference_source.ref_file}" "${reference_source.ref_file.ext}" "gatk_input" - #end if - ##end standard gatk options - ##start analysis specific options #if $analysis_param_type.analysis_param_type_selector == "advanced": -p ' @@ -202,7 +197,7 @@ This walker is designed to work as the second pass in a two-pass processing step, doing a by-read traversal. For each base in each read this walker calculates various user-specified covariates (such as read group, reported quality score, cycle, and dinuc) Using these values as a key in a large hashmap the walker calculates an empirical base quality score and overwrites the quality score currently in the read. This walker then outputs a new bam file with these updated (recalibrated) reads. Note: This walker expects as input the recalibration table file generated previously by CovariateCounterWalker. Note: This walker is designed to be used in conjunction with CovariateCounterWalker. -For more information on base quality score recalibration using the GATK, see this `tool specific page <http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_PrintReads.html>`_. +For more information on base quality score recalibration using the GATK, see this `tool specific page <http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_readutils_PrintReads.html>`_. To learn about best practices for variant detection using GATK, see this `overview <http://www.broadinstitute.org/gatk/guide/topic?name=best-practices>`_. @@ -247,4 +242,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 readme.rst --- a/readme.rst Wed Feb 19 04:39:38 2014 -0500 +++ b/readme.rst Mon Aug 25 17:43:11 2014 -0400 @@ -63,7 +63,8 @@ History ======= -v0.1 - Initial public release +* v0.1 - Initial public release +* v2.8.0 - Bugfix release, increase version number to reflect the underlying GATK version Licence (MIT) diff -r 2553f84b8174 -r f244b8209eb8 realigner_target_creator.xml --- a/realigner_target_creator.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/realigner_target_creator.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + for use in local realignment @@ -164,4 +164,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 reduce_reads.xml --- a/reduce_reads.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/reduce_reads.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + in BAM files @@ -154,7 +154,7 @@ This walker will generated reduced versions of the BAM files that still follow the BAM spec and contain all the information necessary for the GSA variant calling pipeline. Some options allow you to tune in how much compression you want to achieve. The default values have been shown to reduce a typical whole exome BAM file 100x. The higher the coverage, the bigger the savings in file size and performance of the downstream tools. -For more information on using read based compression in the GATK, see this `tool specific page <http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_compression_reducereads_ReduceReads.html>`_. +.. For more information on using read based compression in the GATK, see this `tool specific page <http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_compression_reducereads_ReduceReads.html>`_. To learn about best practices for variant detection using GATK, see this `overview <http://www.broadinstitute.org/gatk/guide/topic?name=best-practices>`_. @@ -223,4 +223,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 tool_dependencies.xml --- a/tool_dependencies.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/tool_dependencies.xml Mon Aug 25 17:43:11 2014 -0400 @@ -15,6 +15,12 @@ - + + + + + + + diff -r 2553f84b8174 -r f244b8209eb8 unified_genotyper.xml --- a/unified_genotyper.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/unified_genotyper.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + SNP and indel caller @@ -72,7 +72,6 @@ --excludeAnnotation "${annotation}" #end for #end if - ${analysis_param_type.multiallelic} #if str( $analysis_param_type.sample_ploidy ) != '': --sample_ploidy "$analysis_param_type.sample_ploidy" #end if @@ -199,7 +198,7 @@ - + @@ -294,4 +293,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 variant_annotator.xml --- a/variant_annotator.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/variant_annotator.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + @@ -244,4 +244,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 variant_apply_recalibration.xml --- a/variant_apply_recalibration.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/variant_apply_recalibration.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + @@ -135,4 +135,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 variant_combine.xml --- a/variant_combine.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/variant_combine.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + @@ -167,4 +167,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 variant_eval.xml --- a/variant_eval.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/variant_eval.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + @@ -80,10 +80,6 @@ --minPhaseQuality "${analysis_param_type.min_phase_quality}" - #if str( $analysis_param_type.family ): - --family_structure "${analysis_param_type.family}" - #end if - --mendelianViolationQualThreshold "${analysis_param_type.mendelian_violation_qual_threshold}" #if str( $analysis_param_type.ancestral_alignments ) != "None": @@ -165,9 +161,8 @@ - + diff -r 2553f84b8174 -r f244b8209eb8 variant_filtration.xml --- a/variant_filtration.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/variant_filtration.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + on VCF files @@ -177,4 +177,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 variant_recalibrator.xml --- a/variant_recalibrator.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/variant_recalibrator.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,6 +1,8 @@ - + - + + ggplot + gatk2_macros.xml @@ -63,15 +65,12 @@ --maxIterations "${analysis_param_type.max_iterations}" --numKMeans "${analysis_param_type.num_k_means}" --stdThreshold "${analysis_param_type.std_threshold}" - --qualThreshold "${analysis_param_type.qual_threshold}" --shrinkage "${analysis_param_type.shrinkage}" --dirichlet "${analysis_param_type.dirichlet}" --priorCounts "${analysis_param_type.prior_counts}" - #if str( $analysis_param_type.bad_variant_selector.bad_variant_selector_type ) == 'percent': - --percentBadVariants "${analysis_param_type.bad_variant_selector.percent_bad_variants}" - #else: - --minNumBadVariants "${analysis_param_type.bad_variant_selector.min_num_bad_variants}" - #end if + + --minNumBadVariants "${analysis_param_type.min_num_bad_variants}" + --target_titv "${analysis_param_type.target_titv}" #for $tranche in [ $tranche.strip() for $tranche in str( $analysis_param_type.ts_tranche ).split( ',' ) if $tranche.strip() ] --TStranche "${tranche}" @@ -83,7 +82,6 @@ #end if --ignore_filter "${ignore_filter_name}" #end for - --ts_filter_level "${analysis_param_type.ts_filter_level}" ' #end if @@ -100,7 +98,7 @@ - + @@ -114,7 +112,7 @@
- + @@ -324,26 +322,16 @@ - - - - - + + + + - - - - - - - - - - - - + + @@ -360,7 +348,6 @@ - @@ -410,12 +397,10 @@ maxIterations The maximum number of VBEM iterations to be performed in variational Bayes algorithm. Procedure will normally end when convergence is detected. numKMeans The number of k-means iterations to perform in order to initialize the means of the Gaussians in the Gaussian mixture model. stdThreshold If a variant has annotations more than -std standard deviations away from mean then don't use it for building the Gaussian mixture model. - qualThreshold If a known variant has raw QUAL value less than -qual then don't use it for building the Gaussian mixture model. shrinkage The shrinkage parameter in variational Bayes algorithm. dirichlet The dirichlet parameter in variational Bayes algorithm. priorCounts The number of prior counts to use in variational Bayes algorithm. - percentBadVariants What percentage of the worst scoring variants to use when building the Gaussian mixture model of bad variants. 0.07 means bottom 7 percent. - minNumBadVariants The minimum amount of worst scoring variants to use when building the Gaussian mixture model of bad variants. Will override -percentBad arugment if necessary. + minNumBadVariants The minimum amount of worst scoring variants to use when building the Gaussian mixture model of bad variants. recal_file The output recal file used by ApplyRecalibration target_titv The expected novel Ti/Tv ratio to use when calculating FDR tranches and for display on optimization curve output figures. (approx 2.15 for whole genome experiments). ONLY USED FOR PLOTTING PURPOSES! TStranche The levels of novel false discovery rate (FDR, implied by ti/tv) at which to slice the data. (in percent, that is 1.0 for 1 percent) @@ -423,8 +408,8 @@ path_to_Rscript The path to your implementation of Rscript. For Broad users this is maybe /broad/tools/apps/R-2.6.0/bin/Rscript rscript_file The output rscript file generated by the VQSR to aid in visualization of the input data and learned model path_to_resources Path to resources folder holding the Sting R scripts. - ts_filter_level The truth sensitivity level at which to start filtering, used here to indicate filtered variants in plots @CITATION_SECTION@ +
diff -r 2553f84b8174 -r f244b8209eb8 variant_select.xml --- a/variant_select.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/variant_select.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + from VCF files @@ -285,4 +285,5 @@ @CITATION_SECTION@ + diff -r 2553f84b8174 -r f244b8209eb8 variant_validate.xml --- a/variant_validate.xml Wed Feb 19 04:39:38 2014 -0500 +++ b/variant_validate.xml Mon Aug 25 17:43:11 2014 -0400 @@ -1,4 +1,4 @@ - + @@ -101,4 +101,5 @@ @CITATION_SECTION@ +