picard: picard_EstimateLibraryComplexity.xml comparison

comparison picard_EstimateLibraryComplexity.xml @ 13:7e6fd3d0f16e draft

planemo upload for repository https://github.com/galaxyproject/tools-devteam/tree/master/tools/picard commit bf94a1505c131fb3f67c867b6e1d886780efa42e

author	devteam
date	Tue, 06 Dec 2016 10:04:41 -0500
parents	05087b27692a
children	465cbb0cf2eb

comparison

equal deleted inserted replaced

-:05087b27692a
+:7e6fd3d0f16e
 <import>picard_macros.xml</import>
 </macros>
 <expand macro="requirements" />
 <command detect_errors="exit_code"><![CDATA[
 @java_options@
+@symlink_element_identifier@
 picard
 EstimateLibraryComplexity
-INPUT="${inputFile}"
+INPUT='$inputFile.element_identifier'
 OUTPUT="${outFile}"
 MIN_IDENTICAL_BASES="${min_identical_bases}"
 MAX_DIFF_RATE="${max_diff_rate}"
 MIN_MEAN_QUALITY="${min_mean_quality}"
 MAX_GROUP_RATIO="${max_group_ratio}"
 #import pipes
 READ_NAME_REGEX=${ pipes.quote( str( $read_name_regex ) ) or "''" }
 OPTICAL_DUPLICATE_PIXEL_DISTANCE="${optical_duplicate_pixel_distance}"
 VALIDATION_STRINGENCY="${validation_stringency}"
 QUIET=true
 VERBOSITY=ERROR
 ]]></command>
 <inputs>
 <param format="bam" name="inputFile" type="data" label="Select SAM/BAM dataset or dataset collection" help="If empty, upload or import a SAM/BAM dataset" />
 <param name="min_identical_bases" type="integer" value="5" label="The minimum number of bases at the starts of reads that must be identical for reads to be grouped together for duplicate detection" help="MIN_IDENTICAL_BASES; In effect total_reads / 4^max_id_bases reads will be compared at a time, so lower numbers will produce more accurate results but consume exponentially more memory and CPU; default=5"/>
 <param name="max_diff_rate" type="float" value="0.03" label="The maximum rate of differences between two reads to call them identical" help="MAX_DIFF_RATE; default=0.03"/>
 </sanitizer>
 </param>
 <param name="optical_duplicate_pixel_distance" type="integer" value="100" min="0" max="500" label="The maximum offset between two duplicte clusters in order to consider them optical duplicates" help="OPTICAL_DUPLICATE_PIXEL_DISTANCE; default=100"/>
 <expand macro="VS" />
 </inputs>
 <outputs>
 <data format="tabular" name="outFile" label="${tool.name} on ${on_string}: Library complexity report"/>
 </outputs>
 <tests>
 <test>
 <param name="inputFile" value="picard_EstimateLibraryComplexity.bam" ftype="bam"/>
 <param name="min_identical_bases" value="5"/>
 <param name="max_diff_rate" value="0.03"/>
 <param name="max_group_ratio" value="500"/>
 <param name="validation_stringency" value="LENIENT"/>
 <output name="outFile" file="picard_EstimateLibraryComplexity_test1.tab" ftype="tabular" lines_diff="4"/>
 </test>
 </tests>
 <help>
 **Purpose**
 Attempts to estimate library complexity from sequence of read pairs alone. Does so by sorting all reads by the first N bases (5 by default)
 @dataset_collections@
 @description@
 MIN_IDENTICAL_BASES=Integer   The minimum number of bases at the starts of reads that must be identical for reads to be
 grouped together for duplicate detection.  In effect total_reads / 4^max_id_bases reads
 will be compared at a time, so lower numbers will produce more accurate results but
 consume exponentially more memory and CPU.  Default value: 5.
 MAX_DIFF_RATE=Double          The maximum rate of differences between two reads to call them identical.  Default value:
 0.03.
 MIN_MEAN_QUALITY=Integer      The minimum mean quality of the bases in a read pair for the read to be analyzed. Reads
 with lower average quality are filtered out and not considered in any calculations.
 Default value: 20.
 MAX_GROUP_RATIO=Integer       Do not process self-similar groups that are this many times over the mean expected group
 size. I.e. if the input contains 10m read pairs and MIN_IDENTICAL_BASES is set to 5, then
 the mean expected group size would be approximately 10 reads.  Default value: 500.
 READ_NAME_REGEX=String        Regular expression that can be used to parse read names in the incoming SAM file. Read
 names are parsed to extract three variables: tile/region, x coordinate and y coordinate.
 These values are used to estimate the rate of optical duplication in order to give a more
 accurate estimated library size. Set this option to null to disable optical duplicate
 detection. The regular expression should contain three capture groups for the three
 variables, in order. It must match the entire read name. Note that if the default regex
 is specified, a regex match is not actually done, but instead the read name  is split on
 colon character. For 5 element names, the 3rd, 4th and 5th elements are assumed to be
 tile, x and y values. For 7 element names (CASAVA 1.8), the 5th, 6th, and 7th elements
 are assumed to be tile, x and y values.  Default value:
 [a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).*.
 OPTICAL_DUPLICATE_PIXEL_DISTANCE=Integer
 The maximum offset between two duplicte clusters in order to consider them optical
 duplicates. This should usually be set to some fairly small number (e.g. 5-10 pixels)
 unless using later versions of the Illumina pipeline that multiply pixel values by 10, in
 which case 50-100 is more normal.  Default value: 100.
 @more_info@
 </help>
 </tool>

Mercurial > repos > devteam > picard

comparison picard_EstimateLibraryComplexity.xml @ 13:7e6fd3d0f16e draft