Mercurial > repos > xuebing > sharplabtool
view mytools/sampline.xml @ 7:f0dc65e7f6c0
Uploaded
author | xuebing |
---|---|
date | Fri, 09 Mar 2012 19:59:07 -0500 |
parents | |
children |
line wrap: on
line source
<tool id="sampline" name="sample"> <description>records from a file</description> <command interpreter="python">sampline.py --input=$input --output=$out_file1 --nSample=$nSample --recSize=$recSize --nSkip=$nSkip $replacement</command> <inputs> <param name="input" format="txt" type="data" label="Original file"/> <param name="nSample" size="10" type="integer" value="100" label="Number of records to sample"/> <param name="recSize" size="10" type="integer" value="1" label="Number of lines per record"/> <param name="nSkip" size="10" type="integer" value="0" label="Number of top lines to output directly (without sampling)"/> <param name="replacement" label="Sampling with replacement" type="boolean" truevalue="--replacement" falsevalue="" checked="False"/> </inputs> <outputs> <data format="input" name="out_file1" /> </outputs> <tests> <test> <output name="out_file1" file="testmap.sampled"/> <param name="input" value="test.map" ftype="TXT"/> <param name="nSample" value="100"/> <param name="recSize" value="1" /> <param name="nSkip" value="0" /> <param name="replacement" value=""/> </test> </tests> <help> **What it does** This tool selects random records from a file. Each record is defined by a fixed number of lines. - When doing over-sampling, --replacement option is enforced by default. ----- **Example 1: sampling from a BED file** parameters:: 1 line per record, sampling 5 lines, without replacement, output line 1 (track name) directly Input:: track name=test.bed chr1 148078400 148078582 CCDS993.1_cds_0_0_chr1_148078401_r 0 - chr11 116124407 116124501 CCDS8374.1_cds_0_0_chr11_116124408_r 0 - chr15 41826029 41826196 CCDS10101.1_cds_0_0_chr15_41826030_f 0 + chr16 142908 143003 CCDS10397.1_cds_0_0_chr16_142909_f 0 + chr2 220229609 220230869 CCDS2443.1_cds_0_0_chr2_220229610_r 0 - chr20 33579500 33579527 CCDS13256.1_cds_0_0_chr20_33579501_r 0 - chr20 33593260 33593348 CCDS13257.1_cds_0_0_chr20_33593261_f 0 + chr5 131621326 131621419 CCDS4152.1_cds_0_0_chr5_131621327_f 0 + chr7 113660517 113660685 CCDS5760.1_cds_0_0_chr7_113660518_f 0 + chrX 152648964 152649196 CCDS14733.1_cds_0_0_chrX_152648965_r 0 - Output:: track name=test.bed chr11 116124407 116124501 CCDS8374.1_cds_0_0_chr11_116124408_r 0 - chr16 142908 143003 CCDS10397.1_cds_0_0_chr16_142909_f 0 + chr20 33579500 33579527 CCDS13256.1_cds_0_0_chr20_33579501_r 0 - chr20 33593260 33593348 CCDS13257.1_cds_0_0_chr20_33593261_f 0 + chr5 131621326 131621419 CCDS4152.1_cds_0_0_chr5_131621327_f 0 + **Example 2: sampling reads from a fastq file** parameters:: 4 line per record, sampling 3 records, without replacement Input:: @SRR066787.2496 WICMT-SOLEXA:8:1:28:2047 length=36 NNANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN +SRR066787.2496 WICMT-SOLEXA:8:1:28:2047 length=36 !!%!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! @SRR066787.2497 WICMT-SOLEXA:8:1:28:463 length=36 GTGATTAAGAAGAGACTGGCATCACTAAGGTGACAT +SRR066787.2497 WICMT-SOLEXA:8:1:28:463 length=36 @A=BBCBBAA@:@:@@@:,?AB:B?BB=*2:@=?AA @SRR066787.2498 WICMT-SOLEXA:8:1:28:704 length=36 GAACCCAATTTTCAAAGAAGTGTGACTGCTTGTTTC +SRR066787.2498 WICMT-SOLEXA:8:1:28:704 length=36 =?BAABBACCCCAA9>>A=>A?A;;@A>ABBABBB: @SRR066787.2499 WICMT-SOLEXA:8:1:28:997 length=36 CGACTTCAGGCTCTCGCTAGCCTTCGCTTGACTGAC +SRR066787.2499 WICMT-SOLEXA:8:1:28:997 length=36 BCCBCCB?A1ACAC>;@CCAAABB?8=BA>@?B?@: @SRR066787.2500 WICMT-SOLEXA:8:1:28:582 length=36 TCTCTCTCTTTCTCTCTCTCTCTCTCTCTCTCTCTC +SRR066787.2500 WICMT-SOLEXA:8:1:28:582 length=36 ?.?.=9C8CCC:BACBCBC?CCC@CBBBCBBACAC8 Output:: @SRR066787.2497 WICMT-SOLEXA:8:1:28:463 length=36 GTGATTAAGAAGAGACTGGCATCACTAAGGTGACAT +SRR066787.2497 WICMT-SOLEXA:8:1:28:463 length=36 @A=BBCBBAA@:@:@@@:,?AB:B?BB=*2:@=?AA @SRR066787.2499 WICMT-SOLEXA:8:1:28:997 length=36 CGACTTCAGGCTCTCGCTAGCCTTCGCTTGACTGAC +SRR066787.2499 WICMT-SOLEXA:8:1:28:997 length=36 BCCBCCB?A1ACAC>;@CCAAABB?8=BA>@?B?@: @SRR066787.2500 WICMT-SOLEXA:8:1:28:582 length=36 TCTCTCTCTTTCTCTCTCTCTCTCTCTCTCTCTCTC +SRR066787.2500 WICMT-SOLEXA:8:1:28:582 length=36 ?.?.=9C8CCC:BACBCBC?CCC@CBBBCBBACAC8 </help> </tool>