# HG changeset patch # User iuc # Date 1537365229 14400 # Node ID f56bdb93ae582bd61524e034b475f9f181273df1 # Parent cab3f8d3598975faa54d99fe53c46b768b4b005d planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tool_collections/samtools/samtools_sort commit 9f6dd28ae31897068c9f8b5d842750d5d7cd600c diff -r cab3f8d35989 -r f56bdb93ae58 macros.xml --- a/macros.xml Tue May 09 11:18:12 2017 -0400 +++ b/macros.xml Wed Sep 19 09:53:49 2018 -0400 @@ -1,11 +1,93 @@ - samtools + samtools - 1.3.1 + 1.9 + #set $flags = sum(map(int, str($filter).split(','))) + + + + + + + + + + + + + + + + + + + + @@ -49,21 +131,4 @@ - ------ - -.. class:: warningmark - -**No options available? How to re-detect metadata** - -If you see a "No options available" within the "**Select references (chromosomes and contigs) you would like to restrict bam to**" drop down, you need to re-detect metadata for the dataset you are trying to process. To do this follow these steps: - -1. Click on the **pencil** icon adjacent to the dataset in the history -2. A new menu will appear in the center pane of the interface -3. Click **Datatype** tab -4. Set **New Type** to **BAM** -5. Click **Save** - -The medatada will be re-detected and you will be able to see the list of reference sequences in the "**Select references (chromosomes and contigs) you would like to restrict bam to**" drop-down. - diff -r cab3f8d35989 -r f56bdb93ae58 samtools_sort.xml --- a/samtools_sort.xml Tue May 09 11:18:12 2017 -0400 +++ b/samtools_sort.xml Wed Sep 19 09:53:49 2018 -0400 @@ -1,4 +1,4 @@ - + order of storing aligned sequences macros.xml @@ -7,39 +7,163 @@ '${output1}' ]]> - - - - - + + + + + + + + + + + + + + + + + + - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - + + + + **What it does** -This tool uses ``samtools sort`` command to sort BAM datasets in coordinate or read name order. +Sort alignments by leftmost coordinates, or by read name when -n is used. +An appropriate @HD-SO sort order header tag will be added or an existing +one updated if necessary. + +**Ordering Rules** + +The following rules are used for ordering records. + +If option -t is in use, records are first sorted by the value of the given +alignment tag, and then by position or name (if using -n). For example, “-t RG” +will make read group the primary sort key. The rules for ordering by tag are: + +- Records that do not have the tag are sorted before ones that do. +- If the types of the tags are different, they will be sorted so that single + character tags (type A) come before array tags (type B), then string tags + (types H and Z), then numeric tags (types f and i). +- Numeric tags (types f and i) are compared by value. Note that comparisons of + floating-point values are subject to issues of rounding and precision. +- String tags (types H and Z) are compared based on the binary contents of the + tag using the C strcmp(3) function. +- Character tags (type A) are compared by binary character value. +- No attempt is made to compare tags of other types — notably type B array values will not be compared. + +When the -n option is present, records are sorted by name. Names are compared so as to give a “natural” ordering — i.e. sections consisting of digits are compared numerically while all other sections are compared based on their binary representation. This means “a1” will come before “b1” and “a9” will come before “a10”. Records with the same name will be ordered according to the values of the READ1 and READ2 flags (see flags). + +When the -n option is not present, reads are sorted by reference (according to the order of the @SQ header records), then by position in the reference, and then by the REVERSE flag. + +This has now been removed. The previous out.prefix argument (and -f option, if any) should be changed to an appropriate combination of -T PREFIX and -o FILE. The previous -o option should be removed, as output defaults to standard output. + diff -r cab3f8d35989 -r f56bdb93ae58 test-data/name.sort.expected.bam Binary file test-data/name.sort.expected.bam has changed diff -r cab3f8d35989 -r f56bdb93ae58 test-data/name.sort.expected.sam --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/name.sort.expected.sam Wed Sep 19 09:53:49 2018 -0400 @@ -0,0 +1,28 @@ +@HD VN:1.4 SO:queryname +@SQ SN:insert LN:599 +@SQ SN:ref1 LN:45 +@SQ SN:ref2 LN:40 +@SQ SN:ref3 LN:4 +@RG ID:fish PG:donkey +@RG ID:cow PU:13_&^&&*(:332 +@RG PU:*9u8jkjjkjd: ID:colt +@PG ID:bull PP:donkey +@PG ID:donkey +@PG ID:moose +@PG PP:moose ID:cow +@CO +r000 99 insert 50 30 10M = 80 30 ATTTAGCTAC AAAAAAAAAA RG:Z:cow PG:Z:bull +r000 211 insert 80 30 10M = 50 -30 CCCAATCATT AAAAAAAAAA RG:Z:cow PG:Z:bull +r001 83 ref1 37 30 9M = 7 -39 CAGCGCCAT * RG:Z:fish PG:Z:colt +r001 163 ref1 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112 YY:i:100 RG:Z:fish PG:Z:colt +r002 0 ref1 9 30 1S2I6M1P1I1P1I4M2I * 0 0 AAAAGATAAGGGATAAA * XA:Z:abc XB:i:-10 PG:Z:colt +r003 0 ref1 9 30 5H6M * 0 0 AGCTAA * RG:Z:cow +r003 16 ref1 29 30 6H5M * 0 0 TAGGC * RG:Z:cow PG:Z:colt +r004 0 ref1 16 30 6M14N1I5M * 0 0 ATAGCTCTCAGC * RG:Z:colt PG:Z:colt +u1 4 * 0 30 23M * 0 0 TAATTAAGTCTACAGAAAAAAAA ??????????????????????? +x1 0 ref2 1 30 20M * 0 0 AGGTTTTATAAAACAAATAA * RG:Z:colt PG:Z:bull +x2 0 ref2 2 30 21M * 0 0 GGTTTTATAAAACAAATAATT ????????????????????? RG:Z:colt PG:Z:bull +x3 0 ref2 6 30 9M4I13M * 0 0 TTATAAAACAAATAATTAAGTCTACA ?????????????????????????? RG:Z:fish PG:Z:bull +x4 0 ref2 10 30 25M * 0 0 CAAATAATTAAGTCTACAGAGCAAC ????????????????????????? RG:Z:fish PG:Z:bull +x5 0 ref2 12 30 24M * 0 0 AATAATTAAGTCTACAGAGCAACT ???????????????????????? RG:Z:fish PG:Z:bull +x6 0 ref2 14 30 23M * 0 0 TAATTAAGTCTACAGAGCAACTA ??????????????????????? RG:Z:cow diff -r cab3f8d35989 -r f56bdb93ae58 test-data/pos.sort.expected.bam Binary file test-data/pos.sort.expected.bam has changed diff -r cab3f8d35989 -r f56bdb93ae58 test-data/pos.sort.expected.sam --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/pos.sort.expected.sam Wed Sep 19 09:53:49 2018 -0400 @@ -0,0 +1,28 @@ +@HD VN:1.4 SO:coordinate +@SQ SN:insert LN:599 +@SQ SN:ref1 LN:45 +@SQ SN:ref2 LN:40 +@SQ SN:ref3 LN:4 +@RG ID:fish PG:donkey +@RG ID:cow PU:13_&^&&*(:332 +@RG PU:*9u8jkjjkjd: ID:colt +@PG ID:bull PP:donkey +@PG ID:donkey +@PG ID:moose +@PG PP:moose ID:cow +@CO +r000 99 insert 50 30 10M = 80 30 ATTTAGCTAC AAAAAAAAAA RG:Z:cow PG:Z:bull +r000 211 insert 80 30 10M = 50 -30 CCCAATCATT AAAAAAAAAA RG:Z:cow PG:Z:bull +r001 163 ref1 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112 YY:i:100 RG:Z:fish PG:Z:colt +r002 0 ref1 9 30 1S2I6M1P1I1P1I4M2I * 0 0 AAAAGATAAGGGATAAA * XA:Z:abc XB:i:-10 PG:Z:colt +r003 0 ref1 9 30 5H6M * 0 0 AGCTAA * RG:Z:cow +r004 0 ref1 16 30 6M14N1I5M * 0 0 ATAGCTCTCAGC * RG:Z:colt PG:Z:colt +r003 16 ref1 29 30 6H5M * 0 0 TAGGC * RG:Z:cow PG:Z:colt +r001 83 ref1 37 30 9M = 7 -39 CAGCGCCAT * RG:Z:fish PG:Z:colt +x1 0 ref2 1 30 20M * 0 0 AGGTTTTATAAAACAAATAA * RG:Z:colt PG:Z:bull +x2 0 ref2 2 30 21M * 0 0 GGTTTTATAAAACAAATAATT ????????????????????? RG:Z:colt PG:Z:bull +x3 0 ref2 6 30 9M4I13M * 0 0 TTATAAAACAAATAATTAAGTCTACA ?????????????????????????? RG:Z:fish PG:Z:bull +x4 0 ref2 10 30 25M * 0 0 CAAATAATTAAGTCTACAGAGCAAC ????????????????????????? RG:Z:fish PG:Z:bull +x5 0 ref2 12 30 24M * 0 0 AATAATTAAGTCTACAGAGCAACT ???????????????????????? RG:Z:fish PG:Z:bull +x6 0 ref2 14 30 23M * 0 0 TAATTAAGTCTACAGAGCAACTA ??????????????????????? RG:Z:cow +u1 4 * 0 30 23M * 0 0 TAATTAAGTCTACAGAAAAAAAA ??????????????????????? diff -r cab3f8d35989 -r f56bdb93ae58 test-data/tag.as.sort.expected.bam Binary file test-data/tag.as.sort.expected.bam has changed diff -r cab3f8d35989 -r f56bdb93ae58 test-data/tag.as.sort.expected.sam --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/tag.as.sort.expected.sam Wed Sep 19 09:53:49 2018 -0400 @@ -0,0 +1,24 @@ +@HD VN:1.4 SO:unknown +@SQ SN:insert LN:599 +@SQ SN:ref1 LN:45 +@SQ SN:ref2 LN:40 +@SQ SN:ref3 LN:4 +@PG ID:llama +@RG ID:fish PG:llama +@RG ID:cow PU:13_&^&&*(:332 PG:donkey +@RG PU:*9u8jkjjkjd: ID:colt +@PG ID:bull PP:donkey +@PG ID:donkey +@CO Do you know? +r006 16 ref1 29 30 6H5M * 0 0 TAGGC * RG:Z:colt PG:Z:donkey FI:i:3 +x11 0 ref2 12 30 24M * 0 0 AATAATTAAGTCTACAGAGCAACT ???????????????????????? RG:Z:cow PG:Z:bull FI:Z:a +r007 0 ref1 16 30 6M14N1I5M * 0 0 ATAGCTCTCAGC * RG:Z:colt PG:Z:donkey AS:i:-5 FI:f:3.5 +x10 0 ref2 10 30 25M * 0 0 CAAATAATTAAGTCTACAGAGCAAC ????????????????????????? RG:Z:cow PG:Z:bull AS:i:0 FI:A:b +r007 0 ref1 9 30 5H6M * 0 0 AGCTAA * RG:Z:colt PG:Z:donkey AS:i:1 FI:i:4 +r005 163 ref1 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112 YY:i:100 RG:Z:colt PG:Z:donkey AS:i:10 FI:i:5 +x8 0 ref2 2 30 21M * 0 0 GGTTTTATAAAACAAATAATT ????????????????????? RG:Z:cow PG:Z:bull AS:i:10 FI:f:1.5 +r006 0 ref1 9 30 1S2I6M1P1I1P1I4M2I * 0 0 AAAAGATAAGGGATAAA * XA:Z:abc XB:i:-10 RG:Z:colt PG:Z:donkey AS:i:20 FI:f:4.5 +x9 0 ref2 6 30 9M4I13M * 0 0 TTATAAAACAAATAATTAAGTCTACA ?????????????????????????? RG:Z:cow PG:Z:bull AS:i:20 FI:i:1 +x7 0 ref2 1 30 20M * 0 0 AGGTTTTATAAAACAAATAA * RG:Z:cow PG:Z:bull AS:i:50 FI:i:2 +r005 83 ref1 37 30 9M = 7 -39 CAGCGCCAT * RG:Z:colt PG:Z:donkey AS:i:100 FI:f:2.5 +x12 0 ref2 14 30 23M * 0 0 TAATTAAGTCTACAGAGCAACTA ??????????????????????? RG:Z:cow PG:Z:bull AS:i:65100 diff -r cab3f8d35989 -r f56bdb93ae58 test-data/tag.fi.sort.expected.bam Binary file test-data/tag.fi.sort.expected.bam has changed diff -r cab3f8d35989 -r f56bdb93ae58 test-data/tag.fi.sort.expected.sam --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/tag.fi.sort.expected.sam Wed Sep 19 09:53:49 2018 -0400 @@ -0,0 +1,24 @@ +@HD VN:1.4 SO:unknown +@SQ SN:insert LN:599 +@SQ SN:ref1 LN:45 +@SQ SN:ref2 LN:40 +@SQ SN:ref3 LN:4 +@PG ID:llama +@RG ID:fish PG:llama +@RG ID:cow PU:13_&^&&*(:332 PG:donkey +@RG PU:*9u8jkjjkjd: ID:colt +@PG ID:bull PP:donkey +@PG ID:donkey +@CO Do you know? +x12 0 ref2 14 30 23M * 0 0 TAATTAAGTCTACAGAGCAACTA ??????????????????????? RG:Z:cow PG:Z:bull AS:i:65100 +x10 0 ref2 10 30 25M * 0 0 CAAATAATTAAGTCTACAGAGCAAC ????????????????????????? RG:Z:cow PG:Z:bull AS:i:0 FI:A:b +x11 0 ref2 12 30 24M * 0 0 AATAATTAAGTCTACAGAGCAACT ???????????????????????? RG:Z:cow PG:Z:bull FI:Z:a +x9 0 ref2 6 30 9M4I13M * 0 0 TTATAAAACAAATAATTAAGTCTACA ?????????????????????????? RG:Z:cow PG:Z:bull AS:i:20 FI:i:1 +x8 0 ref2 2 30 21M * 0 0 GGTTTTATAAAACAAATAATT ????????????????????? RG:Z:cow PG:Z:bull AS:i:10 FI:f:1.5 +x7 0 ref2 1 30 20M * 0 0 AGGTTTTATAAAACAAATAA * RG:Z:cow PG:Z:bull AS:i:50 FI:i:2 +r005 83 ref1 37 30 9M = 7 -39 CAGCGCCAT * RG:Z:colt PG:Z:donkey AS:i:100 FI:f:2.5 +r006 16 ref1 29 30 6H5M * 0 0 TAGGC * RG:Z:colt PG:Z:donkey FI:i:3 +r007 0 ref1 16 30 6M14N1I5M * 0 0 ATAGCTCTCAGC * RG:Z:colt PG:Z:donkey AS:i:-5 FI:f:3.5 +r007 0 ref1 9 30 5H6M * 0 0 AGCTAA * RG:Z:colt PG:Z:donkey AS:i:1 FI:i:4 +r006 0 ref1 9 30 1S2I6M1P1I1P1I4M2I * 0 0 AAAAGATAAGGGATAAA * XA:Z:abc XB:i:-10 RG:Z:colt PG:Z:donkey AS:i:20 FI:f:4.5 +r005 163 ref1 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112 YY:i:100 RG:Z:colt PG:Z:donkey AS:i:10 FI:i:5 diff -r cab3f8d35989 -r f56bdb93ae58 test-data/tag.rg.n.sort.expected.bam Binary file test-data/tag.rg.n.sort.expected.bam has changed diff -r cab3f8d35989 -r f56bdb93ae58 test-data/tag.rg.n.sort.expected.sam --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/tag.rg.n.sort.expected.sam Wed Sep 19 09:53:49 2018 -0400 @@ -0,0 +1,28 @@ +@HD VN:1.4 SO:unknown +@SQ SN:insert LN:599 +@SQ SN:ref1 LN:45 +@SQ SN:ref2 LN:40 +@SQ SN:ref3 LN:4 +@RG ID:fish PG:donkey +@RG ID:cow PU:13_&^&&*(:332 +@RG PU:*9u8jkjjkjd: ID:colt +@PG ID:bull PP:donkey +@PG ID:donkey +@PG ID:moose +@PG PP:moose ID:cow +@CO +r002 0 ref1 9 30 1S2I6M1P1I1P1I4M2I * 0 0 AAAAGATAAGGGATAAA * XA:Z:abc XB:i:-10 PG:Z:colt +u1 4 * 0 30 23M * 0 0 TAATTAAGTCTACAGAAAAAAAA ??????????????????????? +r004 0 ref1 16 30 6M14N1I5M * 0 0 ATAGCTCTCAGC * RG:Z:colt PG:Z:colt +x1 0 ref2 1 30 20M * 0 0 AGGTTTTATAAAACAAATAA * RG:Z:colt PG:Z:bull +x2 0 ref2 2 30 21M * 0 0 GGTTTTATAAAACAAATAATT ????????????????????? RG:Z:colt PG:Z:bull +r000 99 insert 50 30 10M = 80 30 ATTTAGCTAC AAAAAAAAAA RG:Z:cow PG:Z:bull +r000 211 insert 80 30 10M = 50 -30 CCCAATCATT AAAAAAAAAA RG:Z:cow PG:Z:bull +r003 0 ref1 9 30 5H6M * 0 0 AGCTAA * RG:Z:cow +r003 16 ref1 29 30 6H5M * 0 0 TAGGC * RG:Z:cow PG:Z:colt +x6 0 ref2 14 30 23M * 0 0 TAATTAAGTCTACAGAGCAACTA ??????????????????????? RG:Z:cow +r001 83 ref1 37 30 9M = 7 -39 CAGCGCCAT * RG:Z:fish PG:Z:colt +r001 163 ref1 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112 YY:i:100 RG:Z:fish PG:Z:colt +x3 0 ref2 6 30 9M4I13M * 0 0 TTATAAAACAAATAATTAAGTCTACA ?????????????????????????? RG:Z:fish PG:Z:bull +x4 0 ref2 10 30 25M * 0 0 CAAATAATTAAGTCTACAGAGCAAC ????????????????????????? RG:Z:fish PG:Z:bull +x5 0 ref2 12 30 24M * 0 0 AATAATTAAGTCTACAGAGCAACT ???????????????????????? RG:Z:fish PG:Z:bull diff -r cab3f8d35989 -r f56bdb93ae58 test-data/tag.rg.sort.expected.bam Binary file test-data/tag.rg.sort.expected.bam has changed diff -r cab3f8d35989 -r f56bdb93ae58 test-data/tag.rg.sort.expected.sam --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/tag.rg.sort.expected.sam Wed Sep 19 09:53:49 2018 -0400 @@ -0,0 +1,28 @@ +@HD VN:1.4 SO:unknown +@SQ SN:insert LN:599 +@SQ SN:ref1 LN:45 +@SQ SN:ref2 LN:40 +@SQ SN:ref3 LN:4 +@RG ID:fish PG:donkey +@RG ID:cow PU:13_&^&&*(:332 +@RG PU:*9u8jkjjkjd: ID:colt +@PG ID:bull PP:donkey +@PG ID:donkey +@PG ID:moose +@PG PP:moose ID:cow +@CO +r002 0 ref1 9 30 1S2I6M1P1I1P1I4M2I * 0 0 AAAAGATAAGGGATAAA * XA:Z:abc XB:i:-10 PG:Z:colt +u1 4 * 0 30 23M * 0 0 TAATTAAGTCTACAGAAAAAAAA ??????????????????????? +r004 0 ref1 16 30 6M14N1I5M * 0 0 ATAGCTCTCAGC * RG:Z:colt PG:Z:colt +x1 0 ref2 1 30 20M * 0 0 AGGTTTTATAAAACAAATAA * RG:Z:colt PG:Z:bull +x2 0 ref2 2 30 21M * 0 0 GGTTTTATAAAACAAATAATT ????????????????????? RG:Z:colt PG:Z:bull +r000 99 insert 50 30 10M = 80 30 ATTTAGCTAC AAAAAAAAAA RG:Z:cow PG:Z:bull +r000 211 insert 80 30 10M = 50 -30 CCCAATCATT AAAAAAAAAA RG:Z:cow PG:Z:bull +r003 0 ref1 9 30 5H6M * 0 0 AGCTAA * RG:Z:cow +r003 16 ref1 29 30 6H5M * 0 0 TAGGC * RG:Z:cow PG:Z:colt +x6 0 ref2 14 30 23M * 0 0 TAATTAAGTCTACAGAGCAACTA ??????????????????????? RG:Z:cow +r001 163 ref1 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112 YY:i:100 RG:Z:fish PG:Z:colt +r001 83 ref1 37 30 9M = 7 -39 CAGCGCCAT * RG:Z:fish PG:Z:colt +x3 0 ref2 6 30 9M4I13M * 0 0 TTATAAAACAAATAATTAAGTCTACA ?????????????????????????? RG:Z:fish PG:Z:bull +x4 0 ref2 10 30 25M * 0 0 CAAATAATTAAGTCTACAGAGCAAC ????????????????????????? RG:Z:fish PG:Z:bull +x5 0 ref2 12 30 24M * 0 0 AATAATTAAGTCTACAGAGCAACT ???????????????????????? RG:Z:fish PG:Z:bull diff -r cab3f8d35989 -r f56bdb93ae58 test-data/test_input_1_a.bam Binary file test-data/test_input_1_a.bam has changed