Mercurial > repos > mheinzl > hd
diff test-data/hd_output.tab @ 25:9e384b0741f1 draft
planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit b8a2f7b7615b2bcd3b602027af31f4e677da94f6-dirty
author | mheinzl |
---|---|
date | Tue, 14 May 2019 03:29:37 -0400 |
parents | |
children | 6b15b3b6405c |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/hd_output.tab Tue May 14 03:29:37 2019 -0400 @@ -0,0 +1,77 @@ +hd_data.tab +number of tags per file 20 (from 20) against 20 + +Hamming distance separated by family size + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +HD=1 5 1 1 1 1 0 9 +HD=6 3 0 0 0 0 0 3 +HD=7 4 0 0 0 1 0 5 +HD=8 2 0 0 1 0 0 3 +sum 14 1 1 2 2 0 20 + +Family size distribution separated by Hamming distance + HD=1 HD=2 HD=3 HD=4 HD=5-8 HD>8 sum +FS=1 5 0 0 0 9 0 14 +FS=2 1 0 0 0 0 0 1 +FS=3 1 0 0 0 0 0 1 +FS=4 1 0 0 0 1 0 2 +FS=6 1 0 0 0 0 0 1 +FS=7 0 0 0 0 1 0 1 +sum 9 0 0 0 11 0 20 + + +max. family size in sample: 7 +absolute frequency: 1 +relative frequency: 0.05 + +The Hamming distances were calculated by comparing the first halve against all halves and selected the minimum value (HD a). +For the second half of the tag, we compared them against all tags which resulted in the minimum HD of the previous step and selected the maximum value (HD b'). +Finally, it was possible to calculate the absolute and relative differences between the HDs (absolute and relative delta HD). +These calculations were repeated, but starting with the second half in the first step to find all possible chimeras in the data (HD b and HD For simplicity we used the maximum value between the delta values in the end. +When only tags that can form DCS were allowed in the analysis, family sizes for the forward and reverse (ab and ba) will be included in the plots. +length of one part of the tag = 12 + +Hamming distance of each half in the tag + HD a HD b' HD b HD a' HD a+b sum +HD=0 20 0 8 1 0 29 +HD=1 0 0 1 19 8 28 +HD=2 0 0 0 0 1 1 +HD=5 0 0 3 0 0 3 +HD=6 0 0 2 0 3 5 +HD=7 0 1 6 0 4 11 +HD=8 0 2 0 0 7 9 +HD=9 0 1 0 0 1 2 +HD=10 0 2 0 0 2 4 +HD=11 0 7 0 0 7 14 +HD=12 0 7 0 0 7 14 +sum 20 20 20 20 40 120 + +Absolute delta Hamming distances within the tag + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +diff=7 1 0 0 0 0 0 1 +diff=8 1 0 0 0 1 0 2 +diff=9 1 0 0 0 0 0 1 +diff=10 2 0 0 0 0 0 2 +diff=11 4 0 1 1 1 0 7 +diff=12 5 1 0 1 0 0 7 +sum 14 1 1 2 2 0 20 + +Chimera analysis: relative delta Hamming distances + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +diff=1.0 14 1 1 2 2 0 20 +sum 14 1 1 2 2 0 20 + +Chimeras: +All tags were filtered: only those tags where at least one half was identical (HD=0) and therefore, had a relative delta of 1 were kept. These tags are considered as chimeric. +So the Hamming distances of the chimeric tags are shown. +Hamming distances of chimeras + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +HD=7 1 0 0 0 0 0 1 +HD=8 1 0 0 0 1 0 2 +HD=9 1 0 0 0 0 0 1 +HD=10 2 0 0 0 0 0 2 +HD=11 4 0 1 1 1 0 7 +HD=12 5 1 0 1 0 0 7 +sum 14 1 1 2 2 0 20 + +