Mercurial > repos > mheinzl > hd
diff test-data/output_file.tabular @ 19:2e9f7ea7ae93 draft
planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit dfaab79252a858e8df16bbea3607ebf1b6962e5a-dirty
author | mheinzl |
---|---|
date | Mon, 08 Oct 2018 05:56:04 -0400 |
parents | |
children | 7e570ba56b83 |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/output_file.tabular Mon Oct 08 05:56:04 2018 -0400 @@ -0,0 +1,85 @@ +Test_data +number of tags per file 20 (from 20) against 20 + +Hamming distance separated by family size + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +HD=1 5 1 1 1 1 0 9 +HD=6 3 0 0 0 0 0 3 +HD=7 4 0 0 0 1 0 5 +HD=8 2 0 0 1 0 0 3 +sum 14 1 1 2 2 0 20 + +Family size distribution separated by Hamming distance + HD=1 HD=2 HD=3 HD=4 HD=5-8 HD>8 sum +FS=1 5 0 0 0 9 0 14 +FS=2 1 0 0 0 0 0 1 +FS=3 1 0 0 0 0 0 1 +FS=4 1 0 0 0 1 0 2 +FS=6 1 0 0 0 0 0 1 +FS=7 0 0 0 0 1 0 1 +sum 9 0 0 0 11 0 20 + + +max. family size: 7 +absolute frequency: 1 +relative frequency: 0.05 + +The hamming distances were calculated by comparing each half of all tags against the tag(s) with the minimum Hamming distance per half. +It is possible that one tag can have the minimum HD from multiple tags, so the sample size in this calculation differs from the sample size entered by the user. +actual number of tags with min HD = 171 (sample size by user = 20) +length of one part of the tag = 12 + +Hamming distance of each half in the tag + HD a HD b' HD b HD a' HD a+b sum +HD=0 146 0 8 4 0 158 +HD=1 0 2 2 21 11 36 +HD=2 0 0 0 0 1 1 +HD=5 0 0 4 0 0 4 +HD=6 0 2 2 0 6 10 +HD=7 0 16 9 0 21 46 +HD=8 0 20 0 0 26 46 +HD=9 0 50 0 0 50 100 +HD=10 0 30 0 0 30 60 +HD=11 0 18 0 0 18 36 +HD=12 0 8 0 0 8 16 +sum 146 146 25 25 171 513 + +Absolute delta Hamming distances within the tag + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +diff=0 1 0 0 0 0 0 1 +diff=1 6 1 2 1 1 0 11 +diff=4 4 0 0 0 0 0 4 +diff=5 2 0 0 0 0 0 2 +diff=6 6 0 0 1 1 0 8 +diff=7 15 0 1 0 3 0 19 +diff=8 15 2 0 1 2 0 20 +diff=9 37 4 1 4 4 0 50 +diff=10 22 2 1 4 1 0 30 +diff=11 8 1 1 5 3 0 18 +diff=12 6 1 0 1 0 0 8 +sum 122 11 6 17 15 0 171 + +Chimera analysis: relative delta Hamming distances + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +diff=0.0 1 0 0 0 0 0 1 +diff=0.7 6 0 0 0 0 0 6 +diff=0.8 4 0 0 1 1 0 6 +diff=1.0 111 11 6 16 14 0 158 +sum 122 11 6 17 15 0 171 + +Chimeras: +All tags were filtered: only those tags where at least one half is identical with the half of the min. tag are kept. +So the hamming distance of the non-identical half is compared. +Hamming distances of non-zero half + FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum +HD=1 6 1 2 1 1 0 11 +HD=6 2 0 0 0 0 0 2 +HD=7 15 0 1 0 3 0 19 +HD=8 15 2 0 1 2 0 20 +HD=9 37 4 1 4 4 0 50 +HD=10 22 2 1 4 1 0 30 +HD=11 8 1 1 5 3 0 18 +HD=12 6 1 0 1 0 0 8 +sum 111 11 6 16 14 0 158 + +