Mercurial > repos > mheinzl > hd
view test-data/hd_output.tab @ 26:15d5da04ef70 draft
planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit b8a2f7b7615b2bcd3b602027af31f4e677da94f6-dirty
author | mheinzl |
---|---|
date | Tue, 14 May 2019 03:48:39 -0400 |
parents | 9e384b0741f1 |
children | 6b15b3b6405c |
line wrap: on
line source
hd_data.tab number of tags per file 20 (from 20) against 20 Hamming distance separated by family size FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum HD=1 5 1 1 1 1 0 9 HD=6 3 0 0 0 0 0 3 HD=7 4 0 0 0 1 0 5 HD=8 2 0 0 1 0 0 3 sum 14 1 1 2 2 0 20 Family size distribution separated by Hamming distance HD=1 HD=2 HD=3 HD=4 HD=5-8 HD>8 sum FS=1 5 0 0 0 9 0 14 FS=2 1 0 0 0 0 0 1 FS=3 1 0 0 0 0 0 1 FS=4 1 0 0 0 1 0 2 FS=6 1 0 0 0 0 0 1 FS=7 0 0 0 0 1 0 1 sum 9 0 0 0 11 0 20 max. family size in sample: 7 absolute frequency: 1 relative frequency: 0.05 The Hamming distances were calculated by comparing the first halve against all halves and selected the minimum value (HD a). For the second half of the tag, we compared them against all tags which resulted in the minimum HD of the previous step and selected the maximum value (HD b'). Finally, it was possible to calculate the absolute and relative differences between the HDs (absolute and relative delta HD). These calculations were repeated, but starting with the second half in the first step to find all possible chimeras in the data (HD b and HD For simplicity we used the maximum value between the delta values in the end. When only tags that can form DCS were allowed in the analysis, family sizes for the forward and reverse (ab and ba) will be included in the plots. length of one part of the tag = 12 Hamming distance of each half in the tag HD a HD b' HD b HD a' HD a+b sum HD=0 20 0 8 1 0 29 HD=1 0 0 1 19 8 28 HD=2 0 0 0 0 1 1 HD=5 0 0 3 0 0 3 HD=6 0 0 2 0 3 5 HD=7 0 1 6 0 4 11 HD=8 0 2 0 0 7 9 HD=9 0 1 0 0 1 2 HD=10 0 2 0 0 2 4 HD=11 0 7 0 0 7 14 HD=12 0 7 0 0 7 14 sum 20 20 20 20 40 120 Absolute delta Hamming distances within the tag FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum diff=7 1 0 0 0 0 0 1 diff=8 1 0 0 0 1 0 2 diff=9 1 0 0 0 0 0 1 diff=10 2 0 0 0 0 0 2 diff=11 4 0 1 1 1 0 7 diff=12 5 1 0 1 0 0 7 sum 14 1 1 2 2 0 20 Chimera analysis: relative delta Hamming distances FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum diff=1.0 14 1 1 2 2 0 20 sum 14 1 1 2 2 0 20 Chimeras: All tags were filtered: only those tags where at least one half was identical (HD=0) and therefore, had a relative delta of 1 were kept. These tags are considered as chimeric. So the Hamming distances of the chimeric tags are shown. Hamming distances of chimeras FS=1 FS=2 FS=3 FS=4 FS=5-10 FS>10 sum HD=7 1 0 0 0 0 0 1 HD=8 1 0 0 0 1 0 2 HD=9 1 0 0 0 0 0 1 HD=10 2 0 0 0 0 0 2 HD=11 4 0 1 1 1 0 7 HD=12 5 1 0 1 0 0 7 sum 14 1 1 2 2 0 20