view test-data/output_file.tabular @ 21:9919024d7778 draft

planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit b8a2f7b7615b2bcd3b602027af31f4e677da94f6
author mheinzl
date Fri, 14 Dec 2018 05:03:24 -0500
parents 2e9f7ea7ae93
children 7e570ba56b83
line wrap: on
line source

Test_data
number of tags per file	20 (from 20) against 20

Hamming distance separated by family size
	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
HD=1	5	1	1	1	1	0	9	
HD=6	3	0	0	0	0	0	3	
HD=7	4	0	0	0	1	0	5	
HD=8	2	0	0	1	0	0	3	
sum	14	1	1	2	2	0	20	

Family size distribution separated by Hamming distance
	HD=1	HD=2	HD=3	HD=4	HD=5-8	HD>8	sum	
FS=1	5	0	0	0	9	0	14	
FS=2	1	0	0	0	0	0	1	
FS=3	1	0	0	0	0	0	1	
FS=4	1	0	0	0	1	0	2	
FS=6	1	0	0	0	0	0	1	
FS=7	0	0	0	0	1	0	1	
sum	9	0	0	0	11	0	20	


max. family size:	7
absolute frequency:	1
relative frequency:	0.05

The hamming distances were calculated by comparing each half of all tags against the tag(s) with the minimum Hamming distance per half.
It is possible that one tag can have the minimum HD from multiple tags, so the sample size in this calculation differs from the sample size entered by the user.
actual number of tags with min HD = 171 (sample size by user = 20)
length of one part of the tag = 12

Hamming distance of each half in the tag
	HD a	HD b'	HD b	HD a'	HD a+b	sum	
HD=0	146	0	8	4	0	158	
HD=1	0	2	2	21	11	36	
HD=2	0	0	0	0	1	1	
HD=5	0	0	4	0	0	4	
HD=6	0	2	2	0	6	10	
HD=7	0	16	9	0	21	46	
HD=8	0	20	0	0	26	46	
HD=9	0	50	0	0	50	100	
HD=10	0	30	0	0	30	60	
HD=11	0	18	0	0	18	36	
HD=12	0	8	0	0	8	16	
sum	146	146	25	25	171	513	

Absolute delta Hamming distances within the tag
	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
diff=0	1	0	0	0	0	0	1	
diff=1	6	1	2	1	1	0	11	
diff=4	4	0	0	0	0	0	4	
diff=5	2	0	0	0	0	0	2	
diff=6	6	0	0	1	1	0	8	
diff=7	15	0	1	0	3	0	19	
diff=8	15	2	0	1	2	0	20	
diff=9	37	4	1	4	4	0	50	
diff=10	22	2	1	4	1	0	30	
diff=11	8	1	1	5	3	0	18	
diff=12	6	1	0	1	0	0	8	
sum	122	11	6	17	15	0	171	

Chimera analysis: relative delta Hamming distances
	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
diff=0.0	1	0	0	0	0	0	1	
diff=0.7	6	0	0	0	0	0	6	
diff=0.8	4	0	0	1	1	0	6	
diff=1.0	111	11	6	16	14	0	158	
sum	122	11	6	17	15	0	171	

Chimeras:
All tags were filtered: only those tags where at least one half is identical with the half of the min. tag are kept.
So the hamming distance of the non-identical half is compared.
Hamming distances of non-zero half
	FS=1	FS=2	FS=3	FS=4	FS=5-10	FS>10	sum	
HD=1	6	1	2	1	1	0	11	
HD=6	2	0	0	0	0	0	2	
HD=7	15	0	1	0	3	0	19	
HD=8	15	2	0	1	2	0	20	
HD=9	37	4	1	4	4	0	50	
HD=10	22	2	1	4	1	0	30	
HD=11	8	1	1	5	3	0	18	
HD=12	6	1	0	1	0	0	8	
sum	111	11	6	16	14	0	158