hd: hd.py comparison

comparison hd.py @ 21:9919024d7778 draft

planemo upload for repository https://github.com/monikaheinzl/duplexanalysis_galaxy/tree/master/tools/hd commit b8a2f7b7615b2bcd3b602027af31f4e677da94f6

author	mheinzl
date	Fri, 14 Dec 2018 05:03:24 -0500
parents	b084b6a8e3ac
children	7e570ba56b83

comparison

equal deleted inserted replaced

-:b084b6a8e3ac
+:9919024d7778
 plt.xlim((0, maximumXFS + 1))
 if len(numpy.concatenate(familySizeList1)) != 0:
 plt.ylim((0, max(numpy.bincount(numpy.concatenate(familySizeList1))) * 1.1))
 plt.ylim((0, maximumY * 1.2))
-legend = "\nmax. family size: \nabsolute frequency: \nrelative frequency: "
+legend = "\nfamily size: \nabsolute frequency: \nrelative frequency: "
 plt.text(0.15, -0.08, legend, size=12, transform=plt.gcf().transFigure)
 count = numpy.bincount(originalCounts)  # original counts
-legend1 = "{}\n{}\n{:.5f}".format(max(originalCounts), count[len(count) - 1], float(count[len(count) - 1]) / sum(count))
+if max(originalCounts) >= 20:
+max_count = ">= 20"
+else:
+max_count = max(originalCounts)
+legend1 = "{}\n{}\n{:.5f}".format(max_count, count[len(count) - 1], float(count[len(count) - 1]) / sum(count))
 plt.text(0.5, -0.08, legend1, size=12, transform=plt.gcf().transFigure)
 legend3 = "singletons\n{:,}\n{:.5f}".format(int(counts[0][len(counts[0]) - 1][1]), float(counts[0][len(counts[0]) - 1][1]) / sum(counts[0][len(counts[0]) - 1]))
 plt.text(0.7, -0.08, legend3, transform=plt.gcf().transFigure, size=12)
 plt.grid(b=True, which='major', color='#424242', linestyle=':')
 # FSD
 createFileFSD2(summary5, sumCol5, overallSum5, output_file,
 "Family size distribution separated by Hamming distance", sep,
 diff=False)
-count = numpy.bincount(quant)
 # output_file.write("{}{}\n".format(sep, name1))
 output_file.write("\n")
-output_file.write("max. family size:{}{}\n".format(sep, max(quant)))
+max_fs = numpy.bincount(integers[result])
-output_file.write("absolute frequency:{}{}\n".format(sep, count[len(count) - 1]))
+output_file.write("max. family size in sample:{}{}\n".format(sep, max(integers[result])))
+output_file.write("absolute frequency:{}{}\n".format(sep, max_fs[len(max_fs) - 1]))
 output_file.write(
-"relative frequency:{}{}\n\n".format(sep, float(count[len(count) - 1]) / sum(count)))
+"relative frequency:{}{}\n\n".format(sep, float(max_fs[len(max_fs) - 1]) / sum(max_fs)))
 # HD within tags
 output_file.write(
 "The hamming distances were calculated by comparing each half of all tags against the tag(s) with the minimum Hamming distance per half.\n"
 "It is possible that one tag can have the minimum HD from multiple tags, so the sample size in this calculation differs from the sample size entered by the user.\n")

Mercurial > repos > mheinzl > hd

comparison hd.py @ 21:9919024d7778 draft