comparison hcluster_sg_parser.py @ 3:f9e418125021 draft

planemo upload for repository https://github.com/TGAC/earlham-galaxytools/tree/master/tools/hcluster_sg_parser commit 66af14bc1642c1ca6ceb21f6018c8d665da890e8
author earlhaminst
date Fri, 28 Apr 2017 12:51:35 -0400
parents 17aa68582a05
children 02d73e6ca869
comparison
equal deleted inserted replaced
2:0a33fd8ead70 3:f9e418125021
1 """ 1 """
2 A simple parser to convert the hcluster_sg 3-column output into lists of IDs, one list for each cluster. 2 A simple parser to convert the hcluster_sg output into lists of IDs, one list for each cluster.
3 3
4 When a minimum and/or maximum number of cluster elements are specified, the IDs contained in the filtered-out clusters are collected in the "discarded IDS" output dataset. 4 When a minimum and/or maximum number of cluster elements are specified, the IDs contained in the filtered-out clusters are collected in the "discarded IDS" output dataset.
5 5
6 Usage: 6 Usage:
7 7
19 19
20 with open(args[1], 'w') as discarded_out: 20 with open(args[1], 'w') as discarded_out:
21 with open(args[0]) as fh: 21 with open(args[0]) as fh:
22 for line in fh: 22 for line in fh:
23 line = line.rstrip() 23 line = line.rstrip()
24 (cluster_id, n_ids, id_list) = line.split('\t') 24 line_cols = line.split('\t')
25 n_ids = int(n_ids) 25 cluster_id = line_cols[0]
26 id_list = id_list.replace(',', '\n') 26 n_ids = int(line_cols[-2])
27 id_list = line_cols[-1].replace(',', '\n')
27 if n_ids >= options.min and n_ids <= options.max: 28 if n_ids >= options.min and n_ids <= options.max:
28 outfile = cluster_id + '_output.txt' 29 outfile = cluster_id + '_output.txt'
29 with open(outfile, 'w') as f: 30 with open(outfile, 'w') as f:
30 f.write(id_list) 31 f.write(id_list)
31 else: 32 else: