Mercurial > repos > earlhaminst > hcluster_sg_parser
comparison hcluster_sg_parser.py @ 3:f9e418125021 draft
planemo upload for repository https://github.com/TGAC/earlham-galaxytools/tree/master/tools/hcluster_sg_parser commit 66af14bc1642c1ca6ceb21f6018c8d665da890e8
author | earlhaminst |
---|---|
date | Fri, 28 Apr 2017 12:51:35 -0400 |
parents | 17aa68582a05 |
children | 02d73e6ca869 |
comparison
equal
deleted
inserted
replaced
2:0a33fd8ead70 | 3:f9e418125021 |
---|---|
1 """ | 1 """ |
2 A simple parser to convert the hcluster_sg 3-column output into lists of IDs, one list for each cluster. | 2 A simple parser to convert the hcluster_sg output into lists of IDs, one list for each cluster. |
3 | 3 |
4 When a minimum and/or maximum number of cluster elements are specified, the IDs contained in the filtered-out clusters are collected in the "discarded IDS" output dataset. | 4 When a minimum and/or maximum number of cluster elements are specified, the IDs contained in the filtered-out clusters are collected in the "discarded IDS" output dataset. |
5 | 5 |
6 Usage: | 6 Usage: |
7 | 7 |
19 | 19 |
20 with open(args[1], 'w') as discarded_out: | 20 with open(args[1], 'w') as discarded_out: |
21 with open(args[0]) as fh: | 21 with open(args[0]) as fh: |
22 for line in fh: | 22 for line in fh: |
23 line = line.rstrip() | 23 line = line.rstrip() |
24 (cluster_id, n_ids, id_list) = line.split('\t') | 24 line_cols = line.split('\t') |
25 n_ids = int(n_ids) | 25 cluster_id = line_cols[0] |
26 id_list = id_list.replace(',', '\n') | 26 n_ids = int(line_cols[-2]) |
27 id_list = line_cols[-1].replace(',', '\n') | |
27 if n_ids >= options.min and n_ids <= options.max: | 28 if n_ids >= options.min and n_ids <= options.max: |
28 outfile = cluster_id + '_output.txt' | 29 outfile = cluster_id + '_output.txt' |
29 with open(outfile, 'w') as f: | 30 with open(outfile, 'w') as f: |
30 f.write(id_list) | 31 f.write(id_list) |
31 else: | 32 else: |