annotate check_id_map.xml @ 4:51eb500d6526 draft

check_id_map minor fix
author Davide Albanese <davide.albanese@gmail.com>
date Sat, 09 Mar 2013 22:27:31 +0100
parents 3e5da170ffdd
children 60b88c61566f
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
1 <!-- Author: Davide Albanese - Fondazione Edmud Mach, 2013 -->
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
2
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
3 <tool id="check_id_map" name="Check ID Map" version="1.6.0-2">
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
4 <description>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
5 Checks user's metadata mapping file for required data, valid
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
6 format
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
7 </description>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
8 <requirements>
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
9 <requirement type="binary">check_id_map.py</requirement>
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
10 </requirements>
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
11 <command>
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
12 check_id_map.py
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
13
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
14 -m $mapping_fp
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
15
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
16 #if str($char_replace):
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
17 -c $char_replace
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
18 #end if
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
19
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
20 #if $not_barcoded:
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
21 -b
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
22 #end if
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
23
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
24 #if $variable_len_barcodes:
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
25 -B
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
26 #end if
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
27
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
28 #if $disable_primer_check:
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
29 -p
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
30 #end if
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
31
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
32 #if str($added_demultiplex_field):
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
33 -j $added_demultiplex_field
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
34 #end if
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
35 ;
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
36 rm `basename $mapping_fp .txt`'.html'
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
37 ;
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
38 rm overlib.js
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
39 ;
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
40 mv `basename $mapping_fp .txt`'.log' $out_log
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
41 ;
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
42 mv `basename $mapping_fp .txt`'_corrected.txt' $out_txt
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
43 </command>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
44 <inputs>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
45 <param name="mapping_fp" label="Metadata mapping file" optional="False" type="data" format="tabular"/>
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
46
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
47 <param name="char_replace" value="_" label="Character used to replace invalid characters found in the mapping file. Must be a valid character (alphanumeric, period, or underscore)" optional="False" type="text"/>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
48
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
49 <param name="not_barcoded" label="Set if barcodes are not present. BarcodeSequence header still required" selected="False" type="boolean"/>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
50
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
51 <param name="variable_len_barcodes" label="Set if variable length barcodes are present to suppress warnings about barcodes of unequal length" selected="False" type="boolean"/>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
52
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
53 <param name="disable_primer_check" label="Set to disable checks for primers. LinkerPrimerSequence header still required" selected="False" type="boolean"/>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
54
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
55 <param name="added_demultiplex_field" label="Use to add a field to use in the mapping file as additional demultiplexing (can be used with or without barcodes). All combinations of barcodes/primers and the these fields must be unique. The fields must contain values that can be parsed from the fasta labels such as 'plate=R_2008_12_09'. In this case, 'plate' would be the column header and 'R_2008_12_09' would be the field data (minus quotes) in the mapping file. To use the run prefix from the fasta label, such as '>FLP3FBN01ELBSX', where 'FLP3FBN01' is generated from the run ID, use '-j run_prefix' and set the run prefix to be used as the data under the column header 'run_prefix'" optional="False" type="text" value=""/>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
56 </inputs>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
57 <outputs>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
58 <data format="txt" name="out_log" label="${tool.name} on ${mapping_fp.name}"/>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
59 <data format="tabular" name="out_txt" label="Corrected ${mapping_fp.name}"/>
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
60 </outputs>
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
61 <help>
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
62 Check ID Map checks::
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
63
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
64 1. The BarcodeSequence, LinkerPrimerSequences, and ReversePrimer fields
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
65 have valid IUPAC DNA characters, and BarcodeSequence characters
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
66 are non-degenerate (error)
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
67
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
68 2. The SampleID, BarcodeSequence, LinkerPrimerSequence, and Description
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
69 headers are present (error)
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
70
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
71 3. There are not duplicate header fields (error)
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
72
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
73 4. There are not duplicate barcodes (error)
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
74
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
75 5. Barcodes are of the same length. Suppressed when
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
76 variable_len_barcode flag is passed (warning)
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
77
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
78 6. The headers do not contain invalid characters (alphanumeric and
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
79 underscore only) (warning)
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
80
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
81 7. The data fields do not contain invalid characters (alphanumeric,
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
82 underscore, space, and +-%./:,; characters) (warning)
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
83
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
84 8. SampleID fields are MIENS compliant (only alphanumeric
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
85 and . characters). (warning)
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
86
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
87 9. There are no duplicates when the primer and variable length
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
88 barcodes are appended (error)
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
89
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
90 10. There are no duplicates when barcodes and added demultiplex
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
91 fields (-j option) are combined (error)
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
92
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
93 11. Data fields are not found beyond the Description column (warning)
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
94
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
95 Details about the metadata mapping file format can be found here:
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
96 http://www.qiime.org/documentation/file_formats.html#metadata-mapping-files
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
97
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
98 Errors and warnings are saved to a log file. Errors can be caused
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
99 by problems with the headers, invalid characters in barcodes or
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
100 primers, or by duplications in SampleIDs or barcodes.
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
101
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
102 Warnings can arise from invalid characters and variable length
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
103 barcodes that are not specified with the --variable_len_barcode.
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
104 Warnings will contain a reference to the cell (row,column) that
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
105 the warning arose from.
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
106
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
107 In addition to the log file, a 'corrected_mapping' file will be
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
108 created. Any invalid characters will be replaced with '.'
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
109 characters in the SampleID fields (to enforce MIENS compliance)
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
110 and text in other data fields will be replaced with the character
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
111 specified by the -c parameter, which is an underscore '_' by
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
112 default.
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
113
4
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
114 If pooled primers are used, separate with a comma. For instance,
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
115 a pooled set of three 27f primers (used to increase taxonomic
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
116 coverage) could be specified in the LinkerPrimerSequence fields as
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
117 such:
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
118 AGGGTTCGATTCTGGCTCAG,AGAGTTTGATCCTGGCTTAG,AGAATTTGATCTTGGTTCAG
51eb500d6526 check_id_map minor fix
Davide Albanese <davide.albanese@gmail.com>
parents: 3
diff changeset
119 </help>
0
0d8e091eb3e1 Uploaded
davide-albanese
parents:
diff changeset
120 </tool>