Mercurial > repos > nml > csvtk_awklike_filter
comparison awklike-filter.xml @ 0:20e8be1464f5 draft default tip
"planemo upload for repository https://github.com/shenwei356/csvtk commit 3a97e1b79bf0c6cdd37d5c8fb497b85531a563ab"
| author | nml |
|---|---|
| date | Tue, 19 May 2020 17:14:07 -0400 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:20e8be1464f5 |
|---|---|
| 1 <tool id="csvtk_awklike_filter" name="csvtk-advanced-filter" version="@VERSION@+@GALAXY_VERSION@"> | |
| 2 <description> rows by awk-like artithmetic/string expressions</description> | |
| 3 <macros> | |
| 4 <import>macros.xml</import> | |
| 5 </macros> | |
| 6 <expand macro="requirements" /> | |
| 7 <expand macro="version_cmd" /> | |
| 8 <command detect_errors="exit_code"><![CDATA[ | |
| 9 | |
| 10 ################### | |
| 11 ## Start Command ## | |
| 12 ################### | |
| 13 csvtk filter2 --num-cpus "\${GALAXY_SLOTS:-1}" | |
| 14 | |
| 15 ## Add additional flags as specified ## | |
| 16 ####################################### | |
| 17 $global_param.illegal_rows | |
| 18 $global_param.empty_rows | |
| 19 $global_param.header | |
| 20 $global_param.lazy_quotes | |
| 21 | |
| 22 ## Set Tabular input/output flag if first input is tabular ## | |
| 23 ############################################################# | |
| 24 #if $in_1.is_of_type("tabular"): | |
| 25 -t -T | |
| 26 #end if | |
| 27 | |
| 28 ## Set input files ## | |
| 29 ##################### | |
| 30 $in_1 | |
| 31 | |
| 32 ## Specify fields to filter ## | |
| 33 ############################## | |
| 34 -f '$in_text' | |
| 35 | |
| 36 ## Specific inputs ## | |
| 37 ##################### | |
| 38 $line_number | |
| 39 | |
| 40 ## To output ## | |
| 41 ############### | |
| 42 > filtered | |
| 43 | |
| 44 ]]></command> | |
| 45 <inputs> | |
| 46 <expand macro="singular_input"/> | |
| 47 <param name="in_text" type="text" | |
| 48 optional="false" | |
| 49 argument="-f" | |
| 50 label="Awk-like artithmetic/string expression"> | |
| 51 <help> | |
| 52 <![CDATA[ | |
| 53 Examples: | |
| 54 - '$age>12' | |
| 55 - '$1 > $3' | |
| 56 - '$name=="abc"' | |
| 57 - '$1 % 2 == 0' | |
| 58 More info is available in the help section below. The ' character is invalid and will be replaced, thus you must | |
| 59 surround strings with double quotes ("string") instead. | |
| 60 ]]> | |
| 61 </help> | |
| 62 <expand macro="text_sanitizer" /> | |
| 63 </param> | |
| 64 <param name="line_number" type="boolean" | |
| 65 checked="false" | |
| 66 truevalue="-n" | |
| 67 falsevalue="" | |
| 68 argument="-n" | |
| 69 label="Print initial line number as the first column" | |
| 70 /> | |
| 71 <expand macro="global_parameters" /> | |
| 72 </inputs> | |
| 73 <outputs> | |
| 74 <data format_source="in_1" from_work_dir="filtered" name="filtered" label="${in_1.name} filtered by ${in_text}" /> | |
| 75 </outputs> | |
| 76 <tests> | |
| 77 <test> | |
| 78 <param name="in_1" value="frequency.tsv" /> | |
| 79 <param name="in_text" value="$3>1" /> | |
| 80 <output name="filtered" file="filtered.tsv" ftype="tabular" /> | |
| 81 </test> | |
| 82 </tests> | |
| 83 <help><![CDATA[ | |
| 84 | |
| 85 Csvtk - Filter2 Help | |
| 86 -------------------- | |
| 87 | |
| 88 Info | |
| 89 #### | |
| 90 | |
| 91 Csvtk advanced filter (also called filter2) outputs rows that satisfy the input awk-like artithmetic/string expressions. Please see the | |
| 92 `documentation <https://github.com/Knetic/govaluate/blob/master/MANUAL.md>`_ for further details and examples on how to write expressions. | |
| 93 | |
| 94 .. class:: warningmark | |
| 95 | |
| 96 Single quotes are not allowed in text inputs! | |
| 97 | |
| 98 .. class:: note | |
| 99 | |
| 100 If your wanted column header has a space in it, use the column number. Example: Use $1 if column #1 is called "Colony Counts" | |
| 101 | |
| 102 Supported operators and types: | |
| 103 | |
| 104 - Modifiers: + - / * & | ^ ** % >> << | |
| 105 - Comparators: > >= < <= == != =~ !~ | |
| 106 - Logical ops: || && | |
| 107 - Numeric constants, as 64-bit floating point (12345.678) | |
| 108 - String constants (double quotes: "foobar") | |
| 109 - Date constants (double quotes) | |
| 110 - Boolean constants: true false | |
| 111 - Parenthesis to control order of evaluation ( ) | |
| 112 - Arrays (anything separated by , within parenthesis: (1, 2, "foo")) | |
| 113 - Prefixes: ! - ~ | |
| 114 - Ternary conditional: ? : | |
| 115 - Null coalescence: ?? | |
| 116 | |
| 117 ---- | |
| 118 | |
| 119 @HELP_INPUT_DATA@ | |
| 120 | |
| 121 | |
| 122 Usage | |
| 123 ##### | |
| 124 | |
| 125 **Ex. Filter2 on one column:** | |
| 126 | |
| 127 Suppose we had the following table: | |
| 128 | |
| 129 +---------------+------------+----------+ | |
| 130 | Culture Label | Cell Count | Dilution | | |
| 131 +===============+============+==========+ | |
| 132 | ECo-1 | 2523 | 1000 | | |
| 133 +---------------+------------+----------+ | |
| 134 | LPn-1 | 100 | 1000000 | | |
| 135 +---------------+------------+----------+ | |
| 136 | LPn-2 | 4 | 1000 | | |
| 137 +---------------+------------+----------+ | |
| 138 | |
| 139 If we wanted to find all samples with the label LPn, we could use the filter expression '$1 =~ "LPn*"' to get the following output: | |
| 140 | |
| 141 +---------------+------------+----------+ | |
| 142 | Culture Label | Cell Count | Dilution | | |
| 143 +===============+============+==========+ | |
| 144 | LPn-1 | 100 | 1000000 | | |
| 145 +---------------+------------+----------+ | |
| 146 | LPn-2 | 4 | 1000 | | |
| 147 +---------------+------------+----------+ | |
| 148 | |
| 149 Note how $1 was used to get column 1 due to it containing a space | |
| 150 | |
| 151 ---- | |
| 152 | |
| 153 **Ex2. Filter2 with multiple inputs:** | |
| 154 | |
| 155 Same input table | |
| 156 | |
| 157 +---------------+------------+----------+ | |
| 158 | Culture Label | Cell Count | Dilution | | |
| 159 +===============+============+==========+ | |
| 160 | ECo-1 | 2523 | 1000 | | |
| 161 +---------------+------------+----------+ | |
| 162 | LPn-1 | 100 | 1000000 | | |
| 163 +---------------+------------+----------+ | |
| 164 | LPn-2 | 4 | 1000 | | |
| 165 +---------------+------------+----------+ | |
| 166 | |
| 167 Now if we use the expression '$1 =~ "LPn*" && $Dilution > 1000' to filter on, we would pull out the only row that satisfies both conditions: | |
| 168 | |
| 169 +---------------+------------+----------+ | |
| 170 | Culture Label | Cell Count | Dilution | | |
| 171 +===============+============+==========+ | |
| 172 | LPn-1 | 100 | 1000000 | | |
| 173 +---------------+------------+----------+ | |
| 174 | |
| 175 ---- | |
| 176 | |
| 177 @HELP_COLUMNS@ | |
| 178 | |
| 179 | |
| 180 @HELP_END_STATEMENT@ | |
| 181 | |
| 182 | |
| 183 ]]></help> | |
| 184 <expand macro="citations" /> | |
| 185 </tool> |
