Mercurial > repos > iuc > table_compute
changeset 3:60ff16842fcd draft
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/table_compute commit 5c7c463baf40edda673a569e91b2c2a5e3b6b4f8"
author | iuc |
---|---|
date | Fri, 18 Oct 2019 06:22:51 -0400 |
parents | 02c3e335a695 |
children | 93a3ce78ce55 |
files | table_compute.xml |
diffstat | 1 files changed, 51 insertions(+), 42 deletions(-) [+] |
line wrap: on
line diff
--- a/table_compute.xml Fri Sep 13 14:54:41 2019 -0400 +++ b/table_compute.xml Fri Oct 18 06:22:51 2019 -0400 @@ -1297,8 +1297,14 @@ </test> </tests> <help><![CDATA[ -This tool computes table expressions on the element, row, and column basis. It can sub-select, -duplicate, as well as perform general and custom expressions on rows, columns or elements. +Table Compute +------------- + +This tool is a Galaxy wrapper for the `Pandas Data Analysis Library <https://pandas.pydata.org/>`_ in Python, +for manipulating and computing expressions upon tabular data and matrices. It can perform functions on the +element, row, and column basis, as well as sub-select, duplicate, replace, and perform general and custom +expressions on rows, columns, and elements. + .. class:: infomark @@ -1307,6 +1313,12 @@ provide a more transparent workflow for complex operations. +Many of the examples given below relate to common research use-cases such as filtering large matrices for +specific values, counting unique instances of elements, conditionally manipulating the data, and replacing +unwanted values. Full table operations such as normalisation can be easily performed by scaling the data via +mean/median/min/max (and many other) metrics, and general expressions can even be computed across multiple +tables. + Examples ======== @@ -1325,7 +1337,8 @@ g4 81 6 3 === === === === -and we want to duplicate c1 and remove c2. Also select g1 to g3 and add g2 at the end as well. This would result in the output table: +and we want to duplicate c1 and remove c2. Also select g1 to g3 and add g2 at the end as well. This + would result in the output table: === === === === . c1 c1 c3 @@ -1341,10 +1354,10 @@ * *Input Single or Multiple Tables* → **Single Table** * *Column names on first row?* → **Yes** * *Row names on first column?* → **Yes** - * *Type of table operation* → **Drop, keep or duplicate rows and columns** + * *Type of table operation* → **Drop, keep or duplicate rows and columns** - * *List of columns to select* → **1,1,3** - * *List of rows to select* → **1:3,2** + * *List of columns to select* → ``1,1,3`` + * *List of rows to select* → ``1:3,2`` * *Keep duplicate columns* → **Yes** * *Keep duplicate rows* → **Yes** @@ -1376,14 +1389,14 @@ * *Input Single or Multiple Tables* → **Single Table** * *Column names on first row?* → **Yes** * *Row names on first column?* → **Yes** - * *Type of table operation* → **Filter rows or columns by their properties** + * *Type of table operation* → **Filter rows or columns by their properties** * *Filter* → **Rows** * *Filter Criterion* → **Result of function applied to columns/rows** * *Keep column/row if its observed* → **Sum** * *is* → **< (Less Than)** - * *this value* → **50** + * *this value* → ``50`` Example 3: Count the number of values per row smaller than a specified value @@ -1417,15 +1430,16 @@ * *Input Single or Multiple Tables* → **Single Table** * *Column names on first row?* → **Yes** * *Row names on first column?* → **Yes** - * *Type of table operation* → **Manipulate selected table elements** + * *Type of table operation* → **Manipulate selected table elements** * *Operation to perform* → **Custom** - * *Custom Expression on 'elem'* → **elem < 10** + * *Custom Expression on 'elem'* → ``elem < 10`` * *Operate on elements* → **All** -**Note:** *There are actually simpler ways to achieve our purpose, but here we are demonstrating the use of a custom expression.* +**Note:** *There are actually simpler ways to achieve our purpose, but here we are demonstrating +the use of a custom expression.* After executing, we would then be presented with a table like so: @@ -1443,18 +1457,20 @@ * *Input Single or Multiple Tables* → **Single Table** * *Column names on first row?* → **Yes** * *Row names on first column?* → **Yes** - * *Type of table operation* → **Compute Expression across Rows or Columns** + * *Type of table operation* → **Compute Expression across Rows or Columns** * *Calculate* → **Sum** * *For each* → **Row** -Executing this will sum all the 'True' values in each row. Note that the values must have no extra whitespace in them for this to work (e.g. 'True ' or ' True' will not be parsed correctly). +Executing this will sum all the 'True' values in each row. Note that the values must have no +extra whitespace in them for this to work (e.g. 'True ' or ' True' will not be parsed correctly). Example 4: Perform a scaled log-transformation conditionally ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We want to perform a scaled log transformation on all values greater than 5, and set all other values to 1. +We want to perform a scaled log transformation on all values greater than 5, and set all +other values to 1. We have the following table: @@ -1483,13 +1499,11 @@ * *Input Single or Multiple Tables* → **Single Table** * *Column names on first row?* → **Yes** * *Row names on first column?* → **Yes** - * *Type of table operation* → **Manipulate selected table elements** + * *Type of table operation* → **Manipulate selected table elements** * *Operation to perform* → **Custom** - * *Custom Expression* → :: - - (math.log(elem) / elem) if (elem > 5) else 1 + * *Custom Expression* → ``(math.log(elem) / elem) if (elem > 5) else 1`` * *Operate on elements* → **All** @@ -1508,7 +1522,8 @@ g4 81 10 10 === === === === -and we want to subtract from each column the mean of that column divided by the standard deviation of it to yield: +and we want to subtract from each column the mean of that column divided by the standard + deviation of it to yield: === ========= ========= ========= @@ -1528,10 +1543,7 @@ * *Type of table operation* → **Perform a Full Table Operation** * *Operation* → **Custom** - - * *Custom Expression on 'table' along axis (0 or 1)* → :: - - table - table.mean(0)/table.std(0) + * *Custom Expression on 'table' along axis (0 or 1)* → ``table - table.mean(0)/table.std(0)`` Example 6: Perform operations on multiple tables @@ -1658,8 +1670,8 @@ * *Type of table operation* → **Perform a Full Table Operation** * *Operation* → **Melt** - * *Variable IDs* → "A" - * *Unpivoted IDs* → "B,C" + * *Variable IDs* → ``A`` + * *Unpivoted IDs* → ``B,C`` This converts the "B" and "C" columns into variables. @@ -1697,11 +1709,12 @@ * *Type of table operation* → **Perform a Full Table Operation** * *Operation* → **Pivot** - * *Index* → "foo" - * *Column* → "bar" - * *Values* → "baz" + * *Index* → ``foo`` + * *Column* → ``bar`` + * *Values* → ``baz`` -This splits the matrix using "foo" and "bar" using only the values from "baz". Header values may contain extra information. +This splits the matrix using "foo" and "bar" using only the values from "baz". Header values + may contain extra information. Example 9: Replacing text in specific rows or columns @@ -1739,23 +1752,21 @@ * *Operation to perform* → **Replace values** - * *Replacement value* → :: - - chr{elem:.0f} + * *Replacement value* → ``chr{elem:.0f}`` - Here, the placeholder ``{elem}`` lets us refer to each element's - current value, while the ``:.0f`` part is a format specifier that makes - sure numbers are printed without decimals (for a complete description of - the available syntax see the + Here, the placeholder ``{elem}`` lets us refer to each element's current value, + while the ``:.0f`` part is a format specifier that makes sure numbers are printed + without decimals (for a complete description of the available syntax see the `Python Format Specification Mini-Language <https://docs.python.org/3/library/string.html#formatspec>`_). * *Operate on elements* → **Specific Rows and/or Columns** - * *List of columns to select* → "2" - * *List of rows to select* → "2,4" - * *Inclusive Selection* → "No" + * *List of columns to select* → ``2`` + * *List of rows to select* → ``2,4`` + * *Inclusive Selection* → ``No`` -If we wanted to instead add "chr" to the ALL elements in column 2 and rows 2 and 4, we would repeat the steps above but set the *Inclusive Selection* to "Yes", to give: +If we wanted to instead add "chr" to the ALL elements in column 2 and rows 2 and 4, we + would repeat the steps above but set the *Inclusive Selection* to "Yes", to give: === ===== ===== ===== . c1 c2 c3 @@ -1766,8 +1777,6 @@ g4 chr81 chr6 chr3 === ===== ===== ===== - - ]]></help> <citations></citations> </tool>