Mercurial > repos > iuc > table_compute
comparison table_compute.xml @ 3:60ff16842fcd draft
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/table_compute commit 5c7c463baf40edda673a569e91b2c2a5e3b6b4f8"
author | iuc |
---|---|
date | Fri, 18 Oct 2019 06:22:51 -0400 |
parents | 02c3e335a695 |
children | 93a3ce78ce55 |
comparison
equal
deleted
inserted
replaced
2:02c3e335a695 | 3:60ff16842fcd |
---|---|
1295 </conditional> | 1295 </conditional> |
1296 </conditional> | 1296 </conditional> |
1297 </test> | 1297 </test> |
1298 </tests> | 1298 </tests> |
1299 <help><![CDATA[ | 1299 <help><![CDATA[ |
1300 This tool computes table expressions on the element, row, and column basis. It can sub-select, | 1300 Table Compute |
1301 duplicate, as well as perform general and custom expressions on rows, columns or elements. | 1301 ------------- |
1302 | |
1303 This tool is a Galaxy wrapper for the `Pandas Data Analysis Library <https://pandas.pydata.org/>`_ in Python, | |
1304 for manipulating and computing expressions upon tabular data and matrices. It can perform functions on the | |
1305 element, row, and column basis, as well as sub-select, duplicate, replace, and perform general and custom | |
1306 expressions on rows, columns, and elements. | |
1307 | |
1302 | 1308 |
1303 .. class:: infomark | 1309 .. class:: infomark |
1304 | 1310 |
1305 Only a single operation can be performed on the data. Multiple operations | 1311 Only a single operation can be performed on the data. Multiple operations |
1306 can be performed by chaining successive runs of this tool. This is to | 1312 can be performed by chaining successive runs of this tool. This is to |
1307 provide a more transparent workflow for complex operations. | 1313 provide a more transparent workflow for complex operations. |
1308 | 1314 |
1315 | |
1316 Many of the examples given below relate to common research use-cases such as filtering large matrices for | |
1317 specific values, counting unique instances of elements, conditionally manipulating the data, and replacing | |
1318 unwanted values. Full table operations such as normalisation can be easily performed by scaling the data via | |
1319 mean/median/min/max (and many other) metrics, and general expressions can even be computed across multiple | |
1320 tables. | |
1309 | 1321 |
1310 | 1322 |
1311 Examples | 1323 Examples |
1312 ======== | 1324 ======== |
1313 | 1325 |
1323 g2 3 6 9 | 1335 g2 3 6 9 |
1324 g3 4 8 12 | 1336 g3 4 8 12 |
1325 g4 81 6 3 | 1337 g4 81 6 3 |
1326 === === === === | 1338 === === === === |
1327 | 1339 |
1328 and we want to duplicate c1 and remove c2. Also select g1 to g3 and add g2 at the end as well. This would result in the output table: | 1340 and we want to duplicate c1 and remove c2. Also select g1 to g3 and add g2 at the end as well. This |
1341 would result in the output table: | |
1329 | 1342 |
1330 === === === === | 1343 === === === === |
1331 . c1 c1 c3 | 1344 . c1 c1 c3 |
1332 === === === === | 1345 === === === === |
1333 g1 10 10 30 | 1346 g1 10 10 30 |
1339 In Galaxy we would select the following: | 1352 In Galaxy we would select the following: |
1340 | 1353 |
1341 * *Input Single or Multiple Tables* → **Single Table** | 1354 * *Input Single or Multiple Tables* → **Single Table** |
1342 * *Column names on first row?* → **Yes** | 1355 * *Column names on first row?* → **Yes** |
1343 * *Row names on first column?* → **Yes** | 1356 * *Row names on first column?* → **Yes** |
1344 * *Type of table operation* → **Drop, keep or duplicate rows and columns** | 1357 * *Type of table operation* → **Drop, keep or duplicate rows and columns** |
1345 | 1358 |
1346 * *List of columns to select* → **1,1,3** | 1359 * *List of columns to select* → ``1,1,3`` |
1347 * *List of rows to select* → **1:3,2** | 1360 * *List of rows to select* → ``1:3,2`` |
1348 * *Keep duplicate columns* → **Yes** | 1361 * *Keep duplicate columns* → **Yes** |
1349 * *Keep duplicate rows* → **Yes** | 1362 * *Keep duplicate rows* → **Yes** |
1350 | 1363 |
1351 Example 2: Filter for rows with row sums less than 50 | 1364 Example 2: Filter for rows with row sums less than 50 |
1352 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 1365 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
1374 In Galaxy we would select the following: | 1387 In Galaxy we would select the following: |
1375 | 1388 |
1376 * *Input Single or Multiple Tables* → **Single Table** | 1389 * *Input Single or Multiple Tables* → **Single Table** |
1377 * *Column names on first row?* → **Yes** | 1390 * *Column names on first row?* → **Yes** |
1378 * *Row names on first column?* → **Yes** | 1391 * *Row names on first column?* → **Yes** |
1379 * *Type of table operation* → **Filter rows or columns by their properties** | 1392 * *Type of table operation* → **Filter rows or columns by their properties** |
1380 | 1393 |
1381 * *Filter* → **Rows** | 1394 * *Filter* → **Rows** |
1382 * *Filter Criterion* → **Result of function applied to columns/rows** | 1395 * *Filter Criterion* → **Result of function applied to columns/rows** |
1383 | 1396 |
1384 * *Keep column/row if its observed* → **Sum** | 1397 * *Keep column/row if its observed* → **Sum** |
1385 * *is* → **< (Less Than)** | 1398 * *is* → **< (Less Than)** |
1386 * *this value* → **50** | 1399 * *this value* → ``50`` |
1387 | 1400 |
1388 | 1401 |
1389 Example 3: Count the number of values per row smaller than a specified value | 1402 Example 3: Count the number of values per row smaller than a specified value |
1390 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 1403 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
1391 | 1404 |
1415 In Galaxy we would select the following: | 1428 In Galaxy we would select the following: |
1416 | 1429 |
1417 * *Input Single or Multiple Tables* → **Single Table** | 1430 * *Input Single or Multiple Tables* → **Single Table** |
1418 * *Column names on first row?* → **Yes** | 1431 * *Column names on first row?* → **Yes** |
1419 * *Row names on first column?* → **Yes** | 1432 * *Row names on first column?* → **Yes** |
1420 * *Type of table operation* → **Manipulate selected table elements** | 1433 * *Type of table operation* → **Manipulate selected table elements** |
1421 | 1434 |
1422 * *Operation to perform* → **Custom** | 1435 * *Operation to perform* → **Custom** |
1423 | 1436 |
1424 * *Custom Expression on 'elem'* → **elem < 10** | 1437 * *Custom Expression on 'elem'* → ``elem < 10`` |
1425 | 1438 |
1426 * *Operate on elements* → **All** | 1439 * *Operate on elements* → **All** |
1427 | 1440 |
1428 **Note:** *There are actually simpler ways to achieve our purpose, but here we are demonstrating the use of a custom expression.* | 1441 **Note:** *There are actually simpler ways to achieve our purpose, but here we are demonstrating |
1442 the use of a custom expression.* | |
1429 | 1443 |
1430 After executing, we would then be presented with a table like so: | 1444 After executing, we would then be presented with a table like so: |
1431 | 1445 |
1432 === ===== ===== ===== | 1446 === ===== ===== ===== |
1433 . c1 c2 c3 | 1447 . c1 c2 c3 |
1441 To get to our desired table, we would then process this table with the tool again: | 1455 To get to our desired table, we would then process this table with the tool again: |
1442 | 1456 |
1443 * *Input Single or Multiple Tables* → **Single Table** | 1457 * *Input Single or Multiple Tables* → **Single Table** |
1444 * *Column names on first row?* → **Yes** | 1458 * *Column names on first row?* → **Yes** |
1445 * *Row names on first column?* → **Yes** | 1459 * *Row names on first column?* → **Yes** |
1446 * *Type of table operation* → **Compute Expression across Rows or Columns** | 1460 * *Type of table operation* → **Compute Expression across Rows or Columns** |
1447 | 1461 |
1448 * *Calculate* → **Sum** | 1462 * *Calculate* → **Sum** |
1449 * *For each* → **Row** | 1463 * *For each* → **Row** |
1450 | 1464 |
1451 Executing this will sum all the 'True' values in each row. Note that the values must have no extra whitespace in them for this to work (e.g. 'True ' or ' True' will not be parsed correctly). | 1465 Executing this will sum all the 'True' values in each row. Note that the values must have no |
1466 extra whitespace in them for this to work (e.g. 'True ' or ' True' will not be parsed correctly). | |
1452 | 1467 |
1453 | 1468 |
1454 Example 4: Perform a scaled log-transformation conditionally | 1469 Example 4: Perform a scaled log-transformation conditionally |
1455 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 1470 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
1456 | 1471 |
1457 We want to perform a scaled log transformation on all values greater than 5, and set all other values to 1. | 1472 We want to perform a scaled log transformation on all values greater than 5, and set all |
1473 other values to 1. | |
1458 | 1474 |
1459 We have the following table: | 1475 We have the following table: |
1460 | 1476 |
1461 === === === === | 1477 === === === === |
1462 . c1 c2 c3 | 1478 . c1 c2 c3 |
1481 In Galaxy we would select the following: | 1497 In Galaxy we would select the following: |
1482 | 1498 |
1483 * *Input Single or Multiple Tables* → **Single Table** | 1499 * *Input Single or Multiple Tables* → **Single Table** |
1484 * *Column names on first row?* → **Yes** | 1500 * *Column names on first row?* → **Yes** |
1485 * *Row names on first column?* → **Yes** | 1501 * *Row names on first column?* → **Yes** |
1486 * *Type of table operation* → **Manipulate selected table elements** | 1502 * *Type of table operation* → **Manipulate selected table elements** |
1487 | 1503 |
1488 * *Operation to perform* → **Custom** | 1504 * *Operation to perform* → **Custom** |
1489 | 1505 |
1490 * *Custom Expression* → :: | 1506 * *Custom Expression* → ``(math.log(elem) / elem) if (elem > 5) else 1`` |
1491 | |
1492 (math.log(elem) / elem) if (elem > 5) else 1 | |
1493 | 1507 |
1494 * *Operate on elements* → **All** | 1508 * *Operate on elements* → **All** |
1495 | 1509 |
1496 | 1510 |
1497 Example 5: Perform a Full table operation | 1511 Example 5: Perform a Full table operation |
1506 g2 3 10 9 | 1520 g2 3 10 9 |
1507 g3 4 8 10 | 1521 g3 4 8 10 |
1508 g4 81 10 10 | 1522 g4 81 10 10 |
1509 === === === === | 1523 === === === === |
1510 | 1524 |
1511 and we want to subtract from each column the mean of that column divided by the standard deviation of it to yield: | 1525 and we want to subtract from each column the mean of that column divided by the standard |
1526 deviation of it to yield: | |
1512 | 1527 |
1513 | 1528 |
1514 === ========= ========= ========= | 1529 === ========= ========= ========= |
1515 . c1 c2 c3 | 1530 . c1 c2 c3 |
1516 === ========= ========= ========= | 1531 === ========= ========= ========= |
1526 * *Column names on first row?* → **Yes** | 1541 * *Column names on first row?* → **Yes** |
1527 * *Row names on first column?* → **Yes** | 1542 * *Row names on first column?* → **Yes** |
1528 * *Type of table operation* → **Perform a Full Table Operation** | 1543 * *Type of table operation* → **Perform a Full Table Operation** |
1529 | 1544 |
1530 * *Operation* → **Custom** | 1545 * *Operation* → **Custom** |
1531 | 1546 * *Custom Expression on 'table' along axis (0 or 1)* → ``table - table.mean(0)/table.std(0)`` |
1532 * *Custom Expression on 'table' along axis (0 or 1)* → :: | |
1533 | |
1534 table - table.mean(0)/table.std(0) | |
1535 | 1547 |
1536 | 1548 |
1537 Example 6: Perform operations on multiple tables | 1549 Example 6: Perform operations on multiple tables |
1538 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 1550 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
1539 | 1551 |
1656 * *Column names on first row?* → **Yes** | 1668 * *Column names on first row?* → **Yes** |
1657 * *Row names on first column?* → **Yes** | 1669 * *Row names on first column?* → **Yes** |
1658 * *Type of table operation* → **Perform a Full Table Operation** | 1670 * *Type of table operation* → **Perform a Full Table Operation** |
1659 | 1671 |
1660 * *Operation* → **Melt** | 1672 * *Operation* → **Melt** |
1661 * *Variable IDs* → "A" | 1673 * *Variable IDs* → ``A`` |
1662 * *Unpivoted IDs* → "B,C" | 1674 * *Unpivoted IDs* → ``B,C`` |
1663 | 1675 |
1664 This converts the "B" and "C" columns into variables. | 1676 This converts the "B" and "C" columns into variables. |
1665 | 1677 |
1666 | 1678 |
1667 Example 8: Pivot | 1679 Example 8: Pivot |
1695 * *Column names on first row?* → **Yes** | 1707 * *Column names on first row?* → **Yes** |
1696 * *Row names on first column?* → **Yes** | 1708 * *Row names on first column?* → **Yes** |
1697 * *Type of table operation* → **Perform a Full Table Operation** | 1709 * *Type of table operation* → **Perform a Full Table Operation** |
1698 | 1710 |
1699 * *Operation* → **Pivot** | 1711 * *Operation* → **Pivot** |
1700 * *Index* → "foo" | 1712 * *Index* → ``foo`` |
1701 * *Column* → "bar" | 1713 * *Column* → ``bar`` |
1702 * *Values* → "baz" | 1714 * *Values* → ``baz`` |
1703 | 1715 |
1704 This splits the matrix using "foo" and "bar" using only the values from "baz". Header values may contain extra information. | 1716 This splits the matrix using "foo" and "bar" using only the values from "baz". Header values |
1717 may contain extra information. | |
1705 | 1718 |
1706 | 1719 |
1707 Example 9: Replacing text in specific rows or columns | 1720 Example 9: Replacing text in specific rows or columns |
1708 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 1721 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
1709 | 1722 |
1737 | 1750 |
1738 * *Type of table operation* → **Manipulate selected table elements** | 1751 * *Type of table operation* → **Manipulate selected table elements** |
1739 | 1752 |
1740 * *Operation to perform* → **Replace values** | 1753 * *Operation to perform* → **Replace values** |
1741 | 1754 |
1742 * *Replacement value* → :: | 1755 * *Replacement value* → ``chr{elem:.0f}`` |
1743 | 1756 |
1744 chr{elem:.0f} | 1757 Here, the placeholder ``{elem}`` lets us refer to each element's current value, |
1745 | 1758 while the ``:.0f`` part is a format specifier that makes sure numbers are printed |
1746 Here, the placeholder ``{elem}`` lets us refer to each element's | 1759 without decimals (for a complete description of the available syntax see the |
1747 current value, while the ``:.0f`` part is a format specifier that makes | |
1748 sure numbers are printed without decimals (for a complete description of | |
1749 the available syntax see the | |
1750 `Python Format Specification Mini-Language <https://docs.python.org/3/library/string.html#formatspec>`_). | 1760 `Python Format Specification Mini-Language <https://docs.python.org/3/library/string.html#formatspec>`_). |
1751 | 1761 |
1752 * *Operate on elements* → **Specific Rows and/or Columns** | 1762 * *Operate on elements* → **Specific Rows and/or Columns** |
1753 * *List of columns to select* → "2" | 1763 * *List of columns to select* → ``2`` |
1754 * *List of rows to select* → "2,4" | 1764 * *List of rows to select* → ``2,4`` |
1755 * *Inclusive Selection* → "No" | 1765 * *Inclusive Selection* → ``No`` |
1756 | 1766 |
1757 | 1767 |
1758 If we wanted to instead add "chr" to the ALL elements in column 2 and rows 2 and 4, we would repeat the steps above but set the *Inclusive Selection* to "Yes", to give: | 1768 If we wanted to instead add "chr" to the ALL elements in column 2 and rows 2 and 4, we |
1769 would repeat the steps above but set the *Inclusive Selection* to "Yes", to give: | |
1759 | 1770 |
1760 === ===== ===== ===== | 1771 === ===== ===== ===== |
1761 . c1 c2 c3 | 1772 . c1 c2 c3 |
1762 === ===== ===== ===== | 1773 === ===== ===== ===== |
1763 g1 10 chr20 30 | 1774 g1 10 chr20 30 |
1764 g2 chr3 chr3 chr9 | 1775 g2 chr3 chr3 chr9 |
1765 g3 4 8 12 | 1776 g3 4 8 12 |
1766 g4 chr81 chr6 chr3 | 1777 g4 chr81 chr6 chr3 |
1767 === ===== ===== ===== | 1778 === ===== ===== ===== |
1768 | 1779 |
1769 | |
1770 | |
1771 ]]></help> | 1780 ]]></help> |
1772 <citations></citations> | 1781 <citations></citations> |
1773 </tool> | 1782 </tool> |