Mercurial > repos > iuc > mothur_cooccurrence
view cooccurrence.xml @ 1:f83d1ed69978 draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/mothur commit 721531d2e9fd1e208a3fba8cfbe5dcd572599ca2
author | iuc |
---|---|
date | Tue, 05 Sep 2017 17:12:25 -0400 |
parents | 6cb0335df5db |
children | c2cf32e92e83 |
line wrap: on
line source
<tool profile="16.07" id="mothur_cooccurrence" name="Cooccurrence" version="@WRAPPER_VERSION@.0"> <description>tests whether presence-absence patterns differ from chance</description> <macros> <import>macros.xml</import> </macros> <expand macro="requirements"/> <expand macro="stdio"/> <expand macro="version_command"/> <command><![CDATA[ @SHELL_OPTIONS@ ## create symlinks to input datasets ln -s "$otu" otu.dat && echo 'cooccurrence( shared=otu.dat, metric=$metric, matrixmodel=$matrixmodel, iters=$iters #if $label: ,label=${ str($label).replace(",","-") } #end if #if $groups: ,groups=${ str($groups).replace(",","-") } #end if )' | sed 's/ //g' ## mothur trips over whitespace | mothur | tee mothur.out.log ]]></command> <inputs> <param name="otu" type="data" format="mothur.shared" label="shared - OTU Shared file"/> <param name="iters" type="integer" value="1000" min="1" optional="true" label="iters - Number of iterations to try (default 1000)"/> <param name="metric" type="select" label="metric - test metric for scoring"> <option value="cscore" selected="true">cscore - species segregation</option> <option value="checker">checker - species segregation</option> <option value="combo">combo - unique species pairs</option> <option value="vratio">vratio - variance</option> </param> <param name="matrixmodel" type="select" label="matrixmodel - the scoring matrix" help="See the notes below on choosing a metric/matrixmodel combination"> <option value="sim1">sim1</option> <option value="sim2" selected="true">sim2</option> <option value="sim3">sim3</option> <option value="sim4">sim4</option> <option value="sim5">sim5</option> <option value="sim6">sim6</option> <option value="sim7">sim7</option> <option value="sim8">sim8</option> <option value="sim9">sim9</option> </param> <param name="groups" type="select" optional="true" label="groups - Groups to include" multiple="true" help="By default all are included if no selection is made."> <options> <filter type="data_meta" ref="otu" key="groups"/> </options> </param> <param name="label" type="select" optional="true" label="label - Select OTU Labels to include" multiple="true" help="By default all are included if no selection is made."> <expand macro="labeloptions"/> </param> </inputs> <outputs> <expand macro="logfile-output"/> <data name="out_summary" format="tabular" from_work_dir="otu*.summary" label="${tool.name} on ${on_string}: cooccurence.summary"/> </outputs> <tests> <test><!-- test with defaults --> <param name="otu" value="amazon.an.shared" ftype="mothur.shared"/> <output name="out_summary" ftype="tabular"> <assert_contents> <has_text text="zScore"/> <has_text text="cscore"/> <has_text text="unique"/> <has_text text="0.05"/> </assert_contents> </output> <expand macro="logfile-test"/> </test> <test><!-- test with label and group select --> <param name="otu" value="amazon.an.shared" ftype="mothur.shared"/> <param name="metric" value="vratio"/> <param name="label" value="0.05,0.22,0.41"/> <param name="groups" value="forest,pasture"/> <output name="out_summary" ftype="tabular"> <assert_contents> <has_text text="zScore"/> <has_text text="vratio"/> <not_has_text text="unique"/> <has_text text="0.05"/> </assert_contents> </output> <expand macro="logfile-test"/> </test> </tests> <help> <![CDATA[ @MOTHUR_OVERVIEW@ **Command Documentation** The cooccurrence_ command variance calculates four metrics and tests their significance to assess whether presence-absence patterns are different than what one would expect by chance. The input is a shared_ file. The output can be filtered by groups and labels. **metric** The metric parameter options are **cscore**, **checker**, **combo** and **vratio**. Default=cscore. The cscore or checkerboard score [1] is a metric that measures species segregation. It is the mean number of checkerboard units per species pair. The checker metric [2] counts the number of species pairs forming a perfect checkerboard. The combo metric [3] is the number of unique species pairs. The vratio or variance ratio [4] is a measure of the species association calculated by the ratio of the variance in total species number to the sum of the variances of the species. :: [1] Stone, L., and A. Roberts. 1990. The checkerboard score and species distributions. Ocelogia. 85:74-79. [2] Diamond, J. M. 1975. Assembly of species communities. Pages 342-444 in M. L. Cody and J. M. Diamond, editors. Ecology and evolution of communities. Harvard University Press, Cambridge, Massachusetts, USA. [3] Pielou, D. P., and E. C. Pielou. 1968 Association among species of infrequent occurrence: the insect and spider fauna of Polypours betulinus (Bulliard) Fries. Journal of Theoretical Biology 21:202-216. [4] Schluter, D. 1984. A variance test for detecting species associations, with some example applications. Ecology 65:998-1005. [5] Gotelli, Nicholas J. 2000. NULL MODEL ANALYSIS OF SPECIES CO-OCCURRENCE PATTERNS. Ecology 81:2606-2621. **matrixmodel** The matrixmodel parameter allows you to select the model you would like to use. Options are sim1, sim2, sim3, sim4, sim5, sim6, sim7, sim8 and sim9. Default=sim2. Each sim implements a different algorithm for generating null matrices with constraints on the rows (species) and columns (sites).:: ===================== ====================== ======================= ====================== Rows Columns equiprobable Columns proportional Column totals fixed ===================== ====================== ======================= ====================== Rows equiprobable sim1 sim6 sim3 Rows proportional sim7 sim8 sim5 Row totals fixed sim2 sim4 sim9 ===================== ====================== ======================= ====================== Equiprobable rows or columns means that each row, column or both is not dependent on the original co-occurrence matrix. Each species or site has an equal change of occurring in the null matrix. Proportional rows or columns means that the proportion of occurrences in rows, columns or both in the original co-occurrence matrix are preserved but the totals may differ. Each species or site's chances of occurring are proportional to their occurrence in the original co-occurrence matrix. Fixed row or column totals preserves the total number of occurrences in rows, columns or both in the original co-occurrence matrix. Sim9 is a special case that is not probabilistic. Since both the row and column totals are preserved the only way to randomize the matrix is with a checkerboard swap. When a checkerboard appears in the matrix the 1s and 0s are swapped to their mirror image to preserve the species and site totals. Checkerboard:: 10 01 Swap:: 01 10 suggested metric/matrixmodel combinations:: ======== ======== ======== ======== cscore checker combo vratio ======== ======== ======== ======== sim9 sim9 sim9 sim2 sim2 sim2 sim2 sim4 - - sim4 sim8 - - sim8 - ======== ======== ======== ======== Careful readers will note that none of the suggested matrixmodels have equiprobable rows (species). This is because tests of co-occurrence are quite sensitive to the frequency of species occurrence. As such, rowtotals should be maintained or at least kept proportional in the null models. Sim9 is well suited to co-occurrence matrices that have an "island list" structure. Island lists are often found in classical ecology datasets that contain species with well defined habitat patches and are rarely degenerate (matrices that contain empty rows or columns). Sim2 is well suited for co-occurrence matrices that have a "sample list" structure. Sample list structured data are found where species have relatively homogeneous habitats and degenerate matrices are not uncommon. In these matrices species will often occur in only one site. The default values of cscore and sim2 have been selected because the c-score is not very sensitive to noise in the data and when used with sim9 or sim2 is not particularly prone to false positives. Sim2 has been chosen because of the prevalence of degenerate matrices. These are just guidelines, however, be sure to select a metric and matrix model that is best suited to the type of data you are analyzing. It should be noted that sim9 cannot be used with vratio because in sim9 both the column and row totals are maintained, hence there will be no variance. Please see [5] for more details on metric/null model selection. .. _shared: https://www.mothur.org/wiki/Shared_file .. _cooccurrence: https://www.mothur.org/wiki/Cooccurrence v1.26.0: Updated to Mothur 1.33 ]]> </help> <expand macro="citations"/> </tool>