Mercurial > repos > iuc > codeml
changeset 1:ba71e26d5bdc draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/codeml commit 43935edeb0abb95b08b191b379e160ec25cb33c0
author | iuc |
---|---|
date | Wed, 02 May 2018 05:44:08 -0400 |
parents | 961a712f9743 |
children | 66228e9c29d9 |
files | codeml.xml test-data/1_codeml.ctl test-data/2_codeml.ctl test-data/3_codeml.ctl |
diffstat | 4 files changed, 49 insertions(+), 47 deletions(-) [+] |
line wrap: on
line diff
--- a/codeml.xml Tue Aug 29 19:12:01 2017 -0400 +++ b/codeml.xml Wed May 02 05:44:08 2018 -0400 @@ -14,11 +14,9 @@ <version_command><![CDATA[ codeml /dev/null 2>&1 | tail -1 ]]></version_command> <command><![CDATA[ - codeml '$codeml_ctl' && mv '$codeml_ctl' '$ctl' - ]]></command> <configfiles> @@ -65,12 +63,10 @@ cleandata = $adv.cleandata * remove sites with ambiguity data (1:yes, 0:no)? fix_blength = $adv.fix_blength * 0: ignore, -1: random, 1: initial, 2: fixed method = $adv.method * 0: simultaneous; 1: one branch at a time - ]]></configfile> </configfiles> <inputs> - <param name="concat_nuc" type="data" format="fasta" label="Sequences file" help="The fasta file with the sequences to be analyzed" /> <param name="tree" type="data" format="nhx" label="tree file" help="Tree file in Newick format" /> @@ -314,18 +310,17 @@ </tests> <help><![CDATA[ - .. class:: infomark **Galaxy integration** Victor Mataigne and ABIMS TEAM. - Contact support.abims@sb-roscoff.fr for any questions or concerns about the Galaxy implementation of this tool. +Contact support.abims@sb-roscoff.fr for any questions or concerns about the Galaxy implementation of this tool. ---------- **CompCodeML (from paml package)** -A few help is detailed below ; full and detailed codeml readme can be found on the paml website_ +A few help is detailed below ; full and detailed codeml readme can be found on the paml website_. .. _website: http://abacus.gene.ucl.ac.uk/software/paml.html @@ -334,6 +329,7 @@ Due to their high number, some parameters incompatibility can remain. This Galaxy implementation : + - handles incompatibilities between branch and sites models (the tool CANNOT be run with incompatible models). - warns the user in a help section when an advanced parameter has known incompatibilities (the tool CAN be run, but the output files will be empty). @@ -342,11 +338,12 @@ .. class:: infomark Known incompatibilities: - - 'seqtype' = 3 : only compatible with 'FSsites' = 0. - - 'clock' = 2 : needs branch labels in the tree. - - fix_alpha !=1 combined with alpha !=0 are not compatible with NSsites !=0 - - 'aaDist' = 0 is the only one compatible with 'NSsites' different than 0 and 'seqtype' = 1. - - 'method' = 1 : does not work with 'clock' different than 0. + +- 'seqtype' = 3 : only compatible with 'FSsites' = 0. +- 'clock' = 2 : needs branch labels in the tree. +- fix_alpha !=1 combined with alpha !=0 are not compatible with NSsites !=0 +- 'aaDist' = 0 is the only one compatible with 'NSsites' different than 0 and 'seqtype' = 1. +- 'method' = 1 : does not work with 'clock' different than 0. ---------- @@ -368,10 +365,12 @@ **Parameters** Several models are available. + - branch models ("model" parameter). - sites models ("NSsites" parameter, model is left at 0). - branch-sites models (when model = 2 NSsites=2,3). - Clade models (when model=3 NSsites=2,3). + Basically, this tool write a configfile called codeml.ctl with the specified parameters and then launches codeml. .. class:: infomark @@ -384,10 +383,12 @@ How to run the branch-site models (A & B in Yang & Nielsen 2002 MBE) ? The options are : - Model A: (model=2, NSsites=2). - Model B: (model=2, NSsites=3). + +- Model A: (model=2, NSsites=2). +- Model B: (model=2, NSsites=3). How to run the M0 (one-ratio) model : + model = 0, NSsites= = 0. ---------- @@ -396,27 +397,35 @@ .. class:: infomark -See paml complete manual and FAQ on paml website_ +See paml complete manual and FAQ on the paml website_. .. _website: http://abacus.gene.ucl.ac.uk/software/paml.html **Details of some parameters :** - - 'kappa' denotes the transition/transversion rate ratio. - - 'fix_kappa' specifies whether kappa in K80, F84, or HKY85 is given at a fixed value or is to be estimated by iteration from the data. - -> If fix_kappa = 1 (fixed), the value of kappa is the given value - -> If fix_kappa = 0 (estimated) the value of kappa is used as the initial estimate for iteration. +- 'kappa' denotes the transition/transversion rate ratio. +- 'fix_kappa' specifies whether kappa in K80, F84, or HKY85 is given at a fixed value or is to be estimated by iteration from the data. + + -> If fix_kappa = 1 (fixed), the value of kappa is the given value + + -> If fix_kappa = 0 (estimated) the value of kappa is used as the initial estimate for iteration. + +- 'alpha' refers to the shape parameter alpha of the gamma distribution for variable substitution rates across sites (Yang 1994a). +- 'fix_alpha' works in a similar way that fix_kappa. + + -> The model of a single rate for all sites is specified as fix_alpha = 1 and alpha = 0 (0 means infinity) - - 'alpha' refers to the shape parameter alpha of the gamma distribution for variable substitution rates across sites (Yang 1994a). - - 'fix_alpha' works in a similar way that fix_kappa. - -> The model of a single rate for all sites is specified as fix_alpha = 1 and alpha = 0 (0 means infinity) - -> The (discrete-) gamma model is specified by a positive value for alpha, and 'ncatG' is then the number of categories for the discrete-gamma model. Values such as 5, 4, 8, or 10 are reasonable. + -> The (discrete-) gamma model is specified by a positive value for alpha, and 'ncatG' is then the number of categories for the discrete-gamma model. Values such as 5, 4, 8, or 10 are reasonable. + +- fix_rho and rho work in a similar way and concern independence or correlation of rates at adjacent sites, where rho is the correlation parameter of the auto-discrete-gamma model (Yang 1995). + + -> The model of independent rates for sites is specified as fix_rho = 1 and rho = 0; choosing alpha = 0 further means a constant rate for all sites. - - fix_rho and rho work in a similar way and concern independence or correlation of rates at adjacent sites, where rho is the correlation parameter of the auto-discrete-gamma model (Yang 1995). - -> The model of independent rates for sites is specified as fix_rho = 1 and rho = 0; choosing alpha = 0 further means a constant rate for all sites. - -> The auto-discrete-gamma model is specified by positive values for both alpha and rho. - -> The model of a constant rate for sites is a special case of the (discrete) gamma model with alpha = 0 (means infinity). - -> The model of independent rates for sites is a special case of the auto-discrete-gamma model with rho = 0. + -> The auto-discrete-gamma model is specified by positive values for both alpha and rho. + + -> The model of a constant rate for sites is a special case of the (discrete) gamma model with alpha = 0 (means infinity). + + -> The model of independent rates for sites is a special case of the auto-discrete-gamma model with rho = 0. ---------- @@ -434,18 +443,16 @@ Some models implemented in codeml allow several groups of branches on the tree, which are assigned different parameters of interest. - - For example, in the local clock models (clock = 2 or 3), you can have, say, 3 branch rate groups, with low, medium, and high rates respectively. - - - Also the branch-specific codon models (model = 2 or 3 or codonml) allow different branch groups to have different ωs, leading to so called “two-ratios” and “three-ratios” models. - - - All those models require branches or nodes in the tree to be labeled. Branch labels are specified in the same way as branch lengths except that the symbol “#” is used rather than “:”. The branch labels are consecutive integers starting from 0, which is the default and does not have to be specified. +- For example, in the local clock models (clock = 2 or 3), you can have, say, 3 branch rate groups, with low, medium, and high rates respectively. +- Also the branch-specific codon models (model = 2 or 3 or codonml) allow different branch groups to have different ωs, leading to so called “two-ratios” and “three-ratios” models. +- All those models require branches or nodes in the tree to be labeled. Branch labels are specified in the same way as branch lengths except that the symbol “#” is used rather than “:”. The branch labels are consecutive integers starting from 0, which is the default and does not have to be specified. In ((Hsa_Human, Hla_gibbon) #1, ((Cgu/Can_colobus, Pne_langur), Mmu_rhesus), (Ssc_squirrelM, Cja_marmoset)); : The internal branch ancestral to human and gibbon has the ratio ω1, while all other branches (with the default label #0) have the background ratio ω0. -The following trees are equivalent : - ((rabbit, rat) $1, human), goat_cow, marsupial); - (((rabbit #1, rat #1) #1, human), goat_cow, marsupial); +The following trees are equivalent: +- ((rabbit, rat) $1, human), goat_cow, marsupial); +- (((rabbit #1, rat #1) #1, human), goat_cow, marsupial); $ is the symbol for clade labels. @@ -455,8 +462,6 @@ $1 is first applied to the whole clade of placental mammals (except for the human lineage), and then $2 is applied to the rabbit-rat clade. Equivalent tree with only '#' : ((((rabbit #2, rat #2) #2, human #3) #1, goat_cow #1) #1, marsupial); - - ]]></help> <citations>
--- a/test-data/1_codeml.ctl Tue Aug 29 19:12:01 2017 -0400 +++ b/test-data/1_codeml.ctl Wed May 02 05:44:08 2018 -0400 @@ -1,7 +1,7 @@ - seqfile = /tmp/saskia/tmphfQhiQ/files/000/dataset_1.dat * sequence data file name + seqfile = /tmp/tmpr0j4bmyj/files/000/dataset_1.dat * sequence data file name outfile = run_codeml * main result file name - treefile = /tmp/saskia/tmphfQhiQ/files/000/dataset_2.dat * tree structure file name + treefile = /tmp/tmpr0j4bmyj/files/000/dataset_2.dat * tree structure file name noisy = 9 * 0,1,2,3,9: how much rubbish on the screen verbose = 0 * 1: detailed output, 0: concise output runmode = 0 * 0: user tree; 1: semi-automatic; 2: automatic @@ -41,5 +41,4 @@ cleandata = 0 * remove sites with ambiguity data (1:yes, 0:no)? fix_blength = 0 * 0: ignore, -1: random, 1: initial, 2: fixed method = 0 * 0: simultaneous; 1: one branch at a time - \ No newline at end of file
--- a/test-data/2_codeml.ctl Tue Aug 29 19:12:01 2017 -0400 +++ b/test-data/2_codeml.ctl Wed May 02 05:44:08 2018 -0400 @@ -1,7 +1,7 @@ - seqfile = /tmp/saskia/tmphfQhiQ/files/000/dataset_12.dat * sequence data file name + seqfile = /tmp/tmpr0j4bmyj/files/000/dataset_12.dat * sequence data file name outfile = run_codeml * main result file name - treefile = /tmp/saskia/tmphfQhiQ/files/000/dataset_13.dat * tree structure file name + treefile = /tmp/tmpr0j4bmyj/files/000/dataset_13.dat * tree structure file name noisy = 9 * 0,1,2,3,9: how much rubbish on the screen verbose = 0 * 1: detailed output, 0: concise output runmode = 0 * 0: user tree; 1: semi-automatic; 2: automatic @@ -41,5 +41,4 @@ cleandata = 0 * remove sites with ambiguity data (1:yes, 0:no)? fix_blength = 0 * 0: ignore, -1: random, 1: initial, 2: fixed method = 0 * 0: simultaneous; 1: one branch at a time - \ No newline at end of file
--- a/test-data/3_codeml.ctl Tue Aug 29 19:12:01 2017 -0400 +++ b/test-data/3_codeml.ctl Wed May 02 05:44:08 2018 -0400 @@ -1,7 +1,7 @@ - seqfile = /tmp/saskia/tmphfQhiQ/files/000/dataset_23.dat * sequence data file name + seqfile = /tmp/tmpr0j4bmyj/files/000/dataset_23.dat * sequence data file name outfile = run_codeml * main result file name - treefile = /tmp/saskia/tmphfQhiQ/files/000/dataset_24.dat * tree structure file name + treefile = /tmp/tmpr0j4bmyj/files/000/dataset_24.dat * tree structure file name noisy = 9 * 0,1,2,3,9: how much rubbish on the screen verbose = 0 * 1: detailed output, 0: concise output runmode = 0 * 0: user tree; 1: semi-automatic; 2: automatic @@ -41,5 +41,4 @@ cleandata = 0 * remove sites with ambiguity data (1:yes, 0:no)? fix_blength = 0 * 0: ignore, -1: random, 1: initial, 2: fixed method = 0 * 0: simultaneous; 1: one branch at a time - \ No newline at end of file