comparison codeml.xml @ 1:ba71e26d5bdc draft

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/codeml commit 43935edeb0abb95b08b191b379e160ec25cb33c0
author iuc
date Wed, 02 May 2018 05:44:08 -0400
parents 961a712f9743
children 66228e9c29d9
comparison
equal deleted inserted replaced
0:961a712f9743 1:ba71e26d5bdc
12 </requirements> 12 </requirements>
13 13
14 <version_command><![CDATA[ codeml /dev/null 2>&1 | tail -1 ]]></version_command> 14 <version_command><![CDATA[ codeml /dev/null 2>&1 | tail -1 ]]></version_command>
15 15
16 <command><![CDATA[ 16 <command><![CDATA[
17
18 codeml '$codeml_ctl' 17 codeml '$codeml_ctl'
19 && 18 &&
20 mv '$codeml_ctl' '$ctl' 19 mv '$codeml_ctl' '$ctl'
21
22 ]]></command> 20 ]]></command>
23 21
24 <configfiles> 22 <configfiles>
25 <configfile name="codeml_ctl"><![CDATA[ 23 <configfile name="codeml_ctl"><![CDATA[
26 seqfile = $concat_nuc * sequence data file name 24 seqfile = $concat_nuc * sequence data file name
63 RateAncestor = $adv.RateAncestor * (0,1,2): rates (alpha>0) or ancestral states (1 or 2) 61 RateAncestor = $adv.RateAncestor * (0,1,2): rates (alpha>0) or ancestral states (1 or 2)
64 Small_Diff = $adv.Small_Diff 62 Small_Diff = $adv.Small_Diff
65 cleandata = $adv.cleandata * remove sites with ambiguity data (1:yes, 0:no)? 63 cleandata = $adv.cleandata * remove sites with ambiguity data (1:yes, 0:no)?
66 fix_blength = $adv.fix_blength * 0: ignore, -1: random, 1: initial, 2: fixed 64 fix_blength = $adv.fix_blength * 0: ignore, -1: random, 1: initial, 2: fixed
67 method = $adv.method * 0: simultaneous; 1: one branch at a time 65 method = $adv.method * 0: simultaneous; 1: one branch at a time
68
69 ]]></configfile> 66 ]]></configfile>
70 </configfiles> 67 </configfiles>
71 68
72 <inputs> 69 <inputs>
73
74 <param name="concat_nuc" type="data" format="fasta" label="Sequences file" help="The fasta file with the sequences to be analyzed" /> 70 <param name="concat_nuc" type="data" format="fasta" label="Sequences file" help="The fasta file with the sequences to be analyzed" />
75 <param name="tree" type="data" format="nhx" label="tree file" help="Tree file in Newick format" /> 71 <param name="tree" type="data" format="nhx" label="tree file" help="Tree file in Newick format" />
76 72
77 <conditional name="compat_model" > 73 <conditional name="compat_model" >
78 <param argument="brmodel" type="select" label="Branch model ; for tree file editing in model 2 and 3, see paml manual (chap.3)" > 74 <param argument="brmodel" type="select" label="Branch model ; for tree file editing in model 2 and 3, see paml manual (chap.3)" >
312 <output name="ctl" value="3_codeml.ctl" lines_diff="4" /> 308 <output name="ctl" value="3_codeml.ctl" lines_diff="4" />
313 </test> 309 </test>
314 </tests> 310 </tests>
315 311
316 <help><![CDATA[ 312 <help><![CDATA[
317
318 .. class:: infomark 313 .. class:: infomark
319 314
320 **Galaxy integration** Victor Mataigne and ABIMS TEAM. 315 **Galaxy integration** Victor Mataigne and ABIMS TEAM.
321 316
322 Contact support.abims@sb-roscoff.fr for any questions or concerns about the Galaxy implementation of this tool. 317 Contact support.abims@sb-roscoff.fr for any questions or concerns about the Galaxy implementation of this tool.
323 318
324 ---------- 319 ----------
325 320
326 **CompCodeML (from paml package)** 321 **CompCodeML (from paml package)**
327 322
328 A few help is detailed below ; full and detailed codeml readme can be found on the paml website_ 323 A few help is detailed below ; full and detailed codeml readme can be found on the paml website_.
329 324
330 .. _website: http://abacus.gene.ucl.ac.uk/software/paml.html 325 .. _website: http://abacus.gene.ucl.ac.uk/software/paml.html
331 326
332 .. class:: warningmark 327 .. class:: warningmark
333 328
334 Due to their high number, some parameters incompatibility can remain. 329 Due to their high number, some parameters incompatibility can remain.
335 330
336 This Galaxy implementation : 331 This Galaxy implementation :
332
337 - handles incompatibilities between branch and sites models (the tool CANNOT be run with incompatible models). 333 - handles incompatibilities between branch and sites models (the tool CANNOT be run with incompatible models).
338 - warns the user in a help section when an advanced parameter has known incompatibilities (the tool CAN be run, but the output files will be empty). 334 - warns the user in a help section when an advanced parameter has known incompatibilities (the tool CAN be run, but the output files will be empty).
339 335
340 We recommand to have a look at the full paml manual before looking at the advanced parameters, in order to spot parameters incompatibilities and to know what each model does. If you choose by mistake incompatible parameters, the output files will be empty, except the log file ("run_codeml" output) which will normally explicit the error. 336 We recommand to have a look at the full paml manual before looking at the advanced parameters, in order to spot parameters incompatibilities and to know what each model does. If you choose by mistake incompatible parameters, the output files will be empty, except the log file ("run_codeml" output) which will normally explicit the error.
341 337
342 .. class:: infomark 338 .. class:: infomark
343 339
344 Known incompatibilities: 340 Known incompatibilities:
345 - 'seqtype' = 3 : only compatible with 'FSsites' = 0. 341
346 - 'clock' = 2 : needs branch labels in the tree. 342 - 'seqtype' = 3 : only compatible with 'FSsites' = 0.
347 - fix_alpha !=1 combined with alpha !=0 are not compatible with NSsites !=0 343 - 'clock' = 2 : needs branch labels in the tree.
348 - 'aaDist' = 0 is the only one compatible with 'NSsites' different than 0 and 'seqtype' = 1. 344 - fix_alpha !=1 combined with alpha !=0 are not compatible with NSsites !=0
349 - 'method' = 1 : does not work with 'clock' different than 0. 345 - 'aaDist' = 0 is the only one compatible with 'NSsites' different than 0 and 'seqtype' = 1.
346 - 'method' = 1 : does not work with 'clock' different than 0.
350 347
351 ---------- 348 ----------
352 349
353 **Description** 350 **Description**
354 351
366 ---------- 363 ----------
367 364
368 **Parameters** 365 **Parameters**
369 366
370 Several models are available. 367 Several models are available.
368
371 - branch models ("model" parameter). 369 - branch models ("model" parameter).
372 - sites models ("NSsites" parameter, model is left at 0). 370 - sites models ("NSsites" parameter, model is left at 0).
373 - branch-sites models (when model = 2 NSsites=2,3). 371 - branch-sites models (when model = 2 NSsites=2,3).
374 - Clade models (when model=3 NSsites=2,3). 372 - Clade models (when model=3 NSsites=2,3).
373
375 Basically, this tool write a configfile called codeml.ctl with the specified parameters and then launches codeml. 374 Basically, this tool write a configfile called codeml.ctl with the specified parameters and then launches codeml.
376 375
377 .. class:: infomark 376 .. class:: infomark
378 377
379 Branch models allow the omega ratio to vary among branches in the phylogeny and are useful for detecting positive selection acting on particular lineages. Sites models allow the omega ratio to vary among sites (codons or amino acids). 378 Branch models allow the omega ratio to vary among branches in the phylogeny and are useful for detecting positive selection acting on particular lineages. Sites models allow the omega ratio to vary among sites (codons or amino acids).
382 381
383 **Other examples of model** 382 **Other examples of model**
384 383
385 How to run the branch-site models (A &amp; B in Yang &amp; Nielsen 2002 MBE) ? 384 How to run the branch-site models (A &amp; B in Yang &amp; Nielsen 2002 MBE) ?
386 The options are : 385 The options are :
387 Model A: (model=2, NSsites=2). 386
388 Model B: (model=2, NSsites=3). 387 - Model A: (model=2, NSsites=2).
388 - Model B: (model=2, NSsites=3).
389 389
390 How to run the M0 (one-ratio) model : 390 How to run the M0 (one-ratio) model :
391
391 model = 0, NSsites= = 0. 392 model = 0, NSsites= = 0.
392 393
393 ---------- 394 ----------
394 395
395 **Advanced Parameters** 396 **Advanced Parameters**
396 397
397 .. class:: infomark 398 .. class:: infomark
398 399
399 See paml complete manual and FAQ on paml website_ 400 See paml complete manual and FAQ on the paml website_.
400 401
401 .. _website: http://abacus.gene.ucl.ac.uk/software/paml.html 402 .. _website: http://abacus.gene.ucl.ac.uk/software/paml.html
402 403
403 **Details of some parameters :** 404 **Details of some parameters :**
404 405
405 - 'kappa' denotes the transition/transversion rate ratio. 406 - 'kappa' denotes the transition/transversion rate ratio.
406 - 'fix_kappa' specifies whether kappa in K80, F84, or HKY85 is given at a fixed value or is to be estimated by iteration from the data. 407 - 'fix_kappa' specifies whether kappa in K80, F84, or HKY85 is given at a fixed value or is to be estimated by iteration from the data.
407 -> If fix_kappa = 1 (fixed), the value of kappa is the given value 408
408 -> If fix_kappa = 0 (estimated) the value of kappa is used as the initial estimate for iteration. 409 -> If fix_kappa = 1 (fixed), the value of kappa is the given value
409 410
410 - 'alpha' refers to the shape parameter alpha of the gamma distribution for variable substitution rates across sites (Yang 1994a). 411 -> If fix_kappa = 0 (estimated) the value of kappa is used as the initial estimate for iteration.
411 - 'fix_alpha' works in a similar way that fix_kappa. 412
412 -> The model of a single rate for all sites is specified as fix_alpha = 1 and alpha = 0 (0 means infinity) 413 - 'alpha' refers to the shape parameter alpha of the gamma distribution for variable substitution rates across sites (Yang 1994a).
413 -> The (discrete-) gamma model is specified by a positive value for alpha, and 'ncatG' is then the number of categories for the discrete-gamma model. Values such as 5, 4, 8, or 10 are reasonable. 414 - 'fix_alpha' works in a similar way that fix_kappa.
414 415
415 - fix_rho and rho work in a similar way and concern independence or correlation of rates at adjacent sites, where rho is the correlation parameter of the auto-discrete-gamma model (Yang 1995). 416 -> The model of a single rate for all sites is specified as fix_alpha = 1 and alpha = 0 (0 means infinity)
416 -> The model of independent rates for sites is specified as fix_rho = 1 and rho = 0; choosing alpha = 0 further means a constant rate for all sites. 417
417 -> The auto-discrete-gamma model is specified by positive values for both alpha and rho. 418 -> The (discrete-) gamma model is specified by a positive value for alpha, and 'ncatG' is then the number of categories for the discrete-gamma model. Values such as 5, 4, 8, or 10 are reasonable.
418 -> The model of a constant rate for sites is a special case of the (discrete) gamma model with alpha = 0 (means infinity). 419
419 -> The model of independent rates for sites is a special case of the auto-discrete-gamma model with rho = 0. 420 - fix_rho and rho work in a similar way and concern independence or correlation of rates at adjacent sites, where rho is the correlation parameter of the auto-discrete-gamma model (Yang 1995).
421
422 -> The model of independent rates for sites is specified as fix_rho = 1 and rho = 0; choosing alpha = 0 further means a constant rate for all sites.
423
424 -> The auto-discrete-gamma model is specified by positive values for both alpha and rho.
425
426 -> The model of a constant rate for sites is a special case of the (discrete) gamma model with alpha = 0 (means infinity).
427
428 -> The model of independent rates for sites is a special case of the auto-discrete-gamma model with rho = 0.
420 429
421 ---------- 430 ----------
422 431
423 **Output files** 432 **Output files**
424 433
432 441
433 **How to edit manually the tree file : Branch or node labels** 442 **How to edit manually the tree file : Branch or node labels**
434 443
435 Some models implemented in codeml allow several groups of branches on the tree, which are assigned different parameters of interest. 444 Some models implemented in codeml allow several groups of branches on the tree, which are assigned different parameters of interest.
436 445
437 - For example, in the local clock models (clock = 2 or 3), you can have, say, 3 branch rate groups, with low, medium, and high rates respectively. 446 - For example, in the local clock models (clock = 2 or 3), you can have, say, 3 branch rate groups, with low, medium, and high rates respectively.
438 447 - Also the branch-specific codon models (model = 2 or 3 or codonml) allow different branch groups to have different ωs, leading to so called “two-ratios” and “three-ratios” models.
439 - Also the branch-specific codon models (model = 2 or 3 or codonml) allow different branch groups to have different ωs, leading to so called “two-ratios” and “three-ratios” models. 448 - All those models require branches or nodes in the tree to be labeled. Branch labels are specified in the same way as branch lengths except that the symbol “#” is used rather than “:”. The branch labels are consecutive integers starting from 0, which is the default and does not have to be specified.
440
441 - All those models require branches or nodes in the tree to be labeled. Branch labels are specified in the same way as branch lengths except that the symbol “#” is used rather than “:”. The branch labels are consecutive integers starting from 0, which is the default and does not have to be specified.
442 449
443 In ((Hsa_Human, Hla_gibbon) #1, ((Cgu/Can_colobus, Pne_langur), Mmu_rhesus), (Ssc_squirrelM, Cja_marmoset)); : 450 In ((Hsa_Human, Hla_gibbon) #1, ((Cgu/Can_colobus, Pne_langur), Mmu_rhesus), (Ssc_squirrelM, Cja_marmoset)); :
444 The internal branch ancestral to human and gibbon has the ratio ω1, while all other branches (with the default label #0) have the background ratio ω0. 451 The internal branch ancestral to human and gibbon has the ratio ω1, while all other branches (with the default label #0) have the background ratio ω0.
445 452
446 The following trees are equivalent : 453 The following trees are equivalent:
447 ((rabbit, rat) $1, human), goat_cow, marsupial); 454 - ((rabbit, rat) $1, human), goat_cow, marsupial);
448 (((rabbit #1, rat #1) #1, human), goat_cow, marsupial); 455 - (((rabbit #1, rat #1) #1, human), goat_cow, marsupial);
449 456
450 $ is the symbol for clade labels. 457 $ is the symbol for clade labels.
451 458
452 Rules concerning nested clade labels : The symbol # takes precedence over the symbol $, and clade labels close to the tips take precedence over clade labels for ancestral nodes close to the root. 459 Rules concerning nested clade labels : The symbol # takes precedence over the symbol $, and clade labels close to the tips take precedence over clade labels for ancestral nodes close to the root.
453 460
454 In the tree ((((rabbit, rat) $2, human #3), goat_cow) $1, marsupial); : 461 In the tree ((((rabbit, rat) $2, human #3), goat_cow) $1, marsupial); :
455 $1 is first applied to the whole clade of placental mammals (except for the human lineage), and then $2 is applied to the rabbit-rat clade. 462 $1 is first applied to the whole clade of placental mammals (except for the human lineage), and then $2 is applied to the rabbit-rat clade.
456 Equivalent tree with only '#' : 463 Equivalent tree with only '#' :
457 ((((rabbit #2, rat #2) #2, human #3) #1, goat_cow #1) #1, marsupial); 464 ((((rabbit #2, rat #2) #2, human #3) #1, goat_cow #1) #1, marsupial);
458
459
460 ]]></help> 465 ]]></help>
461 466
462 <citations> 467 <citations>
463 <citation type="doi">10.1093/molbev/msm088</citation> 468 <citation type="doi">10.1093/molbev/msm088</citation>
464 </citations> 469 </citations>