comparison orthologs/evolmap.xml @ 0:5b9a38ec4a39 draft default tip

First commit of old repositories
author osiris_phylogenetics <ucsb_phylogenetics@lifesci.ucsb.edu>
date Tue, 11 Mar 2014 12:19:13 -0700
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:5b9a38ec4a39
1 <tool id="evolmap" name="EvolMAP">
2 <description>Runs EvolMAP.</description>
3 <command interpreter="perl">evolmap.pl '$tree' $protein $database_name $read_database $Blastall $read_blast_scores $alignments $bit_scores $read_scores $read_ancestors $sfa $ortholog_threshold $diverged_threshold $diverged_std $avg_of_paralogs #for $file in $fileList
4 ${file.tree_file}
5 #end for</command>
6
7 <inputs>
8 <repeat name="fileList" title="Species files (fasta) - please input in the same order as species tree">
9 <param name="tree_file" type="data" format="fasta" label="fasta file"/>
10 </repeat>
11
12 <param name="tree" type="text" area="true" size="3x25" label="tree (newick format)"/>
13 <param name="protein" type="boolean" truevalue="true" falsevalue="false" checked="true" label="protein" />
14 <param name="database_name" type="text" value="dataout" label="database_name" />
15 <param name="read_database" type="boolean" truevalue="true" falsevalue="false" checked="false" label="read_database" />
16 <param name="Blastall" type="boolean" truevalue="true" falsevalue="false" checked="true" label="Blastall" />
17 <param name="read_blast_scores" type="boolean" truevalue="true" falsevalue="false" checked="false" label="read_blast_scores" />
18 <param name="alignments" type="integer" value="0" label="alignments" />
19 <param name="bit_scores" type="boolean" truevalue="true" falsevalue="false" checked="false" label="bit_scores" />
20 <param name="read_scores" type="boolean" truevalue="true" falsevalue="false" checked="false" label="read_scores" />
21 <param name="read_ancestors" type="boolean" truevalue="true" falsevalue="false" checked="false" label="read_ancestors" />
22 <param name="sfa" type="boolean" truevalue="true" falsevalue="false" checked ="false" label="sfa" />
23 <param name="ortholog_threshold" type="integer" value="250" label="ortholog_threshold" />
24 <param name="diverged_threshold" type="integer" value="250" label="diverged_threshold" />
25 <param name="diverged_std" type="integer" value="3" label="diverged_std" />
26 <param name="avg_of_paralogs" type="boolean" truevalue="true" falsevalue="false" checked="true" label="avg_of_paralogs" />
27 </inputs>
28
29 <outputs>
30 <data from_work_dir="dataout.ancestors_pass2.rn"/>
31 </outputs>
32
33 <help>
34 EvolMAP is an algorithm and software for estimating the composition of ancestral genomes and the timing of gene duplication and loss events.
35 The input is a species-tree and genes from its modern species.
36 The output is the inferred ancestral genes of the speciation nodes of the tree and the inferred gene duplication and loss events specific to each branch.
37
38 EvolMAP features include:
39 * Detection of orthologous groups from an ancestral gene perspective (i.e. descendants of an ancestral gene)
40 * Scalable and fast genome-level comparisons laying out timings of gene duplications and losses
41 * Generating gene expansion (GE) tree which is useful to track evolution of a specific domain on the species tree
42 * Generating average ortholog divergence (AOD) tree which is a measure of the molecular clock
43 * Categorizing divergence of gene duplications into in-paralogs, diverged in-paralogs and ambiguous gains
44
45 Onur Sakarya, Kenneth S. Kosik and Todd H. Oakley. Reconstructing ancestral genome content based on symmetrical best alignments and Dollo parsimony. Bioinformatics 2008 24(5):606-612.
46
47 http://kosik-web.mcdb.ucsb.edu/evolmap/index.htm
48
49 Options overview
50
51 tree: input newick format species tree
52 Example input: (((human,chimp),(mouse,rat)),dog)
53
54 protein: amino-acid or nucleotide file
55 Check if true, else false
56
57 database_name: analysis name
58 Example input: mammal_genomes
59 Output database name would be: mammal_genomes.gd.fa
60
61 read_database: if true, reads already created fasta from disk
62 Check if true, else false
63
64 blastall: runs blastall first -- if false, it generates all-to-all Needleman-wunsch scores which is slow for large datasets.
65 Check if true, else false
66
67 read_blast_scores: if true, reads already calculated blast scores
68 Check if true, else false
69
70 alignments: Number of top alignments for each gene to be calculated by blastall (if used)
71 Example input: 300
72
73 bit_scores: if false, calculate needleman-wusch alignment scores for the blast hits, if true, uses blast bit scores.
74 Check if true, else false
75
76 read_scores: if true, reads scores from already calculated score file
77 Check if true, else false
78
79 read_ancestors: reads already calculated ancestor from file and re-runs Dollo parsimony
80 Check if true, else false
81
82 ortholog_threshold: minimum similarity threshold for orthologs
83 Example input: 250
84
85 diverged_threshold: minimum similarity threshold for diverged paralogs
86 Example input: 250
87
88 diverged_std: diverged paralogs are allowed to be at most this many ortholog divergence standard deviations from the ancestor node's average sym-bet score.
89 Example input: 3
90
91 avg_of_paralogs: if true, while calculating similarity between two ancestral genes, avg. score of all its members to members of the other gene are considered if false only best score between the members is considered
92 Check if true, else false
93 </help>
94
95 </tool>