view gecko/gecko.xml @ 2:ac1ecf12863a draft default tip

Uploaded
author bitlab
date Thu, 13 Dec 2018 08:57:14 -0500
parents 35af401890c0
children
line wrap: on
line source

<tool id="GECKO" name="GECKO" version="1.0.1">
  <description>A pairwise genome comparison software for the detection of High-scoring Segment Pairs</description>

  <command>/home/galaxy-bitlab/galaxy/tools/gecko/bin/workflow.sh $genome1 $genome2 $length $similarity $wl 1 $frags_output0 $csv_output1</command>

  <inputs>
    <param name="genome1" type="data" format="fasta" label="Sequence X" help="Query sequence file in fasta format" />
    <param name="genome2" type="data" format="fasta" label="Sequence Y" help="Query reference file in fasta format"/>    
    <param name="length" type="integer" value="20" label="Minimum length to report a frag" />
    <param name="similarity" type="integer" value="20" label="Minimum similarity to report frag" />
    <param name="wl" type="integer" value="16" label="Word length" />
  </inputs>

  <outputs>
     <data name="frags_output0" format="data" label="Frags comparison for ${genome1.name}_${genome2.name}"/>
     <data name="csv_output1" format="csv" label="Csv comparison for ${genome1.name}_${genome2.name}"/> 
  </outputs>

  <help>

GECKO (GEnome Comparison with K-mers Out-of-core) is a fast, modular application designed to identify collections of HSPs in a pairwise genome comparisons. By employing novel filtering and data storing strategies, it is able to compare genome-sized sequences in less time. 

-----

**Manual**

To use GECKO, simply upload two .fasta files and select these as Sequence X and as Sequence Y.
Once so, choose the parameters that suit best your comparison:


- Minimum length: This parameter is the minimum length in nucleotides for an HSP (similarity fragment) to be conserved. Any HSP below this length will be filtered out of the comparison. It is recommended to use around 40 bp for small organisms (e.g. bacterial mycoplasma or E. Coli) and around 100 bp or more for larger organisms (e.g. human chromosomes).

- Minimum similarity: This parameter is analogous to the minimum length, however, instead of length, the similarity is used as threshold. The similarity is calculated as the score attained by an HSP divided by the maximum possible score. Use values above 50 to filter noise.

- Word length: This parameter is the seed size used to find HSPs. A smaller seed size will increase sensitivity and decrease performance, whereas a larger seed size will decrease sensitivity and increase performance. Recommended values are 12 or 16 for smaller organisms and 32 for larger organisms. These values must be multiples of 4.

</help>

   <citations>
	   <citation type="doi">10.1186/s12859-015-0679-9</citation>
   </citations>

</tool>