Mercurial > repos > iuc > salsa
comparison salsa2.xml @ 3:f77f7a7f3b83 draft
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/salsa2 commit 4904594e8df7cbd6eeee4be24023c6bd15e162de"
author | iuc |
---|---|
date | Thu, 11 Nov 2021 15:03:17 +0000 |
parents | ab5b7f6b7198 |
children | 9a22227bb6d0 |
comparison
equal
deleted
inserted
replaced
2:ab5b7f6b7198 | 3:f77f7a7f3b83 |
---|---|
3 <xrefs> | 3 <xrefs> |
4 <xref type="bio.tools">SALSA</xref> | 4 <xref type="bio.tools">SALSA</xref> |
5 </xrefs> | 5 </xrefs> |
6 <macros> | 6 <macros> |
7 <token name="@TOOL_VERSION@">2.3</token> | 7 <token name="@TOOL_VERSION@">2.3</token> |
8 <token name="@VERSION_SUFFIX@">1</token> | 8 <token name="@VERSION_SUFFIX@">2</token> |
9 </macros> | 9 </macros> |
10 <requirements> | 10 <requirements> |
11 <requirement type="package" version="@TOOL_VERSION@">salsa2</requirement> | 11 <requirement type="package" version="@TOOL_VERSION@">salsa2</requirement> |
12 <requirement type="package" version="1.11">samtools</requirement> | 12 <requirement type="package" version="1.11">samtools</requirement> |
13 </requirements> | 13 </requirements> |
41 #end if | 41 #end if |
42 -o ./out | 42 -o ./out |
43 ]]></command> | 43 ]]></command> |
44 <inputs> | 44 <inputs> |
45 <param name="fasta_in" type="data" format="fasta" label="Initial assembly file" help="Headers must not contain ':'."/> | 45 <param name="fasta_in" type="data" format="fasta" label="Initial assembly file" help="Headers must not contain ':'."/> |
46 <param name="bed_file" type="data" format="bed" label="Bed alignment" help="Sorted by read names"/> | 46 <param name="bed_file" type="data" format="bed" label="Bed alignment" help="To start scaffolding with SALSA, reads need to be mapped to the assembly. |
47 <param name="cutoff" argument="-c" type="integer" min="1" label="Cutoff" optional="true" help="Minimum contig length to scaffold."/> | 47 BWA or BOWTIE2 are recommended. SALSA requires a bed file as the input. The alignment bam file can be converted using the bamToBed command from |
48 <param name="gfa_file" argument="-g" type="data" format="gfa1,gfa2" optional="true" label="Sequence graphs" help="Sequence graphs encoded in GFA format."/> | 48 the Bedtools package."/> |
49 <param name="cutoff" argument="-c" type="integer" min="1" label="Cutoff" optional="true" help="Minimum contig length to scaffold"/> | |
50 <param name="gfa_file" argument="-g" type="data" format="gfa1,gfa2" optional="true" label="Sequence graphs" | |
51 help="An assembly graph can be optionally provided to guide the scaffolding, potentially reducing the scaffolding errors"/> | |
49 <conditional name="enzyme_conditional"> | 52 <conditional name="enzyme_conditional"> |
50 <param name="enzyme_options" type="select" label="Enzyme selection" help="TODO"> | 53 <param name="enzyme_options" type="select" label="Enzyme selection" help="Hi-C experiments can use different restriction enzymes. |
54 The enzyme frequency in contigs is used to normalize the Hi-C interaction frequency. Note that you need to specify the actual | |
55 sequence of the cutting site for a restriction enzyme and not the enzyme name. You can also specify DNASE as an enzyme if you | |
56 use an enzyme-free prep, e.g. Omin-C."> | |
51 <option value="preconfigured">Preconfigured restriction enzymes</option> | 57 <option value="preconfigured">Preconfigured restriction enzymes</option> |
52 <option value="specific">Enter a specific sequence</option> | 58 <option value="specific">Enter a specific sequence</option> |
53 </param> | 59 </param> |
54 <when value="preconfigured"> | 60 <when value="preconfigured"> |
55 <param name="preconfigured_enzymes" type="select" multiple="true" label="Preconfigured enzymes"> | 61 <param name="preconfigured_enzymes" type="select" multiple="true" label="Preconfigured enzymes"> |
63 help="Restriction enzyme sequence. If multiple were used, include all as a comma separated list without spaces (ex. 'GATC,AAGCTT')."> | 69 help="Restriction enzyme sequence. If multiple were used, include all as a comma separated list without spaces (ex. 'GATC,AAGCTT')."> |
64 <validator type="expression" message="Only alphabetical letters and the comma can be used in to define restriction enzym sequences.">value.replace(',', '').isalpha()</validator> | 70 <validator type="expression" message="Only alphabetical letters and the comma can be used in to define restriction enzym sequences.">value.replace(',', '').isalpha()</validator> |
65 </param> | 71 </param> |
66 </when> | 72 </when> |
67 </conditional> | 73 </conditional> |
68 <param name="iter" argument="-i" type="integer" min="0" label="Iterations" optional="true" help="Number of iterations to run"/> | 74 <param name="iter" argument="-i" type="integer" min="0" max="20" label="Iterations" optional="true" |
75 help="SALSA will scaffold through sequential iterations. The default number of iterations is 3. Increasing the number of iterations will | |
76 potentially increase the number of joins, however it could also introduce additional misjoins"/> | |
69 </inputs> | 77 </inputs> |
70 <outputs> | 78 <outputs> |
71 <data name="scaffolds_fasta" format="fasta" from_work_dir="out/scaffolds_FINAL.fasta" label="${tool.name} on ${on_string}: FASTA assembly"/> | 79 <data name="scaffolds_fasta" format="fasta" from_work_dir="out/scaffolds_FINAL.fasta" label="${tool.name} on ${on_string}: FASTA assembly"/> |
72 <data name="scaffolds_agp" format="tabular" from_work_dir="out/scaffolds_FINAL.agp" label="${tool.name} on ${on_string}: agp output"/> | 80 <data name="scaffolds_agp" format="tabular" from_work_dir="out/scaffolds_FINAL.agp" label="${tool.name} on ${on_string}: agp output"/> |
73 </outputs> | 81 </outputs> |
115 <output name="scaffolds_fasta" file="out.fasta"/> | 123 <output name="scaffolds_fasta" file="out.fasta"/> |
116 <output name="scaffolds_agp" file="out.agp"/> | 124 <output name="scaffolds_agp" file="out.agp"/> |
117 </test> | 125 </test> |
118 </tests> | 126 </tests> |
119 <help><![CDATA[ | 127 <help><![CDATA[ |
120 **What is does** | 128 .. class:: infomark |
129 | |
130 **Purpose** | |
121 | 131 |
122 SALSA (Simple AssembLy ScAffolder) is a scaffolding tool based on a computational method that exploits the genomic proximity | 132 SALSA (Simple AssembLy ScAffolder) is a scaffolding tool based on a computational method that exploits the genomic proximity |
123 information in Hi-C data sets for long range scaffolding of de novo genome assemblies. | 133 information in Hi-C data sets for long range scaffolding of de novo genome assemblies. |
134 | |
135 ---- | |
136 | |
137 .. class:: infomark | |
138 | |
139 **Mapping reads** | |
140 | |
141 To start the scaffolding, first step is to map reads to the assembly. We recommend using `BWA <https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa_mem/0.7.17.2>`_ | |
142 or `BOWTIE2 <https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.4.2+galaxy0>`_ aligner to map reads. The read mapping generates a bam file. SALSA requires | |
143 BED file as the input. This can be done using the bamToBed command from the `Bedtools package <http://bedtools.readthedocs.io/en/latest/>`_. Also, SALSA requires BED files to be sorted by the | |
144 read name, rather than the alignment coordinates. Once you have bam file, you can run following commands to get the bam file needed as an input to SALSA. | |
145 | |
146 Since Hi-C reads and alignments contain experimental artifacts, the alignments needs some postprocessing. To align and postprocess | |
147 the alignments, you can use the pipeline released by Arima Genomics which can be found in the `GitHub repository <https://github.com/ArimaGenomics>`_. | |
148 | |
149 Additional information on how to generate/filter the bam `here <https://github.com/marbl/SALSA#mapping-reads>`_. | |
124 | 150 |
125 ]]></help> | 151 ]]></help> |
126 <citations> | 152 <citations> |
127 <citation type="doi">10.1101/261149</citation> | 153 <citation type="doi">10.1101/261149</citation> |
128 <citation type="doi">10.1186/s12864-017-3879-z</citation> | 154 <citation type="doi">10.1186/s12864-017-3879-z</citation> |