Mercurial > repos > abims-sbr > filter_assemblies
comparison filter_assembly.xml @ 1:a83562c0719f draft
planemo upload for repository https://github.com/abims-sbr/adaptsearch commit 68979144b9949c27bcc3340a9e8375de1391526c
| author | abims-sbr |
|---|---|
| date | Mon, 03 Feb 2025 14:37:31 +0000 |
| parents | 7a813e633d1c |
| children | 000dbfafe31d |
comparison
equal
deleted
inserted
replaced
| 0:7a813e633d1c | 1:a83562c0719f |
|---|---|
| 1 <tool name="Filter assemblies" id="filter_assemblies" version="2.0.3"> | 1 <tool name="Filter assemblies" id="filter_assemblies" version="2.0.4"> |
| 2 | 2 |
| 3 <description> | 3 <description> |
| 4 Filter the outputs of Velvet or Trinity assemblies | 4 Filter the outputs of Velvet or Trinity assemblies |
| 5 </description> | 5 </description> |
| 6 | 6 |
| 7 <macros> | 7 <macros> |
| 8 <import>macros.xml</import> | 8 <import>macros.xml</import> |
| 9 </macros> | 9 </macros> |
| 10 | 10 |
| 11 <requirements> | 11 <requirements> |
| 12 <expand macro="python_required" /> | 12 <expand macro="python3_required" /> |
| 13 <requirement type="package" version="0.0.14">fastx_toolkit</requirement> | |
| 14 <requirement type="package" version="10.2011">cap3</requirement> | 13 <requirement type="package" version="10.2011">cap3</requirement> |
| 15 </requirements> | 14 </requirements> |
| 16 | 15 |
| 17 <command> | 16 <command> |
| 18 <![CDATA[ | 17 <![CDATA[ |
| 21 ln -s '$input' '$input.element_identifier'; | 20 ln -s '$input' '$input.element_identifier'; |
| 22 #set $infiles = $infiles + $input.element_identifier + "," | 21 #set $infiles = $infiles + $input.element_identifier + "," |
| 23 #end for | 22 #end for |
| 24 #set $infiles = $infiles[:-1] | 23 #set $infiles = $infiles[:-1] |
| 25 | 24 |
| 26 ln -s '$__tool_directory__/scripts/S02a_remove_redondancy_from_velvet_oases.py' . && | |
| 27 ln -s '$__tool_directory__/scripts/S02b_format_fasta_name_trinity.py' . && | |
| 28 ln -s '$__tool_directory__/scripts/S03_choose_one_variants_per_locus_trinity.py' . && | |
| 29 ln -s '$__tool_directory__/scripts/S04_find_orf.py' . && | |
| 30 ln -s '$__tool_directory__/scripts/S05_filter.py' . && | |
| 31 | |
| 32 python '$__tool_directory__/scripts/S01_script_to_choose.py' | 25 python '$__tool_directory__/scripts/S01_script_to_choose.py' |
| 33 | 26 |
| 34 '$infiles' | 27 '$infiles' |
| 35 $length_seq_max | 28 $length_seq_max |
| 36 $percent_identity | 29 $percent_identity |
| 37 $overlap_length | 30 $overlap_length |
| 38 > ${log} | 31 > '${log}' |
| 39 ]]> | 32 ]]> |
| 40 </command> | 33 </command> |
| 41 | 34 |
| 42 <inputs> | 35 <inputs> |
| 43 <param name="inputs" type="data" format="fasta" multiple="true" label="Input files" /> | 36 <param name="inputs" type="data" format="fasta" multiple="true" label="Input files" /> |
| 104 | 97 |
| 105 <![CDATA[ | 98 <![CDATA[ |
| 106 | 99 |
| 107 **Description** | 100 **Description** |
| 108 | 101 |
| 109 This tool reformats Velvet Oases or Trinity assemblies for the AdaptSearch galaxy suite and selects only one variant per gene according to its length and quality check. | 102 This tool runs the CAP3 software on assembly FASTA data, merge singlets and contigs and then reformat headers to allow any assembly tools. |
| 110 | 103 |
| 111 --------- | 104 --------- |
| 112 | 105 |
| 113 **Input format** | 106 **Input format** |
| 114 | 107 |
| 115 (1) Sequences are in the sequential format: | 108 Sequences are in the FASTA format: |
| 116 | 109 |
| 117 | >seqname1 | 110 | >seqname1 |
| 118 | AAAGAGAGACCACATGTCAGTAGC -on one or several lines - | 111 | AAAGAGAGACCACATGTCAGTAGC -on one or several lines - |
| 119 | >seqname2 | 112 | >seqname2 |
| 120 | AAGGCCTGACCACATGAGTTAAGC -on one or several lines - | 113 | AAGGCCTGACCACATGAGTTAAGC -on one or several lines - |
| 121 | etc ... | 114 | etc ... |
| 122 | | 115 | |
| 123 | |
| 124 2) The file name should begin with a two letter abbreviation of the species name (for isntance, 'Ap' if the species is Alvinella pompejana). | |
| 125 | |
| 126 **For Velvet Oases assemblies input** | |
| 127 | |
| 128 The headers must be as follow : *>Locus_i_Transcript_i/j_Confidence_x.xxx_Length_N* where i is the locus number, j the transcript variant among all versions of the transcript, x.xxx the confidence value and N the length. | |
| 129 | |
| 130 **For Trinity assemblies inputs** | |
| 131 | |
| 132 The headers must be as follow : *>cj_gj_ij Len=j path=[j:0-j]* where all the j are integers (locus number, transcript variant, length, position...) | |
| 133 | |
| 134 **The tool handles the case if input files come from both assemblers (there is no need for input files to be exclusively from one or another assembler).** | |
| 135 | 116 |
| 136 --------- | 117 --------- |
| 137 | 118 |
| 138 **Parameters** | 119 **Parameters** |
| 139 | 120 |
| 148 --------- | 129 --------- |
| 149 | 130 |
| 150 **Steps**: | 131 **Steps**: |
| 151 | 132 |
| 152 The tool: | 133 The tool: |
| 153 1) Modifies the sequence name to add the species abbreviation using the 2 first letters of the name of the transcriptome file : note that each species abbreviation must be unique | 134 1) Performs a CAP3 from the full set of ORFs to minimize redundancy |
| 154 2) Selects one allelic sequence from each transcript (c or locus) using the length of the sequence and its level of confidence | 135 2) Merges singlets and contigs identified by CAP3 |
| 155 3) Selects the best ORF from the sequence between two stop codons | 136 3) Reformats headers of the FASTA records by adding a specified prefix (defined from the original filename) and ensures that sequences are on a single line |
| 156 4) Performs a CAP3 from the full set of ORFs to minimize redundancy | |
| 157 5) Retrieves the initial transcript sequences from the remaining set of proceeded ORF sequences | |
| 158 | 137 |
| 159 **Outputs** | 138 **Outputs** |
| 160 | 139 |
| 161 - 'Filter Assemblies Summary' : the log file. | 140 - 'Filter Assemblies Summary' : the log file. |
| 162 - 'Filter Assemblies outputs' : the main results. | 141 - 'Filter Assemblies outputs' : the main results. |
| 170 --------- | 149 --------- |
| 171 | 150 |
| 172 Changelog | 151 Changelog |
| 173 --------- | 152 --------- |
| 174 | 153 |
| 154 | |
| 155 **Version 2.2 - 07/10/2024** | |
| 156 | |
| 157 - Input files can be from any assembly tools | |
| 158 | |
| 175 **Version 2.1 - 15/01/2018** | 159 **Version 2.1 - 15/01/2018** |
| 176 | 160 |
| 177 - Input files can be a mix from files coming either from Trinity or Velvet Oases assemblers | 161 - Input files can be a mix from files coming either from Trinity or Velvet Oases assemblers |
| 178 | 162 |
| 179 **Version 2.0 - 14/04/2017** | 163 **Version 2.0 - 14/04/2017** |
