Mercurial > repos > earlhaminst > gstf_preparation
comparison gstf_preparation.xml @ 8:92f3966d5bc3 draft
planemo upload for repository https://github.com/TGAC/earlham-galaxytools/tree/master/tools/gstf_preparation commit 88ba62ae8c3d9587a0015c72209242ad0c1df0c2
author | earlhaminst |
---|---|
date | Wed, 16 May 2018 20:03:57 -0400 |
parents | 9ef7661e8e9c |
children | e8e75a79de59 |
comparison
equal
deleted
inserted
replaced
7:9ef7661e8e9c | 8:92f3966d5bc3 |
---|---|
35 <param name="genome" type="text" label="Genome name" help="Genome name without whitespaces or special characters"> | 35 <param name="genome" type="text" label="Genome name" help="Genome name without whitespaces or special characters"> |
36 <validator type="empty_field" /> | 36 <validator type="empty_field" /> |
37 </param> | 37 </param> |
38 </repeat> | 38 </repeat> |
39 <param name="json" type="data" format="json" multiple="true" optional="true" label="Gene features in JSON format generated by 'Get features by Ensembl ID' tool" /> | 39 <param name="json" type="data" format="json" multiple="true" optional="true" label="Gene features in JSON format generated by 'Get features by Ensembl ID' tool" /> |
40 <param name="fasta_inputs" type="data" format="fasta" multiple="true" label="Corresponding FASTA datasets" help="Each FASTA header line should start with a transcript id" /> | 40 <param name="fasta_inputs" type="data" format="fasta" multiple="true" label="Corresponding CDS datasets in FASTA format" help="Each FASTA header line should start with a transcript id" /> |
41 <param name="longestCDS" type="boolean" checked="false" label="Keep only the longest CDS per gene" /> | 41 <param name="longestCDS" type="boolean" checked="false" label="Keep only the longest CDS per gene" /> |
42 <param name="headers" type="boolean" checked="true" label="Change the header line of the FASTA sequences to the >TranscriptId_species format" help="As required by TreeBest, part of the GeneSeqToFamily workflow" /> | 42 <param name="headers" type="boolean" checked="true" label="Change the header line of the FASTA sequences to the >TranscriptId_species format" help="As required by TreeBest, part of the GeneSeqToFamily workflow" /> |
43 <param name="regions" type="text" optional="true" label="Comma-separated list of region IDs (e.g. chromosomes or scaffolds) for which FASTA sequences should be filtered" help="Region IDs are in the `seqid` column for GFF3 and in the `seq_region_name` field in JSON. This is typically used to filter chromosomes with a non-standard genetic code, like mitochondria, to be analysed separately" /> | 43 <param name="regions" type="text" optional="true" label="Comma-separated list of region IDs (e.g. chromosomes or scaffolds) for which FASTA sequences should be filtered" help="Region IDs are in the `seqid` column for GFF3 and in the `seq_region_name` field in JSON. This is typically used to filter chromosomes with a non-standard genetic code, like mitochondria, to be analysed separately" /> |
44 </inputs> | 44 </inputs> |
45 | 45 |
109 <![CDATA[ | 109 <![CDATA[ |
110 **What it does** | 110 **What it does** |
111 | 111 |
112 This tool converts a set of GFF3 and/or JSON gene feature information datasets into SQLite format. | 112 This tool converts a set of GFF3 and/or JSON gene feature information datasets into SQLite format. |
113 | 113 |
114 It also filters a CDS FASTA dataset to keep only the transcripts present in the gene feature information. Optionally it can also keep only the longest CDS per gene and/or change the header line of the FASTA sequences to the >TranscriptId_species format (as required by TreeBest, part of the GeneSeqToFamily workflow). | 114 It also filters the CDS FASTA datasets to: |
115 | |
116 - remove coding sequences whose length is not a multiple of 3 | |
117 - keep only the transcripts present in the gene feature information. | |
118 | |
119 Optionally it can also keep only the longest CDS per gene and/or change the header line of the FASTA sequences to the >TranscriptId_species format (as required by TreeBest, part of the GeneSeqToFamily workflow). | |
115 | 120 |
116 Example GFF3 file:: | 121 Example GFF3 file:: |
117 | 122 |
118 scaffold_0 MYZPE13164_Clone_G006_v1.0 gene 44968 69413 . - . ID=MYZPE13164_G006_v1.0_000000030;Name=MYZPE13164_G006_v1.0_000000030;biotype=protein_coding | 123 scaffold_0 MYZPE13164_Clone_G006_v1.0 gene 44968 69413 . - . ID=MYZPE13164_G006_v1.0_000000030;Name=MYZPE13164_G006_v1.0_000000030;biotype=protein_coding |
119 scaffold_0 MYZPE13164_Clone_G006_v1.0 mRNA 44968 69413 . - . ID=MYZPE13164_G006_v1.0_000000030.1;Parent=MYZPE13164_G006_v1.0_000000030;Name=MYZPE13164_G006_v1.0_000000030.1;biotype=protein_coding;_AED=0.31 | 124 scaffold_0 MYZPE13164_Clone_G006_v1.0 mRNA 44968 69413 . - . ID=MYZPE13164_G006_v1.0_000000030.1;Parent=MYZPE13164_G006_v1.0_000000030;Name=MYZPE13164_G006_v1.0_000000030.1;biotype=protein_coding;_AED=0.31 |