Mercurial > repos > artbio > get_reference_fasta
changeset 0:816bedbb305c draft
"planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/get_reference_fasta commit ff0b4efcce46f70bb06dcb1b7200c348959d1911"
author | artbio |
---|---|
date | Tue, 07 Jan 2020 03:36:07 -0500 |
parents | |
children | 98211bfc53fc |
files | get_reference_fasta.xml test-data/EcR_USP_224.fa test-data/all_fasta.loc tool-data/all_fasta.loc.sample tool_data_table_conf.xml.sample tool_data_table_conf.xml.test |
diffstat | 6 files changed, 160 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/get_reference_fasta.xml Tue Jan 07 03:36:07 2020 -0500 @@ -0,0 +1,27 @@ +<tool id="get_fasta_reference" name="get fasta reference" version="0.2.0"> +<description>Obtain reference genome sequence</description> + <stdio> + <exit_code range="1:" /> + </stdio> + <command><![CDATA[ + cat "$pre_installed_fasta.fields.path" > "$output" + ]]></command> + <inputs> + <param help="if you wish to have your fasta sequence listed contact instance administrator" label="Select a fasta sequence" name="pre_installed_fasta" type="select"> + <options from_data_table="all_fasta"/> + </param> + </inputs> + <outputs> + <data name="output" label="${pre_installed_fasta.value_label}" format="fasta" /> + </outputs> + <tests> + <test> + <param name="pre_installed_fasta" value="EcR_USP_224.fa"/> + <output name="output" file="EcR_USP_224.fa"/> + </test> + </tests> + <help><![CDATA[ +Places the reference genome sequence in the current history. +Useful for sharing purposes or tools that work directly on fasta files. + ]]></help> +</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/EcR_USP_224.fa Tue Jan 07 03:36:07 2020 -0500 @@ -0,0 +1,101 @@ +>3L:3245372,3251371 +AGGAGGCTATGTTTCTTAATGATGAACTGGTATATTATATTTTCGAAACTTTCATTTAAT +GTAAAAACACTGTTATTAGTAAATGGAATCTTCCATAACAGGGTCCACCCCAATCGATCA +TCCCACTCATTAGGTTTCTTCCTCGTCTGAGAGGACCAGAGGTCCCGAGGCTTGCGTTTT +TGCGGTGACTGTGGCGTCTGCTGGGTGTCAACATTTAGCCGCATGTTGTGCGGCTGCCAC +AACTGCAGTTCCCCAATTCTGGATAAAAAAAAAAAAACGCCGTCAAACAGAAGGGTGAAA +TTAACTTCGGAGTCGGACGAGGAGGAGTGCTTAGCACCCGCATCCAGGCACATGTGCTGT +TCTTCTCGACATGCGACATTCGTACAGTCGCCGAGTCAATCAGACCGGAAAATATGCACA +AATTCCAATGGCTTTTGTCATGTTGCAAGGCGATTGTGACTACATTTTTGCCATTTCTAT +TTCCCAAATAGCGGCTAAAACAGTTGTGGCCATTAACTTAGTGCTATGAATGTAGTTACT +GTTCTTGTGAATTGCTTAATTTCCCTTTCAATCATCTAAAATAAATAAATACAAATGAAT +GAACCAAAGTCATTACCCTGCCTAAAACACATTTTCTTCCCTTAGAACTTTGTACCCCTT +TTATATTTCGAATTGCAAATGGAAGGGTGAAATTAGCGGCAAGTGGAGCTATTTTCCGCG +CGGTGTATGGTTATCACAACGCTTATCTCACTACGCGATTGTCGGAACCTGTAAACGCGA +TTTTGTCAGCCAGGTTTTTTTGGCTGCCAGCACTTGACGTGTGTTAAATTAGCATTAAAC +AATAATTTTGTGCCTGTTTTTCTTTTTTTGGCTTTGCCGCTGTTGTTGTTGCTGTTACTG +TTGTGGTGAATGCAAAATAAATTGCGCATAAAGTTTAATCACTTTGACTTTGCACAACAC +ACACGCACGCACACACACTTGCAATGCCAAAAAATAAAAACGCACAACAAAAGATTCTCC +ACACACAGATACACAGATACTGAATACAATATCGGCAGCAGCAGCGCAAACCAAAACAAC +AACGACGAGCCCAGTGGACAGTGCAAAATAAGTATAAAAATAAATAAGAAAAATTAAAAA +AAAAGGAAATAAATAAAAAAATACACAAGGCGAAGGCGACGATGGCAACAGGCAGAGCGA +GCGGGATGCAAATAGAGTCAACAACCTCCAGTGCATTACTCACTTTTAAAGACCGTGTCT +GTCCTTCAACGAAAACTTTATCTCTGTCCCTCACTCGCTCGCTCTGCATCTGCATCTCAG +CTTTTGCTCCCTCTCTCTCTCGCTCGCTCTCACTCGAGCAGCCCGATTCGTTTTTTACTT +CTTATAATTAAACAATTTTGCAGCGGCACTCCCCCAGCGCCCCCATTTCGTACTCCCCCC +GCCGTCTTCATCTTTTTCCGGACAAACAGAACCCGAAAAAGTGATCTTGCTGCACGGAAA +GAAATCGAATCGTTGTCTAACAATAGAAATGCTGCTTTATAAGGCAACCAATTGAAGTCT +TTCGTTCTTCAAAACCTAATTCAAAGATTAACCACTTTTTTGCTATATCTATCCTTAAGC +TTTAAAATTCGAATAGTAATAGGATTTAACTCTATAGTAGTAGTATCTTATCATATTATC +AGTACGATTTTTCCCAAAGTGCCATTGTTTTTGGTCACTATGGTCTTGTTCTTGTTCCCG +GAGATTTGCACAAGTGTGCAGAAGACAGTCTTCTATCTCCACTTTATCGATGGGGCTTCG +GAAAGTTTGTCGATTCCGCTGCTGCGCATTTTGGCGAGATGCGAGATGAATAAACTGTTG +TACGCTTGCGTGACCTGTTCGCCATCTCGGTTGCTCCTCCTGCGCCCTCTTTTCCTCTCT +CTACCCCCCTCCCCCACGCAAAGGAAAGAGACGGAGCGATTGCCACATCCGCCCTGCATG +TTTGTGCGCTTTTTGTGCGACTAATGTCATTGACTGCAATTTGTTAGCAAAGTATTTCGA +CTGATAAACAAAATCTGCACGATGCCAATGAACCCGGCTCTCTAAAATTGCCATCCGAAA +GCCAAAACAACCCGAAATTGCAATTCGCGCCCCAATTGGAGAGCAACTAGGTAGGCGTGT +GTAACAGAGATAGCAATCGGGCCTTGCACTCACACTCCCTAAGGAGCAGAGGTGGATTCT +AACGGGGATTTGACGGAACGCGAATTCTTTAGCATTCTATATCTGCACCTTATAAGAATT +TCCACTGAGTTCTAAGTTGAGATTTCATAATATTTAGTATTTTAACTATGTTTTTTCGTA +TAAGTTTTGTTATTACTCTGCTATGACTTCCATAACCCCTTTTTAGAAGTGCTTTTCCTT +ATCCGCCTCTGCACATGAGCACAGGTTGGCAATGCCATAAACAAAGAATTCCTTTTGTTT +GTTGATTCGATTTTTGTTCTGCGATCTTTTTATTTCTTTGCAAATTGTATTTTTATTTTT +AAATAAACAAGCCGAGTTCATTGCATTCGCCAGAATAAGAATATAACAAATACGGCACGA +AAAGCACTCGACAACCGACAAAAGGCGCAGAAAAACAGGAACGTCGACTGACATACATGG +CGTATAATTAACGGCTGCGCGTGTAGAGAGAGTTCAAGTTACTTTATCAATTCTTTCTTT +TTCGGGACCTAACAATACTCATACTTGCACTTAAGTAGGCGGAGTGAAAGCCAAGTCATA +ATTTCGACGATGCGTATACATATATAGAATCAATCAACTGATTAATTGCAGCTGTGCAAC +GCTTGAGTTTTTGCCTCAGCCTTTCGTCTGGTGACATAGTTTACTCGATTAATTATGGTA +AGTAATAAGGGTTTAAAATAATTGAAAACCCCAGCACTTGCGTGTATTATATATAACGAT +GATTTAACAGCACCTCCTTTATAAATAAAACCAATCCCTTCAAGTGCGAACAGCTATGTT +TTCCGCTCATCTGGCGCATTTATCAGATGGTGCCATATTTCCTCGGAGAAGAAGGCATTG +AATGTCAGTGGTGTTCGGATTACACTAAAGTCGGTCAATAACTTCGGACCCCTGCACAAA +GCGTTTAAGTGACCACAAGTGATCGAGATGTTCCTCTTGTTGTTTACCCCCTTGCCAACT +GATCTTAAGTTTGGGATGCCACGCTAGTTTAGTTGACCGGTTTAATGACTCGAACTTAAT +TTGCGCCCTCGGAGAGAGGAAAGTAGCCAGCAAAAAATGCAAGCCGAAAAATATGCGAAA +CAACCAGGCAGACAACACCCAACGGCAAAAACTCGGCCTGGAGAGAAAGAGGCAGTGGCA +GCGACGCGTCTGGGGGCTTACAATGGCGGTCGCAACACTAGTGGCGCTTGTAAATAGAGA +CATAACGAAAAGGTATTAAAAATGATGCGGCAAAAGAAATAACTGCACTATTTTCCATAA +AAATATTTTGAAAAATAAACTGTTGCGCCGTTTTTAGAGATCTTAAAAACCTCTTTACGT +CAACTTTGATAACTAATTGAGTTCCTTTGCATAGTTATGATTTTTAAAAATAACAATTTA +AGATACATGATATTCCCTACATGAACAATTAGTGGTTTATAATAAATAAGCAACCTAATG +CGTAAGATCCCACAATCTTGACCGCTACTGTGAAAAGGGGGGGATCTGCGTGAGTGTGCG +TGTATGTGGGGAGAGGGCTGCAGTGGGCGGGGCAGGCTGCAGGAAGAGCCCCCAGGCGAG +CGTGTGTATGTGAGTGGGTCGCCAAAACAGACAAAAAACGAGGAGTGCATACGAGCAGAA +GCAGCAGTGCTGCCCCAAAAGAACGATGCTCGAACCGAAAGTAACTCATATTCGCGGCTG +CGAGAGTGTGTGCGTGTGCGGCGACGGATCAGGCAAATATATAAAGCGGCGATCGGGCAA +TGCAAACTGCTCATTCCGTCGCCGTTCGTTCCGTTCGCTTTACTTTTCGTACTTTTCTCG +CATATTAAAAAGTCAAACAAAATAATAAAATGAAGCGATCGCACACATACTCACGCACAT +ACGTGTATATGTTCGTACATATATATATGTACATATGCATACATATATATGCACGCAATG +GCCGCCATTGACGCCGACTGCGCTGCCGACTGCGCTGGCGAGAGTATAAAAGCATAAAAT +CACTTCGTACTCGGGTTTATTAAAACCAAAACTGTGCAAGTGTCAAGATCGGTTAGCAGC +AGCAAAAAGATAAATAAGAAATAGCCAAGGACCCATAAAATAAATAATCTCAATACCAAA +AAGTTCTAGTGAAATTCACAATTCTGACTTGGAAAGTGAAAGTTTGGCCATTAAACGTAC +GATTAAAATCCACAACAGCACGATCAAAAATAGTATCAAGCAATTAGCGCAGAAAAAGTA +ACAAAAAAAATTTTAAAAAACAGGAACGCGACGTGCCCGCAAAAGCGAAAAAAAATTAAA +ATCGAAAGTGTTCTACTGACATGGATTACTTTTTGCCGCCCAAACTACAAACACAACAAA +ACGCGCAGAGAAACTAGCACTACTTTTTTCTATTCACCCATTCGGAGAGTGAGAAGAATC +GGAGAGAAGGAAAGAGAGCAAAGGCCGGTCCGAAATAGAAAGCACTACACTGGAAAATCT +GTTAATAAGAATGCAATGAAAGTAATACGAAACACAGATATATTTAGCTTATATTACTCT +TAAAACTCTAGAAAAATCTAGATCGGTTAATATAACAATTTTAAAATAGCTTAGTTGGCA +TCGATGTTACGGAAAAAATTTTTCCAGTTGTACTTGAAAGGCAGAAATATTTCGAGATTT +AGATTTATGAGTCTTCCTAAAGAAATTAAATTGGATAGAAAATGTCTTTTAGATATAGAA +TACAGGGTGATGCCTAATCCATTAAAATCGAGCAATCTAAAAAGTGTCATACCAGTTAGA +TTGGGTGTAACCAACGCAACGTTCTCACTGTGTAGATAGCGATCTCTTCTATTCGGCGGT +GTACGTGCACTTGGCCATTTGTCTCTCCATTCTCCATTTTTCGCGCCTCGCTCTCAATTC +TCTGCGCTCAAATCCTCTAGCAATTCTAATTCGTATTCTCGCCGCCTCGCTTTGAACTTG +AACTTTAAATGCACAAACCATAATCGTGTATGTTATGTTGTTGCTGGCCGAGGGCGTGCT +CTCGCACTCTGGCAAACATGGGCTCTACGAGTTTGCTATATATACGCAGCGCAATCAGTT +GCGAGGCAGCACTCGTTCCATGTGGGCGCTCGACAATCGCCCGCTGATCAGTTTTCGACT +GGCTTGCAATTAATTCGGCTCTTGACGAGCCCCAAAAGTGAAAGTCGCGAGTGAAAGACG +TGGCAGTTTTATATTAAAGAAAAATACGAAAACGGGCAGCAGATCAAACATGAACAGTAC +GCAAAACACGAAATGCGAAACGGCGGCAACAAGTTAATAAATTAAGACGGCAAACGAAAA +AATCCAGATTCCGAGCACTGCAAAGAAAGTGGCACAAATGCTTTGCTTTTATCGTAGGAA +ATTCGCAAAAAATGTACAAATAAAACGAAAGAAAAGTTGCCACTATCAAATCCCACCGTT +CTTTAACTATAGTTTCCTTCTAAATCTAGCCTCTACTAGGCTTTGTCTGTGCATTCGAAA +GCCGATCAGACATAGCCTATAAGAGGTTAGGTGTACCAAGGCGAACAATCAGCGAAAACG +GAATCGATTACAGTTTTGGAGATCGTGAGAGGAGGAGAAGAGGCGACTGCTTGATAAGCC +CGGACCCTCCAGCGATCTCCAATCAATATTACTTTCCACCTACATATCTCCCCCTTTCAG +CTGGTTTAATTTTGGATTCCCCCATCTGGCTGGCCTATTTTCGCCTGGCCTGCGTTATTT +ATTAGTTAATAAACCATTAATATATACTTGAATAAAAAGGCGTTTCTCTGATTTTTGATG \ No newline at end of file
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/all_fasta.loc Tue Jan 07 03:36:07 2020 -0500 @@ -0,0 +1,1 @@ +EcR_USP_224.fa EcR_USP_224.fa EcR_USP_224.fa ${__HERE__}/EcR_USP_224.fa
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool-data/all_fasta.loc.sample Tue Jan 07 03:36:07 2020 -0500 @@ -0,0 +1,18 @@ +#This file lists the locations and dbkeys of all the fasta files +#under the "genome" directory (a directory that contains a directory +#for each build). The script extract_fasta.py will generate the file +#all_fasta.loc. This file has the format (white space characters are +#TAB characters): +# +#<unique_build_id> <dbkey> <display_name> <file_path> +# +#So, all_fasta.loc could look something like this: +# +#apiMel3 apiMel3 Honeybee (Apis mellifera): apiMel3 /path/to/genome/apiMel3/apiMel3.fa +#hg19canon hg19 Human (Homo sapiens): hg19 Canonical /path/to/genome/hg19/hg19canon.fa +#hg19full hg19 Human (Homo sapiens): hg19 Full /path/to/genome/hg19/hg19full.fa +# +#Your all_fasta.loc file should contain an entry for each individual +#fasta file. So there will be multiple fasta files for each build, +#such as with hg19 above. +#
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool_data_table_conf.xml.sample Tue Jan 07 03:36:07 2020 -0500 @@ -0,0 +1,7 @@ +<tables> + <!-- Locations of all fasta files under genome directory --> + <table name="all_fasta" comment_char="#"> + <columns>value, dbkey, name, path</columns> + <file path="tool-data/all_fasta.loc" /> + </table> +</tables>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool_data_table_conf.xml.test Tue Jan 07 03:36:07 2020 -0500 @@ -0,0 +1,6 @@ +<tables> + <table name="all_fasta" comment_char="#"> + <columns>value, dbkey, name, path</columns> + <file path="${__HERE__}/test-data/all_fasta.loc" /> + </table> +</tables>