# HG changeset patch # User iracooke # Date 1402784321 14400 # Node ID 04dc24d06ddb5e56ea54ec36a21dd20543196aba Uploaded diff -r 000000000000 -r 04dc24d06ddb README --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README Sat Jun 14 18:18:41 2014 -0400 @@ -0,0 +1,6 @@ +This tool takes a protxml file and a reference genome and produces a gff file with genomic coordinates of all peptides in the protxml file. The mapping process relies on the presence of genomic coordinates in the protxml file, encoded in the protein names. To generate such a file you should run all searches against a database that was generated with the companion tool "proteindb_from_gff3". + +Requirements: +This package uses protk which must be installed separately. + +For instructions please see: https://github.com/iracooke/protk/#galaxy-integration diff -r 000000000000 -r 04dc24d06ddb README.md --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.md Sat Jun 14 18:18:41 2014 -0400 @@ -0,0 +1,1 @@ +# This is my README diff -r 000000000000 -r 04dc24d06ddb protxml_to_gff.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/protxml_to_gff.xml Sat Jun 14 18:18:41 2014 -0400 @@ -0,0 +1,81 @@ + + + protk + blast+ + + + Map peptides from a protXML file to genomic coordinates + + + protxml_to_gff.rb -p $protxml_file + + -g $genome_fasta_file + + -d $protein_fasta_file + + -o $output + + --threshold $peptide_threshold + + --prot-threshold $protein_threshold + + $stack_charges + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +**What it does** + +Generates a gff file containing genomic coordinates for peptides present in a protXML file. + +In order for this tool to work the inputs must satisfy certain requirements. + +1. The genome fasta should encode the scaffold numbers as in the following example + +>scaffoldXXX + +or + +>scaffold_XXX + +where XXX represent digits encoding the scaffold number. Any number of digits are allowed + +2. The protXML should have been generated by searching a database generated using the protk Generate 6 frame translation tool and the extract proteins from gff3 tool. Both those tools should be run with the genomics coordinates included in the output file. + + + +---- + +**References** + + + + + diff -r 000000000000 -r 04dc24d06ddb repository_dependencies.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/repository_dependencies.xml Sat Jun 14 18:18:41 2014 -0400 @@ -0,0 +1,4 @@ + + + + diff -r 000000000000 -r 04dc24d06ddb tool_dependencies.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool_dependencies.xml Sat Jun 14 18:18:41 2014 -0400 @@ -0,0 +1,8 @@ + + + + + + + +