# HG changeset patch # User hyungrolee # Date 1402786814 14400 # Node ID 803c7c39993e48f4459ca9790e000a2ddfbd0897 Uploaded diff -r 000000000000 -r 803c7c39993e mgescan.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mgescan.xml Sat Jun 14 19:00:14 2014 -0400 @@ -0,0 +1,101 @@ + + + + + MGEScan + + + mgescan.sh $input '$input.name' 3 $output $program $clade $qvalue_en $qvalue_rt $ltr_gff3 $nonltr_gff3 + + + + + + + + + + + + + + program != "N" + + + program != "L" + + + program != "L" + + + program != "L" + + + program != "N" + + + program != "L" + + + + +Running the program +=================== + +To run MGEScan, select input genome data in From select box, and select program either LTR, nonLTR or both. + +Click 'Execute' button. + +If you like to have more options to run LTR or nonLTR progrma, use separated tools on the left panel. +In LTR > MGEScan-LTR, preprocessing by repeatmasker and setting other variables are available e.g. distance(bp) between LTRs. + +Output +============ +A. MGEScan_LTR: +Upon completion, MGEScan-LTR generates a file "ltr.out". This output file has information +about clusters and coordinates of LTR retrotransposons identified. Each cluster of LTR +retrotransposons starts with the head line of "[cluster_number]---------", followed by +the information of LTR retrotransposons in the cluster. The columns for LTR +retrotransposons are as follows. + +1. LTR_id: unique id of LTRs identified. It consist of two components, sequence file name and id in the file. For example, chr1_2 is the second LTR retrotransposon in the chr1 file. +2. start position of 5’ LTR. +3. end position of 5’ LTR. +4. start position of 3’ LTR. +5. end position of 3’ LTR. +6. strand: + or -. +7. length of 5’ LTR. +8. length of 3’ LTR. +9. length of the LTR retrotransposon. +10. TSD on the left side of the LTR retotransposons. +11. TSD on the right side of the LTR retrotransposons. +12. di(tri)nucleotide on the left side of 5’LTR +13. di(tri)nucleotide on the right side of 5’LTR +14. di(tri)nucleotide on the left side of 3’LTR +15. di(tri)nucleotide on the right side of 3’LTR + +B. MGEScan_nonLTR: + Upon completion, MGEScan-nonLTR generates the directory, "info" in the data directory you + specified. In this "info" directory, two sub-directories ("full" and "validation") are + generated. + + - The "full" directory is for storing sequences of elements. Each subdirectory in "full" + is the name of clade. In each directory of clade, the DNA sequences of nonLTRs identified + are listed. Each sequence is in fasta format. The header contains the position + information of TEs identified: + [genome_file_name]_[start position in the sequence] + + For example, >chr1_333 means that this element start at 333bp in the "chr1" file. + + - The "validation" directory is for storing Q values. In the files "en" and "rt", the first column corresponds to the element name and the last column Q value. + +License +============ +Copyright 2014 Mina Rho, Haixu Tang. +You may redistribute this software under the terms of the GNU General Public License. + + +