Mercurial > repos > jjohnson > defuse
annotate README @ 17:c3167ccca38c draft default tip
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit d2317dff5a89016f18038b97d057f47d949e7808-dirty
author | jjohnson |
---|---|
date | Sat, 26 Jan 2019 12:53:08 -0500 |
parents | b22f8634ff84 |
children |
rev | line source |
---|---|
11
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
7
diff
changeset
|
1 The DeFuse galaxy tool is based on DeFuse_Version_0.6.2 |
4
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
2 http://sourceforge.net/apps/mediawiki/defuse/index.php?title=Main_Page |
11
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
7
diff
changeset
|
3 https://bitbucket.org/dranew/defuse |
1 | 4 |
5 DeFuse is a software package for gene fusion discovery using RNA-Seq data. The software uses clusters of discordant paired end alignments to inform a split read alignment analysis for finding fusion boundaries. The software also employs a number of heuristic filters in an attempt to reduce the number of false positives and produces a fully annotated output for each predicted fusion. | |
6 | |
7 | |
8 Manual: | |
11
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
7
diff
changeset
|
9 http://sourceforge.net/apps/mediawiki/defuse/index.php?title=DeFuse_Version_0.6.2 |
4
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
10 |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
11 The included tool_dependencies.xml will download and install the defuse code. |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
12 It will set the environment variable: "DEFUSE_PATH" to the location of the defuse install. |
5
3bd1087db05e
Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents:
4
diff
changeset
|
13 The tool_dependencies.xml also has the download for bowtie. |
1 | 14 |
15 | |
7 | 16 The defuse.pl command relies on a configuration file to specifiy options, the location of reference data, and other applications that it depends upon: bowtie, bowtie-build, samtools, gmap, blat, fatotwobit, R, and Rscript. |
4
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
17 |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
18 The DeFuse galaxy tool can either construct the config.txt file that is mentioned in the defuse manual, or select an existing config.txt file in the users history. |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
19 When constructing the config.txt file, the DeFuse tool uses the values selected in: tool-data/defuse.loc |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
20 The dictionary field in the tool-data/defuse.loc can be used to set fields in the config.txt file, including the site specific location of reference data and the paths to the other application binaries. |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
21 The "Defuse parameter settings" are used to alter options in the config.txt file. |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
22 |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
23 The DeFuse galaxy tool also generates a bash script to run defuse. |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
24 That script will attempt to edit the config.txt file to specifiy any unset paths to applications that defuse relies upon: |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
25 bowtie, bowtie-build, samtools, blat, fatotwobit, R, and Rscript |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
26 The script uses the using the shell "which" command to discover the application path, so the required applications should in PATH environment variable. |
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
27 |
1 | 28 |
2
4245c2b047de
Changes for defuse-0.4.3, modifications for non-human genomes no longer required, defuse.xml searches for location of scripts/defuse.pl
Jim Johnson <jj@umn.edu>
parents:
1
diff
changeset
|
29 Generate Reference Datasets as described in the Manual: |
4245c2b047de
Changes for defuse-0.4.3, modifications for non-human genomes no longer required, defuse.xml searches for location of scripts/defuse.pl
Jim Johnson <jj@umn.edu>
parents:
1
diff
changeset
|
30 |
7 | 31 Reference Dataset |
32 The reference dataset setup process has been simplified as of deFuse 0.6.0, and deFuse now automatically downloads all required files. | |
33 The create_reference_dataset.pl script will download the genome and other source files, and build any derivative files including bowtie indices, gmap indices, and 2bit files. Run the following command. Expect this step to take at least 12 hours. | |
34 create_reference_dataset.pl -c config.txt | |
2
4245c2b047de
Changes for defuse-0.4.3, modifications for non-human genomes no longer required, defuse.xml searches for location of scripts/defuse.pl
Jim Johnson <jj@umn.edu>
parents:
1
diff
changeset
|
35 |
4
679a5c7b1294
deFuse version 0.5.0 - Use tool_dependencies.xml
Jim Johnson <jj@umn.edu>
parents:
2
diff
changeset
|
36 These datasets should be referenced in the tool-data/defuse.loc file. |
1 | 37 |
11
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
7
diff
changeset
|
38 The create_reference_dataset will run the create_reference_dataset.pl script to generate deFuse genome reference data in a galaxy dataset. |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
7
diff
changeset
|
39 This should me made available in the future as a Galaxy DataManager. |
1 | 40 |
11
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
7
diff
changeset
|
41 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
7
diff
changeset
|
42 Galaxy will try to auto-install dependencies: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
7
diff
changeset
|
43 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
7
diff
changeset
|
44 External Tools ( http://sourceforge.net/apps/mediawiki/defuse/index.php?title=DeFuse_Version_0.6.2 ) |
5
3bd1087db05e
Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents:
4
diff
changeset
|
45 deFuse relies on other publically available tools as part of its pipeline. Some of these tools are not included with the deFuse download. Obtain these tools as detailed below. |
7 | 46 Download samtools |
47 The latest version of samtools can be downloaded from sourceforge: https://sourceforge.net/projects/samtools/files/samtools. | |
48 Set the samtools_bin entry in config.txt to the fully qualified paths of the samtools binary. | |
49 Download bowtie | |
5
3bd1087db05e
Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents:
4
diff
changeset
|
50 The latest version of bowtie can be downloaded from sourceforge: http://sourceforge.net/projects/bowtie-bio/files/bowtie/. deFuse has been tested on version 0.12.5. |
3bd1087db05e
Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents:
4
diff
changeset
|
51 Set the bowtie_bin and bowtie_build_bin entries in config.txt to the fully qualified paths of the bowtie and bowtie-build binaries. |
3bd1087db05e
Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents:
4
diff
changeset
|
52 Download blat and faToTwoBit |
3bd1087db05e
Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents:
4
diff
changeset
|
53 The latest blat tool suite can be downloaded from the ucsc website: http://hgdownload.cse.ucsc.edu/admin/exe/. Download blat and faToTwoBit and set the blat_bin and fatotwobit_bin entries in config.txt to the fully qualified paths of the blat and faToTwoBit binaries. |
7 | 54 Download GMAP |
55 The latest version of GMAP can be downloaded here http://research-pub.gene.com/gmap/. Build with a default configuration. Do not worry about the `--with-gmapdb` build flag, deFuse will request a specific directory for the database anyway. | |
5
3bd1087db05e
Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents:
4
diff
changeset
|
56 Download R |
3bd1087db05e
Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents:
4
diff
changeset
|
57 The latest version of R can be downloaded from the R project website: http://www.r-project.org/. Install R and then locate the R and Rscript executables, and set the r_bin and rscript_bin entries in config.txt to the path of those executables. |
7 | 58 Install the ada package. Run R, then at the prompt type install.packages("ada") |
59 Reference Dataset | |
60 The reference dataset setup process has been simplified as of deFuse 0.6.0, and deFuse now automatically downloads all required files. | |
61 The create_reference_dataset.pl script will download the genome and other source files, and build any derivative files including bowtie indices, gmap indices, and 2bit files. Run the following command. Expect this step to take at least 12 hours. | |
5
3bd1087db05e
Add dependecies for bowtie, blat, and faToTwoBit
Jim Johnson <jj@umn.edu>
parents:
4
diff
changeset
|
62 create_reference_dataset.pl -c config.txt |
1 | 63 |
17
c3167ccca38c
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit d2317dff5a89016f18038b97d057f47d949e7808-dirty
jjohnson
parents:
11
diff
changeset
|
64 |
c3167ccca38c
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit d2317dff5a89016f18038b97d057f47d949e7808-dirty
jjohnson
parents:
11
diff
changeset
|
65 defuse_trinity_analysis.py - Validating deFuse predictions using Trinity de novo assembled transcripts |
c3167ccca38c
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit d2317dff5a89016f18038b97d057f47d949e7808-dirty
jjohnson
parents:
11
diff
changeset
|
66 |
c3167ccca38c
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit d2317dff5a89016f18038b97d057f47d949e7808-dirty
jjohnson
parents:
11
diff
changeset
|
67 DeFuse provides a total fusion sequence of 200-500 nucleotides (nts) around the fusion breakpoint. This may be insufficient to predict the effect of the fusion on protein production. To get a view of the full transcript containing the fusion, Trinity de novo transcripts from the RNA-seq data are compared with the deFuse fusion sequences using a subsequence around the deFuse indetified fusion breakpoint. The Trinity transcriptToOrfs output provides potential proteins from the projected fusion transcript. |
c3167ccca38c
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit d2317dff5a89016f18038b97d057f47d949e7808-dirty
jjohnson
parents:
11
diff
changeset
|
68 |
c3167ccca38c
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit d2317dff5a89016f18038b97d057f47d949e7808-dirty
jjohnson
parents:
11
diff
changeset
|
69 |