annotate export_iprscan_to_Excel/readme.txt @ 0:a9762cd6e2e3 draft default tip

Uploaded
author basfplant
date Tue, 05 Mar 2013 04:00:19 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
1 Installation of iprscanToExcel
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
2 ------------------------------
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
3
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
4 1) The program iprscanToExcel does not work if InterProScan and the corresponding Galaxy wrapper are not present.
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
5
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
6 2) Change the paths in the <command> part of Galaxy wrapper "interproscan.xml" to the paths on your system, at least if this is required
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
7
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
8 ${GALAXY_ROOT_DIR}/tools/iprscan/iprscanToExcel_v20.jar
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
9
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
10 3) Installation of iprscanToExcel_v20.jar, iprscanToExcel.props and the Galaxy XML wrapper iprscanToExcel.xml
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
11
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
12 - The wrapper file "iprscanToExcel.xml", the program "iprscanToExcel_v20.jar" and its corresponding properties file "iprscanToExcel.props" should all be copied to the same directory, namely Galaxy tools directory "iprscan", {GALAXY_ROOT_DIR}/tools/iprscan
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
13 - Make GALAXY aware of the new tool: GALAXY knows about installed tools (and also what to display on the left pane) from the file {GALAXY_ROOT_DIR}/tool_conf.xml
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
14 Use a text editor to add a line for the interproscan.xml wrapper to e.g.the Sequence Annotation section.
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
15
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
16 <label text="My Tools" id="My tools" />
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
17 <section name="Sequence Annotation" id="sequence_annotation" >
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
18 <tool file="iprscan/interproscan.xml" />
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
19 <tool file="iprscan/iprscanToExcel.xml" />
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
20 </section>
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
21
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
22 - start up GALAXY again, open it in the web browser and test
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
23
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
24
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
25 iprscanToExcel functionality
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
26 ----------------------------
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
27
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
28 iprscanToExcel is a Java program that converts raw and/or xml output files from the interproscan program to Excel format (xlsx). Three modes of operation are available: convert both XML and raw iprscan output files to Excel, convert only the xml output file to Excel or convert only the raw file to Excel.
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
29
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
30 The xml output file of the interproscan program contains the source data for the Excel tabsheet "summary tables". Those summary tables give for each protein family information concerning the detailed matches, the parent, the child_list, where they are found_in, the GO-terms, ...
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
31
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
32 The raw output file of the interproscan program contains the source data for the Excel tabsheet "iprscan results", containing an overview table with proteinID, protein crc64, protein length, match dbname, classification id, classification description, start, end, score, status, date, interproID, interpro name, (title, GO number, description)n. The columns can be sorted and filtered via the filters present in the headers of the columns.
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
33
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
34 The program requires the availability of raw and/or xml files in the Galaxy history. The files can be generated via the application "Interproscan functional predictions" (under the header Sequence Annotation).
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
35
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
36
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
37 Galaxy workflow example
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
38 -----------------------
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
39
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
40 The file "Galaxy-Workflow-Export_xml_and_raw_output_from_iprscan_to_Excel.ga" stores a workflow. In the first two sections, a sequence file (fasta) can be uploaded and all InterProScan applications will be executed to generate the and xml and a raw InterProScan output file. In the third section of the workflow, those two InterProScan output files will be used as input for the iprscanToExcel program, resulting in an Excel file (.xlsx) with two tab pages.
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
41
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
42
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
43 Author and affiliation
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
44 ----------------------
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
45
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
46 Katrien Bernaerts and Domantas Motiejunas
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
47 corresponding author: gb-ctk-open-source-support@basf.com
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
48 10/06/2012
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
49
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
50 CropDesign N.V., a BASF Plant Science Company - Technologiepark 3, 9052 Zwijnaarde - Belgium
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
51
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
52
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
53 Terms of use
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
54 --------------------------
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
55 iprscanToExcel - Copyright (C) 2012 CropDesign N.V. - this software may be used, copied and redistributed, with or without modification freely, without advance permission, provided that the above Copyright statement is reproduced with each copy.
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
56 THIS SOFTWARE IS PROVIDED "AS IS" WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE (INCLUDING NEGLIGENCE OR OTHERWISE).
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
57
a9762cd6e2e3 Uploaded
basfplant
parents:
diff changeset
58 (R)Excel is a registered trademark of Microsoft Corporation in the United States and/or other countries.