annotate README.md @ 0:7db7ecc78ad6 draft

Uploaded
author damion
date Mon, 02 Mar 2015 20:46:00 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
1 # blast_reporting
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
2 NCBI BLAST+ searches can output in a range of formats, but in the past only the XML format included fields like sequence description. This tool converts the NCBI BLAST XML report into 12, 24, 26 or custom column tabular and HTML reports. It is used as a command-line tool or via a Galaxy bioinformatics platform tool.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
3
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
4 The tool allows almost complete control over which fields are displayed and filtered, how columns are named, and how the HTML report on each query is sectioned. Search result records can be filtered out based on values in numeric or textual fields. Matches (by accession id) to a selection of reference databases can be shown, and this can include a description of the matched sequence.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
5
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
6 Currently this tool only takes as input the "Output format: BLAST XML" option of the NCBI Blast+ search tool, triggered by (for example)
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
7
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
8 blastn -outfmt 5 -query "...."
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
9
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
10 or via Galaxy by selecting the NCBI Blast+ search tool's option towards bottom of form ...
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
11
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
12 ## Documentation
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
13 A fairly comprehensive user guide is available in the doc/ folder.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
14
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
15 ## Installation
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
16 The tool can be installed from https://toolshed.g2.bx.psu.edu/ . It draws upon the XML reports generated by the NCBI Blast+ tools.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
17
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
18 The setup of Reference Bins and the Selectable HTML Report are optional as described below.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
19
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
20 ### Using ''Reference Bins''
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
21 - A reference bin file is simply a text file having line records each containing an accession id and a description. The accession id is cross-referenced with the accession id returned with each search hit. However we have to tell the Blast reporting tool where these tables are. Their names and paths are listed in the fasta_reference_dbs.loc.sample, which ends up in the Galaxy install's tool-data/fasta_reference_dbs.loc file.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
22 Example:
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
23
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
24 ```
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
25 AADS00000000.1 Phanerochaete chrysosporium RP-78
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
26 AAEW02000014.2 Desulfuromonas acetoxidans DSM 684
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
27 AAEY01000007.0 Cryptococcus neoformans var. neoformans B-3501A
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
28 AAFI01000166 Dictyostelium discoideum AX4
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
29 AAFW02000169.3 Saccharomyces cerevisiae YJM789
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
30 ```
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
31
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
32 Both the search result hit and the reference file accession ids are stripped of any fractional component before being compared.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
33
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
34 ### Using the ''Selectable HTML Report'':
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
35 - This is EXPERIMENTAL because it currently requires the "select subsets" galaxy tool with a bit of extra setup that might have to be redone as Galaxy evolves:
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
36 - In Galaxy install and run the "Select subsets" tool from https://toolshed.g2.bx.psu.edu/.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
37 - Use your browser's "View frame source" option while mouse is over the "Select subsets" form.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
38 - Scroll down to the <input type="hidden" name="tool_state" value="..."> and copy the numeric value string into a new text file.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
39 - save the text file with the name "html_selectable_report_tool_state" to the tool's templates/ subfolder. It should be alongside the html_selectable_report.py script which reads it.
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
40
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
41 ## Development notes
7db7ecc78ad6 Uploaded
damion
parents:
diff changeset
42 - A few changes are in the works: A galaxy form tool fix sheduled in the next month will enable setup of reference databases to be much easier. One will only have to load each reference bin file into a Galaxy data library you can set up.