annotate fastx_barcode_splitter_galaxy_wrapper.sh @ 0:84bbf4fd24c3 draft

Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
author lparsons
date Fri, 08 Nov 2013 09:53:39 -0500
parents
children b7b3d008e2d3
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
1 #!/bin/bash
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
2
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
3 # FASTX-toolkit - FASTA/FASTQ preprocessing tools.
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
4 # Copyright (C) 2009 A. Gordon (gordon@cshl.edu)
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
5 #
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
6 # This program is free software: you can redistribute it and/or modify
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
7 # it under the terms of the GNU Affero General Public License as
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
8 # published by the Free Software Foundation, either version 3 of the
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
9 # License, or (at your option) any later version.
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
10 #
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
11 # This program is distributed in the hope that it will be useful,
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
12 # but WITHOUT ANY WARRANTY; without even the implied warranty of
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
14 # GNU Affero General Public License for more details.
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
15 #
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
16 # You should have received a copy of the GNU Affero General Public License
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
17 # along with this program. If not, see <http://www.gnu.org/licenses/>.
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
18
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
19 # Modified by Lance Parsons (lparsons@princeton.edu)
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
20 # 2011-03-15 Adapted to allow galaxy to determine filetype
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
21 #
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
22 #This is a shell script wrapper for 'fastx_barcode_splitter.pl'
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
23 #
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
24 # 1. Output files are saved at the dataset's files_path directory.
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
25 #
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
26 # 2. 'fastx_barcode_splitter.pl' outputs a textual table.
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
27 # This script turns it into pretty HTML with working URL
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
28 # (so lazy users can just click on the URLs and get their files)
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
29
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
30 if [ "$1x" = "x" ]; then
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
31 echo "Usage: $0 [BARCODE FILE] [FASTQ FILE] [LIBRARY_NAME] [OUTPUT_PATH] [FILETYPE]" >&2
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
32 exit 1
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
33 fi
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
34
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
35 BARCODE_FILE="$1"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
36 FASTQ_FILE="$2"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
37 LIBNAME="$3"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
38 OUTPUT_PATH="$4"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
39 FILETYPE="$5"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
40 shift 5
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
41 # The rest of the parameters are passed to the split program
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
42
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
43 if [ "${OUTPUT_PATH}x" = "x" ]; then
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
44 echo "Usage: $0 [BARCODE FILE] [FASTQ FILE] [LIBRARY_NAME] [OUTPUT_PATH] [FILETYPE]" >&2
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
45 exit 1
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
46 fi
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
47
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
48 #Sanitize library name, make sure we can create a file with this name
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
49 LIBNAME=${LIBNAME%.gz}
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
50 LIBNAME=${LIBNAME%.txt}
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
51 LIBNAME=$(echo "$LIBNAME" | tr -cd '[:alnum:]_')
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
52
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
53 if [ ! -r "$FASTQ_FILE" ]; then
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
54 echo "Error: Input file ($FASTQ_FILE) not found!" >&2
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
55 exit 1
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
56 fi
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
57 if [ ! -r "$BARCODE_FILE" ]; then
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
58 echo "Error: barcode file ($BARCODE_FILE) not found!" >&2
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
59 exit 1
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
60 fi
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
61 mkdir -p "$OUTPUT_PATH"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
62 if [ ! -d "$OUTPUT_PATH" ]; then
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
63 echo "Error: failed to create output path '$OUTPUT_PATH'" >&2
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
64 exit 1
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
65 fi
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
66
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
67 PUBLICURL=""
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
68 BASEPATH="$OUTPUT_PATH/"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
69 #PREFIX="$BASEPATH"`date "+%Y-%m-%d_%H%M__"`"${LIBNAME}__"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
70 PREFIX="$BASEPATH""${LIBNAME}_"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
71 SUFFIX="_visible_$FILETYPE"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
72 DIRECTORY=$(cd `dirname $0` && pwd)
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
73
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
74 RESULTS=`gzip -cdf "$FASTQ_FILE" | $DIRECTORY/fastx_barcode_splitter.pl --bcfile "$BARCODE_FILE" --prefix "$PREFIX" --suffix "$SUFFIX" "$@"`
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
75 if [ $? != 0 ]; then
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
76 echo "error"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
77 fi
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
78
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
79 #
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
80 # Convert the textual tab-separated table into simple HTML table,
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
81 # with the local path replaces with a valid URL
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
82 #HTMLSUMMARY=${PREFIX}stats_visible_html
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
83 echo "<html><body><table border=1>"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
84 echo "$RESULTS" | sed -r "s|$BASEPATH(.*)|\\1|" | sed '
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
85 i<tr><td>
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
86 s|\t|</td><td>|g
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
87 a<\/td><\/tr>
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
88 '
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
89 echo "<p>"
84bbf4fd24c3 Initial toolshed version with support for separate index reads and automatic loading of results into Galaxy history.
lparsons
parents:
diff changeset
90 echo "</table></body></html>"