view README.rdoc @ 2:e16016635b2a

Uploaded
author timpalpant
date Mon, 13 Feb 2012 22:12:06 -0500
parents
children
line wrap: on
line source

= Java Genomics Toolkit

This is a collection of applications for genomics data processing, primarily high-throughput next-generation sequencing. There is a particular focus on processing data in Wiggle format, since many other tools already cover SAM, BAM, FastQ, etc. However, Wiggle/BigWig formats provide a compact way to store numerical data resulting from ChIP-seq and MNase-seq experiments. Common computations provided in this toolkit include adding, subtracting, dividing, multiplying, log-transforming, averaging, Z-scoring, and Gaussian smoothing Wig files.

Tools may be run from the command-line, a simple Swing GUI, or from Galaxy (http://getgalaxy.org).

== Loading the Tools into Galaxy

TODO

== Using the ToolRunner GUI

TODO

== Command-Line Usage

Applications can be run on the command-line, and the toolRunner.sh script is provided for convenience. Calling any script without arguments will display the help, as well as the missing mandatory arguments:

  $ > ./toolRunner.sh wigmath.AddWig
  $ Usage: <main class> [options] Input files
  $   Options:
  $   * -o, --output   Output file

Mandatory arguments are denoted with a (*).

Other tools require more input:

  $ > ./toolRunner.sh ngs.Autocorrelation
  $ Usage: <main class> [options]
  $   Options:
  $   * -i, --input    Input file
  $   * -l, --loci     Genomic loci (Bed format)
  $   -m, --max        Autocorrelation limit (bp)
  $                    Default: 200
  $   * -o, --output   Output file

=== Log transform a Wig file with base 2

  $ > ./toolRunner.sh wigmath.LogTransform --input input.wig --base 2 --output output.log2.wig

== Java Genomics IO

Those wishing to write their own scripts may be interested in https://github.com/timpalpant/java-genomics-io, the toolkit upon which these applications are built.