annotate NEUMA-1.2.1/README @ 0:c44c43d185ef draft default tip

NEUMA-1.2.1 Uploaded
author chawhwa
date Thu, 08 Aug 2013 00:46:13 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
1 ###########################
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
2 # README for NEUMA v1.2.0 #
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
3 ###########################
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
4
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
5 ## Additional information and files can be obtained from the NEUMA website, http://neuma.kobic.re.kr.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
6 ## Inquiries can be written to duplexa@gmail.com.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
7
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
8 # This version has removed the extra tab in the iNIR file. (version 1.2.1)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
9 # This version has fixed the problem of missing some reads at the 5'end in paired-end data. (Version 1.2.1)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
10
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
11 # This version has the following changes.(version 1.2.0)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
12 ## The intermediate cmbt and MA files are not produced and mapping stat, insertlendis and read counts are directly computed from bowtie output files. This drastically improves on memory and speed and uses less hard disk space.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
13 ## A subdirectory named 'readcount' will be generated in which gNIR, iNIR and gReadcount files will be placed.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
14 ## A new option of generating all read counts (not only gene-wise or isoform-wise informative reads) for each gene (--gReadcount option). It is possible to get only these numbers and not compute NIR (--noNIR), EUMA, FVKM and LVKM (--noNEUMA).
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
15 ## auto_NEUMA_PE.pl can be used either for initial run to determine maximum insert length (--only_init) for after-initial run (--skip_init) ,or both (default).
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
16 ## Mismatches are allowed (--mm), but the alignments are filtered for all best-matching alignments.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
17
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
18
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
19 # This version can handle the newer fastq format for paired-end case. (version 1.1.5)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
20 # This version can handle the newer fastq format with a space in the sequencd ID. (version 1.1.4)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
21 # This version includes two distinct strand-sepcificity options (S for forward and R for reverse strand). (version 1.1.3)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
22 # This version includes a script that generates merged LVKM file and NIR files that can be used for diffNEUMA, to identify differentially expression genes/isoforms. (version 1.1.2)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
23 # This version runs dos2unix for gene2NM and gene2symbol files in the beginning. (version 1.1.1)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
24 # This version handles strand-specific data. (version 1.1.1)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
25 # This version allows multi-thread option for bowtie. (version 1.1.0)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
26 # This version uses a different way to take argument. (usage has changed). (version 1.1.0)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
27 # This version fixed a bug in reading mapping stat file in the Ensembl mode. (version 1.0.5)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
28 # This version handles Ensembl data. (version 1.0.4)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
29 # This version handles SOLiD colorspace data. (version 1.0.3)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
30
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
31
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
32 **** Table of contents ****
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
33
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
34 1. Installation
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
35 2. Bowtie
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
36 3. Ingredients
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
37 4. How to run
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
38 5. Output files
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
39 6. Preliminary run
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
40 7. Mergin output files
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
41
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
42 ***************************
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
43
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
44
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
45
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
46
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
47
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
48 1. Installation
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
49
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
50 No Installation is required. Simply, create a directory (eg. neumadir) and extract the .tar.gz file in the directory. Then, make all the perl scripts executable by the following command:
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
51
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
52 chmod a+x *.pl
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
53
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
54
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
55 2. Bowtie
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
56
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
57 Please download and install bowtie from the bowtie website http://bowtie-bio.sourceforge.net/index.shtml, following the developers' instruction. The reference index file must be created, using the same fasta file used for generating the gU and iU tables (gEUMA and iEUMA tables). The fasta file can be obtained from the NEUMA website.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
58
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
59
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
60
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
61 3. Ingredients
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
62
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
63 Before you run NEUMA, you need the following files.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
64
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
65 * a fasta file containing raw sequences (single-end)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
66 or a pair of fasta-files containing raw sequences, each mate pair having the same unique ID (paired-end).
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
67 * bowtie reference index file
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
68 * gEUMA and iEUMA tables (single-end)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
69 or gU table and iU table (paired-end)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
70 * gene2NM file
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
71 * gene2symbol file
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
72
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
73
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
74 Note that the gene2NM file must be matched to the initial reference fasta file that gU and iU tables (gEUMA and iEUMA tables) were created from. The gU and iU tables (gEUMA and iEUMA tables) along with gene2NM and gene2symbol files can be obtained from the NEUMA website.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
75
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
76
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
77
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
78 4. How to run
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
79
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
80 Two scripts, auto_NEUMA_PE.pl (paired-end) and auto_NEUMA_SE.pl (single-end) are what you need and these scripts run the other scripts automatically.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
81
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
82
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
83 ## paired-end case
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
84 usage: ./auto_NEUMA_PE.pl [options] -L=<read_length> -D=<maxdist> -1=<input_file1(mate1)> -2=<input_file2(mate2)> -U=<Utable_prefix(fullpath, before .gU.table or .iU.table)> --g2m=<gene2NM_file> --g2s=<gene2symbol_file> -b=<bowtie_dir(eg.bin/bowtie-0.12.5)> --bi=<bowtieindex> -o=<outputdir> -s=<sample_name>
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
85
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
86 The order of Arguments and options can be arbitrary.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
87
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
88 ** required arguments **
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
89 * -L=<read_length> : read_length(eg.36) : sequenced length of a read mate (/not/ the insert length or sum of the two mate lengths) (no default : L must be specified)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
90 * -D=<maxdist> : maxdist(eg.400) : maximum outer distance between mates (insert size). This must be identical to the maxdist used for generating gU and iU tables. (no default : D must be specified)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
91 * -1=<input_file1(mate1)> : fasta or fastq file of a series of read mate 1
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
92 * -2=<input_file2(mate2)> : fasta or fastq file of a series of read mate 2. The ID of each mate 2 must be matched to that of mate 1 in case of fasta file.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
93 * -U=<Utable_prefix(fullpath, before .gU.table or .iU.table)> : the path to the gU.table and iU.table files, except their extension. If you placed your gU.table and iU.table files in /home/paired-end/Utable/ and the file names are hg19.refMrna.L36.D250.gU.table and hg19.refMrna.L36.D250.iU.table, then the value for this argument is '/home/paired-end/Utable/hg19.refMrna.L36.D250'. The files must be matched to the read length, maxdist and reference transcriptome sequence.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
94 * --g2m=<gene2NM_file> : gene2NM file, matched to the reference transcriptome model used for generating the gU and iU tables.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
95 * --g2s=<gene2symbol_file> : gene2symbol file, containing at least all of the genes in the reference transcriptome sequence.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
96 * -b=<bowtie_dir(eg.bin/bowtie-0.12.7)> : directory in which bowtie executable is installed. Either full path or relative path must be used and '~' must be avoided.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
97 * --bi=<bowtieindex> : the index file prefix of the reference transcriptome sequence, created by bowtie (bowtie-build). It is the same string put as the reference index argument for the bowtie program. (see bowtie manual)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
98 * -o=<outputdir> : a directory that will contain all the output files. If the directory does not exist, the program will create it automatically.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
99 * -s=<sample_name> : name of the sample that will be used as the prefix of all the output files.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
100
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
101 ** options **
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
102 * -f=<file_type> : fasta(f)_or_fastq(q) : f if input files are in fasta format; q if in fastq format. (default : q)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
103 * -c=<coding_option> : nucleotide(n)_or_colorspace(c) : n if input files are in DNA sequence (A,C,G,T); c if in colorspace (SOLiD platform). Note that if colorspace is used, the colorspace version of bowtie index file must be used. (default : n)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
104 * -t=<euma_cut> : EUMAcut(eg.50) : The cut off of EUMA that determines measurable genes and transcripts. (default : 50)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
105 * -d=<data_type> : Refseq data(R) or Ensemble data(E). The R option uses the 'NM' and 'NR' prefices in RefSeq data to discriminate between mRNA and ncRNA. If your reference doesn't have these prefices, use the E option. (default : R)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
106 * -p=<num_cpu> : number of cpu's to use for Bowtie. (default : 1)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
107 * --str=<strand_specificity> : Strand-specific(S) vs Non-Strand-specific(N). For strand-specific data, strand-specific U tables must be used. (default : N)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
108 * --mm=<number_of_mismatches_allowed> : setting number of mismatches allowed (default : 0)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
109 * --gReadcount : Compute the number of all reads (not just gene-wise and isoform-wise informative reads) for each gene.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
110 * --noNIR : do not compute NIR. (This option must be used together with --noNEUMA)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
111 * --noNEUMA : do not compute EUMA, FVKM and LVKM (but compute NIR unless used with --noNIR).
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
112 * --only_init : run only the initial part of bowtie-mapping and insert-lendis calculation, preferably with a relatively large -D value (which is way over expected maximum, eg. 1000). A more detailed description can be found below in section 6.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
113 * --skip_init : given the maximum insert length to be used is decided, skip the initial bowtie mapping and insert-lendis part and use the output files from the previous run. (No need to run bowtie again with the new -D value because length filtering will be done at the downstream steps.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
114
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
115
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
116 ## single-end case
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
117 usage: ./auto_NEUMA_SE.pl [options] -L=<read_length> -i=<input_file> -U=<EUMA_prefix(fullpath, before .gEUMA or .iEUMA)> --g2m=<gene2NM_file> --g2s=<gene2symbol_file> -b=<bowtie_dir(eg.bin/bowtie-0.12.5)> --bi=<bowtieindex> -o=<outputdir> -s=<sample_name>
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
118
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
119 The order of Arguments and options can be arbitrary.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
120
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
121 ** required arguments **
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
122 * -L=<read_length> : read_length(eg.36) : sequenced length of a read mate (/not/ the insert length or sum of the two mate lengths) (no default : L must be specified)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
123 * -i=<input_file> : fasta file of a series of sequenced reads
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
124 * -U=<EUMA_prefix(fullpath, before .gEUMA or .iEUMA)> : the path to the gEUMA and tEUMA files, except their extension. If you placed your gEUMA and iEUMA files in /home/single-end/EUMA/ and the file names are hg19.refMrna.L36.single.gEUMA and hg19.refMrna.L36.single.iEUMA, then the value for this argument is '/home/single-end/EUMA/hg19.refMrna.L36.single'.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
125 * --g2m=<gene2NM_file> : same as paired-end case
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
126 * --g2s=<gene2symbol_file> : same as paired-end case
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
127 * -b=<bowtiedir(eg.bin/bowtie-0.12.5)> : same as paired-end case
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
128 * --bi=<bowtieindex> : same as paired-end case
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
129 * -o=<outputdir> : same as paired-end case
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
130 * -s=<samplename> : same as paired-end case
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
131
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
132 ** options **
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
133 * -f=<file_type> : fasta(f)_or_fastq(q) : f if input files are in fasta format; q if in fastq format. (default : q)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
134 * -c=<coding_option> : nucleotide(n)_or_colorspace(c) : n if input files are in DNA sequence (A,C,G,T); c if in colorspace (SOLiD platform). Note that if colorspace is used, the colorspace version of bowtie index file must be used. (default : n)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
135 * -t=<euma_cut> : EUMAcut(eg.50) : The cut off of EUMA that determines measurable genes and transcripts. (default : 50)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
136 * -d=<data_type> : Refseq data(R) or Ensemble data(E). The R option uses the 'NM' and 'NR' prefices in RefSeq data to discriminate between mRNA and ncRNA. If your reference doesn't have these prefices, use the E option. (default : R)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
137 * -p=<num_cpu> : number of cpu's to use for Bowtie. (default : 1)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
138 * --str=<strand_specificity> : Strand-specific, forward(S), strand-specific, reverse (R), and Non-Strand-specific(N). For strand-specific data, strand-specific EUMA tables must be used. (default : N)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
139 * --mm=<number_of_mismatches_allowed> : setting number of mismatches allowed (default : 0)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
140 * --gReadcount : Compute the number of all reads (not just gene-wise and isoform-wise informative reads) for each gene.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
141 * --noNIR : do not compute NIR. (This option must be used together with --noNEUMA)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
142 * --noNEUMA : do not compute EUMA, FVKM and LVKM (but compute NIR unless used with --noNIR).
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
143
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
144
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
145 Example usages are as follows:
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
146
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
147 # paired-end
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
148 neumadir/auto_NEUMA_PE.pl --only_init -f=f -L=36 -D=1000 --mm2 -1=MKN28.1.fa -2=MKN28.2.fa -U=U.tables/hg19.refMrna.L36.D250 --g2m=gene2NM.human.fastafiltered --g2s=gene2symbol.human -b=bin/bowtie-0.12.7 --bi=ebwt/hg19.RefmRNA -o=MKN28.hg19 -s=MKN28
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
149 # This command will run initial bowtie mapping, mapping stat and insert length distribution.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
150 neumadir/auto_NEUMA_PE.pl --skip_init --gReadcount --noNEUMA --mm2 -f=f -L=36 -D=250 -1=MKN28.1.fa -2=MKN28.2.fa -U=U.tables/hg19.refMrna.L36.D250 --g2m=gene2NM.human.fastafiltered --g2s=gene2symbol.human -b=bin/bowtie-0.12.7 --bi=ebwt/hg19.RefmRNA -o=MKN28.hg19 -s=MKN28
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
151 # This command will skip bowtie-mapping and insert length distribution and use the files from the previous (eg. above) run, and reports counts of all reads (gReadcount), gNIR and iNIR, and will not compute EUMA, FVKM, LVKM.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
152 # Only reads with insert length up to 250 will be used, although the bowtie output from the previous run contains larger insert lengths.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
153
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
154
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
155 # single-end
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
156 auto_NEUMA_SE.pl --noNEUMA --mm2 -L=36 -t=30 -p=2 -i=MKN28.txt -U=hg19.refMrna.L36.single --g2m=gene2NM.human.fastafiltered --g2s=gene2symbol.human -b=../bin/bowtie-0.12.5 --bi=ebwt/hg19.RefmRNA -o=MKN28.hg19 -s=MKN28
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
157 # This command will only do bowtie-mapping and reports mapping stat and gNIR and iNIR.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
158
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
159
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
160
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
161 5. OUTPUT Files
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
162
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
163 ## paired-end
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
164
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
165 For Refseq, the final log2(x+1) transformed values of FVKM (LVKM) can be found in outputdir/LVKM/samplename.ebwtname.maxinsMAX_DIST.mm0.-EUMAcut.-NR.gLVKM (gene-wise) and outputdir/LVKM/samplename.ebwtname.maxinsMAX_DIST.mm0.-EUMAcut.-NR.iLVKM (isoform-wise).
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
166
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
167 Files containing '-NR' excludes genes with no mRNA (eg. rRNA, tRNA, snRNA, snoRNA and other small RNA genes). If your sample is poly-A-selected, these RNAs cannot be quantified along with mRNAs. If your sample did not use any enrichment step and mRNAs and ncRNAs are represented in their exact proportion as in the cell, you can use the LVKM files without the '-NR' tag. The ncRNAs were removed at the last step because reads mapping to both an mRNA and an ncRNA must be considered a multi-read. For Ensembl, -NR version are not provided (ncRNAs are not removed).
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
168
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
169
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
170
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
171 The unadjusted and adjusted FVKM values (as 'FVK' and 'FVKM' for before and after sample-normalization, respectively) can be found in the LVKM files as well, along with gEUMA and iEUMA values and the number of isoforms and the number of measurable isoforms.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
172
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
173 The mapping stat can be found in outputdir/mapping_stat.samplename.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
174
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
175 The NIR and/or readcount values can be found in outputdir/readcount/samplename.ebwtname.maxinsMAX_DIST.mm0.gNIR, outputdir/readcount/samplename.ebwtname.maxinsMAX_DIST.mm0.iNIR and/or outputdir/readcount/samplename.ebwtname.maxinsMAX_DIST.mm0.gReadcount
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
176
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
177 The EUMA values can be found in outputdir/EUMA/samplename.ebwtname.maxinsMAX_DIST.mm0.gEUMA and outputdir/EUMA/samplename.ebwtname.maxinsMAX_DIST.mm0.iEUMA.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
178
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
179 The insert length distribution can be found in outputdir/insertlendis/samplename.ebwtname.maxinsMAX_DIST.mm0.i.insertlendis.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
180
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
181 If --mm option was used, replace 'mm0' in the above file names with 'mm1' or 'mm2'.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
182
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
183
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
184 ## single-end
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
185
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
186 The formats for LVKM and NIR files(gMA & iMA) are the same as in the paired-end case. The LVKM file names are outputdir/LVKM/samplename.ebwtname.maxinsMAX_DIST.mm0.single.-EUMAcut.-NR.gLVKM (gene-wise) and outputdir/LVKM/samplename.ebwtname.maxinsMAX_DIST.mm0.single.-EUMAcut.-NR.iLVKM (isoform-wise). The NIR values can be found in outputdir/MA/samplename.ebwtname.maxinsMAX_DIST.mm0.single.all.gMA (the #uniq.common column) and outputdir/MA/samplename.ebwtname.maxins400.mm0.single.all.iMA (the #uniq column).
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
187
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
188 The mapping stat can be found in outputdir/mappint_stat.samplename.single.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
189
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
190
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
191
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
192
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
193 6. Preliminary run
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
194
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
195 Given a new data set, users can run the first part of auto_NEUMA_PE.pl, without the U.tables, to find out the insert length distribution and mapping stats. Based on the length distribution, the user can determine a safe MAXDIST and request us to build U.tables (through the NEUMA website). After having the U.tables ready, the user can run the latter part of auto_NEUMA_PE.pl. Note that -U, --g2m and --g2s are not required for this run.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
196
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
197 * Usages:
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
198
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
199 auto_NEUMA_PE.pl --only_init [options] -L=<read_length> -D=<maxdist> -1=<input_file1(mate1)> -2=<input_file2(mate2)> -b=<bowtie_dir(eg.bin/bowtie-0.12.5)> --bi=<bowtieindex> -o=<outputdir> -s=<sample_name>
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
200
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
201 ** required arguments **
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
202 * -L=<read_length> : read_length(eg.36) : sequenced length of a read mate (/not/ the insert length or sum of the two mate lengths) (no default : L must be specified)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
203 * -D=<maxdist> : maxdist(eg.400) : maximum outer distance between mates (insert size). This must be identical to the maxdist used for generating gU and iU tables. (no default : D must be specified)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
204 * -1=<input_file1(mate1)> : fasta or fastq file of a series of read mate 1
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
205 * -2=<input_file2(mate2)> : fasta or fastq file of a series of read mate 2. The ID of each mate 2 must be matched to that of mate 1 in case of fasta file.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
206 * -b=<bowtie_dir(eg.bin/bowtie-0.12.7)> : directory in which bowtie executable is installed. Either full path or relative path must be used and '~' must be avoided.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
207 * --bi=<bowtieindex> : the index file prefix of the reference transcriptome sequence, created by bowtie (bowtie-build). It is the same string put as the reference index argument for the bowtie program. (see bowtie manual)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
208 * -o=<outputdir> : a directory that will contain all the output files. If the directory does not exist, the program will create it automatically.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
209 * -s=<sample_name> : name of the sample that will be used as the prefix of all the output files.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
210
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
211 ** options **
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
212 * -f=<file_type> : fasta(f)_or_fastq(q) : f if input files are in fasta format; q if in fastq format. (default : q)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
213 * -c=<coding_option> : nucleotide(n)_or_colorspace(c) : n if input files are in DNA sequence (A,C,G,T); c if in colorspace (SOLiD platform). Note that if colorspace is used, the colorspace version of bowtie index file must be used. (default : n)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
214 * -p=<num_cpu> : number of cpu's to use for Bowtie. (default : 1)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
215 * --str=<strand_specificity> : Strand-specific(S) vs Non-Strand-specific(N). (default : N)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
216 * --mm=<number_of_mismatches_allowed> : setting number of mismatches allowed (default : 0)
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
217
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
218 * The maxdist for this run can be set to something very large, eg. 1000. Then, for the real runs using the --skip_init option, the maxdist must be set identical to the one used to build the U.tables.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
219
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
220 * For single-end reads, insert length distribution does not have to be pre-determined. Simply tell us the read length to request the EUMA tables for single-end data.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
221
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
222 ** output **
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
223 The quickest way to check the output insert length distribution is to look at the insertlendis directory under outputdir.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
224
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
225
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
226
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
227
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
228
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
229 7. Merging output files
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
230
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
231
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
232 1) merge_LVKM.pl
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
233
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
234 The script produces a text file that contains gLVKM or iLVKM values for all samples for all genes/isoforms.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
235
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
236 usage: ./merge_LVKM.pl genewise/isoformwise[g/i] EUMAcut LVKM_out_dir NR(1/0) > output.gLVKM(iLVKM).merged
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
237
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
238 Example usage:
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
239 ./merge_LVKM.pl g 50 data/LVKM 0 > data/LVKM/all.gLVKM.merged
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
240 ./merge_LVKM.pl i 50 data/LVKM 0 > data/LVKM/all.iLVKM.merged
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
241
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
242 * genewise/isoformwise[g/i] : put g for gLVKM and i for iLVKM.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
243 * EUMAcut : put the same EUMA cut off used to generate the LVKM files. This will be used to recognize the files.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
244 * LVKM_out : This is usually your_basedir/LVKM, which contains all the LVKM files generated. All the LVKM files in this directory that matches the EUMAcut specified will be included in the merged table.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
245 * NR(1/0) : 1 means the script must use the LVKM files generated after the noncoding RNAs starting with the NR prefix are removed. 0 is otherwise.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
246
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
247
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
248 2) merge_LVKM_readcount.pl
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
249
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
250
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
251 This script produces a gNIR / iNIR file that contains the read counts for all samples for all genes/isoforms. The gNIR / iNIR files can be fed to diffNEUMA (http://neuma.kobic.re.kr), for identification of differentially expressed genes/isoforms. This script is identical to merge_NEUMA_readcount.pl in previous versions.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
252
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
253 usage: ./merge_LVKM_readcount.pl genewise/isoformwise[g/i] EUMAcut LVKM_out_dir NR(1/0) > output.gLVKM(iLVKM).merged
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
254
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
255 Example usage:
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
256 ./merge_LVKM_readcount.pl g 10 data/LVKM 1 > data/LVKM/all.gNIR
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
257 ./merge_LVKM_readcount.pl i 10 data/LVKM 1 > data/LVKM/all.iNIR
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
258
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
259 * genewise/isoformwise[g/i] : put g for gLVKM and i for iLVKM.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
260 * EUMAcut : put the same EUMA cut off used to generate the LVKM files. This will be used to recognize the files.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
261 * LVKM_out : This is usually your_basedir/LVKM, which contains all the LVKM files generated. All the LVKM files in this directory that matches the EUMAcut specified will be included in the NIR file.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
262 * NR(1/0) : 1 means the script must use the LVKM files generated after the noncoding RNAs starting with the NR prefix are removed. 0 is otherwise.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
263
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
264
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
265
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
266 3) merge_readcount.pl
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
267
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
268
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
269 This script produces a gNIR.merged / iNIR.merged / gReadcount.merged file that contains the read counts for all samples. The gNIR / iNIR /gReadcount files can be fed to diffNEUMA (http://neuma.kobic.re.kr), for identification of differentially expressed genes/isoforms. The result is not filtered for EUMA cut or NR as in merge_LVKM_readcount.pl.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
270
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
271 usage: ./merge_readcount.pl type[gNIR/iNIR/gReadcount] readcount_dir > output.gNIR(iNIR/gReadcount).merged
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
272
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
273 Example usage:
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
274 ./merge_readcount.pl gNIR data/readcount > data/readcount/all.gNIR.merged
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
275 ./merge_readcount.pl iNIR data/readcount > data/readcount/all.iNIR.merged
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
276
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
277 * type (gNIR/iNIR/gReadcount) : the type of the read counts to be merged
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
278 * readcount_dir : This is usually your_basedir/readcount, which contains all the readcount files generated.
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
279
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
280
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
281 //
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
282
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
283
c44c43d185ef NEUMA-1.2.1 Uploaded
chawhwa
parents:
diff changeset
284