annotate README.md @ 51:e35b82f31ec7 draft

Uploaded
author davidvanzessen
date Thu, 08 Aug 2019 07:40:36 -0400
parents 4d2a8f98a502
children 124b7fd92a3e
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
46
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
1 # ARGalaxy Immune Repertoire
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
2 This is the GitHub repository for the ARGalaxy Immune repertoire pipeline.
51
e35b82f31ec7 Uploaded
davidvanzessen
parents: 46
diff changeset
3 The Galaxy tool version can be found [here](https://toolshed.g2.bx.psu.edu/repository/browse_repositories_i_own?sort=name&operation=view_or_manage_repository&id=2e457d63170a4b1c).
e35b82f31ec7 Uploaded
davidvanzessen
parents: 46
diff changeset
4 The docker version can be found [here](https://github.com/ErasmusMC-Bioinformatics/ARGalaxy-docker).
46
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
5
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
6 ## Overview
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
7
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
8 In execution order:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
9
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
10 #### imgt_loader or igblast
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
11
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
12 ###### imgt_loader (Recommended)
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
13 Start the analysis with [IMGT HighV Quest](https://www.imgt.org/HighV-QUEST/) archives.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
14 An IMGT archive file holds [multiple tabular files](http://www.imgt.org/IMGT_vquest/share/textes/imgtvquest.html#output3), this script extracts the specific columns relevant to the analysis from several of these files.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
15
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
16 `Rscript imgt_loader.r 1_Summary.txt 3_Nt-sequences.txt 5_AA-sequences.txt 6_Junction.txt 4_IMGT-gapped-AA-sequences.txt /path/to/output.txt`
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
17
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
18
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
19 ###### igblast
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
20 Start the analysis with FASTA files that are aligned with [igblast](https://www.ncbi.nlm.nih.gov/igblast/).
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
21 Note that this method will provide less information than the IMGT archive.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
22
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
23 `sh igblast.sh /path/to/sequences.fasta species locus /path/to/output.txt`
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
24
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
25 #### experimental_design
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
26 This script will merge multiple result files (from the last step) into a single file with an additional ID and Replicate column to differentiate the individual samples during the analysis and to allow for analysis across samples.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
27
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
28 `Rscript experimental_design.r /path/to/input_1 id_1 [/path/to/input_2 id_2] [/path/to/input_n id_n] /path/to/output`
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
29
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
30 #### report_clonality
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
31 The R script that creates the analysis result.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
32
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
33 `sh r_wrapper.sh /path/to/experimental_design/output.txt /path/to/output_dir/output.html /path/to/output_dir "clonaltype" "species" "locus" "filter_productive" "clonality_method"`
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
34
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
35 ###### parameters
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
36 Clonaltype:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
37 - none
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
38 - Top.V.Gene,CDR3.Seq
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
39 - Top.V.Gene,CDR3.Seq.DNA
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
40 - Top.V.Gene,Top.J.Gene,CDR3.Seq
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
41 - Top.V.Gene,Top.J.Gene,CDR3.Seq.DNA
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
42 - Top.V.Gene,Top.D.Gene,Top.J.Gene,CDR3.Seq.DNA
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
43
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
44 Species:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
45 - Homo sapiens functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
46 - Homo sapiens
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
47 - Homo sapiens non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
48 - Bos taurus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
49 - Bos taurus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
50 - Bos taurus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
51 - Camelus dromedarius
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
52 - Camelus dromedarius functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
53 - Camelus dromedarius non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
54 - Canis lupus familiaris
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
55 - Canis lupus familiaris functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
56 - Canis lupus familiaris non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
57 - Danio rerio
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
58 - Danio rerio functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
59 - Danio rerio non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
60 - Macaca mulatta
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
61 - Macaca mulatta functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
62 - Macaca mulatta non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
63 - Mus musculus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
64 - Mus musculus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
65 - Mus musculus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
66 - Mus spretus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
67 - Mus spretus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
68 - Mus spretus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
69 - Oncorhynchus mykiss
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
70 - Oncorhynchus mykiss functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
71 - Oncorhynchus mykiss non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
72 - Ornithorhynchus anatinus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
73 - Ornithorhynchus anatinus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
74 - Ornithorhynchus anatinus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
75 - Oryctolagus cuniculus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
76 - Oryctolagus cuniculus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
77 - Oryctolagus cuniculus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
78 - Rattus norvegicus
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
79 - Rattus norvegicus functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
80 - Rattus norvegicus non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
81 - Sus scrofa
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
82 - Sus scrofa functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
83 - Sus scrofa non-functional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
84
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
85 Locus:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
86 - TRA
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
87 - TRD
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
88 - TRG
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
89 - TRB
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
90 - IGH
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
91 - IGI
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
92 - IGK
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
93 - IGL
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
94
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
95 Filter productive:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
96 - yes
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
97 - no
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
98
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
99 Clonality Method:
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
100 - none
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
101 - old
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
102 - boyd
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
103
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
104 ## complete.sh
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
105 This script will run all of the above for you, it will detect if you are using FASTA files or IMGT archives and use the appropriate tools.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
106
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
107 `sh complete.sh /path/to/input_1 id_1 [/path/to/input_n id_n] /path/to/out_dir/out.html /path/to/out_dir clonaltype species locus filter_productive clonality_method`
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
108 See "report_clonality" for the parameter options.
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
109
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
110 ## Dependencies
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
111 - Linux
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
112 - R
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
113 - gridExtra
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
114 - ggplot2
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
115 - plyr
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
116 - data.table
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
117 - reshape2
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
118 - lymphclon
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
119
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
120 #### optional
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
121 - Circos
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
122 - IgBlast
4d2a8f98a502 Uploaded
davidvanzessen
parents:
diff changeset
123 - igblastwrp