annotate README.md @ 52:124b7fd92a3e draft

Uploaded
author davidvanzessen
date Thu, 25 Feb 2021 13:36:15 +0000
parents e35b82f31ec7
children 81b3eb11ed2c
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
52
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
1 # ARGalaxy Immune Repertoire
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
2 This is the GitHub repository for the ARGalaxy Immune repertoire pipeline.
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
3 The Galaxy tool version can be found [here](https://toolshed.g2.bx.psu.edu/repository/browse_repositories_i_own?sort=name&operation=view_or_manage_repository&id=2e457d63170a4b1c).
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
4 The docker version can be found [here](https://github.com/ErasmusMC-Bioinformatics/ARGalaxy-docker).
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
5
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
6 ## Overview
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
7
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
8 In execution order:
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
9
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
10 #### imgt_loader or igblast
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
11
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
12 ###### imgt_loader (Recommended)
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
13 Start the analysis with [IMGT HighV Quest](https://www.imgt.org/HighV-QUEST/) archives.
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
14 An IMGT archive file holds [multiple tabular files](http://www.imgt.org/IMGT_vquest/share/textes/imgtvquest.html#output3), this script extracts the specific columns relevant to the analysis from several of these files.
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
15
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
16 `Rscript imgt_loader.r 1_Summary.txt 3_Nt-sequences.txt 5_AA-sequences.txt 6_Junction.txt 4_IMGT-gapped-AA-sequences.txt /path/to/output.txt`
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
17
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
18
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
19 ###### igblast
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
20 Start the analysis with FASTA files that are aligned with [igblast](https://www.ncbi.nlm.nih.gov/igblast/).
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
21 Note that this method will provide less information than the IMGT archive.
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
22
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
23 `sh igblast.sh /path/to/sequences.fasta species locus /path/to/output.txt`
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
24
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
25 #### experimental_design
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
26 This script will merge multiple result files (from the last step) into a single file with an additional ID and Replicate column to differentiate the individual samples during the analysis and to allow for analysis across samples.
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
27
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
28 `Rscript experimental_design.r /path/to/input_1 id_1 [/path/to/input_2 id_2] [/path/to/input_n id_n] /path/to/output`
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
29
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
30 #### report_clonality
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
31 The R script that creates the analysis result.
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
32
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
33 `sh r_wrapper.sh /path/to/experimental_design/output.txt /path/to/output_dir/output.html /path/to/output_dir "clonaltype" "species" "locus" "filter_productive" "clonality_method"`
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
34
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
35 ###### parameters
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
36 Clonaltype:
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
37 - none
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
38 - Top.V.Gene,CDR3.Seq
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
39 - Top.V.Gene,CDR3.Seq.DNA
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
40 - Top.V.Gene,Top.J.Gene,CDR3.Seq
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
41 - Top.V.Gene,Top.J.Gene,CDR3.Seq.DNA
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
42 - Top.V.Gene,Top.D.Gene,Top.J.Gene,CDR3.Seq.DNA
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
43
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
44 Species:
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
45 - Homo sapiens functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
46 - Homo sapiens
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
47 - Homo sapiens non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
48 - Bos taurus
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
49 - Bos taurus functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
50 - Bos taurus non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
51 - Camelus dromedarius
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
52 - Camelus dromedarius functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
53 - Camelus dromedarius non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
54 - Canis lupus familiaris
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
55 - Canis lupus familiaris functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
56 - Canis lupus familiaris non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
57 - Danio rerio
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
58 - Danio rerio functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
59 - Danio rerio non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
60 - Macaca mulatta
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
61 - Macaca mulatta functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
62 - Macaca mulatta non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
63 - Mus musculus
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
64 - Mus musculus functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
65 - Mus musculus non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
66 - Mus spretus
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
67 - Mus spretus functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
68 - Mus spretus non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
69 - Oncorhynchus mykiss
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
70 - Oncorhynchus mykiss functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
71 - Oncorhynchus mykiss non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
72 - Ornithorhynchus anatinus
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
73 - Ornithorhynchus anatinus functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
74 - Ornithorhynchus anatinus non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
75 - Oryctolagus cuniculus
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
76 - Oryctolagus cuniculus functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
77 - Oryctolagus cuniculus non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
78 - Rattus norvegicus
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
79 - Rattus norvegicus functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
80 - Rattus norvegicus non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
81 - Sus scrofa
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
82 - Sus scrofa functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
83 - Sus scrofa non-functional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
84
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
85 Locus:
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
86 - TRA
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
87 - TRD
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
88 - TRG
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
89 - TRB
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
90 - IGH
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
91 - IGI
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
92 - IGK
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
93 - IGL
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
94
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
95 Filter productive:
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
96 - yes
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
97 - no
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
98
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
99 Clonality Method:
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
100 - none
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
101 - old
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
102 - boyd
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
103
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
104 ## complete.sh
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
105 This script will run all of the above for you, it will detect if you are using FASTA files or IMGT archives and use the appropriate tools.
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
106
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
107 `sh complete.sh /path/to/input_1 id_1 [/path/to/input_n id_n] /path/to/out_dir/out.html /path/to/out_dir clonaltype species locus filter_productive clonality_method`
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
108 See "report_clonality" for the parameter options.
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
109
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
110 ## Dependencies
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
111 - Linux
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
112 - R
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
113 - gridExtra
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
114 - ggplot2
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
115 - plyr
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
116 - data.table
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
117 - reshape2
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
118 - lymphclon
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
119
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
120 #### optional
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
121 - Circos
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
122 - IgBlast
124b7fd92a3e Uploaded
davidvanzessen
parents: 51
diff changeset
123 - igblastwrp