annotate pmlst/README.md @ 0:140d4f9e1f20 draft default tip

Uploaded
author dcouvin
date Mon, 06 Sep 2021 16:00:46 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
1 pMLST
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
2 ===================
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
3
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
4 Plasmid Multi-Locus Sequence Typing
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
5
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
6
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
7 Documentation
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
8 =============
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
9
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
10 The pMLST service contains one python script *pmlst.py* which is the script of the latest
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
11 version of the pMLST service. The method enables investigators to determine the ST based on WGS data.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
12
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
13 ## Content of the repository
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
14 1. pmlst.py - the program
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
15 2. README.md
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
16 3. Dockerfile - dockerfile for building the pmlst docker container
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
17 4. test.fsa - test fasta file
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
18
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
19
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
20 ## Installation
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
21
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
22 Setting up pMLST program
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
23 ```bash
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
24 # Go to wanted location for pmlst
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
25 cd /path/to/some/dir
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
26 # Clone and enter the pmlst directory
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
27 git clone https://bitbucket.org/genomicepidemiology/pmlst.git
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
28 cd pmlst
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
29 ```
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
30
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
31 Build Docker container
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
32 ```bash
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
33 # Build container
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
34 docker build -t pmlst .
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
35 ```
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
36
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
37 #Download and install pMLST database
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
38 ```bash
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
39 # Go to the directory where you want to store the pmlst database
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
40 cd /path/to/some/dir
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
41 # Clone database from git repository (develop branch)
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
42 git clone https://bitbucket.org/genomicepidemiology/pmlst_db.git
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
43 cd pmlst_db
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
44 pMLST_DB=$(pwd)
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
45 # Install pMLST database with executable kma_index program
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
46 python3 INSTALL.py kma_index
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
47 ```
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
48
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
49 If kma_index has not bin install please install kma_index from the kma repository:
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
50 https://bitbucket.org/genomicepidemiology/kma
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
51
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
52 ## Dependencies
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
53 In order to run the program without using docker, Python 3.5 (or newer) should be installed along with the following versions of the modules (or newer).
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
54
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
55 #### Modules
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
56 - cgecore 1.5.5
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
57 - tabulate 0.7.7
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
58
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
59 Modules can be installed using the following command. Here, the installation of the module cgecore is used as an example:
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
60 ```bash
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
61 pip3 install cgecore
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
62 ```
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
63 #### KMA and BLAST
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
64 Additionally KMA and BLAST version 2.8.1 or newer should be installed.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
65 The newest versions of KMA and BLAST can be installed from here:
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
66 ```url
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
67 https://bitbucket.org/genomicepidemiology/kma
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
68 ```
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
69
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
70 ```url
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
71 ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
72 ```
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
73
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
74 ## Usage
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
75
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
76 The program can be invoked with the -h option to get help and more information of the service.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
77 Run Docker container
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
78
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
79
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
80 ```bash
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
81 # Run pmlst container
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
82 docker run --rm -it \
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
83 -v $pMLST_DB:/database \
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
84 -v $(pwd):/workdir \
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
85 pmlst -i [INPUTFILE] -o . -s [SCHEME] [-x] [-mp] [-p] [-t]
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
86 ```
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
87
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
88 When running the docker file you have to mount 2 directory:
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
89 1. pmlst_db (pMLST database) downloaded from bitbucket
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
90 2. An output/input folder from where the input file can be reached and an output files can be saved.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
91 Here we mount the current working directory (using $pwd) and use this as the output directory,
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
92 the input file should be reachable from this directory as well.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
93
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
94 ` -i INPUTFILE input file (fasta or fastq) relative to pwd `
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
95
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
96 ` -s SCHEME pMLST scheme to be used, details are in config file `
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
97
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
98 ` -o OUTDIR outpur directory relative to pwd `
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
99
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
100 ` -x extended output. Will create an extented output `
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
101
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
102 ` -mp METHOD_PATH Path to executable of the method to be used (kma or blast)`
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
103
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
104 ` -p DATABASE Path to database directory `
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
105
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
106 ` -t TMP_DIR Temporary directory for storage of results from external software. `
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
107
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
108
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
109 ## Web-server
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
110
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
111 A webserver implementing the methods is available at the [CGE website](http://www.genomicepidemiology.org/) and can be found here:
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
112 https://cge.cbs.dtu.dk/services/pMLST/
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
113
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
114 Citation
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
115 =======
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
116
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
117 When using the method please cite:
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
118
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
119 PlasmidFinder and pMLST: in silico detection and typing of plasmids.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
120 Carattoli A, Zankari E, Garcia-Fernandez A, Volby Larsen M, Lund O, Villa L, Aarestrup FM, Hasman H.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
121 Antimicrob. Agents Chemother. 2014. April 28th.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
122 [Epub ahead of print]
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
123
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
124 References
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
125 =======
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
126
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
127 1. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics 2009; 10:421.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
128 2. Clausen PTLC, Aarestrup FM, Lund O. Rapid and precise alignment of raw reads against redundant databases with KMA. BMC Bioinformatics 2018; 19:307.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
129
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
130 License
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
131 =======
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
132
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
133 Copyright (c) 2014, Ole Lund, Technical University of Denmark
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
134 All rights reserved.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
135
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
136 Licensed under the Apache License, Version 2.0 (the "License");
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
137 you may not use this file except in compliance with the License.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
138 You may obtain a copy of the License at
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
139
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
140 http://www.apache.org/licenses/LICENSE-2.0
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
141
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
142 Unless required by applicable law or agreed to in writing, software
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
143 distributed under the License is distributed on an "AS IS" BASIS,
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
144 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
145 See the License for the specific language governing permissions and
140d4f9e1f20 Uploaded
dcouvin
parents:
diff changeset
146 limitations under the License.