annotate pmlst/README.md @ 0:cfab64885f66 draft default tip

Uploaded
author dcouvin
date Mon, 06 Sep 2021 18:27:45 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
1 pMLST
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
2 ===================
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
3
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
4 Plasmid Multi-Locus Sequence Typing
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
5
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
6
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
7 Documentation
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
8 =============
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
9
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
10 The pMLST service contains one python script *pmlst.py* which is the script of the latest
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
11 version of the pMLST service. The method enables investigators to determine the ST based on WGS data.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
12
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
13 ## Content of the repository
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
14 1. pmlst.py - the program
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
15 2. README.md
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
16 3. Dockerfile - dockerfile for building the pmlst docker container
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
17 4. test.fsa - test fasta file
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
18
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
19
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
20 ## Installation
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
21
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
22 Setting up pMLST program
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
23 ```bash
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
24 # Go to wanted location for pmlst
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
25 cd /path/to/some/dir
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
26 # Clone and enter the pmlst directory
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
27 git clone https://bitbucket.org/genomicepidemiology/pmlst.git
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
28 cd pmlst
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
29 ```
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
30
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
31 Build Docker container
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
32 ```bash
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
33 # Build container
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
34 docker build -t pmlst .
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
35 ```
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
36
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
37 #Download and install pMLST database
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
38 ```bash
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
39 # Go to the directory where you want to store the pmlst database
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
40 cd /path/to/some/dir
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
41 # Clone database from git repository (develop branch)
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
42 git clone https://bitbucket.org/genomicepidemiology/pmlst_db.git
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
43 cd pmlst_db
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
44 pMLST_DB=$(pwd)
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
45 # Install pMLST database with executable kma_index program
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
46 python3 INSTALL.py kma_index
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
47 ```
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
48
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
49 If kma_index has not bin install please install kma_index from the kma repository:
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
50 https://bitbucket.org/genomicepidemiology/kma
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
51
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
52 ## Dependencies
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
53 In order to run the program without using docker, Python 3.5 (or newer) should be installed along with the following versions of the modules (or newer).
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
54
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
55 #### Modules
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
56 - cgecore 1.5.5
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
57 - tabulate 0.7.7
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
58
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
59 Modules can be installed using the following command. Here, the installation of the module cgecore is used as an example:
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
60 ```bash
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
61 pip3 install cgecore
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
62 ```
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
63 #### KMA and BLAST
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
64 Additionally KMA and BLAST version 2.8.1 or newer should be installed.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
65 The newest versions of KMA and BLAST can be installed from here:
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
66 ```url
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
67 https://bitbucket.org/genomicepidemiology/kma
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
68 ```
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
69
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
70 ```url
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
71 ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
72 ```
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
73
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
74 ## Usage
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
75
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
76 The program can be invoked with the -h option to get help and more information of the service.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
77 Run Docker container
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
78
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
79
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
80 ```bash
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
81 # Run pmlst container
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
82 docker run --rm -it \
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
83 -v $pMLST_DB:/database \
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
84 -v $(pwd):/workdir \
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
85 pmlst -i [INPUTFILE] -o . -s [SCHEME] [-x] [-mp] [-p] [-t]
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
86 ```
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
87
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
88 When running the docker file you have to mount 2 directory:
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
89 1. pmlst_db (pMLST database) downloaded from bitbucket
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
90 2. An output/input folder from where the input file can be reached and an output files can be saved.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
91 Here we mount the current working directory (using $pwd) and use this as the output directory,
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
92 the input file should be reachable from this directory as well.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
93
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
94 ` -i INPUTFILE input file (fasta or fastq) relative to pwd `
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
95
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
96 ` -s SCHEME pMLST scheme to be used, details are in config file `
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
97
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
98 ` -o OUTDIR outpur directory relative to pwd `
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
99
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
100 ` -x extended output. Will create an extented output `
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
101
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
102 ` -mp METHOD_PATH Path to executable of the method to be used (kma or blast)`
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
103
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
104 ` -p DATABASE Path to database directory `
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
105
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
106 ` -t TMP_DIR Temporary directory for storage of results from external software. `
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
107
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
108
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
109 ## Web-server
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
110
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
111 A webserver implementing the methods is available at the [CGE website](http://www.genomicepidemiology.org/) and can be found here:
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
112 https://cge.cbs.dtu.dk/services/pMLST/
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
113
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
114 Citation
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
115 =======
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
116
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
117 When using the method please cite:
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
118
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
119 PlasmidFinder and pMLST: in silico detection and typing of plasmids.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
120 Carattoli A, Zankari E, Garcia-Fernandez A, Volby Larsen M, Lund O, Villa L, Aarestrup FM, Hasman H.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
121 Antimicrob. Agents Chemother. 2014. April 28th.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
122 [Epub ahead of print]
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
123
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
124 References
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
125 =======
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
126
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
127 1. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics 2009; 10:421.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
128 2. Clausen PTLC, Aarestrup FM, Lund O. Rapid and precise alignment of raw reads against redundant databases with KMA. BMC Bioinformatics 2018; 19:307.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
129
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
130 License
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
131 =======
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
132
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
133 Copyright (c) 2014, Ole Lund, Technical University of Denmark
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
134 All rights reserved.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
135
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
136 Licensed under the Apache License, Version 2.0 (the "License");
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
137 you may not use this file except in compliance with the License.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
138 You may obtain a copy of the License at
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
139
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
140 http://www.apache.org/licenses/LICENSE-2.0
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
141
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
142 Unless required by applicable law or agreed to in writing, software
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
143 distributed under the License is distributed on an "AS IS" BASIS,
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
144 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
145 See the License for the specific language governing permissions and
cfab64885f66 Uploaded
dcouvin
parents:
diff changeset
146 limitations under the License.