Mercurial > repos > galaxy-australia > alphafold2
comparison README.md @ 0:7ae9d78b06f5 draft
"planemo upload for repository https://github.com/usegalaxy-au/galaxy-local-tools commit 7b79778448363aa8c9b14604337e81009e461bd2-dirty"
author | galaxy-australia |
---|---|
date | Fri, 28 Jan 2022 04:56:29 +0000 |
parents | |
children | 6c92e000d684 |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:7ae9d78b06f5 |
---|---|
1 | |
2 # Alphafold compute setup | |
3 | |
4 ## Overview | |
5 | |
6 Alphafold requires a customised compute environment to run. The machine needs a GPU, and access to a 2.2 Tb reference data store. | |
7 | |
8 This document is designed to provide details on the compute environment required for Alphafold operation, and the Galaxy job destination settings to run the wrapper. | |
9 | |
10 For full details on Alphafold requirements, see https://github.com/deepmind/alphafold. | |
11 | |
12 <br> | |
13 | |
14 ### HARDWARE | |
15 | |
16 The machine is recommended to have the following specs: | |
17 - 12 cores | |
18 - 80 Gb RAM | |
19 - 2.5 Tb storage | |
20 - A fast Nvidia GPU. | |
21 | |
22 As a minimum, the Nvidia GPU must have 8Gb RAM. It also requires ***unified memory*** to be switched on. <br> | |
23 Unified memory is usually enabled by default, but some HPC systems will turn it off so the GPU can be shared between multiple jobs concurrently. | |
24 | |
25 <br> | |
26 | |
27 ### ENVIRONMENT | |
28 | |
29 This wrapper runs Alphafold as a singularity container. The following software are needed: | |
30 | |
31 - [Singularity](https://sylabs.io/guides/3.0/user-guide/installation.html) | |
32 - [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) | |
33 | |
34 As Alphafold uses an Nvidia GPU, the NVIDIA Container Toolkit is needed. This makes the GPU available inside the running singularity container. | |
35 | |
36 To check that everything has been set up correctly, run the following | |
37 | |
38 ``` | |
39 singularity run --nv docker://nvidia/cuda:11.0-base nvidia-smi | |
40 ``` | |
41 | |
42 If you can see something similar to this output (details depend on your GPU), it has been set up correctly. | |
43 | |
44 ``` | |
45 +-----------------------------------------------------------------------------+ | |
46 | NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 | | |
47 |-------------------------------+----------------------+----------------------+ | |
48 | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | |
49 | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | |
50 | | | MIG M. | | |
51 |===============================+======================+======================| | |
52 | 0 Tesla T4 Off | 00000000:00:05.0 Off | 0 | | |
53 | N/A 49C P0 28W / 70W | 0MiB / 15109MiB | 0% Default | | |
54 | | | N/A | | |
55 +-------------------------------+----------------------+----------------------+ | |
56 | |
57 +-----------------------------------------------------------------------------+ | |
58 | Processes: | | |
59 | GPU GI CI PID Type Process name GPU Memory | | |
60 | ID ID Usage | | |
61 |=============================================================================| | |
62 | No running processes found | | |
63 +-----------------------------------------------------------------------------+ | |
64 ``` | |
65 | |
66 | |
67 <br> | |
68 | |
69 ### REFERENCE DATA | |
70 | |
71 Alphafold needs reference data to run. The wrapper expects this data to be present at `/data/alphafold_databases`. <br> | |
72 To download, run the following shell script command in the tool directory. | |
73 | |
74 ``` | |
75 # make folders if needed | |
76 mkdir /data /data/alphafold_databases | |
77 | |
78 # download ref data | |
79 bash scripts/download_all_data.sh /data/alphafold_databases | |
80 ``` | |
81 | |
82 This will install the reference data to `/data/alphafold_databases`. To check this has worked, ensure the final folder structure is as follows: | |
83 | |
84 ``` | |
85 data/alphafold_databases | |
86 ├── bfd | |
87 │ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffdata | |
88 │ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffindex | |
89 │ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffdata | |
90 │ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffindex | |
91 │ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffdata | |
92 │ └── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffindex | |
93 ├── mgnify | |
94 │ └── mgy_clusters_2018_12.fa | |
95 ├── params | |
96 │ ├── LICENSE | |
97 │ ├── params_model_1.npz | |
98 │ ├── params_model_1_ptm.npz | |
99 │ ├── params_model_2.npz | |
100 │ ├── params_model_2_ptm.npz | |
101 │ ├── params_model_3.npz | |
102 │ ├── params_model_3_ptm.npz | |
103 │ ├── params_model_4.npz | |
104 │ ├── params_model_4_ptm.npz | |
105 │ ├── params_model_5.npz | |
106 │ └── params_model_5_ptm.npz | |
107 ├── pdb70 | |
108 │ ├── md5sum | |
109 │ ├── pdb70_a3m.ffdata | |
110 │ ├── pdb70_a3m.ffindex | |
111 │ ├── pdb70_clu.tsv | |
112 │ ├── pdb70_cs219.ffdata | |
113 │ ├── pdb70_cs219.ffindex | |
114 │ ├── pdb70_hhm.ffdata | |
115 │ ├── pdb70_hhm.ffindex | |
116 │ └── pdb_filter.dat | |
117 ├── pdb_mmcif | |
118 │ ├── mmcif_files | |
119 │ └── obsolete.dat | |
120 ├── uniclust30 | |
121 │ └── uniclust30_2018_08 | |
122 └── uniref90 | |
123 └── uniref90.fasta | |
124 ``` | |
125 | |
126 | |
127 <br> | |
128 | |
129 ### JOB DESTINATION | |
130 | |
131 Alphafold needs a custom singularity job destination to run. | |
132 The destination needs to be configured for singularity, and some | |
133 extra singularity params need to be set as seen below. | |
134 | |
135 Specify the job runner. For example, a local runner | |
136 | |
137 ``` | |
138 <plugin id="alphafold_runner" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner"/> | |
139 ``` | |
140 | |
141 Customise the job destination with required singularity settings. <br> | |
142 The settings below are mandatory, but you may include other settings as needed. | |
143 | |
144 ``` | |
145 <destination id="alphafold" runner="alphafold_runner"> | |
146 <param id="dependency_resolution">'none'</param> | |
147 <param id="singularity_enabled">true</param> | |
148 <param id="singularity_run_extra_arguments">--nv</param> | |
149 <param id="singularity_volumes">"$job_directory:ro,$tool_directory:ro,$job_directory/outputs:rw,$working_directory:rw,/data/alphafold_databases:/data:ro"</param> | |
150 </destination> | |
151 ``` | |
152 | |
153 <br> | |
154 | |
155 ### Closing | |
156 | |
157 If you are experiencing technical issues, feel free to write to help@genome.edu.au. We may be able to provide comment on setting up Alphafold on your compute environment. |