comparison README.md @ 0:7ae9d78b06f5 draft

"planemo upload for repository https://github.com/usegalaxy-au/galaxy-local-tools commit 7b79778448363aa8c9b14604337e81009e461bd2-dirty"
author galaxy-australia
date Fri, 28 Jan 2022 04:56:29 +0000
parents
children 6c92e000d684
comparison
equal deleted inserted replaced
-1:000000000000 0:7ae9d78b06f5
1
2 # Alphafold compute setup
3
4 ## Overview
5
6 Alphafold requires a customised compute environment to run. The machine needs a GPU, and access to a 2.2 Tb reference data store.
7
8 This document is designed to provide details on the compute environment required for Alphafold operation, and the Galaxy job destination settings to run the wrapper.
9
10 For full details on Alphafold requirements, see https://github.com/deepmind/alphafold.
11
12 <br>
13
14 ### HARDWARE
15
16 The machine is recommended to have the following specs:
17 - 12 cores
18 - 80 Gb RAM
19 - 2.5 Tb storage
20 - A fast Nvidia GPU.
21
22 As a minimum, the Nvidia GPU must have 8Gb RAM. It also requires ***unified memory*** to be switched on. <br>
23 Unified memory is usually enabled by default, but some HPC systems will turn it off so the GPU can be shared between multiple jobs concurrently.
24
25 <br>
26
27 ### ENVIRONMENT
28
29 This wrapper runs Alphafold as a singularity container. The following software are needed:
30
31 - [Singularity](https://sylabs.io/guides/3.0/user-guide/installation.html)
32 - [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html)
33
34 As Alphafold uses an Nvidia GPU, the NVIDIA Container Toolkit is needed. This makes the GPU available inside the running singularity container.
35
36 To check that everything has been set up correctly, run the following
37
38 ```
39 singularity run --nv docker://nvidia/cuda:11.0-base nvidia-smi
40 ```
41
42 If you can see something similar to this output (details depend on your GPU), it has been set up correctly.
43
44 ```
45 +-----------------------------------------------------------------------------+
46 | NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
47 |-------------------------------+----------------------+----------------------+
48 | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
49 | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
50 | | | MIG M. |
51 |===============================+======================+======================|
52 | 0 Tesla T4 Off | 00000000:00:05.0 Off | 0 |
53 | N/A 49C P0 28W / 70W | 0MiB / 15109MiB | 0% Default |
54 | | | N/A |
55 +-------------------------------+----------------------+----------------------+
56
57 +-----------------------------------------------------------------------------+
58 | Processes: |
59 | GPU GI CI PID Type Process name GPU Memory |
60 | ID ID Usage |
61 |=============================================================================|
62 | No running processes found |
63 +-----------------------------------------------------------------------------+
64 ```
65
66
67 <br>
68
69 ### REFERENCE DATA
70
71 Alphafold needs reference data to run. The wrapper expects this data to be present at `/data/alphafold_databases`. <br>
72 To download, run the following shell script command in the tool directory.
73
74 ```
75 # make folders if needed
76 mkdir /data /data/alphafold_databases
77
78 # download ref data
79 bash scripts/download_all_data.sh /data/alphafold_databases
80 ```
81
82 This will install the reference data to `/data/alphafold_databases`. To check this has worked, ensure the final folder structure is as follows:
83
84 ```
85 data/alphafold_databases
86 ├── bfd
87 │   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffdata
88 │   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffindex
89 │   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffdata
90 │   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffindex
91 │   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffdata
92 │   └── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffindex
93 ├── mgnify
94 │   └── mgy_clusters_2018_12.fa
95 ├── params
96 │   ├── LICENSE
97 │   ├── params_model_1.npz
98 │   ├── params_model_1_ptm.npz
99 │   ├── params_model_2.npz
100 │   ├── params_model_2_ptm.npz
101 │   ├── params_model_3.npz
102 │   ├── params_model_3_ptm.npz
103 │   ├── params_model_4.npz
104 │   ├── params_model_4_ptm.npz
105 │   ├── params_model_5.npz
106 │   └── params_model_5_ptm.npz
107 ├── pdb70
108 │   ├── md5sum
109 │   ├── pdb70_a3m.ffdata
110 │   ├── pdb70_a3m.ffindex
111 │   ├── pdb70_clu.tsv
112 │   ├── pdb70_cs219.ffdata
113 │   ├── pdb70_cs219.ffindex
114 │   ├── pdb70_hhm.ffdata
115 │   ├── pdb70_hhm.ffindex
116 │   └── pdb_filter.dat
117 ├── pdb_mmcif
118 │   ├── mmcif_files
119 │   └── obsolete.dat
120 ├── uniclust30
121 │   └── uniclust30_2018_08
122 └── uniref90
123 └── uniref90.fasta
124 ```
125
126
127 <br>
128
129 ### JOB DESTINATION
130
131 Alphafold needs a custom singularity job destination to run.
132 The destination needs to be configured for singularity, and some
133 extra singularity params need to be set as seen below.
134
135 Specify the job runner. For example, a local runner
136
137 ```
138 <plugin id="alphafold_runner" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner"/>
139 ```
140
141 Customise the job destination with required singularity settings. <br>
142 The settings below are mandatory, but you may include other settings as needed.
143
144 ```
145 <destination id="alphafold" runner="alphafold_runner">
146 <param id="dependency_resolution">'none'</param>
147 <param id="singularity_enabled">true</param>
148 <param id="singularity_run_extra_arguments">--nv</param>
149 <param id="singularity_volumes">"$job_directory:ro,$tool_directory:ro,$job_directory/outputs:rw,$working_directory:rw,/data/alphafold_databases:/data:ro"</param>
150 </destination>
151 ```
152
153 <br>
154
155 ### Closing
156
157 If you are experiencing technical issues, feel free to write to help@genome.edu.au. We may be able to provide comment on setting up Alphafold on your compute environment.