comparison README.rst @ 1:6c92e000d684 draft

"planemo upload for repository https://github.com/usegalaxy-au/galaxy-local-tools commit a510e97ebd604a5e30b1f16e5031f62074f23e86"
author galaxy-australia
date Tue, 01 Mar 2022 02:53:05 +0000
parents
children abba603c6ef3
comparison
equal deleted inserted replaced
0:7ae9d78b06f5 1:6c92e000d684
1 Alphafold compute setup
2 =======================
3
4 Overview
5 --------
6
7 Alphafold requires a customised compute environment to run. The machine
8 needs a GPU, and access to a 2.2 Tb reference data store.
9
10 This document is designed to provide details on the compute environment
11 required for Alphafold operation, and the Galaxy job destination
12 settings to run the wrapper.
13
14 For full details on Alphafold requirements, see
15 https://github.com/deepmind/alphafold.
16
17 HARDWARE
18 ~~~~~~~~
19
20 The machine is recommended to have the following specs: - 12 cores - 80
21 Gb RAM - 2.5 Tb storage - A fast Nvidia GPU.
22
23 As a minimum, the Nvidia GPU must have 8Gb RAM. It also requires
24 **unified memory** to be switched on. Unified memory is usually enabled
25 by default, but some HPC systems will turn it off so the GPU can be
26 shared between multiple jobs concurrently.
27
28 ENVIRONMENT
29 ~~~~~~~~~~~
30
31 This wrapper runs Alphafold as a singularity container. The following
32 software are needed:
33
34 - `Singularity <https://sylabs.io/guides/3.0/user-guide/installation.html>`_
35 - `NVIDIA Container
36 Toolkit <https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html>`_
37
38 As Alphafold uses an Nvidia GPU, the NVIDIA Container Toolkit is needed.
39 This makes the GPU available inside the running singularity container.
40
41 To check that everything has been set up correctly, run the following
42
43 ::
44
45 singularity run --nv docker://nvidia/cuda:11.0-base nvidia-smi
46
47 If you can see something similar to this output (details depend on your
48 GPU), it has been set up correctly.
49
50 ::
51
52 +-----------------------------------------------------------------------------+
53 | NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
54 |-------------------------------+----------------------+----------------------+
55 | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
56 | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
57 | | | MIG M. |
58 |===============================+======================+======================|
59 | 0 Tesla T4 Off | 00000000:00:05.0 Off | 0 |
60 | N/A 49C P0 28W / 70W | 0MiB / 15109MiB | 0% Default |
61 | | | N/A |
62 +-------------------------------+----------------------+----------------------+
63
64 +-----------------------------------------------------------------------------+
65 | Processes: |
66 | GPU GI CI PID Type Process name GPU Memory |
67 | ID ID Usage |
68 |=============================================================================|
69 | No running processes found |
70 +-----------------------------------------------------------------------------+
71
72 REFERENCE DATA
73 ~~~~~~~~~~~~~~
74
75 Alphafold needs reference data to run. The wrapper expects this data to
76 be present at ``/data/alphafold_databases``. To download, run the
77 following shell script command in the tool directory.
78
79 ::
80
81 # make folders if needed
82 mkdir /data /data/alphafold_databases
83
84 # download ref data
85 bash scripts/download_all_data.sh /data/alphafold_databases
86
87 This will install the reference data to ``/data/alphafold_databases``.
88 To check this has worked, ensure the final folder structure is as
89 follows:
90
91 ::
92
93 data/alphafold_databases
94 ├── bfd
95 │   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffdata
96 │   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffindex
97 │   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffdata
98 │   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffindex
99 │   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffdata
100 │   └── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffindex
101 ├── mgnify
102 │   └── mgy_clusters_2018_12.fa
103 ├── params
104 │   ├── LICENSE
105 │   ├── params_model_1.npz
106 │   ├── params_model_1_ptm.npz
107 │   ├── params_model_2.npz
108 │   ├── params_model_2_ptm.npz
109 │   ├── params_model_3.npz
110 │   ├── params_model_3_ptm.npz
111 │   ├── params_model_4.npz
112 │   ├── params_model_4_ptm.npz
113 │   ├── params_model_5.npz
114 │   └── params_model_5_ptm.npz
115 ├── pdb70
116 │   ├── md5sum
117 │   ├── pdb70_a3m.ffdata
118 │   ├── pdb70_a3m.ffindex
119 │   ├── pdb70_clu.tsv
120 │   ├── pdb70_cs219.ffdata
121 │   ├── pdb70_cs219.ffindex
122 │   ├── pdb70_hhm.ffdata
123 │   ├── pdb70_hhm.ffindex
124 │   └── pdb_filter.dat
125 ├── pdb_mmcif
126 │   ├── mmcif_files
127 │   └── obsolete.dat
128 ├── uniclust30
129 │   └── uniclust30_2018_08
130 └── uniref90
131 └── uniref90.fasta
132
133 JOB DESTINATION
134 ~~~~~~~~~~~~~~~
135
136 Alphafold needs a custom singularity job destination to run. The
137 destination needs to be configured for singularity, and some extra
138 singularity params need to be set as seen below.
139
140 Specify the job runner. For example, a local runner
141
142 ::
143
144 <plugin id="alphafold_runner" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner"/>
145
146 Customise the job destination with required singularity settings. The
147 settings below are mandatory, but you may include other settings as
148 needed.
149
150 ::
151
152 <destination id="alphafold" runner="alphafold_runner">
153 <param id="dependency_resolution">'none'</param>
154 <param id="singularity_enabled">true</param>
155 <param id="singularity_run_extra_arguments">--nv</param>
156 <param id="singularity_volumes">"$job_directory:ro,$tool_directory:ro,$job_directory/outputs:rw,$working_directory:rw,/data/alphafold_databases:/data:ro"</param>
157 </destination>
158
159 Closing
160 ~~~~~~~
161
162 If you are experiencing technical issues, feel free to write to
163 help@genome.edu.au. We may be able to provide advice on setting up
164 Alphafold on your compute environment.