annotate scripts/outputs.py @ 23:2891385d6ace draft default tip

planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
author galaxy-australia
date Wed, 16 Apr 2025 05:46:58 +0000
parents 3f188450ca4f
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
1 """Generate additional output files not produced by AlphaFold.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
2
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
3 Currently this is includes:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
4 - model confidence scores
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
5 - per-residue confidence scores (pLDDTs - optional output)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
6 - model_*.pkl files renamed with rank order
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
7
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
8 N.B. There have been issues with this script breaking between AlphaFold
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
9 versions due to minor changes in the output directory structure across minor
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
10 versions. It will likely need updating with future releases of AlphaFold.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
11
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
12 This code is more complex than you might expect due to the output files
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
13 'moving around' considerably, depending on run parameters. You will see that
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
14 several output paths are determined dynamically.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
15 """
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
17 import argparse
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
18 import json
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
19 import numpy as np
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
20 import os
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
21 import pickle as pk
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
22 import shutil
23
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
23 import zipfile
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
24 from matplotlib import pyplot as plt
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
25 from pathlib import Path
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
26 from typing import Dict, List
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
27
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
28 OUTPUT_DIR = 'extra'
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
29 OUTPUTS = {
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
30 'model_pkl': OUTPUT_DIR + '/ranked_{rank}.pkl',
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
31 'model_pae': OUTPUT_DIR + '/pae_ranked_{rank}.csv',
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
32 'model_plot': OUTPUT_DIR + '/ranked_{rank}.png',
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
33 'model_confidence_scores': OUTPUT_DIR + '/model_confidence_scores.tsv',
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
34 'plddts': OUTPUT_DIR + '/plddts.tsv',
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
35 'relax': OUTPUT_DIR + '/relax_metrics_ranked.json',
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
36 'msa': OUTPUT_DIR + '/msa_coverage.png',
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
37 }
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
38
20
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
39 HTML_PATH = Path(__file__).parent / "alphafold.html"
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
40 HTML_OUTPUT_FILENAME = 'alphafold.html'
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
41 HTML_BUTTON_ATTR = 'class="btn" id="btn-ranked_{rank}"'
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
42 HTML_BUTTON_ATTR_DISABLED = (
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
43 'class="btn disabled" id="btn-ranked_{rank}" disabled')
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
44
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
45
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
46 class PLDDT_KEY:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
47 """Dict keys for accessing confidence data from JSON/pkl files."
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
48 Changes depending on which model PRESET was used.
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
49 """
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
50 monomer = 'plddts'
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
51 multimer = 'iptm+ptm'
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
52
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
53
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
54 class PRESETS:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
55 monomer = 'monomer'
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
56 monomer_ptm = 'monomer_ptm'
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
57 multimer = 'multimer'
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
58
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
59
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
60 class Settings:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
61 """Parse and store settings/config."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
62 def __init__(self):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
63 self.workdir = None
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
64 self.output_confidence_scores = True
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
65 self.output_residue_scores = False
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
66 self.parse()
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
67
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
68 def parse(self) -> None:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
69 parser = argparse.ArgumentParser()
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
70 parser.add_argument(
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
71 "workdir",
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
72 help="alphafold output directory",
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
73 type=str,
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
74 )
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
75 parser.add_argument(
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
76 "-s",
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
77 "--confidence-scores",
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
78 help="output per-residue confidence scores (pLDDTs)",
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
79 action="store_true",
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
80 )
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
81 parser.add_argument(
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
82 "--pkl",
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
83 help="rename model pkl outputs with rank order",
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
84 action="store_true",
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
85 )
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
86 parser.add_argument(
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
87 "--pae",
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
88 help="extract PAE from pkl files to CSV format",
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
89 action="store_true",
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
90 )
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
91 parser.add_argument(
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
92 "--plot",
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
93 help="Plot pLDDT and PAE for each model",
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
94 action="store_true",
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
95 )
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
96 parser.add_argument(
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
97 "--plot-msa",
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
98 help="Plot multiple-sequence alignment coverage as a heatmap",
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
99 action="store_true",
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
100 )
23
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
101 parser.add_argument(
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
102 "--msa",
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
103 help="Collect multiple-sequence alignments as ZIP archives",
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
104 action="store_true",
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
105 )
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
106 parser.add_argument(
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
107 "--msa_only",
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
108 help="Alphafold generated MSA files only - skip all other outputs",
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
109 action="store_true",
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
110 )
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
111 args = parser.parse_args()
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
112 self.workdir = Path(args.workdir.rstrip('/'))
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
113 self.output_residue_scores = args.confidence_scores
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
114 self.output_model_pkls = args.pkl
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
115 self.output_model_plots = args.plot
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
116 self.output_pae = args.pae
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
117 self.plot_msa = args.plot_msa
23
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
118 self.collect_msas = args.msa
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
119 self.model_preset = self._sniff_model_preset()
23
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
120 self.is_multimer = self.model_preset == PRESETS.multimer
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
121 self.output_dir = self.workdir / OUTPUT_DIR
23
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
122 self.msa_only = args.msa_only
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
123 os.makedirs(self.output_dir, exist_ok=True)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
124
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
125 def _sniff_model_preset(self) -> bool:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
126 """Check if the run was multimer or monomer."""
22
3f188450ca4f planemo upload for repository https://github.com/usegalaxy-au/tools-au commit d626bb28203543a70d3fc60d662cb054bc3cef7c
galaxy-australia
parents: 21
diff changeset
127 for path in self.workdir.glob('*.pkl'):
3f188450ca4f planemo upload for repository https://github.com/usegalaxy-au/tools-au commit d626bb28203543a70d3fc60d662cb054bc3cef7c
galaxy-australia
parents: 21
diff changeset
128 if 'feature' not in path.name:
3f188450ca4f planemo upload for repository https://github.com/usegalaxy-au/tools-au commit d626bb28203543a70d3fc60d662cb054bc3cef7c
galaxy-australia
parents: 21
diff changeset
129 if '_multimer_' in path.name:
3f188450ca4f planemo upload for repository https://github.com/usegalaxy-au/tools-au commit d626bb28203543a70d3fc60d662cb054bc3cef7c
galaxy-australia
parents: 21
diff changeset
130 return PRESETS.multimer
3f188450ca4f planemo upload for repository https://github.com/usegalaxy-au/tools-au commit d626bb28203543a70d3fc60d662cb054bc3cef7c
galaxy-australia
parents: 21
diff changeset
131 if '_ptm_' in path.name:
3f188450ca4f planemo upload for repository https://github.com/usegalaxy-au/tools-au commit d626bb28203543a70d3fc60d662cb054bc3cef7c
galaxy-australia
parents: 21
diff changeset
132 return PRESETS.monomer_ptm
3f188450ca4f planemo upload for repository https://github.com/usegalaxy-au/tools-au commit d626bb28203543a70d3fc60d662cb054bc3cef7c
galaxy-australia
parents: 21
diff changeset
133 return PRESETS.monomer
23
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
134 return PRESETS.monomer
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
135
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
136
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
137 class ExecutionContext:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
138 """Collect file paths etc."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
139 def __init__(self, settings: Settings):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
140 self.settings = settings
23
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
141 if settings.is_multimer:
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
142 self.plddt_key = PLDDT_KEY.multimer
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
143 else:
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
144 self.plddt_key = PLDDT_KEY.monomer
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
145
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
146 def get_model_key(self, ix: int) -> str:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
147 """Return json key for model index.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
148
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
149 The key format changed between minor AlphaFold versions so this
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
150 function determines the correct key.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
151 """
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
152 with open(self.ranking_debug) as f:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
153 data = json.load(f)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
154 model_keys = list(data[self.plddt_key].keys())
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
155 for k in model_keys:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
156 if k.startswith(f"model_{ix}_"):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
157 return k
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
158 return KeyError(
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
159 f'Could not find key for index={ix} in'
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
160 ' ranking_debug.json')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
161
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
162 @property
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
163 def ranking_debug(self) -> str:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
164 return self.settings.workdir / 'ranking_debug.json'
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
165
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
166 @property
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
167 def relax_metrics(self) -> str:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
168 return self.settings.workdir / 'relax_metrics.json'
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
169
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
170 @property
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
171 def relax_metrics_ranked(self) -> str:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
172 return self.settings.workdir / 'relax_metrics_ranked.json'
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
173
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
174 @property
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
175 def model_pkl_paths(self) -> List[str]:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
176 return sorted([
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
177 self.settings.workdir / f
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
178 for f in os.listdir(self.settings.workdir)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
179 if f.startswith('result_model_') and f.endswith('.pkl')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
180 ])
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
181
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
182
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
183 class ResultModelPrediction:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
184 """Load and manipulate data from result_model_*.pkl files."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
185 def __init__(self, path: str, context: ExecutionContext):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
186 self.context = context
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
187 self.path = path
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
188 self.name = os.path.basename(path).replace('result_', '').split('.')[0]
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
189 with open(path, 'rb') as path:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
190 self.data = pk.load(path)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
191
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
192 @property
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
193 def plddts(self) -> List[float]:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
194 """Return pLDDT scores for each residue."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
195 return list(self.data['plddt'])
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
196
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
197
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
198 class ResultRanking:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
199 """Load and manipulate data from ranking_debug.json file."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
200
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
201 def __init__(self, context: ExecutionContext):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
202 self.path = context.ranking_debug
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
203 self.context = context
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
204 with open(self.path, 'r') as f:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
205 self.data = json.load(f)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
206
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
207 @property
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
208 def order(self) -> List[str]:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
209 """Return ordered list of model indexes."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
210 return self.data['order']
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
211
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
212 def get_plddt_for_rank(self, rank: int) -> List[float]:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
213 """Get pLDDT score for model instance."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
214 return self.data[self.context.plddt_key][self.data['order'][rank - 1]]
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
215
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
216 def get_rank_for_model(self, model_name: str) -> int:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
217 """Return 0-indexed rank for given model name.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
218
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
219 Model names are expressed in result_model_*.pkl file names.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
220 """
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
221 return self.data['order'].index(model_name)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
222
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
223
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
224 def write_confidence_scores(ranking: ResultRanking, context: ExecutionContext):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
225 """Write per-model confidence scores."""
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
226 outfile = context.settings.workdir / OUTPUTS['model_confidence_scores']
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
227 scores: Dict[str, list] = {}
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
228 header = ['model', context.plddt_key]
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
229
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
230 for i, path in enumerate(context.model_pkl_paths):
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
231 rank = int(path.name.split('model_')[-1][0])
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
232 scores_ls = [ranking.get_plddt_for_rank(rank)]
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
233 with open(path, 'rb') as f:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
234 data = pk.load(f)
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
235 if 'ptm' in data:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
236 scores_ls.append(data['ptm'])
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
237 if i == 0:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
238 header += ['ptm']
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
239 if 'iptm' in data:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
240 scores_ls.append(data['iptm'])
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
241 if i == 0:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
242 header += ['iptm']
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
243 scores[rank] = scores_ls
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
244
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
245 with open(outfile, 'w') as f:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
246 f.write('\t'.join(header) + '\n')
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
247 for rank, score_ls in scores.items():
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
248 row = [f"ranked_{rank - 1}"] + [str(x) for x in score_ls]
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
249 f.write('\t'.join(row) + '\n')
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
250
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
251
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
252 def write_per_residue_scores(
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
253 ranking: ResultRanking,
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
254 context: ExecutionContext,
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
255 ):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
256 """Write per-residue plddts for each model.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
257
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
258 A row of plddt values is written for each model in tabular format.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
259 """
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
260 model_plddts = {}
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
261 for i, path in enumerate(context.model_pkl_paths):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
262 model = ResultModelPrediction(path, context)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
263 rank = ranking.get_rank_for_model(model.name)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
264 model_plddts[rank] = model.plddts
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
265
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
266 path = context.settings.workdir / OUTPUTS['plddts']
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
267 with open(path, 'w') as f:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
268 for i in sorted(list(model_plddts.keys())):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
269 row = [f'ranked_{i}'] + [
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
270 str(x) for x in model_plddts[i]
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
271 ]
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
272 f.write('\t'.join(row) + '\n')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
273
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
274
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
275 def rename_model_pkls(ranking: ResultRanking, context: ExecutionContext):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
276 """Rename model.pkl files so the rank order is implicit."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
277 for path in context.model_pkl_paths:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
278 model = ResultModelPrediction(path, context)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
279 rank = ranking.get_rank_for_model(model.name)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
280 new_path = (
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
281 context.settings.workdir
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
282 / OUTPUTS['model_pkl'].format(rank=rank)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
283 )
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
284 shutil.copyfile(path, new_path)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
285
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
286
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
287 def extract_pae_to_csv(ranking: ResultRanking, context: ExecutionContext):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
288 """Extract predicted alignment error matrix from pickle files.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
289
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
290 Creates a CSV file for each of five ranked models.
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
291 """
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
292 for path in context.model_pkl_paths:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
293 model = ResultModelPrediction(path, context)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
294 rank = ranking.get_rank_for_model(model.name)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
295 with open(path, 'rb') as f:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
296 data = pk.load(f)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
297 if 'predicted_aligned_error' not in data:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
298 print("Skipping PAE output"
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
299 f" - not found in {path}."
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
300 " Running with model_preset=monomer?")
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
301 return
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
302 pae = data['predicted_aligned_error']
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
303 out_path = (
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
304 context.settings.workdir
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
305 / OUTPUTS['model_pae'].format(rank=rank)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
306 )
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
307 with open(out_path, 'w') as f:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
308 for row in pae:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
309 f.write(','.join([str(x) for x in row]) + '\n')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
310
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
311
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
312 def rekey_relax_metrics(ranking: ResultRanking, context: ExecutionContext):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
313 """Replace keys in relax_metrics.json with 0-indexed rank."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
314 with open(context.relax_metrics) as f:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
315 data = json.load(f)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
316 for k in list(data.keys()):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
317 rank = ranking.get_rank_for_model(k)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
318 data[f'ranked_{rank}'] = data.pop(k)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
319 new_path = context.settings.workdir / OUTPUTS['relax']
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
320 with open(new_path, 'w') as f:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
321 json.dump(data, f)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
322
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
323
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
324 def plddt_pae_plots(ranking: ResultRanking, context: ExecutionContext):
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
325 """Generate a pLDDT + PAE plot for each model."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
326 for path in context.model_pkl_paths:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
327 num_plots = 2
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
328 model = ResultModelPrediction(path, context)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
329 rank = ranking.get_rank_for_model(model.name)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
330 png_path = (
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
331 context.settings.workdir
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
332 / OUTPUTS['model_plot'].format(rank=rank)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
333 )
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
334 plddts = model.data['plddt']
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
335 if 'predicted_aligned_error' in model.data:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
336 pae = model.data['predicted_aligned_error']
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
337 max_pae = model.data['max_predicted_aligned_error']
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
338 else:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
339 num_plots = 1
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
340
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
341 plt.figure(figsize=[8 * num_plots, 6])
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
342 plt.subplot(1, num_plots, 1)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
343 plt.plot(plddts)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
344 plt.title('Predicted LDDT')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
345 plt.xlabel('Residue')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
346 plt.ylabel('pLDDT')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
347
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
348 if num_plots == 2:
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
349 plt.subplot(1, 2, 2)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
350 plt.imshow(pae, vmin=0., vmax=max_pae, cmap='Greens_r')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
351 plt.colorbar(fraction=0.046, pad=0.04)
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
352 plt.title('Predicted Aligned Error')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
353 plt.xlabel('Scored residue')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
354 plt.ylabel('Aligned residue')
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
355
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
356 plt.savefig(png_path)
21
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
357 plt.close()
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
358
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
359
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
360 def plot_msa(wdir: Path, dpi: int = 150):
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
361 """Plot MSA as a heatmap."""
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
362 with open(wdir / 'features.pkl', 'rb') as f:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
363 features = pk.load(f)
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
364
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
365 msa = features.get('msa')
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
366 if msa is None:
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
367 print("Could not plot MSA coverage - 'msa' key not found in"
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
368 " features.pkl")
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
369 return
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
370 seqid = (np.array(msa[0] == msa).mean(-1))
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
371 seqid_sort = seqid.argsort()
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
372 non_gaps = (msa != 21).astype(float)
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
373 non_gaps[non_gaps == 0] = np.nan
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
374 final = non_gaps[seqid_sort] * seqid[seqid_sort, None]
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
375
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
376 plt.figure(figsize=(6, 4))
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
377 # plt.subplot(111)
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
378 plt.title("Sequence coverage")
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
379 plt.imshow(final,
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
380 interpolation='nearest', aspect='auto',
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
381 cmap="rainbow_r", vmin=0, vmax=1, origin='lower')
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
382 plt.plot((msa != 21).sum(0), color='black')
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
383 plt.xlim(-0.5, msa.shape[1] - 0.5)
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
384 plt.ylim(-0.5, msa.shape[0] - 0.5)
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
385 plt.colorbar(label="Sequence identity to query", )
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
386 plt.xlabel("Positions")
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
387 plt.ylabel("Sequences")
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
388 plt.tight_layout()
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
389 plt.savefig(wdir / OUTPUTS['msa'], dpi=dpi)
e7f1b552a695 planemo upload for repository https://github.com/usegalaxy-au/tools-au commit 628c9fdcb77489063145a2307b6bb6a450416dd6-dirty
galaxy-australia
parents: 20
diff changeset
390 plt.close()
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
391
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
392
23
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
393 def collect_msas(settings: Settings):
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
394 """Collect MSA files into ZIP archive(s)."""
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
395
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
396 def zip_dir(directory: Path, is_multimer: bool, name: str):
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
397 chain_id = directory.with_suffix('.zip').stem
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
398 msa_dir = settings.output_dir / 'msas'
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
399 msa_dir.mkdir(exist_ok=True)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
400 zip_name = (
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
401 f"MSA-{chain_id}-{name}.zip"
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
402 if is_multimer
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
403 else f"MSA-{name}.zip")
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
404 zip_path = msa_dir / zip_name
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
405 with zipfile.ZipFile(zip_path, 'w') as z:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
406 for path in directory.glob('*'):
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
407 z.write(path, path.name)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
408
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
409 print("Collecting MSA archives...")
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
410 chain_names = get_input_sequence_ids(
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
411 settings.workdir.parent.parent / 'alphafold.fasta')
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
412 msa_dir = settings.workdir / 'msas'
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
413 is_multimer = (msa_dir / 'A').exists()
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
414 if is_multimer:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
415 msa_dirs = sorted([
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
416 path for path in msa_dir.glob('*')
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
417 if path.is_dir()
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
418 ])
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
419 for i, path in enumerate(msa_dirs):
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
420 zip_dir(path, is_multimer, chain_names[i])
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
421 else:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
422 zip_dir(msa_dir, is_multimer, chain_names[0])
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
423
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
424
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
425 def get_input_sequence_ids(fasta_file: Path) -> List[str]:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
426 """Read headers from the input FASTA file.
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
427 Split them to get a sequence ID and truncate to 20 chars max.
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
428 """
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
429 headers = []
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
430 for line in fasta_file.read_text().split('\n'):
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
431 if line.startswith('>'):
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
432 seq_id = line[1:].split(' ')[0]
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
433 seq_id_trunc = seq_id[:20].strip()
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
434 if len(seq_id) > 20:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
435 seq_id_trunc += '...'
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
436 headers.append(seq_id_trunc)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
437 return headers
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
438
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
439
20
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
440 def template_html(context: ExecutionContext):
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
441 """Template HTML file.
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
442
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
443 Remove buttons that are redundant with limited model outputs.
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
444 """
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
445 print("Templating HTML file...")
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
446 with open(HTML_PATH) as f:
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
447 html = f.read()
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
448 for i in range(len(context.model_pkl_paths), 5):
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
449 btn_id = HTML_BUTTON_ATTR.format(rank=i)
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
450 btn_attr_disabled = HTML_BUTTON_ATTR_DISABLED.format(rank=i)
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
451 html = html.replace(btn_id, btn_attr_disabled)
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
452 with open(context.settings.output_dir / HTML_OUTPUT_FILENAME, 'w') as f:
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
453 f.write(html)
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
454
6ab1a261520a planemo upload for repository https://github.com/usegalaxy-au/tools-au commit c3a90eb12ada44d477541baa4dd6182be29cd554-dirty
galaxy-australia
parents: 16
diff changeset
455
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
456 def main():
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
457 """Parse output files and generate additional output files."""
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
458 settings = Settings()
23
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
459 if not settings.msa_only:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
460 context = ExecutionContext(settings)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
461 ranking = ResultRanking(context)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
462 write_confidence_scores(ranking, context)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
463 rekey_relax_metrics(ranking, context)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
464 template_html(context)
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
465
23
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
466 # Optional outputs
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
467 if settings.output_model_pkls:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
468 rename_model_pkls(ranking, context)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
469 if settings.output_model_plots:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
470 plddt_pae_plots(ranking, context)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
471 if settings.output_pae:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
472 # Only created by monomer_ptm and multimer models
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
473 extract_pae_to_csv(ranking, context)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
474 if settings.output_residue_scores:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
475 write_per_residue_scores(ranking, context)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
476 if settings.plot_msa:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
477 plot_msa(settings.workdir)
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
478 if settings.collect_msas or settings.msa_only:
2891385d6ace planemo upload for repository https://github.com/usegalaxy-au/tools-au commit b347c6ccc82b14fcbff360b3357050d1d43e3ef5-dirty
galaxy-australia
parents: 22
diff changeset
479 collect_msas(settings)
16
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
480
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
481
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
482 if __name__ == '__main__':
f9eb041c518c planemo upload for repository https://github.com/usegalaxy-au/tools-au commit ee77734f1800350fa2a6ef28b2b8eade304a456f-dirty
galaxy-australia
parents:
diff changeset
483 main()