Example Notebook: Atom Mappings#
In this example we want to showcase how to generate the Kartograf mappings on the RHFE Data set, which was used for our publication.
Get Data:#
In this cell we will load the molecules as components from openfe-benchmarks. Note, that openfe-benchmarks contains aligned molecules with 3D coordinates. It is a general assumption for Kartografs atom mapper to get input molecules with well aligned conformations.
[8]:
import tarfile
from gufe import SmallMoleculeComponent
from rdkit import Chem
components = []
with tarfile.open("benzenes.tar.gz", mode="r:gz") as tar:
for member in tar:
with tar.extractfile(member) as f:
mol = SmallMoleculeComponent.from_msgpack(content=f.read())
components.append(mol)
Chem.Draw.MolsToGridImage([c.to_rdkit() for c in components])
[8]:
Generate Atom Mappings with Kartograf:#
Next we will generate all possible atom mappings, for the given Ligand cA. These mappings will be visualized in 3D in the latter cell.
[9]:
from kartograf import KartografAtomMapper
atomMapper = KartografAtomMapper()
# Generate Mappings
mappings = []
cA = components[-5] # central ligand from Ries et al. 2022
for cB in components:
if cA != cB:
mapping = next(atomMapper.suggest_mappings(cA, cB))
mappings.append(mapping)
[10]:
from gufe.visualization.mapping_visualization import display_mappings_3d
display_mappings_3d(mappings)
[10]:
Scoring Metrics for Atom mappings:#
Finally we want to compare the different mappings with rule-based scoring metrics. This way for example we can try to estimate the complexity of the transformation from cA to cB.
[11]:
from kartograf.mapping_metrics import (
MappingRMSDScorer,
MappingShapeMismatchScorer,
MappingShapeOverlapScorer,
MappingVolumeRatioScorer,
)
scorer_dict = {
"volume_score": MappingVolumeRatioScorer(),
"rmsd_score": MappingRMSDScorer(),
"overlap_score": MappingShapeOverlapScorer(),
"mismatch_score": MappingShapeMismatchScorer(),
}
def apply_scorers(mapping) -> None:
for score_name, scorer in scorer_dict.items():
setattr(mapping, score_name, scorer(mapping))
# score mappings:
for mapping in mappings:
apply_scorers(mapping)
[12]:
from matplotlib import pyplot as plt
score_names = sorted(scorer_dict)
plt.boxplot([[getattr(m, score_name) for m in mappings] for score_name in score_names])
plt.xticks(range(1, len(score_names) + 1), score_names, rotation=45)
[12]:
([<matplotlib.axis.XTick at 0x73a320a19f90>,
<matplotlib.axis.XTick at 0x73a3209d65d0>,
<matplotlib.axis.XTick at 0x73a3209d6e90>,
<matplotlib.axis.XTick at 0x73a3209d7610>],
[Text(1, 0, 'mismatch_score'),
Text(2, 0, 'overlap_score'),
Text(3, 0, 'rmsd_score'),
Text(4, 0, 'volume_score')])
[13]:
from matplotlib import pyplot as plt
from scipy.stats import spearmanr
score_names = sorted(scorer_dict)
fig, axes = plt.subplots(nrows=len(score_names), ncols=len(score_names), figsize=[16, 9])
i = 0
for score_nameA in score_names:
j = 0
axes[i, j].set_ylabel(score_nameA)
for score_nameB in score_names:
ax = axes[i, j]
if i == len(score_names) - 1:
ax.set_xlabel(score_nameB)
else:
ax.set_xticklabels([])
if j > 0:
ax.set_yticklabels([])
x = [getattr(m, score_nameA) for m in mappings]
y = [getattr(m, score_nameB) for m in mappings]
r, _ = spearmanr(x, y)
ax.scatter(x, y)
ax.text(0.1, 0.85, "$r_{spearman}:~$" + str(round(r, 2)))
ax.set_xlim([0, 1])
ax.set_ylim([0, 1])
j += 1
i += 1
fig.tight_layout()
fig.subplots_adjust(wspace=0, hspace=0)
[ ]: