Molecule Alignment Algorithms#

Kartograf provides two complementary algorithms for aligning small molecules in 3D space. Both functions accept a molecule to be moved and a reference molecule, and return a new, aligned copy of the input molecule without modifying the original.

Note

Both functions operate on SmallMoleculeComponent objects and return a deep copy of the input molecule with updated 3D coordinates. The reference molecule is never modified.


Skeleton-Based Alignment (MCS)#

kartograf.atom_aligner.align_mol_skeletons(mol: SmallMoleculeComponent, ref_mol: SmallMoleculeComponent) SmallMoleculeComponent

Aligns a molecule to a reference by superimposing their shared Maximum Common Substructure (MCS).

This function uses RDKit’s rdkit.Chem.rdFMCS module to identify the largest common subgraph between the two molecules, then calls AlignMol() to minimise the RMSD over the matched atom pairs. Because the atom-type comparator is set to CompareAny the MCS search matches atoms regardless of their element, making it tolerant of scaffold hops and heteroatom substitutions.

Parameters:
Returns:

An aligned copy of mol superimposed on ref_mol via the MCS.

Return type:

SmallMoleculeComponent

When to use this algorithm

Choose skeleton-based alignment when:

  • The two molecules share a recognisable common scaffold (e.g. an R-group optimisation series).

  • You are using a method that requires a well mapped common core (e.g. openfe’s HybridTopologyProtocol).

  • You want the alignment to reflect chemical similarity rather than overall 3D shape.

  • The molecules differ primarily in peripheral substituents and the core geometry should be conserved.

Algorithm outline

  1. Find the MCS of the two molecules using CompareAny atom typing (topology-only matching).

  2. Convert the MCS SMARTS pattern to an atom-index mapping between the two molecules.

  3. Call rdkit.Chem.rdMolAlign.AlignMol() with the explicit atom map to minimise the RMSD over the MCS atoms only.

Example

from kartograf import align_mol_skeletons
from gufe import SmallMoleculeComponent

mol = SmallMoleculeComponent.from_sdf_file("ligand.sdf")
ref_mol = SmallMoleculeComponent.from_sdf_file("reference.sdf")

aligned_mol = align_mol_skeletons(mol, ref_mol)

Warning

If the two molecules share no common substructure the MCS will be empty and the alignment will be undefined. Pre-check your molecules if a shared scaffold cannot be assumed.

Shape-Based Alignment (Open3DAlign)#

kartograf.atom_aligner.align_mol_shape(mol: SmallMoleculeComponent, ref_mol: SmallMoleculeComponent) SmallMoleculeComponent

Aligns a molecule to a reference by maximising the overlap of their 3D shapes using the Open3DAlign (O3A) algorithm.

This function wraps RDKit’s GetO3A(), which scores alignment quality using a combination of atom-pair distances and partial-charge similarities. The alignment is purely geometry-driven and does not require any shared substructure.

Parameters:
Returns:

An aligned copy of mol whose 3D shape best overlaps ref_mol.

Return type:

SmallMoleculeComponent

When to use this algorithm

Choose shape-based alignment when:

  • You are not using a method that requires a well mapped common core (e.g. openfe’s SepTopProtocol).

  • The molecules belong to different chemical series (scaffold hops, bioisosteric replacements) but are expected to occupy a similar binding volume.

  • No obvious MCS exists or the MCS is too small to anchor a reliable structural overlay.

  • You wish to compare or cluster molecules by 3D pharmacophoric shape.

Algorithm outline

  1. Compute the Open3DAlign score between the probe and reference molecules using GetO3A().

  2. Call Align() to apply the optimal rigid-body rotation and translation that maximises shape overlap.

  3. The alignment score (a float) is logged at DEBUG level for diagnostic purposes.

Example

from kartograf import align_mol_shape
from gufe import SmallMoleculeComponent

mol = SmallMoleculeComponent.from_sdf_file("ligand.sdf")
ref_mol = SmallMoleculeComponent.from_sdf_file("reference.sdf")

aligned_mol = align_mol_shape(mol, ref_mol)

Note

Open3DAlign requires that both molecules carry 3D coordinates and (optionally) partial charges for optimal scoring. Ensure that conformers have been generated before calling this function.

Choosing Between the Two Methods#

Criterion

Skeleton (MCS)

Shape (O3A)

Requires shared substructure

Yes

No

Sensitive to scaffold changes

High

Low

Best for congeneric series

Best for diverse scaffolds

Alignment driven by

Atom topology

3D volume & charges

Underlying RDKit module

rdFMCS / AllChem

rdMolAlign (O3A)

For a congeneric series, or when using a single or hybrid topology scheme (such as openfe’s RelativeHybridTopology method), where you want to respect the common core use align_mol_skeletons. For structurally diverse molecules where binding-volume overlap is the primary concern, or when applying a method that does not require common core overlap (e.g. Separated Topologies), use align_mol_shape.