schrodinger.comparison.neighbors module

methods for handling neighbors with ase

class schrodinger.comparison.neighbors.MoleculeNeighbor(molecule_number, ix, iy, iz)

Bases: tuple

ix

Alias for field number 1

iy

Alias for field number 2

iz

Alias for field number 3

molecule_number

Alias for field number 0

class schrodinger.comparison.neighbors.SphericalClusterRecipe(unitcell_molecule_dict: dict[int, Structure], center_molecule_number: int, molecule_neighbors: Tuple[MoleculeNeighbor], box_vectors: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], title: str)

Bases: object

A low memory representation of a spherical atomic cluster. Holds the required ingredients to construct the full spherical cluster structure on demand.

Variables:
  • unitcell_molecule_dict – A dictionary mapping molecule numbers to to extracted Structure objects.

  • center_molecule_number – The integer index of the molecule that acts as the origin (center) of the spherical cluster.

  • molecule_neighbors – A tuple of MoleculeNeighbor objects representing all molecules within the cutoff radius.

  • box_vectors – The box vectors from st.pbc.getBoxVectors(), formatted as a 3x3 NumPy array where each row is a lattice vector.

  • title – The title string to assign to the generated cluster.

unitcell_molecule_dict: dict[int, Structure]
center_molecule_number: int
molecule_neighbors: Tuple[MoleculeNeighbor]
box_vectors: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]
title: str
property mol_total: int

Total number of molecules in the spherical cluster, including the center molecule and all neighbors.

property atom_total: int

Total number of atoms in the spherical cluster

property cluster: Structure

Constructs and returns the full spherical cluster Structure. Computed freshly on each call to prevent holding entire cluster in memory when it’s not being used

classmethod from_prebuilt_cluster(st: Structure, title=None)

Wrap a pre-built spherical cluster to support spherical_cluster_as_input in dedup_utils.deduplicate_crystals.

__init__(unitcell_molecule_dict: dict[int, Structure], center_molecule_number: int, molecule_neighbors: Tuple[MoleculeNeighbor], box_vectors: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], title: str) None
schrodinger.comparison.neighbors.get_n_subunits_wrapper(st: Structure, ref: Structure, zprime2: bool) int

Function wrapping common component ID validation for cluster alignment functions below.

Logic is equivalent to struc.get_n_subunits_in_asu(), but we perform extra validation and implement a backwards compatibility policy: because many test files were created before component IDs are implemented, we log a warning and default to assuming a Z’=1 single-molecule ASU only if zprime2 = False and all molecules have the same number of atoms. If those conditions aren’t met, an error is raised.

Parameters:
  • st – The input test structure

  • ref – The input reference structure

  • zprime2 – True for a Z’=2 search.

Returns:

Number of distinct molecules/ions in the ASU of st.

schrodinger.comparison.neighbors.com_cell_atoms(st, unit_cell, symbol='C')

Construct an Atoms instance where each atom represents the COM of a molecule in st

params:

st (Structure): the input crystal unit_cell (3x3 np.array): lattice vectors are rows symbol (str): symbol for each atom in COM cell

returns:

An Atoms instance that gives the COM cell

schrodinger.comparison.neighbors.spherical_cluster(st, Rcut)

Construct a spherical cluster of radius Rcut about each molecules center of mass in st. The Nth value of the generator is centered about the Nth molecule.

params:

st (Structure): represents the conventional cell unit_cell (3x3 np.array): lattice vectors as rows Rcut (float): cutoff for including molecules

returns:

Generator of clusters

schrodinger.comparison.neighbors.spherical_atomic_cluster_recipe(st: Structure, Rcut: float, center_idx=1) SphericalClusterRecipe

Construct a recipe describing a cluster of neighbors centered about the center_idx’th molecule. This implementation includes any molecule that has any atom inside the cutoff radius with respect to the central molecule.

Parameters:
  • st – Represents the conventional cell

  • Rcut – Cutoff for including molecules

  • center_idx – Molecule index for the spherical cluster center

Returns:

Returns a lightweight spherical cluster recipe

schrodinger.comparison.neighbors.spherical_atomic_cluster(st: Structure, Rcut: float, center_idx=1) Structure

Convenience wrapper that constructs and returns a fully realized spherical cluster Structure.

For full parameter documentation and algorithmic details, please see spherical_atomic_cluster_recipe.

Returns:

The fully constructed spherical cluster Structure.

schrodinger.comparison.neighbors.search_cluster_radius_recipe(st: Structure, Ncluster_min: int = 25, Ncluster_max: int = 35, r: float = 4.0, dRthresh: float = 0.01, center_idx: int = 1) Tuple[SphericalClusterRecipe, float]

Search for a cluster radius that gives a spherical cluster with a certain number of molecules. Returns a recipe for the cluster that can be used to generate the cluster on-the-fly.

Parameters:
  • st – the crystal

  • Ncluster_min – minimum number of molecules in cluster

  • Ncluster_max – maximum number of molecules in cluster

  • r – radius to start search

  • dRthresh – stop search if can’t find Ncluster_min < N < Ncluster_max within precision dRthresh

  • center_idx – Molecule index for the spherical cluster center

Returns:

tuple of (SphericalClusterRecipe, radius)

schrodinger.comparison.neighbors.search_cluster_radius(st: Structure, Ncluster_min: int = 25, Ncluster_max: int = 35, r: float = 4.0, dRthresh: float = 0.01, center_idx: int = 1) Tuple[Structure, float]

Convenience wrapper that constructs and returns a fully realized spherical cluster Structure and its radius.

For full parameter documentation and algorithmic details, please see search_cluster_radius_recipe.

Returns:

tuple of (spherical cluster Structure, radius)

Raises:

RuntimeError – If a cluster satisfying the limits is not found.

schrodinger.comparison.neighbors.preprocess_st(st: Structure, *, remove_hydrogens=True, remove_asl='', is_spherical_cluster=False, copy=False) Structure

Preprocess a structure for spherical cluster RMSD-N calculation by optionally removing hydrogens and atoms matching a specific ASL expression.

Parameters:
  • st – The input structure.

  • remove_hydrogens – If True, removes all hydrogen atoms.

  • remove_asl – ASL expression defining additional atoms to remove.

  • is_spherical_cluster – Assumes input is a spherical cluster, which adds validation step that the remove_asl does not completely remove the central molecule (molecule 1).

  • copy – If True, operations are performed on a copy of the structure. If False, the input structure is modified in place.

Returns:

The preprocessed Structure object.

schrodinger.comparison.neighbors.prepare_data_recipe(st: Structure, is_spherical_cluster: bool = False, remove_hydrogens: bool = True, remove_asl: str = '', min_molecules: int = 25, max_molecules: int = 35, zprime2: bool = False, regenerate_unitcell: bool = False) List[SphericalClusterRecipe]

Top-level function for constructing spherical clusters for crystal alignment, deduplication and RMSD-N calculation.

Returns a list of 1 or 2 spherical cluster recipes centered on ASU subunits. If zprime2 = False, 1 cluster will be returned. If zprime2 = True, 2 clusters (each centered on a different subunit) will be returned, unless the 2nd subunit in the ASU is an excipient which has too few atoms to fix an alignment, in which case only the cluster centered on the conformer will be returned. The different subunits are assumed to be the first Z’ molecules of st.

Parameters:
  • st – Input structure. Assumed to be one of - pre-prepared spherical cluster, in which case is_spherical_cluster should be True and this is a no-op (not supported for Z’=2) - a unitcell with lattice properties set. - an ASU with lattice properties set and regenerate_unitcell=True. The unitcell will be regenerated automatically for Z’>1 if ASU component IDs are not set.

  • remove_hydrogens – True to remove Hs from output.

  • remove_asl – ASL expression to remove additional atoms from output.

  • min_molecules – target minimum number of mols in each returned cluster

  • max_molecules – target max number of mols in each returned cluster

  • zprime2 – True if Z’=2

  • regenerate_unitcell – True to recreate the unit cell from the first Z’ molecules of st; otherwise input assumed to be a unit cell.

Returns:

List of length <= Z’ containing SphericalClusterRecipes. The list may be shorter than the number of subunits in the ASU due to skipping excipients that are too small to determine an alignment.

schrodinger.comparison.neighbors.prepare_data(st: Structure, is_spherical_cluster: bool = False, remove_hydrogens: bool = True, remove_asl: str = 'water', min_molecules: int = 25, max_molecules: int = 35, zprime2: bool = False, regenerate_unitcell: bool = False) List[Structure]

Convenience wrapper that constructs and returns fully realized spherical cluster Structures.

For full parameter documentation and algorithmic details, please see prepare_data_recipe.

Returns:

List of fully constructed spherical cluster Structures.

schrodinger.comparison.neighbors.get_spherical_cluster_RMSDn(st: Structure, ref: Structure, matching_cutoff: float = 2, matched_cutoff: int = 15, align_cluster=True, include_H=False, renumber_rmsd_thresh=0.8, allow_reflection=True, pg_symmetry_ops=(), n_thresh: int = 20, zprime2: bool = False) Tuple[int, float, Structure]

Return number of molecules matched, their RMSD, and the aligned st with respect to ref. The st and ref are assumed to be spherical clusters with ASU component IDs assigned.

Parameters:
  • st – test Structure to align

  • ref – reference Structure (unmoved)

  • matching_cutoff – If the centroid of a molecule is within the radius of another molecule from a different cluster, it is considered as matched.

  • matched_cutoff – If the number of matched centroids is less than this number, abort further computation

  • align_cluster – If not set, only align on the central molecules, i.e., the first molecules of the input structures; otherwise further align on all molecules

  • include_H – if False using heavy atom rmsd, else use all atom rmsd

  • renumber_rmsd_thresh – only attempt atom renumber if rmsd for 1st molecule is greater than this

  • allow_reflection – whether or not to allow reflection when optimizing rmsd in scoring method

  • pg_symmetry_ops – iterable of SymmOp instances, each representing a point group symmetry operation. If the iterable is not empty each operation is applied to the test cluster RMSD N analysis is performed. The best RMSD N is returned.

  • n_thresh – Centroid threshold for comparing spherical cluster alignments on the basis of RMSD only.

  • zprime2 – Flag to align Z’=2 crystals; if False, assume all molecules are equivalent by space group symmetry operations

Returns:

three-tuple: N matched, rmsd of match, st after renumbering/alignment A new structure is returned

schrodinger.comparison.neighbors.get_centroid_RMSDn(st: Structure, ref: Structure, matching_cutoff: float = 2, matched_cutoff: int = 15, allow_reflection=True, n_maybe=5, n_nb=3, n_thresh: int = 20, parity_seen=None, zprime2: bool = False) Iterator[Tuple[int, float, Structure]]

Return number of molecules matched, their RMSD, and the aligned st with respect to ref. The st and ref are assumed to be spherical clusters with ASU component IDs assigned.

Parameters:
  • matching_cutoff – If the centroid of a molecule is within the radius of another molecule from a different cluster, it is considered as matched.

  • matched_cutoff – If the number of matched centroids is less than this number, abort further computation

  • n_maybe – number of small radius centroids for maybe-inlier test

  • n_nb – number of neighboring centroids (besides the central one) for maybe-inlier test

  • n_thresh – Centroid threshold for comparing centroid alignments on the basis of RMSD only.

  • parity_seen – Only attempt centroid alignment if the corresponding parity has not been tried (proper or improper rotations)

  • zprime2 – Flag to align Z’=2 crystals; if False, assume all molecules are equivalent by space group symmetry operations

schrodinger.comparison.neighbors.run_centroid_rmsdn_then_spherical_rmsdn(test_cluster, ref_cluster, matching_cutoff: float = 2, matched_cutoff: int = 15, allow_reflection=True, renumber_rmsd_thresh=0.5, align_cluster=True, pg_symmetry_ops=(), n_thresh: int = 20, zprime2: bool = False)

Return atomic spherical RMSDn and the corresponding aligned cluster from running centroid RMSDn and atomic RMSDn sequentially.

See get_spherical_cluster_RMSDn for full argument docs

Parameters:
  • matching_cutoff – If the centroid of a molecule is within the radius of another molecule from a different cluster, it is considered as matched.

  • matched_cutoff – If the number of matched centroids is less than this number, abort further computation

  • n_thresh – Centroid threshold for comparing spherical cluster alignments on the basis of RMSD only.

  • zprime2 – Flag to align Z’=2 crystals; if False, assume all molecules are equivalent by space group symmetry operations

schrodinger.comparison.neighbors.compute_rmsdn(test_data: List[Structure], ref_data: List[Structure], use_point_group_symmetry=False, skip_centroid_rmsdn=False, matching_cutoff: float = 2, matched_cutoff: Optional[int] = None, renumber_rmsd_thresh=0.5, align_cluster=False, allow_reflection=True, n_thresh: int = 20, zprime2: bool = False) Tuple[int, float, Structure]

Here both test_data and ref_data correspond to 1 crystal structure. They don’t need to have the same Z’ value.

See get_spherical_cluster_RMSDn for full keyword argument docs