schrodinger.structutils.rmsd module

Functionality for calculating conformer RMSDs.

Copyright Schrodinger, LLC. All rights reserved.

exception schrodinger.structutils.rmsd.DisabledSymmetryWarning

Bases: RuntimeWarning

class for ConformerRmsd warning about Disabling use_symmetry

__init__(*args, **kwargs)
args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class schrodinger.structutils.rmsd.Alignment(alignment: Tuple[Tuple[schrodinger.structure._structure._Residue], Tuple[schrodinger.structure._structure._Residue]], length: int, rmsd: float, transform_matrix: numpy.array)

Bases: tuple

Container to store CEAlign alignment data.

alignment: Tuple[Tuple[schrodinger.structure._structure._Residue], Tuple[schrodinger.structure._structure._Residue]]

Alias for field number 0

length: int

Alias for field number 1

rmsd: float

Alias for field number 2

transform_matrix: numpy.array

Alias for field number 3

__contains__(key, /)

Return key in self.

__len__()

Return len(self).

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

schrodinger.structutils.rmsd.calculate_in_place_rmsd(st1, at_list1, st2, at_list2, use_symmetry=False, weights=None)
Returns

Atomic coordinate rmsd between structures.

Return type

float

Parameters
  • st1 (structure.Structure) – Reference structure.

  • at_list1 (list) – List of atom index integers to consider. This must be the same length as at_list2. The coordinates from st1 and st2 are mapped in index order.

  • st2 (structure.Structure) – Test structure. Must be a conformer of st1 if use_symmetry is True.

  • at_list2 (list) – List of atom index integers to consider. This must be the same length as at_list1. The coordinates from st1 and st2 are mapped in index order.

  • use_symmetry (boolean) – Adjust at_list2 index order such that it is optimized with regard to molecular symmetry. The default is False for backwards compatibility, accounting for molecular symmetry is usually desireable.

  • weights – A list of weights, which must be the same length as at_list1. The weights in this list pertain to the relative importance to assign to the position deviations of atoms in at_list2 from atoms in at_list1. The weights list is assumed to contain positive values.

Raises

ValueError if the shape of the coordinate matrices is not the same, or if the length of the weights array is not the same at the length of the coordinate array. TypeError if the data type of the weights is not float.

Note

The input structures are expected to be conformers. See also ConformerRmsd which supports calculating the RMSD of a common conformer atom subset specified by ASL with non-conformer input structures.

schrodinger.structutils.rmsd.superimpose(st_fixed, atlist_fixed, st_move, atlist_move, use_symmetry=False, move_which=1)

wrapper for _superimpose_single_atoms, _superimpose_atom_pairs, and _superimpose_many_atoms. This handles cases with 1 or 2 atoms which fail in mm_superimpose (in addition to more atoms). This function superposes the two molecules and returns the rms.

move_which can be set to structutils.rmsd.X, where X=CT, ATOMS, MOLECULES

Parameters
  • st_fixed (Structure) – Structure being superposed on, will not be moved

  • atlist_fixed (list of ints) – list of atom indices of st_fixed to be considered for superposition

  • st2 (Structure) – Structure being superposed, some fraction will be moved

  • atlist2 (list of ints) – list of atom indices of st2 to be considered for superposition

  • use_symmetry (bool) – If True, symmetry will automatically be detected and the lowest RMS obtained by doing all symmetry-related comparisons will be returned. This option can only be used if the input structures are conformers. Only used for the many-atom case.

  • move_which (MOLECULES, ATOMS, CT) – moveable unit for superposing st2 onto st_fixed

Return param

rms of st_fixed and st2 (over the atoms in the atlist) after superposition

Return type

float

schrodinger.structutils.rmsd.superimpose_substructure_molecules(st1, atlist1, st2, atlist2)

Superpose st1 and st2 by moving each molecule independently.

Parameters
  • st1 (Structure) – first structure

  • atlist1 (list of ints) – list of atom indexes for structure 1

  • st2 (Structure) – second structure

  • atlist2 (list of ints) – list of atom indexes for structure 2

Returns

rmsd, st1 and st2 rotated and translated to align

schrodinger.structutils.rmsd.superimpose_bond(st1, at_pair1, st2, at_pair2)

Translate and rotate st2 in place, putting the first atom of at_pair2 on top of the first atom of at_pair1, and the second atom of at_pair2 as close as possible to the corresponding atom of at_pair1. This is commonly used for aligning bonds, but it is not a requirement that the two atoms in each pair be bonded.

st1 and st2 must be different structure objects.

Parameters
  • st1 (structure.Structure) – reference structure

  • at_pair1 (sequence of int) – pair of atom indexes from st1

  • st2 (structure.Structure) – structure to be rotated (modified in place)

  • at_pair2 (sequence of int) – pair of atom indexes from st2

schrodinger.structutils.rmsd.get_super_transformation_matrix(st1, at_list1, st2, at_list2)
Returns

numpy matrix for the tranformation that will best superimpose atoms of st2 onto atoms of st1.

Return type

numpy array

Parameters
  • st1 (structure.Structure) – Reference (non-moving) structure.

  • at_list1 (list) – Atom indexes from st1 to consider. Must be the same size as at_list2 and contain at least three atom indexes.

  • st2 (structure.Structure) – Test (moving) structure to be transformed onto st1.

  • at_list2 (list) – Atom indexes from st2 to consider. Must be the same size as at_list1 and contain at least three atom indexes.

schrodinger.structutils.rmsd.get_optimal_atom_mapping(reference_structure, reference_atom_list, test_structure, test_atom_list, in_place=True)

FOR USE BY ConformerRmsd class only!

Returns a list of test_structure atom indexes for the optimal, symmetry-aware, pair-wise alignment with the reference_structure.

Parameters

reference_structure (structure.Structure)

reference_atom_list (list of int)

A list of atom indices.

test_structure (structure.Structure)

test_atom_list (list of int)

A list of atom indices.

Parameters

in_place (bool) – If False, will superimpose the test_structure on top of the reference_structure in addition to calculating the mapping.

This method is semi-private because the ConformerRmsd class depends on it.

schrodinger.structutils.rmsd.renumber_conformer(ref_st, test_st, use_symmetry=False, in_place=True)

Renumbers atoms in <test_st> so that they match <ref_st>. Structures must have hydrogens, see CANVAS-5422.

Parameters
  • ref_st (structure.Structure) – Structure to use as a reference for renumbering.

  • test_st (structure.Structure) – Structure to renumber (renumbered copy will be returned)

  • use_symmetry (bool) – Whether to consider symmetry. Always set to True unless mmsym is going to be used on the output structure.

  • in_place (bool) – If True (default), the conformer is only renumbered; if False, it will also be superimposed on top of the ref_st based on the new atom numbering.

:return Renumbered version of test structure. :rtype structure.Structure

class schrodinger.structutils.rmsd.ConformerRmsd(reference_structure, test_structure=None, asl_expr='NOT atom.element H', in_place=True)

Bases: object

A class to calculate the root mean square deviatation between the atomic coordinates of two conformer structure.Structure objects. The inputs are expected to be conformers in the traditional sense.

Working copies of the input structures are modified instead the original. The superimpose transformation is applied to the entire test_structure. The transformation matrix is saved as superposition_matrix property if in_place is set to False.

Renumbering is achieved by creating a list of SMARTS patterns, one for each molecule in the reference structure, evaluating the SMARTS pattern with both the reference and test structures to get a standard order of atom indexes, then passing that atom order to mm.mmct_ct_reorder. Renumbering can be slow with protein-sized molecules so you may want to disable that feature when working with large molecules.

API Example:

# Calculate in place, heavy atom RMSD.
st1 = structure.Structure.read('file1.mae')
st2 = structure.Structure.read('file2.mae')
conf_rmsd = ConformerRmsd(st1, st2) # in place, heavy atom RMSD calc.
if conf_rmsd.calculate() < 2.00:
    print "Good pose"

# Loop over structures in test.mae, comparing to ref.mae
st1 = structure.Structure.read('ref.mae')
conf_rmsd = ConformerRmsd(st1, st1) # in place, heavy atom RMSD calc.
for st in structure.StructureReader('test.mae'):
    conf_rmsd.test_structure = st
    print conf_rmsd.calculate()

Instance Attributes

Variables
  • use_symmetry – Boolean to control whether the test structure atom list should be determined by with the mmsym library. Mmsym accounts for molecular symmetry and is recommended. This boolean just allows for more detailed testing. NOTE: Make sure use_symmetry is True if using renumber_structures.

  • renumber_structures – Boolean to control whether the reference and test structures should be renumbered by a SMARTS pattern before calculating the rmsd. For better performance, set to False when the inputs are sure to have the same atomic numbering schemes. NOTE: Make sure use_symmetry is True if using renumber_structures.

  • use_heavy_atom_graph – Boolean to control whether the reference and test structures should be treated as heavy-atom only, graph topologies. Default is False. Tautomers, and different ionization states are not true conformers, but often require RMSD analysis. If True, the test_structure and reference_structure are treated by deleting all hydrogens, setting all bond orders to 1, setting all formal charges to 0, then adjusting the atom types.

  • orig_index_prop – m2io dataname for atomic property that stores the original atom index of the input structures. Default is ‘i_confrmsd_original_index’. This is needed to we can extract/reduce/renumber and present information about the original index which the end-user perceives.

  • rmsd (float) – Root mean square deviation of atomic coordintates

  • max_distance (float) – Greatest displacement between the atom pairs

  • max_distance_atom_1 (integer) – Reference atom index (in original atom scheme)

  • max_distance_atom_2 (integer) – Test atom index (in the original atom scheme)

  • rmsd_str – String of basic rmsd info == str(self)

  • precision – Precision of rmsd stored to find minimum if search_permutations=True defaults to 6 (meaning a precision of 10^-6)

  • max_permutations – Maximum number of permutations to search defaults to 10000000

  • superposition_matrix – Matrix that was used for superimposing the test structure on top of the reference structure, if in_place=True.

Note

The following attributes are available after calculate()

Raise

exceptions if a preparation step can’t be completed, or if the input structures can’t be handled as conformers.

orig_index_prop = 'i_confrmsd_original_index'
__init__(reference_structure, test_structure=None, asl_expr='NOT atom.element H', in_place=True)
Parameters
  • reference_structure (structure.Structure) – Template structure

  • test_structure (structure.Structure) – The mobile structure.

  • asl_expr (string) – Atom Language Expression to identify the the atoms used to calculate the RMSD and base the superimpose alignment.

  • in_place (Boolean) – If True, calculate the RMSD without moving the test_structure. Otherwise, perform the optimal alignment then calculate the RMSD.

getRmsdDataname()
Returns

m2io property dataname string. The property name indicates the reference structure, the title, the ASL used to identify comparison atoms and if the structure is in-place or mobile.

Return type

string

writeStructures(file_name='rmsd.mae', mode='w')

Writes the reference and test structures to file.

Parameters
  • file_name (string) – Path of the structure file to write.

  • mode (string) – ‘w’ => write, clobber as needed ‘a’ => append

writeCommand(file_name='rmsd.cmd')

Writes a Maestro command file and structures with the pair wise atom mapping in command file mode. The Maestro file has the same basename as the command file. Clobbers existing files.

Parameters

file_name (string) – Path to the maestro command file with the atom pairings.

Raise

ValueError if file_name does not have ‘.cmd’ extension.

evaluateAsl(st)

Return the atoms of the input structure that match <asl_expr>.

Parameters

st (structure.Structure) – Structure to evaluate

Returns

Atoms matching the ASL

Return type

list of ints

:raise RuntimeError if ASL matching failed.

calculate()
Returns

Root-mean-squared difference of atom coordinates.

Type

float

Raise

ValueError if working versions of the reference and test structures don’t have the same shape (non-confs).

The order of operations:
  • prepare working copies of the reference and test structures.

    ** copy the structures. ** encode the original atom indexes as atom properties. ** extract substructure of the atoms matching the ASL. ** reduce to heavy atom graph (instance option, non-default). ** normalize numbering scheme (instance option, default).

  • determine molecular symmetry mapping (optional, default).

  • create a numpy coordinate array for the working structures.

  • numpy linear algebra SVD to superimpose (instance option).

    ** transform test_structure

  • numpy array used to calculate RMSD, and max_dist.

  • decode original indexes to identify atoms involve in max dist.

renumberBySymmetry()

Renumber the atoms in _working_test_st based off the reference structure ussing mmsym.

renumberWorkingStructures()

Renumber the working structures to give them identical numbering. By default, the test structure is renumbered to match the reference using renumber_conformer() functon; but this behavior can be changed by over-riding this method in a subclass.

class schrodinger.structutils.rmsd.ConformerRmsdX(*args, **kwargs)

Bases: schrodinger.structutils.rmsd.ConformerRmsd

__init__(*args, **kwargs)
Parameters
  • reference_structure (structure.Structure) – Template structure

  • test_structure (structure.Structure) – The mobile structure.

  • asl_expr (string) – Atom Language Expression to identify the the atoms used to calculate the RMSD and base the superimpose alignment.

  • in_place (Boolean) – If True, calculate the RMSD without moving the test_structure. Otherwise, perform the optimal alignment then calculate the RMSD.

calculate()
Returns

Root-mean-squared difference of atom coordinates.

Type

float

Raise

ValueError if working versions of the reference and test structures don’t have the same shape (non-confs).

The order of operations:
  • prepare working copies of the reference and test structures.

    ** copy the structures. ** encode the original atom indexes as atom properties. ** extract substructure of the atoms matching the ASL. ** reduce to heavy atom graph (instance option, non-default). ** normalize numbering scheme (instance option, default).

  • determine molecular symmetry mapping (optional, default).

  • create a numpy coordinate array for the working structures.

  • numpy linear algebra SVD to superimpose (instance option).

    ** transform test_structure

  • numpy array used to calculate RMSD, and max_dist.

  • decode original indexes to identify atoms involve in max dist.

evaluateAsl(st)

Return the atoms of the input structure that match <asl_expr>.

Parameters

st (structure.Structure) – Structure to evaluate

Returns

Atoms matching the ASL

Return type

list of ints

:raise RuntimeError if ASL matching failed.

getRmsdDataname()
Returns

m2io property dataname string. The property name indicates the reference structure, the title, the ASL used to identify comparison atoms and if the structure is in-place or mobile.

Return type

string

orig_index_prop = 'i_confrmsd_original_index'
renumberBySymmetry()

Renumber the atoms in _working_test_st based off the reference structure ussing mmsym.

renumberWorkingStructures()

Renumber the working structures to give them identical numbering. By default, the test structure is renumbered to match the reference using renumber_conformer() functon; but this behavior can be changed by over-riding this method in a subclass.

writeCommand(file_name='rmsd.cmd')

Writes a Maestro command file and structures with the pair wise atom mapping in command file mode. The Maestro file has the same basename as the command file. Clobbers existing files.

Parameters

file_name (string) – Path to the maestro command file with the atom pairings.

Raise

ValueError if file_name does not have ‘.cmd’ extension.

writeStructures(file_name='rmsd.mae', mode='w')

Writes the reference and test structures to file.

Parameters
  • file_name (string) – Path of the structure file to write.

  • mode (string) – ‘w’ => write, clobber as needed ‘a’ => append

schrodinger.structutils.rmsd.cealign(st_refe, st_mobi, atlist_refe=None, atlist_mobi=None, transform_mobile=True, window_size=8, max_gap=30, use_guide=True)

Align two structures using the CEAlign method.

Parameters
  • st_refe (structure.Structure) – Reference structure, always remains fixed.

  • st_mobi (structure.Structure) – Mobile structure to be aligned onto the reference.

  • atlist_refe (list(int)) – List of atom indices from reference structure to be used for the alignment and superposition. If None, depending on the value of use_guide, includes either all guide atoms or all heavy atoms in the structure.

  • atlist_mobi (list(int)) – List of atom indices from mobile structure to be used for the alignment and superposition. If None, depending on the value of use_guide, includes either all guide atoms or all heavy atoms in the structure.

  • transform_mobile (bool) – If True (default), the coordinates of the mobile structure will be modified after calculating the optimal alignment.

  • window_size (float) – CE algorithm parameter. Defines the length of fragments to be used for the alignment. Shorter values may yield more accurate alignments, up to a certain point, at the expense of computation time. Default is 8, as defined in the original publication.

  • max_gap (float) – CE algorithm parameter. Size of maximum gap allowed between two aligned fragment pairs, in number of residues. Short values yield faster computation times, but might prevent similar fragments from being found by the algorithm. Default is 30, as defined in the original publication and found to be a reasonable compromise between computational cost and alignment quality.

  • use_guide (bool) – Use guide atoms to calculate the optimal alignment. Guide atoms depend on the molecule type: C-alpha for protein, C4’ for nucleic acid. If use_guide is False, will use all heavy atoms, which degrades performance of the algorithm without significant increase in accuracy.

Return type

Alignment

Returns

a named tuple with the RMSD (in Angstrom) between the two structures after superposition, the alignment length, a mapping of the aligned residues, and the transform matrix as a 4x4 numpy array.