schrodinger.structutils.rmsd module¶
Functionality for calculating conformer RMSDs.
Copyright Schrodinger, LLC. All rights reserved.
- exception schrodinger.structutils.rmsd.DisabledSymmetryWarning¶
Bases:
RuntimeWarning
class for ConformerRmsd warning about Disabling use_symmetry
- __init__(*args, **kwargs)¶
- args¶
- with_traceback()¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class schrodinger.structutils.rmsd.Alignment(alignment: Tuple[Tuple[schrodinger.structure._structure._Residue], Tuple[schrodinger.structure._structure._Residue]], length: int, rmsd: float, transform_matrix: numpy.array)¶
Bases:
tuple
Container to store CEAlign alignment data.
- alignment: Tuple[Tuple[schrodinger.structure._structure._Residue], Tuple[schrodinger.structure._structure._Residue]]¶
Alias for field number 0
- length: int¶
Alias for field number 1
- rmsd: float¶
Alias for field number 2
- transform_matrix: numpy.array¶
Alias for field number 3
- __contains__(key, /)¶
Return key in self.
- __len__()¶
Return len(self).
- count(value, /)¶
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)¶
Return first index of value.
Raises ValueError if the value is not present.
- schrodinger.structutils.rmsd.calculate_in_place_rmsd(st1, at_list1, st2, at_list2, use_symmetry=False, weights=None)¶
- Returns
Atomic coordinate rmsd between structures.
- Return type
float
- Parameters
st1 (structure.Structure) – Reference structure.
at_list1 (list) – List of atom index integers to consider. This must be the same length as at_list2. The coordinates from st1 and st2 are mapped in index order.
st2 (structure.Structure) – Test structure. Must be a conformer of st1 if use_symmetry is True.
at_list2 (list) – List of atom index integers to consider. This must be the same length as at_list1. The coordinates from st1 and st2 are mapped in index order.
use_symmetry (boolean) – Adjust at_list2 index order such that it is optimized with regard to molecular symmetry. The default is False for backwards compatibility, accounting for molecular symmetry is usually desireable.
weights – A list of weights, which must be the same length as at_list1. The weights in this list pertain to the relative importance to assign to the position deviations of atoms in at_list2 from atoms in at_list1. The weights list is assumed to contain positive values.
- Raises
ValueError if the shape of the coordinate matrices is not the same, or if the length of the weights array is not the same at the length of the coordinate array. TypeError if the data type of the weights is not float.
- Note
The input structures are expected to be conformers. See also
ConformerRmsd
which supports calculating the RMSD of a common conformer atom subset specified by ASL with non-conformer input structures.
- schrodinger.structutils.rmsd.superimpose(st_fixed, atlist_fixed, st_move, atlist_move, use_symmetry=False, move_which=1)¶
wrapper for _superimpose_single_atoms, _superimpose_atom_pairs, and _superimpose_many_atoms. This handles cases with 1 or 2 atoms which fail in mm_superimpose (in addition to more atoms). This function superposes the two molecules and returns the rms.
move_which can be set to structutils.rmsd.X, where X=CT, ATOMS, MOLECULES
- Parameters
st_fixed (Structure) – Structure being superposed on, will not be moved
atlist_fixed (list of ints) – list of atom indices of st_fixed to be considered for superposition
st2 (Structure) – Structure being superposed, some fraction will be moved
atlist2 (list of ints) – list of atom indices of st2 to be considered for superposition
use_symmetry (bool) – If True, symmetry will automatically be detected and the lowest RMS obtained by doing all symmetry-related comparisons will be returned. This option can only be used if the input structures are conformers. Only used for the many-atom case.
move_which (MOLECULES, ATOMS, CT) – moveable unit for superposing st2 onto st_fixed
- Return param
rms of st_fixed and st2 (over the atoms in the atlist) after superposition
- Return type
float
- schrodinger.structutils.rmsd.superimpose_substructure_molecules(st1, atlist1, st2, atlist2)¶
Superpose st1 and st2 by moving each molecule independently.
- schrodinger.structutils.rmsd.superimpose_bond(st1, at_pair1, st2, at_pair2)¶
Translate and rotate st2 in place, putting the first atom of at_pair2 on top of the first atom of at_pair1, and the second atom of at_pair2 as close as possible to the corresponding atom of at_pair1. This is commonly used for aligning bonds, but it is not a requirement that the two atoms in each pair be bonded.
st1 and st2 must be different structure objects.
- Parameters
st1 (
structure.Structure
) – reference structureat_pair1 (sequence of int) – pair of atom indexes from st1
st2 (
structure.Structure
) – structure to be rotated (modified in place)at_pair2 (sequence of int) – pair of atom indexes from st2
- schrodinger.structutils.rmsd.get_super_transformation_matrix(st1, at_list1, st2, at_list2)¶
- Returns
numpy matrix for the tranformation that will best superimpose atoms of st2 onto atoms of st1.
- Return type
numpy array
- Parameters
st1 (structure.Structure) – Reference (non-moving) structure.
at_list1 (list) – Atom indexes from st1 to consider. Must be the same size as at_list2 and contain at least three atom indexes.
st2 (structure.Structure) – Test (moving) structure to be transformed onto st1.
at_list2 (list) – Atom indexes from st2 to consider. Must be the same size as at_list1 and contain at least three atom indexes.
- schrodinger.structutils.rmsd.get_optimal_atom_mapping(reference_structure, reference_atom_list, test_structure, test_atom_list, in_place=True)¶
FOR USE BY ConformerRmsd class only!
Returns a list of test_structure atom indexes for the optimal, symmetry-aware, pair-wise alignment with the reference_structure.
Parameters
reference_structure (structure.Structure)
- reference_atom_list (list of int)
A list of atom indices.
test_structure (structure.Structure)
- test_atom_list (list of int)
A list of atom indices.
- Parameters
in_place (bool) – If False, will superimpose the test_structure on top of the reference_structure in addition to calculating the mapping.
This method is semi-private because the ConformerRmsd class depends on it.
- schrodinger.structutils.rmsd.renumber_conformer(ref_st, test_st, use_symmetry=False, in_place=True)¶
Renumbers atoms in <test_st> so that they match <ref_st>. Structures must have hydrogens, see CANVAS-5422.
- Parameters
ref_st (
structure.Structure
) – Structure to use as a reference for renumbering.test_st (
structure.Structure
) – Structure to renumber (renumbered copy will be returned)use_symmetry (bool) – Whether to consider symmetry. Always set to True unless mmsym is going to be used on the output structure.
in_place (bool) – If True (default), the conformer is only renumbered; if False, it will also be superimposed on top of the ref_st based on the new atom numbering.
:return Renumbered version of test structure. :rtype
structure.Structure
- class schrodinger.structutils.rmsd.ConformerRmsd(reference_structure, test_structure=None, asl_expr='NOT atom.element H', in_place=True)¶
Bases:
object
A class to calculate the root mean square deviatation between the atomic coordinates of two conformer structure.Structure objects. The inputs are expected to be conformers in the traditional sense.
Working copies of the input structures are modified instead the original. The superimpose transformation is applied to the entire test_structure. The transformation matrix is saved as superposition_matrix property if in_place is set to False.
Renumbering is achieved by creating a list of SMARTS patterns, one for each molecule in the reference structure, evaluating the SMARTS pattern with both the reference and test structures to get a standard order of atom indexes, then passing that atom order to mm.mmct_ct_reorder. Renumbering can be slow with protein-sized molecules so you may want to disable that feature when working with large molecules.
API Example:
# Calculate in place, heavy atom RMSD. st1 = structure.Structure.read('file1.mae') st2 = structure.Structure.read('file2.mae') conf_rmsd = ConformerRmsd(st1, st2) # in place, heavy atom RMSD calc. if conf_rmsd.calculate() < 2.00: print "Good pose" # Loop over structures in test.mae, comparing to ref.mae st1 = structure.Structure.read('ref.mae') conf_rmsd = ConformerRmsd(st1, st1) # in place, heavy atom RMSD calc. for st in structure.StructureReader('test.mae'): conf_rmsd.test_structure = st print conf_rmsd.calculate()
Instance Attributes
- Variables
use_symmetry – Boolean to control whether the test structure atom list should be determined by with the mmsym library. Mmsym accounts for molecular symmetry and is recommended. This boolean just allows for more detailed testing. NOTE: Make sure use_symmetry is True if using renumber_structures.
renumber_structures – Boolean to control whether the reference and test structures should be renumbered by a SMARTS pattern before calculating the rmsd. For better performance, set to False when the inputs are sure to have the same atomic numbering schemes. NOTE: Make sure use_symmetry is True if using renumber_structures.
use_heavy_atom_graph – Boolean to control whether the reference and test structures should be treated as heavy-atom only, graph topologies. Default is False. Tautomers, and different ionization states are not true conformers, but often require RMSD analysis. If True, the test_structure and reference_structure are treated by deleting all hydrogens, setting all bond orders to 1, setting all formal charges to 0, then adjusting the atom types.
orig_index_prop – m2io dataname for atomic property that stores the original atom index of the input structures. Default is ‘i_confrmsd_original_index’. This is needed to we can extract/reduce/renumber and present information about the original index which the end-user perceives.
rmsd (float) – Root mean square deviation of atomic coordintates
max_distance (float) – Greatest displacement between the atom pairs
max_distance_atom_1 (integer) – Reference atom index (in original atom scheme)
max_distance_atom_2 (integer) – Test atom index (in the original atom scheme)
rmsd_str – String of basic rmsd info == str(self)
precision – Precision of rmsd stored to find minimum if search_permutations=True defaults to 6 (meaning a precision of 10^-6)
max_permutations – Maximum number of permutations to search defaults to 10000000
superposition_matrix – Matrix that was used for superimposing the test structure on top of the reference structure, if in_place=True.
- Note
The following attributes are available after calculate()
- Raise
exceptions if a preparation step can’t be completed, or if the input structures can’t be handled as conformers.
- orig_index_prop = 'i_confrmsd_original_index'¶
- __init__(reference_structure, test_structure=None, asl_expr='NOT atom.element H', in_place=True)¶
- Parameters
reference_structure (structure.Structure) – Template structure
test_structure (structure.Structure) – The mobile structure.
asl_expr (string) – Atom Language Expression to identify the the atoms used to calculate the RMSD and base the superimpose alignment.
in_place (Boolean) – If True, calculate the RMSD without moving the test_structure. Otherwise, perform the optimal alignment then calculate the RMSD.
- getRmsdDataname()¶
- Returns
m2io property dataname string. The property name indicates the reference structure, the title, the ASL used to identify comparison atoms and if the structure is in-place or mobile.
- Return type
string
- writeStructures(file_name='rmsd.mae', mode='w')¶
Writes the reference and test structures to file.
- Parameters
file_name (string) – Path of the structure file to write.
mode (string) – ‘w’ => write, clobber as needed ‘a’ => append
- writeCommand(file_name='rmsd.cmd')¶
Writes a Maestro command file and structures with the pair wise atom mapping in command file mode. The Maestro file has the same basename as the command file. Clobbers existing files.
- Parameters
file_name (string) – Path to the maestro command file with the atom pairings.
- Raise
ValueError if file_name does not have ‘.cmd’ extension.
- evaluateAsl(st)¶
Return the atoms of the input structure that match <asl_expr>.
- Parameters
st (
structure.Structure
) – Structure to evaluate- Returns
Atoms matching the ASL
- Return type
list of ints
:raise RuntimeError if ASL matching failed.
- calculate()¶
- Returns
Root-mean-squared difference of atom coordinates.
- Type
float
- Raise
ValueError if working versions of the reference and test structures don’t have the same shape (non-confs).
- The order of operations:
- prepare working copies of the reference and test structures.
** copy the structures. ** encode the original atom indexes as atom properties. ** extract substructure of the atoms matching the ASL. ** reduce to heavy atom graph (instance option, non-default). ** normalize numbering scheme (instance option, default).
determine molecular symmetry mapping (optional, default).
create a numpy coordinate array for the working structures.
- numpy linear algebra SVD to superimpose (instance option).
** transform test_structure
numpy array used to calculate RMSD, and max_dist.
decode original indexes to identify atoms involve in max dist.
- renumberBySymmetry()¶
Renumber the atoms in _working_test_st based off the reference structure ussing mmsym.
- renumberWorkingStructures()¶
Renumber the working structures to give them identical numbering. By default, the test structure is renumbered to match the reference using renumber_conformer() functon; but this behavior can be changed by over-riding this method in a subclass.
- class schrodinger.structutils.rmsd.ConformerRmsdX(*args, **kwargs)¶
Bases:
schrodinger.structutils.rmsd.ConformerRmsd
- __init__(*args, **kwargs)¶
- Parameters
reference_structure (structure.Structure) – Template structure
test_structure (structure.Structure) – The mobile structure.
asl_expr (string) – Atom Language Expression to identify the the atoms used to calculate the RMSD and base the superimpose alignment.
in_place (Boolean) – If True, calculate the RMSD without moving the test_structure. Otherwise, perform the optimal alignment then calculate the RMSD.
- calculate()¶
- Returns
Root-mean-squared difference of atom coordinates.
- Type
float
- Raise
ValueError if working versions of the reference and test structures don’t have the same shape (non-confs).
- The order of operations:
- prepare working copies of the reference and test structures.
** copy the structures. ** encode the original atom indexes as atom properties. ** extract substructure of the atoms matching the ASL. ** reduce to heavy atom graph (instance option, non-default). ** normalize numbering scheme (instance option, default).
determine molecular symmetry mapping (optional, default).
create a numpy coordinate array for the working structures.
- numpy linear algebra SVD to superimpose (instance option).
** transform test_structure
numpy array used to calculate RMSD, and max_dist.
decode original indexes to identify atoms involve in max dist.
- evaluateAsl(st)¶
Return the atoms of the input structure that match <asl_expr>.
- Parameters
st (
structure.Structure
) – Structure to evaluate- Returns
Atoms matching the ASL
- Return type
list of ints
:raise RuntimeError if ASL matching failed.
- getRmsdDataname()¶
- Returns
m2io property dataname string. The property name indicates the reference structure, the title, the ASL used to identify comparison atoms and if the structure is in-place or mobile.
- Return type
string
- orig_index_prop = 'i_confrmsd_original_index'¶
- renumberBySymmetry()¶
Renumber the atoms in _working_test_st based off the reference structure ussing mmsym.
- renumberWorkingStructures()¶
Renumber the working structures to give them identical numbering. By default, the test structure is renumbered to match the reference using renumber_conformer() functon; but this behavior can be changed by over-riding this method in a subclass.
- writeCommand(file_name='rmsd.cmd')¶
Writes a Maestro command file and structures with the pair wise atom mapping in command file mode. The Maestro file has the same basename as the command file. Clobbers existing files.
- Parameters
file_name (string) – Path to the maestro command file with the atom pairings.
- Raise
ValueError if file_name does not have ‘.cmd’ extension.
- writeStructures(file_name='rmsd.mae', mode='w')¶
Writes the reference and test structures to file.
- Parameters
file_name (string) – Path of the structure file to write.
mode (string) – ‘w’ => write, clobber as needed ‘a’ => append
- schrodinger.structutils.rmsd.cealign(st_refe, st_mobi, atlist_refe=None, atlist_mobi=None, transform_mobile=True, window_size=8, max_gap=30, use_guide=True)¶
Align two structures using the CEAlign method.
- Parameters
st_refe (structure.Structure) – Reference structure, always remains fixed.
st_mobi (structure.Structure) – Mobile structure to be aligned onto the reference.
atlist_refe (list(int)) – List of atom indices from reference structure to be used for the alignment and superposition. If None, depending on the value of use_guide, includes either all guide atoms or all heavy atoms in the structure.
atlist_mobi (list(int)) – List of atom indices from mobile structure to be used for the alignment and superposition. If None, depending on the value of use_guide, includes either all guide atoms or all heavy atoms in the structure.
transform_mobile (bool) – If True (default), the coordinates of the mobile structure will be modified after calculating the optimal alignment.
window_size (float) – CE algorithm parameter. Defines the length of fragments to be used for the alignment. Shorter values may yield more accurate alignments, up to a certain point, at the expense of computation time. Default is 8, as defined in the original publication.
max_gap (float) – CE algorithm parameter. Size of maximum gap allowed between two aligned fragment pairs, in number of residues. Short values yield faster computation times, but might prevent similar fragments from being found by the algorithm. Default is 30, as defined in the original publication and found to be a reasonable compromise between computational cost and alignment quality.
use_guide (bool) – Use guide atoms to calculate the optimal alignment. Guide atoms depend on the molecule type: C-alpha for protein, C4’ for nucleic acid. If use_guide is False, will use all heavy atoms, which degrades performance of the algorithm without significant increase in accuracy.
- Return type
- Returns
a named tuple with the RMSD (in Angstrom) between the two structures after superposition, the alignment length, a mapping of the aligned residues, and the transform matrix as a 4x4 numpy array.