schrodinger.structutils.analyze module¶
Functions for analyzing Structure objects
.
AslLigandSearcher
is a class that identifies putative ligands in a structure.
Each putative found ligand is contained in a Ligand
instance.
There are also a number of functions for using SMARTS, ASL, and SMILES (e.g.
evaluate_smarts_canvas
or generate_smiles
). Other functions return information
about a structure (i.e. get_chiral_atoms
or hydrogens_present
). There are
also several SASA (Solvent Accessible Surface Area) functions (i.e.
calculate_sasa_by_atom
and calculate_sasa_by_residue
and calculate_sasa
).
See also the discussion in the Python API overview.
@copyright: Schrodinger, LLC. All rights reserved.
- schrodinger.structutils.analyze.get_chiral_atoms(structure)¶
Return a dictionary of chiral atoms, for which the key is the atom index and the value is one of the following strings: “R”, “S”, “ANR”, “ANS”, “undef”.
ANR and ANS designate “chiralities” of non-chiral atoms that are important for determining the structure of the molecule (ex: cis/trans rings).
- Parameters
structure (
Structure
) – Chirality of atoms within this structure will be determined.- Return type
dict
- Returns
Dictionary of chiralities keyed off atom index.
- schrodinger.structutils.analyze.get_chiralities(structure)¶
Return a dictionary of chiral atoms, for which the key is the atom index and the value is a tuple of one of the following strings: “R”, “S”, “ANR”, “ANS”, “undef” and the list of CIP ranked neighbors.
ANR and ANS designate “chiralities” of non-chiral atoms that are important for determining the structure of the molecule (ex: cis/trans rings).
- Parameters
structure (
Structure
) – Chirality of atoms within this structure will be determined.- Return type
dict
- Returns
Dictionary of chiralities keyed off atom index.
- schrodinger.structutils.analyze.enforce_rdkit_smarts(smarts_func)¶
- schrodinger.structutils.analyze.evaluate_smarts(structure, smarts_expression, verbose=False, first_match_only=False, unique_sets=False, *, use_rdkit=False)¶
- Deprecated
Use
schrodinger.adapter.evaluate_smarts
instead.
- schrodinger.structutils.analyze.validate_smarts(smarts, *, use_rdkit=False)¶
- Deprecated
Use
schrodinger.adapter.validate_smarts
instead.
- schrodinger.structutils.analyze.count_atoms_in_smarts(smarts, *, use_rdkit=False)¶
Return the number of atoms that the given SMARTS pattern has.
- Parameters
smarts (str) – SMARTS pattern
- Returns
Number of atoms in the pattern
- Return type
int
- Raises
ValueError is the pattern is invalid.
- schrodinger.structutils.analyze.lazy_canvas_import()¶
Initialize _canvas and smiles variables. Since the canvas modules may take some time to import, before it an function loading time.
- schrodinger.structutils.analyze.validate_smarts_canvas(smarts, *, use_rdkit=False)¶
- Deprecated
Use
schrodinger.adapter.validate_smarts
instead.
- schrodinger.structutils.analyze.evaluate_smarts_canvas(structure, smarts, stereo='annotation_and_geom', start_index=1, uniqueFilter=True, allowRelativeStereo=False, rigorousValidationOfSource=False, hydrogensInterchangeable=False, multiple_smarts=False, *, use_rdkit=False)¶
- Deprecated
Use
schrodinger.adapter.evaluate_smarts
instead.
- schrodinger.structutils.analyze.evaluate_smarts_by_molecule(structure, smarts, timing_data=None, canvas=True, use_rdkit=False, matches_by_mol=False, molecule_numbers=None, **kwargs)¶
Takes a structure and a SMARTS pattern and returns a list of all matching atom indices, where each element in the list is a group of atoms that match the the SMARTS pattern. The advantage of this function over evaluate_smarts_canvas is that it does a SMARTS match for each molecule in a structure rather than over the entire structure at once. SMARTS evaluation scales as N^2 with the size of the structure searched. Doing many SMARTS evaluations over small molecules will have a significant speedup over one SMARTS evaluation over a composite structure. The return value of this function is identical to the return value of the evaluate_smarts_canvas function (or evaluate_smarts function if canvas=False) with the possible exception of the order of the matches. Do not use this function if the SMARTS match can span molecules. This simply fails to match invalid SMARTS patterns and also discards any empty matches.
Additional keyword arguments are passed to the SMARTS matching function
- Parameters
structure (structure.Structure) – the structure to search
smarts (str) – the SMARTS pattern to match
timing_data (dict or None) – If supplied this dict will be filled with timing data for the SMARTS finding. Data will be recorded for each molecule searched. Keys will be the number of atoms in a molecule, each value will be a list. Each item in the list will be the time in seconds it took to search a molecule with that many atoms.
canvas (bool) – If True, use Canvas SMARTS matching, if False, use mmpatty
use_rdkit (bool) – Whether to use RDKIT. Cannot be used together with canvas
matches_by_mol (bool) – if True then rather than returning a list of matches return a dictionary of matches key-ed by molecule number
molecule_numbers (set) – set of molecule numbers in the structure to be used instead of the entire structure
- Return type
list or dict
- Returns
For the list (if matches_by_mol is False) each value is a list of atom indices matching the SMARTS pattern, for the dict (if matches_by_mol is True) keys are molecule indices and values are lists of matches for that molecule
- schrodinger.structutils.analyze.evaluate_multiple_smarts(structure, smarts_list, verbose=False, first_match_only=False, unique_sets=False, keep_nested=False, *, use_rdkit=False)¶
Search for multiple SMARTS substructures in
Structure
structure
.Return a list of lists of ints. Each list of ints is a list of atom indices matching a SMARTS pattern. The multiple SMARTS patterns are combined into one list.
- Parameters
structure (
Structure
) – Structure to search for matching substructures.smarts_list (list) – List of SMARTS patterns to look for.
verbose (bool) – If True, print additional progress reports from the C implementation.
first_match_only (bool) – If False, return all matches for a given starting atom - e.g. [1, 2, 3, 4, 5, 6] and [1, 6, 5, 4, 3, 2] from atom 1 of benzene with the smarts_expression ‘c1ccccc1’. If True, return only the first match found for a given starting atom. Note that setting first_match_only to True does not affect matches with different starting atoms - i.e. benzene will still return six lists of ints for ‘c1ccccc1’, one for each starting atom. To match only once per set of atoms, use unique_sets=True. Note also that setting first_match_only to True does not guarantee that all matching atoms will be found.
unique_sets (bool) – If True, the returned list of matches will contain a single (arbitrary) match for any given set of atoms. If False, return the uniquely ordered matches, subject to the behavior specified by the first_match_only parameter.
- Return type
list
- Returns
Each value is a list of atom indices matching the SMARTS patterns.
- schrodinger.structutils.analyze.evaluate_substructure(st, subs_expression, first_match_only=False, *, use_rdkit=False)¶
- Deprecated
Use
schrodinger.adapter.evaluate_smarts
instead
- schrodinger.structutils.analyze.generate_asl(st, atom_list)¶
Generate and return an atom expression for the atoms in
Structure
st
which are listed inatom_list
. The ASL expression will be as compact as possible using mol, res and atom expressions where appropriate.- Parameters
st (
Structure
) – Structure holding the ASL atoms.atom_list (list) – List of indices of atoms for which ASL is desired.
- Return type
str
- Returns
ASL compactly describing the atoms in atom_list.
- schrodinger.structutils.analyze.generate_residue_asl(residues)¶
Create an ASL representing the residues.
Inscode will only be included if at least one of the residues has a non-blank inscode.
- Parameters
residues (collections.abc.Iterable(structure._Residue)) – Residue objects to create ASL
- Return type
str
- schrodinger.structutils.analyze.validate_asl(asl)¶
Validate the given ASL expression. This is useful for validating an ASL when a structure object is not available - for example when validating a command line option. NOTE: A warning is also printed to stdout if the ASL is not valid.
- Parameters
asl (str) – ASL expression.
- Returns
True if ASL is valid, False otherwise.
- Return type
bool
- schrodinger.structutils.analyze.evaluate_asl(st, asl_expr)¶
Search for substructures matching the ASL (Atom Specification Language) string
asl_expr
inStructure
st
.- Parameters
st (
Structure
) – Structure to search for matching substructures.asl_expr (str) – ASL search string.
- Return type
list
- Returns
List containing indices of matching atoms.
- Raises
schrodinger.infra.mm.MmException – If the ASL expression is invalid.
- schrodinger.structutils.analyze.get_atoms_from_asl(st, asl_expr)¶
Return atoms matching the ASL string
asl_expr
inStructure
st
.- Parameters
st (
Structure
) – Structure to search for matching substructures.asl_expr (str) – ASL search string.
- Return type
generator
- Returns
Generator of matching
StructureAtom
objects.- Raises
schrodinger.infra.mm.MmException – If the ASL expression is invalid.
- schrodinger.structutils.analyze.hydrogens_present(st)¶
Return True if all hydrogens are present in
Structure
st
, False otherwise.Since all modern force fields require hydrogens, this is a good check to make sure that a structure is ready for force field calculations. This function is implemented by checking to see if the structure can be used as-is in a calculation with OPLS2003.
- Warning
Requires atom types to be correct. Consider calling {Structure.retype} first.
- Parameters
st (
Structure
) – Structure to be tested.- Return type
bool
- Returns
Are all hydrogens are present?
- schrodinger.structutils.analyze.has_valid_lewis_structure(st: schrodinger.structure._structure.Structure) bool ¶
Check whether a valid Lewis structure for the structure is possible. Possible causes of an invalid Lewis structure may be invalid bond orders, charges, or missing hydrogens or other atoms.
This check may be useful before attempting to run any backend calculations on the given structure.
- schrodinger.structutils.analyze.generate_tautomer_code(st, considerEZStereo=True, considerRSStereo=True, stereo='annotation_and_geom', strip=False)¶
- Deprecated
- schrodinger.structutils.analyze.create_chmmol_from_structure(structure, stereo='annotation_and_geom')¶
Creates a ChmMol object for a given structure and returns the ChmMol.
Deprecated; use
schrodinger.adapter.to_rdkit()
and other RDKit API instead. ChmMol is deprecated in favor of RDKit.- Parameters
structure (schrodinger.structure.Structure) – The structure to create ChmMol from.
stereo (enum) – Specify how to determine the stereochemistry of a ChmMol from a Structure. Can be STEREO_FROM_GEOMETRY, STEREO_FROM_ANNOTATION, STEREO_FROM_ANNOTATION_AND_GEOM, or NO_STEREO. See
schrodinger.structutils.smiles.SmilesGenerator.__init__
for descriptions of these options.
- Return schrodinger.application.canvas.base.ChmMol
The created ChmMol.
- schrodinger.structutils.analyze.generate_smiles(st, unique=True, stereo='annotation_and_geom')¶
- Deprecated
Use
schrodinger.adapter.to_smiles
instead.
- schrodinger.structutils.analyze.generate_smarts(st, atom_subset=None, check_connectivity=True, *, use_rdkit=False)¶
- Deprecated
Use
schrodinger.adapter.to_smarts
instead.
- schrodinger.structutils.analyze.generate_smarts_canvas(st, atom_subset=None, check_connectivity=True, include_hydrogens=False, honor_maestro_prefs=False, include_stereo=False, *, use_rdkit=False)¶
- Deprecated
Use
schrodinger.adapter.to_smarts
instead.
- schrodinger.structutils.analyze.can_atom_hydrogen_bond(atom)¶
Returns True if the given atom can be involved in a hydrogen bond.
- Parameters
atom (
structure._StructureAtom
) – Atom in question- Returns
Whether atom can H-bond
- Return type
bool
- schrodinger.structutils.analyze.generate_crystal_mates(st, radius=10.0, space_group=None, a=None, b=None, c=None, alpha=None, beta=None, gamma=None, group_radius=14.0)¶
Generate crystal mates for the input
Structure
st
.Return a list of structures that represent the crystal mates. (Note that the first item in the list represents the identity transformation and as such will be identical to the input structure.)
All crystal mates within
radius
of the input structure are generated.The crystal parameters can be specified as parameters to this function or can be standard PDB properties of the input structure. If the structure was read from a PDB file then these crystal properties will usually be present.
The group_radius is used in the crystal mates calculation to determine whether a symmetric element is in contact with the ASU. There should be little reason to change the default value of 14.0.
- Parameters
st (
Structure
) – Structure for which crystal mates will be generated.radius (float) – Distance within which to generate crystal mates.
space_group (str) – Space group of the crystal. If
None
, usesst
’s s_pdb_PDB_CRYST1_Space_Group.a (float) – Crystal ‘a’ length. If
None
, usesst
’s s_pdb_PDB_CRYST1_a.b (float) – Crystal ‘b’ length. If
None
, usesst
’s s_pdb_PDB_CRYST1_b.c (float) – Crystal ‘c’ length. If
None
, usesst
’s s_pdb_PDB_CRYST1_c.alpha (float) – Crystal ‘alpha’ angle. If
None
, usesst
’s s_pdb_PDB_CRYST1_alpha.beta (float) – Crystal ‘beta’ angle. If
None
, usesst
’s s_pdb_PDB_CRYST1_beta.gamma (float) – Crystal ‘gamma’ angle. If
None
, usesst
’s s_pdb_PDB_CRYST1_gamma.group_radius (float) – Used to determine whether a symmetric element is in contact with the ASU. There should be little reason to change the default value of 14.0.
- Return type
list
- Returns
list of
Structure objects
that represent the crystal mates. (Note that the first item in the list represents the identity transformation and as such will be identical to the input structure.)
- schrodinger.structutils.analyze.find_overlapping_atoms(st, ignore_hydrogens=False, ignore_waters=False, dist_threshold=0.8)¶
Search the specified structure for overlapping atoms. Returns a list of (atom1index, atom2index) tuples.
- Parameters
st (
Structure
) – Structure to search for overlapping atomsignore_hydrogens – Whether to ignore hydrogens.
ignore_waters – Whether to ignore waters.
dist_threshold – Atoms are considered overlapping if their centers are within this distance of each other.
- Return type
list
- Returns
Each value is a tuple containing the indices of overlapping atoms.
- schrodinger.structutils.analyze.generate_molecular_formula(st)¶
Return a string for the molecular formula in Hill notation for the
st
. The structure must contain only one molecule.- Parameters
st (
Structure
) – Find the molecular formula for this structure. Must contain only one molecule.- Return type
str
- Returns
The molecular formula for
st
.
- schrodinger.structutils.analyze.is_bond_rotatable(bond, *, allow_methyl=False)¶
Return True if specified bond is rotatable, False otherwise.
A bond is considered rotatable if all of the following are true…
It is a single bond.
It is not adjacent to a triple bond.
It is not in a ring.
Neither atom is a hydrogen or other terminal atom.
Neither atom is a carbon or nitrogen with three hydrogens attached.
- Parameters
bond (
structure.StructureBond
) – bond to test for rotatability.rings (list) – List of ring atom index lists. As an optimization, provide the (sorted) rings list from the find_rings function if you already have it. Otherwise, an SSSR calculation will be done.
allow_methyl (bool) – allow -CH3 and -NH3 as rotatable bonds, if True
- Return type
bool
- Returns
Is the bond rotatable?
- schrodinger.structutils.analyze.rotatable_bonds_iterator(st, *, max_size=None, allow_methyl=False)¶
Return an iterator for rotatable bonds (atomnum1, atomnum2) in the structure.
See the
is_bond_rotatable
function description for which bonds are considered rotatable.- Parameters
st (
Structure
) – The structure to search for rotatable bonds.rings (list) – List of ring atom index lists. As an optimization, provide the (sorted) rings list from the find_rings function if you already have it. Otherwise, an SSSR calculation will be done.
max_size (int) – If specified, yield only rings that have up to this number of bonds. Use this option to exclude large rings; e.g. those in macrocycle-like molecules.
allow_methyl (bool) – allow -CH3 and -NH3 as rotatable bonds, if True
- Return type
iterator of tuples
- Returns
yields tuples of atom index pairs describing a rotatable bond in
st
.
- schrodinger.structutils.analyze.get_num_rotatable_bonds(st, *, max_size=None)¶
Return the number of rotatable bonds in the Structure
st
. The count does not include trivial rotors such as terminal methyls, or rotors within rings.- Parameters
st (
Structure
) – The structure to search for rotatable bonds.rings (list of lists of ints) – List of ring atom index lists. As an optimization, provide the (sorted) rings list from the find_rings function if you already have it. Otherwise, an SSSR calculation will be done.
max_size (int) – If specified, yield only rings that have up to this number of bonds. Use this option to exclude large rings; e.g. those in macrocycle-like molecules.
- Return type
int
- Returns
The number of rotatable bonds in
st
.
- schrodinger.structutils.analyze.hbond_iterator(st, atoms=None)¶
Iterate over hydrogen bond between the atoms specified by the
atom_set
and the other atoms inst
. Each yielded item is a tuple of (atom-index-1, atom-index-2).NOTE: This function has been updated to simply act as a wrapper to
hbond.get_hydrogen_bonds
to ensure that hbonds are determined consistently.- Parameters
st (
Structure
) – The structure to search for H-bonds.atoms (list of int or None) – A list of atom indices (or _StructureAtom objects) to analyze. If not specified, then all H-bonds present in the structure are returned.
- Return type
list of (
_StructureAtom
,_StructureAtom
)- Returns
list of (donor atom object, acceptor atom object) for each hydrogen bond identified.
- schrodinger.structutils.analyze.find_equivalent_atoms(st, span_molecules=True)¶
Find atoms in the structure that are equivalent. For example, all three hydrogens on a methyl group are equivalent.
Returns a list, each value of which is a list of atoms that are equivalent.
- Parameters
st (
Structure
) – The structure to search for equivalent atoms.span_molecules (bool) – If True, don’t consider molecules to be separate entities. If False, constructs global equivalence classes for all atoms in a ct, but will never return an equivalence class across molecules.
- Return type
list
- Returns
Each value is a list of indices of equivalent atoms.
- schrodinger.structutils.analyze.get_approximate_sasa(st, atom_indexes=None, cutoff=7.0)¶
- Deprecated
This function only returns a rough approximation to the solvent accessible surface area. Please use the
calculate_sasa
function instead.
- schrodinger.structutils.analyze.get_approximate_atomic_sasa(st, iat, cutoff=7.0, sasa_probe_radius=1.4, hard_sphere_s=2.5, scale_factor=2.32)¶
- Deprecated
Deprecated in favor of
calculate_sasa
, which is more accurate.
- schrodinger.structutils.analyze.calculate_sasa_by_atom(st, atoms=None, cutoff=8.0, probe_radius=1.4, resolution=0.2, exclude_water=False)¶
Calculate the solvent-accessible surface area (SASA) for the whole structure, or an atom subset, and returns a list of floats.
- Parameters
st (
Structure
) – Structure for which SASA is desired.atoms (list) – List of atom indices or of
_StructureAtom objects
for the atoms to count. If None, calculates SASA for all atoms. (default: None)cutoff (float) – Atoms within this distance of
atoms
will be considered for occlusion. Requiresatoms
to be specified. (default: 8.0A)probe_radius (float) – Probe radius, in units of angstroms, of the solvent molecule. (default: 1.4A)
resolution (float) – Resolution to use. Decreasing this number will yield better results, increasing it will speed up the calculation. NOTE: This is NOT the same option as Maestro’s surface resolution, which uses a different algorithm to calculate the surface area. (default: 0.2)
exclude_water (bool) – If set to True then explicitly exclude waters in the method. This option is only works when ‘atoms’ argument is passed. (default: False)
- Return type
list
- Returns
A list of solvent accessible surface area of selected atoms in
st
.
- schrodinger.structutils.analyze.calculate_sasa_by_residue(st, atoms=None, cutoff=8.0, probe_radius=1.4, resolution=0.2, exclude_water=False)¶
Calculate the solvent-accessible surface area (SASA) for the whole structure, or an atom subset, and then group them by residue.
- Parameters
st (
Structure
) – Structure for which SASA is desired.atoms (list) – List of atom indices or of
_StructureAtom objects
for the atoms to count. If None, calculates SASA for all atoms. (default: None)cutoff (float) – Atoms within this distance of
atoms
will be considered for occlusion. Requiresatoms
to be specified. (default: 8.0A)probe_radius (float) – Probe radius, in units of angstroms, of the solvent molecule. (default: 1.4A)
resolution (float) – Resolution to use. Decreasing this number will yield better results, increasing it will speed up the calculation. NOTE: This is NOT the same option as Maestro’s surface resolution, which uses a different algorithm to calculate the surface area. (default: 0.2)
exclude_water (bool) – If set to True then explicitly exclude waters in the method. This option is only works when ‘atoms’ argument is passed. (default: False)
- Return type
list
- Returns
a list of solvent accessible surface area of residues (ordered by connectivity) within
st
.
- schrodinger.structutils.analyze.calculate_sasa(st, atoms=None, cutoff=8.0, probe_radius=1.4, resolution=0.2, exclude_water=False, exclude_atoms=None)¶
Calculate the solvent-accessible surface area (SASA) for the whole structure, or an atom subset.
- Parameters
st (
Structure
) – Structure for which SASA is desired.atoms (list) – List of atom indices or of
_StructureAtom objects
for the atoms to count. If None, calculates SASA for all atoms. (default: None)cutoff (float) – Atoms within this distance of
atoms
will be considered for occlusion. Requiresatoms
to be specified. (default: 8.0A)probe_radius (float) – Probe radius, in units of angstroms, of the solvent molecule. (default: 1.4A)
resolution (float) – Resolution to use. Decreasing this number will yield better results, increasing it will speed up the calculation. NOTE: This is NOT the same option as Maestro’s surface resolution, which uses a different algorithm to calculate the surface area. (default: 0.2)
exclude_water (bool) – If set to True then explicitly exclude waters in the method. This option is only works when ‘atoms’ argument is passed. (default: False)
exclude_atoms (list) – aid of atoms that you don’t want in SASA caluclation useful for FEP-type systems where a second ‘image’ of a molecules is present
- Return type
float
- Returns
The solvent accessible surface area of the selected
atoms
withinst
.
- schrodinger.structutils.analyze.calc_buried_sasa_by_residue(st, group1_atoms, group2_atoms, resolution=0.5)¶
Calculate the buried SASA ratio (delta SASA upon binding) for each residue, which measures how much of the residue’s surface is interacting with the other binding partner. Value of 1.0 means that all of residue’s area, as calculated in a subunit, is no longer accessible to solvent when in complex with the other group. Value of 0.0 means that all of that residue is accessible as a complex as well (residue is not on the interaction surface). Value of 0.0 is also used for residues that are fully buried in their subunits.
- Parameters
st (
structure.Structure
) – Structure objectgroup1_atoms (list of ints) – Atoms of the first binding partners group.
group2_atoms (list of ints) – Atoms of the second binding partners group.
resolution (float) – Resolution to use. See calculate_sasa_by_atom().
- Returns
Dictionary where keys are residue strings (e.g. “A:123”), and values are the buried SASA ratio for that residue (0.0-1.0).
- Return type
dict
- schrodinger.structutils.analyze.find_ligands(st, **kwargs) list ¶
Simple function interface for
AslLigandSearcher
class.- Parameters
st (
Structure
) – Structure to search.- Return type
list
- Returns
a list of
Ligand
instances for putative ligands within st.
- schrodinger.structutils.analyze.center_of_mass(st, atom_indices: Optional[list] = None)¶
Gets the structure’s center of mass. If specified, this can be limited to a subset of atoms.
NOTE: Periodic boundary conditions (PBC) are NOT honored.
- Parameters
st (structure.Structure) – structure
atom_list (list(int)) – 1-based atom indices
- Returns
centroid given as 3-element array [x, y, z]
- Return type
numpy.array(float)
See schrodinger/geometry/centroid.h
- schrodinger.structutils.analyze.radius_of_gyration(st: structure.Structure, atom_indices: Optional[List[int]] = None, mass_weighted: bool = False) float ¶
Calculate radius of gyration (R_gyr or Rg) is a measure of the size of an object of arbitrary shape.
NOTE: Periodic boundary conditions (PBC) are NOT honored. :return: float value in Angstrom units
- schrodinger.structutils.analyze.calculate_principal_moments(struct=None, atoms=None, massless=False)¶
Calculate the principal moments of inertia for a list of atoms. This is calculated with respect to the x, y, and z coordinates of the atom’s center of mass.
- Parameters
struct (
Structure
) – If given the moments will be calculated for the entire structure. This overrides any atoms given with the atoms keyword. Either atoms or structure must be given.atoms (list) – list of
schrodinger.structure._StructureAtom
objects. Atom objects to compute the tensor for. Either atoms or structure must be given.massless (bool) – True if the calculations should be independent of the atomic masses (all mass=1), False (default) if atomic mass should be used.
- Return type
tuple
- Returns
A tuple of (eigenvalues, eigenvectors) of the inertial tensor. The eigenvalues are the principle moments of inertia and are a list of length 3 floats. The eigenvectors are a list of lists, each inner list is a list of length 3 floats.
- schrodinger.structutils.analyze.get_largest_moment_normalized_vector(**kwargs)¶
Return the normalized eigenvector of the largest moment of inertia. This will be the vector normal to the plane of the largest moment. See calculate_principal_moments for parameters.
- Return type
numpy.array
- Returns
The normalized vector for the largest moment of inertia
- schrodinger.structutils.analyze.find_shortest_bond_path(struct, index1, index2, atom_ids=None)¶
Find the shortest path of bonded atoms that connects atom1 to atom2
The conversion of this routine to use networkx rather than scipy resulted in a dramatic reduction in both time and memory usage.
- Parameters
struct (
schrodinger.structure.Structure
) – The structure containing the atoms index1 and index2index1 (int) – The index of the first atom in the path
index2 (int) – The index of the second atom in the path
atom_ids (the atom_ids to search path from) – list of int
- Return type
list
- Returns
A list of indexes of atoms that connect atom index1 to atom index2 along the shortest bond path. Index1 will be the first item in the list and index2 will be the last. The second item in the list will be bonded to index1, the third will be bonded to the second, etc. If index1 == index2, a single item list is returned: [index1]
- Raises
ValueError – if index1 and index2 are not part of the same molecule
MemoryError – if the system is too large
- schrodinger.structutils.analyze.create_nx_graph(struct, atoms=None)¶
Generate a networkx undirected graph of the structure based on bonds
- Parameters
struct (
schrodinger.structure.Structure
) – the structure to create a graph ofatoms (iterable) – Optionally, graph edges will be restricted to this group of atoms - items are
schrodinger.structure._StructureAtom
objects or atom indexes
- Return type
networkx.Graph
- Returns
An undirected graph of the structure with edges in place of each bond. Edges are identical regardless of the bond order of the bond.
- schrodinger.structutils.analyze.improper_dihedral_iterator(struct=None, atoms=None, nx_graph=None, include_proper_improper=True, include_proper=True)¶
An iterator over all the improper dihedral angles in a structure or group of atoms.
- Parameters
struct (
schrodinger.structure.Structure
) – The structure to find improper dihedrals in. Either struct or nx_graph must be given.atoms (iterable) – Optionally, improper dihedrals will be restricted to this group of atoms - items are
schrodinger.structure._StructureAtom
objects or atom indexesgraph (
networkx.Graph
) – A networkx graph of the structure with edges representing bonds. If not supplied one will be generated. If graph is supplied, struct and atoms are ignored.include_proper_improper (bool) – whether to include improper dihedrals that define the same degree-of-freedom defined by a proper dihedral obtained from torsion_iterator, for example in the digrams below the neighboring atoms are not bound and so the defined impropers offer new degrees-of-freedom but if they were bound (suppose (R’’,12) was bound to (R’,26) in the first diagram) the degree-of-freedom defined by the improper (7, 37, 12, 26) is redundant with that defined by the proper (7, 37, 12, 26) obtained from torsion_iterator, this boolean controls whether such impropers can be returned
include_proper (bool) – whether to include proper dihedrals obtained from torsion_iterator that define a new degree-of-freedom not defined by any improper dihedral, for example in the digram below suppose (R’’,12) was bound to (R’,26) and the bond between (X,37) and (R’,26) did not exist, this boolean controls whether such propers can be returned as impropers
- Return type
tuple
- Returns
Each iteration yields a 4-integer tuple of atom indexes for an improper dihedral. For each given quadruple all unique topologies are enumerated. For example, for a standard quadruple ordering (i,j,k,l) all 6 of the following topologies are returned: (1) (i,j,k,l) (2) (j,i,k,l) # switch 1,2 (3) (i,j,l,k) # switch 3,4 (4) (j,i,l,k) # switch 1,2 and 3,4 (5) (k,j,i,l) # switch 1,3 (6) (i,l,k,j) # switch 2,4 The standard quadruple ordering (i,j,k,l) considers the indices of the first three atoms as the central atom plus the two lowest index bonding atoms in bond-path ordering as lowest bonding atom, central atom, other bonding atom. The last index in the quadruple is the remaining atom which is not bonded to the last atom in the previously mentioned triple but is bonded to the central atom. If calling with include_proper then those quadruples do not have their topologies enumerated and the ordering will be such that the index of the first atom in the tuple will be smaller than the index of the last atom in the tuple.
For example:
(R,7) (X,37)-(R',26) / (R'',12)
(7, 37, 12, 26) (standard)
(37, 7, 12, 26)
(7, 37, 26, 12)
(37, 7, 26, 12)
(12, 37, 7, 26)
(7, 26, 12, 37)
or:
(R,34) (X,1)-(R',4) / (R'',78)
(1) (4, 1, 34, 78) (standard) etc.
- schrodinger.structutils.analyze.torsion_iterator(struct=None, atoms=None, nx_graph=None)¶
An iterator over all the bonded torsions in a structure or group of atoms.
- Parameters
struct (
schrodinger.structure.Structure
) – The structure to find torsions in. Either struct or nx_graph must be given.atoms (iterable) – Optionally, torsions will be restricted to this group of atoms - items are
schrodinger.structure._StructureAtom
objects or atom indexesgraph (
networkx.Graph
) – A networkx graph of the structure with edges representing bonds. If not supplied one will be generated. If graph is supplied, struct and atoms are ignored.
- Return type
tuple
- Returns
Each iteration yields a 4-integer tuple of atom indexes for a dihedral formed by bonded atoms. The index of the first atom in the tuple will be smaller than the index of the last atom in the tuple.
- schrodinger.structutils.analyze.angle_iterator(struct=None, atoms=None, nx_graph=None)¶
An iterator over all the bonded angles in a structure or group of atoms.
- Parameters
struct (
schrodinger.structure.Structure
) – The structure to find angles in. Either struct or nx_graph must be given.atoms (iterable) – Optionally, angles will be restricted to this group of atoms - items are
schrodinger.structure._StructureAtom
objects or atom indexesgraph (
networkx.Graph
) – A networkx graph of the structure with edges representing bonds. If not supplied one will be generated. If graph is supplied, struct and atoms are ignored.
- Return type
tuple
- Returns
Each iteration yields a 3-integer tuple of atom indexes for an angle formed by bonded atoms. The index of the first atom in the tuple will be smaller than the index of the last atom in the tuple.
- schrodinger.structutils.analyze.bond_iterator(struct=None, atoms=None, nx_graph=None)¶
An iterator over all the bonds in a structure or group of atoms.
Note: It may seem unnecessary to have this function as one must iterate over bonds to form the nx_graph that is then iterated over in this function. However, it may be the case that an nx_graph has already been created for other reasons such as when iterating over all bonds, angles and torsions.
- Parameters
struct (
schrodinger.structure.Structure
) – The structure to find bonds in. Either struct or nx_graph must be given.atoms (iterable) – Optionally, bonds will be restricted to this group of atoms - items are
schrodinger.structure._StructureAtom
objects or atom indexesgraph (
networkx.Graph
) – A networkx graph of the structure with edges representing bonds. If not supplied one will be generated. If graph is supplied, struct and atoms are ignored.
- Return type
tuple
- Returns
Each iteration yields a 2-integer tuple of atom indexes for a bond formed by bonded atoms. The index of the first atom in the tuple will be smaller than the index of the last atom in the tuple.
- schrodinger.structutils.analyze.get_average_structure(sts)¶
Calculate the average structure between the given conformers.
- Parameters
sts (Iterable of
structure.Structure
objects) – Structures to average- Return type
structure.Structure
- Returns
Average structure
- schrodinger.structutils.analyze.find_common_substructure(sts, atomTyping=11, allow_broken_rings=True)¶
Find the maximum substructure that is common between all specified CTs. If any of the structures matches the substructure SMARTS more than once, then all matches are reported - that is why output a “triple” list. Outer list represents input structure, next list represents matches, and inner list is list of atom indices for that match. It’s up to the calling code to decide which of the multiple matches to use (one method is to use the one whose center-of-mass is closest to the COM of the whole ligand). NOTE: This function becomes exponentioally slow with larger number of structures. Recommened maximum around 30 structures.
NOTE: This function checks CANVAS_SHARED exists and checks out CANVAS_FULL
- Parameters
sts (Iterable of
structure.Structure
objects) – Structures to averageatomTyping (int) – Atom typing scheme to use. For list of available schemes, see $SCHRODINGER/utilities/canvasMCS -h
allow_broken_rings (bool) – Whether to allow partial mapping of rings
- Return type
List of list of list of ints
- Returns
Substructure atoms from each structure. Outer list represents input structures - in order of input; middle list represents matches, inner list represents atom indices for that match.
- schrodinger.structutils.analyze.group_by_connectivity(st, atoms)¶
Groups the atoms by molecule connectivity. Returns a list of atom groups. Each group is a list of atoms that are in the same “molecule” - that are bonded to each other, counting only atoms in specified list. If multiple atoms are in the same molecule, but are separated by atoms that are not in the list (e.g. 2 covalent ligands bound to same protein), they will be grouped separately.
- Parameters
st (
structure.Structure
) – Structure that atoms are from.atoms (list of ints) – List of atom indices that are to be grouped.
- schrodinger.structutils.analyze.find_common_properties(sts: Iterable[schrodinger.structure._structure.Structure]) Set[str] ¶
Return a set of property names that are common to all selected structures.
- Parameters
sts (Iterable of
structure.Structure
objects) – Structures to analyze- Returns
set of property data names
- Return type
set of str
- schrodinger.structutils.analyze.read_seqres_from_ct(st: schrodinger.structure._structure.Structure)¶
ct {schrodinger.Structure} Input ct to process Read the SEQRES data from a ct and return as a pair of lists with the same size. The first has the chain names and the second the sequences ie [‘A’] and [‘ALA ALA ALA ‘]
- schrodinger.structutils.analyze.seq_align_match(fullseq, fragseq, pdbnum, breaklist=None, allow_frag_gaps=False)¶
restricted Needleman-Wunsch
- Parameters
fullseq (str) – Full sequence to work with. Positions in the alignment that match this and not fragseq are given light penalities. Positions in the alignment that fragseq and not this are either not allowed (allow_frag_gaps=True) or have large penalties. This is intended to be the full protein sequence (from the seqres records) when alignining protein full protein sequences to those actually resolved in the experiment.
fragseq (str) – Fragment sequence to work with. This is intended to be the fragment of the protein sequence actually resolved in the experiment (ATOM records) when aligning full protein sequences to those actually resolved in the experiment.
pdbnum (list(tuple)) – list of tuples with a integer and a character with the same length as fragseq ie [(1, ‘ ‘, (1, ‘A’), (2, ‘ ‘)]} The residue numbers and insertion codes of the residues in fragseq. This allows for the gap penalties to be disregarded when the residue number suggests a gap.
breaklist (list(bool)) – { list of Boolean with the same length as fragseq } True values in this list mean that there is a known break after that residue in fragseq so gap penalties are disregarded.
allow_frag_gaps (bool) – see fullseq for a description
Return value is a string with the same length as the alignment. A M will be at any position that matches the fullseq and fragseq. A U will be at any position that mathces fullseq, but not fragseq. A R will be at any position that mathces fragseq, but not fullseq.
- class schrodinger.structutils.analyze.AslLigandSearcher(copy_props=True, **kwargs)¶
Bases:
object
Search a
Structure
instance for putative ligands with an Atom Selection Language expression. Results are returned as a list ofLigand
instances.API example:
st = structure.Structure.read('file.mae') st_writer = structure.StructureWriter('out.mae') asl_searcher = AslLigandSearcher() ligands = asl_searcher.search(st) for lig in ligands: st_writer.append(lig.st) st_writer.close()
ASL evaluates molecules in a strict sense. Ligands with zero-order bonds to metal and covalently-attached ligands are difficult to find with this naive approach. See __init__ for options that workaround these limitations.
‘sidechain’, ‘backbone’, and ‘ion’ aliases are used by this module. They are taken from first mmasl.ini in the path, but are assumed to be defined as a list of PDB atom names that correspond to atoms of the protein side chains, protein backbone, and small ions respectively.
Since the precise definition of a ligand is context specific and impossible to generally formulate, this class attempts to provide customizable tools for identifying ligands within a structure. It is the caller’s responsibility to customize the search parameters and verify that the hits are appropriate.
For all keyword args, the configured value from Maestro will be used by default. Only specify a keyword arg to override this value.
All kwargs (except copy_props) are passed directly to LigandParameters, as defined in the LigandParamaters class in mmasl.h
- Variables
copy_props (bool) – If True then copy the ct-level properties from the searched structure to all the found ligand substructures. If False, only the title will be copied.
min_heavy_atom_count (int) – Minimum number of heavy atoms required in each ligand molecule.
max_atom_count (int) – Maximum number of heavy atoms for a ligand molecule (does not include hydrogens).
allow_ion_only_molecules (bool) – Consider charged molecules to be ligands.
allow_amino_acid_only_molecules (bool) – If True, consider small molecules containing only amino acids to be ligands.
excluded_residue_names (set[str]) – Set of PDB residue names corresponding to atoms which will never be ligands.
included_residue_names (set[str]) – Set of PDB residue names corresponding to atoms which always be considered ligands.
- See
find_ligands
for a simple functional interface to this class.
- __init__(copy_props=True, **kwargs)¶
Initialize searcher.
- property min_atom_count¶
- property max_atom_count¶
- property exclude_ions¶
- property exclude_amino_acids¶
- property excluded_residues¶
- search(st) list ¶
Find list of putative ligands matching either
ligand_asl
or the default internally generated ASL.
- class schrodinger.structutils.analyze.Ligand(complex_st, st, mol_num=None, atom_indexes=None, lig_asl=None, is_covalently_bound=None)¶
Bases:
object
A putative
AslLigandSearcher
ligand structure with read-only data and convenience methods.Ligand
items sort from smallest to largest, by total number of atoms, then by SMILES.Parameters: * complex_st: Original complex structure. * st: Ligand substructure * mol_num: Ligand molecule number in the original structure NOTE: molecule contains non-ligand atoms for covalently bound ligands. * atom_indexes: Atom indices into the original structure for this ligand. * atom_objects: List of ligand atom objects from the original structure. * lig_asl: ASL that matches the ligand atoms in the original structure. * is_covalently_bound: Whether the ligand is covalently bound. Depreacted. * pdbres: PDB residue name identifier. * centroid: Centroid of ligand as a 4-element numpy array: [x, y, z, 0.0] * unique_smiles: SMILES string representing this ligand structure.
- __init__(complex_st, st, mol_num=None, atom_indexes=None, lig_asl=None, is_covalently_bound=None)¶
- Parameters
st (
Structure
) – Original complex structure.st – Ligand structure.
mol_num (int) – Molecular index identifier. Typically, the mol.n from the original structure from whence this ligand structure was derived. Note, depending on the nature of the ligand and the treatment of the original structure this mol.n index may not be valid.
atom_indexes (list) – Atom index identifiers. Typically, the at.n from the original structure from whence this ligand structure was derived.
lig_asl (str) – ASL identifier. Typically, the expression is defined in terms of the original structure from whence this ligand structure was derived.
- Deprecated is_covalently_bound
Whether this ligand is bonds to other atoms (including zero-order bonds). Will be False if the ligand spans a whole molecule.
- property is_covalently_bound¶
The Ligand.is_covalently_bound property returns True if this ligand has any bonds (including zero-order) to any other atoms, and returns False if the ligand spans a complete molecule.
- sort_key()¶
Enable sorting for
Ligand
objects:ligands.sort(key=lambda l: l.sort_key())
Comparison criteria for sorting Ligands: total number of atoms, unique smiles string, centroid.
- Returns
sort key
- Return type
list
- property mol_num¶
Ligand’s molecule number as defined upon instantiation. :rtype: int
Warning: Depending on the nature of the ligand and the treatment of the original structure, e.g. zero-order bonds cut, this mol.n index may not be valid.
- property atom_indexes¶
Indices of the Ligand atoms as defined upon instantiation. :rtype: list
- property atom_objects¶
Atom objects from the original structure for the ligand atoms. :rtype: list
- property pdbres¶
PDB residue name identifier. If the ligand is composed of multiple residues then the names are joined with a ‘-’ separator. :rtype: str
- property centroid¶
Centroid of the Ligand as a 4-element numpy array: [x, y, z, 0.0] :rtype: 4-element
numpy array
- property unique_smiles¶
Unique SMILES string representing this ligand structure. :rtype: str
- property st¶
Copy of the ligand
Structure
. :rtype:Structure
- property ligand_asl¶
Ligand_asl used when searching for the ligand. The ASL defined the ligand in the context of its original structure. :rtype: str
- class schrodinger.structutils.analyze.MissingLoopFinder¶
Bases:
object
compare SEQRES and ATOM record and find missing loops. Does not use the order of the residues in the CT to avoid issues that occur when missing loops are searched for after some loops have been added by another program
- __init__()¶
- run(ct, include_tails=False, legacy_output=False, debug=False)¶
Compare the SEQRES records the structual atom records to find missing loops.
- Returns
tuples of (<residue object of the residue before the missing loop or NONE if this is a missing N-terminal tail>, <residue object of the residue after the missing loop or NONE if this is a missing C-terminal tail>, <list of residue types missing as a list of 3char strings>
- Parameters
debug (bool) – is True then the allignments will be printed to stdout
include_tails – if True then missing N and C terminal tails will be included
- schrodinger.structutils.analyze.get_low_energy_reps(sts, eps, key=None)¶
Cluster the given structures by energy using the given energy key function and precision and return the lowest energy structures from each cluster sorted by energy.
- Parameters
sts (list[
schrodinger.structure.Structure
]) – the structureseps (float) – the precision that controls the size of the clusters, see sklearn.cluster.DBSCAN documentation for more details
key (function or None) – the function to get the energy from the structure, if None minimize.compute_energy will be used
- Raises
ValueError – if there is an issue
- Return type
- Returns
representative structures sorted by increasing energy