schrodinger.structutils.rgroup_enumerate module¶
Module for R-group enumeration.
- schrodinger.structutils.rgroup_enumerate.logger = <Logger rgroup_enumerate (INFO)>¶
RGroup properties:
atom_index: index of the atom bound to the core (aka the “leaving atom”)
source_index: index of the R-group source that will replace this group
leaving_atoms: list of all the atoms in the leaving group (by index)
staying_atom: index of the core atom bound to the leaving group
- class schrodinger.structutils.rgroup_enumerate.RGroup(atom_index, source_index, leaving_atoms, staying_atom, bond_order)¶
Bases:
tuple
- atom_index¶
Alias for field number 0
- bond_order¶
Alias for field number 4
- leaving_atoms¶
Alias for field number 2
- source_index¶
Alias for field number 1
- staying_atom¶
Alias for field number 3
- class schrodinger.structutils.rgroup_enumerate.Concentration(atoms, concentration)¶
Bases:
object
Concentration classes manage “concentration groups”, which define the exact number of possible combinations to be explored. They construct new instances from commandline strings. For each concentration group, they also track the associated R group sources, which are written as properties to enumerated products.
- __init__(atoms, concentration)¶
- Parameters
atoms (list(int)) – the atom indices of the input structure corresponding to specified attachment points.
concentration (int) – the concentration, a float that defines the maximum number of simultaneous substitutions across the provided atoms
- appendSource(rgroup_source)¶
- Parameters
rgroup_source (rgroup_enumerate.RGgroupSource) – a pair of attachment points and R group input structures, used to enumerate new R groups at the specified attachment points
- addConcentrationProperty(product)¶
Add the concentration property (if applicable) to a product. :param product: target product :type product: structure.Structure
- class schrodinger.structutils.rgroup_enumerate.RGroupSource(source_index, structures, atom_indices, attachment_indices)¶
Bases:
object
Class to generate rgroup structure iterators that are associated with specific attachment points.
- __init__(source_index, structures, atom_indices, attachment_indices)¶
- Parameters
structure (iter(structure.Structure)) – structure iterator
attachment_indices (list(int)) – affected attachment points
- makeIter(target_attachments)¶
Create a new iterator that stores and returns the associated target atoms.
- Parameters
target_attachments (list(int)) – list of attachment point indices specifying affected attachment points.
- Returns
iterator that knows which attachment points to modify
- Return type
RGroupIter
- exception schrodinger.structutils.rgroup_enumerate.RGroupError¶
Bases:
Exception
Exception class for errors specific to this module, which the caller may want to present to the user as a simple error message, as opposed to a traceback. This is meant for “user errors”, as opposed to bugs; for example, when an input structure doesn’t fulfill the requirements.
- exception schrodinger.structutils.rgroup_enumerate.BondOrderMismatch¶
Bases:
schrodinger.structutils.rgroup_enumerate.RGroupError
A specific kind of error that we’ll ignore to allow libraries that include R-groups with different bond orders.
- class schrodinger.structutils.rgroup_enumerate.RGroupEnumerator(core_st, sources, optimize_sidechains=True, deduplicate=True, start=0, stop=None, copy_properties=False, enumerate_cistrans=True, yield_renum_maps=False, concentrations=())¶
Bases:
object
Enumerate a structure using R-group sources.
A source is a sequence with an iterable of Structure as its first element, followed by one or more core atom atom indices where the side chains from the source should be inserted.
RGroupEnumerator objects are iterable. Example:
sources = [ (StructureReader('r1.maegz'), 4, 12), (StructureReader('r2.maegz'), 8), ] for prod_st in RGroupEnumerator(core_st, sources): ...
will use the first reader to replace atoms 4 and 12 in an homo fashion (meaning that for a given product, the groups attached to atoms 4 and 12 are always the same), in combination with the structures for the second reader for atom 8.
The generated structures have the title of the core structure and the title of each of the R-groups, encoded in CSV format. For ease of parsing, this information is also stored as separate properties: i_rge_num_r_groups has the number of R groups, and the title of each is goes in properties r_rge_R1, r_rge_R2, etc.
As an option, all CT properties from the R groups can be copied to each product molecule. These properties have the original name prefixed with <type-char>_rge_R<index>_; for example, r_i_glide_gscore for the first R group becomes r_rge_R1_r_i_glide_gscore.
The structures in each R-group source should each have one dummy atom (symbol ‘’, atomic number zero).
The user of the class can request only a slice of the full set of combinations to be yielded, by providing the optional ‘start’ and ‘stop’ constructor arguments. These follow the standard Python slicing convention.
If the core structure came from an SDF file with R-group labels (“M RGP” lines), the attachment atoms don’t need to be specified; the labels from the file can be used implicitly.
- __init__(core_st, sources, optimize_sidechains=True, deduplicate=True, start=0, stop=None, copy_properties=False, enumerate_cistrans=True, yield_renum_maps=False, concentrations=())¶
Initialize an enumerator for a given core structure and specification of rgroup sources.
- Parameters
core_st (
schrodinger.structure.Structure
) – core structuresources (list of list) – side chain sources. See class description for details.
optimize_sidechains (bool) – if true, generate 3D coordinates for the side chain atoms using Fast3D. The input coordinates will only be used for determining stereochemistry. If false, position the side chains using rigid rotation and translation (and an arbitrary torsional angle around the new bond).
deduplicate (bool) – use unique SMILES to identify and reject duplicate products
start (int) – beginning of results slice (used by subjobs)
stop (int or None) – end of results slice (used by subjobs)
copy_properties (bool) – if true, copy all CT properties from each R-group to the constructed molecule.
enumerate_cistrans (bool) – if True (default), emit both cis and trans isomers for double-bonded R-groups.
yield_renum_maps (bool) – if True then on each iteration yield not only the product structure but also the relevant old-to-new atom index map
concentrations (iterable(Concentration) or None) – List of concentrations, which define the number of simultaneous R group substitutions are made across all attachment point atoms in the concentration group.
- attachSidechains(sidechains)¶
Attach the sidechains to the core structure and return the resulting structure and index map.
- Parameters
sidechains (list of
schrodinger.structure.Structure
) – list of sidechains. Should have the same length as the number of attachment atoms in the core.- Yield
product structures and index maps
- Ytype
- combinations()¶
Return the number of combinations that will be generated for each tuple of R-groups. That is, combinations due to occupancy of the various attachment points when all the concentrations are not 1.0.
- schrodinger.structutils.rgroup_enumerate.filtered_combinations(rgroup_iters, selection_iters)¶
Given a set of rgroup structures (provided as a list of RGroupIterFactories) and attachment point combination iterators, return an iterator over all distinct combinations of structures and attachment point combinations. (Note that while the combinations are unique, the corresponding products may still have duplicates.)
- Parameters
rgroup_iters – iterators over R groups, one for every source.
selection_iters (list(iter(tuple(int)))) – iterators over combinations, one for every set of attachment points from which combinations are chosen
- Returns
iterator over all distinct product combinations
- Return type
itertools.product(tuple(structure.Structure, list(int)))
- schrodinger.structutils.rgroup_enumerate.filtered_combinations_for_selection(rgroup_iters, selection_iter)¶
Helper function to reduce the loop nesting in filtered_combinations. Handles the filtering of rgroup structure iterators per attachment point set (associating the rgroup iterator to its respective attachment points). (Note that while the combinations are unique, the corresponding products may still have duplicates.)
- Parameters
rgroup_iters – iterators over R groups, one for every source.
selection_iter – iterator over combinations for a set of attachment points
- Yield
iterator over all distinct product (R group + attachment point) combinations
- Ytype
itertools.product(tuple(structure.Structure, list(int)))
- schrodinger.structutils.rgroup_enumerate.find_rgroup_from_smarts(st, smarts, leaving_atom_pos, staying_atom_pos, bond_order=None)¶
Find the various ways in which a structure can be split into “R-group” and “functional group” using a SMARTS pattern.
The SMARTS pattern must consist of at least two atoms. Two of the atoms, identified by their position in the SMARTS string, are used to define the bond to be broken between the R group and the “leaving group”. If the two atoms are not directly connected, the bond leading from the leaving atom to the staying atom is broken.
For example, consider the structure c1ccccc1cC(=O)O and the SMARTS pattern C(=O)O. With leaving_atom_pos=2, staying_atom_pos=1, the entire carboxylate is removed, producing the R-group c1ccccc1. With leaving_atom_pos=4, staying_atom_pos=2, only the terminal O is removed, leading to the R-group c1ccccc1C(=O)*. (The asterisks are shown here only to highlight the bond that was broken.)
The return value is a list of tuples, where the first element is the attachment atom index and the second is a list of the indexes of the atoms comprising the R-group. In the first example above, if we pretend there are no hydrogens, the return value might be [(7, [1,2,3,4,5,6])].
Notes: 1) ring bonds can’t be broken because they don’t split the structure in two; 2) if
bond_order
is notNone
, skip matches having the attachment bond of different order.- Parameters
st (
schrodinger.structure.Structure
) – structure to analyzesmarts (str) – SMARTS pattern describing the functional group
leaving_atom_pos (index) – position of the leaving atom in the SMARTS pattern (1-based)
staying_atom_pos (index) – position of the attachment atom in the SMARTS pattern (1-based)
bond_order (int or NoneType) – If
None
(default), has no effect. Otherwise skip matches having R-group attachment bond of different order.
- Returns
list of tuples (attachment atom, list of R-group atom indexes). If no matches satisfied all the requirements, the list may be empty. May include duplicate R-groups (R-group in this context is the substructure made of the newly found R-group atoms).
- Return type
list
- schrodinger.structutils.rgroup_enumerate.find_staying_atom(st, leaving_atom)¶
Given a picked “leaving” atom, determine which of the atoms it is bonded to is part of the larger molecule - the “staying” atom. All other atoms bound to the leaving atom are considered to be part of the leaving group.
- Parameters
leaving_atom (
schrodinger.structure._StructureAtom
) – atom which defines the start of the leaving group- Returns
“staying atom”: the core atom bound to the leaving atom
- Rtype leaving_atom
schrodinger.structure._StructureAtom
- schrodinger.structutils.rgroup_enumerate.get_dummy_filter()¶
Return a filter which has as criteria all the descriptors that can be computed by this module, along with their suggested default limiters (ranges).
- schrodinger.structutils.rgroup_enumerate.add_descriptors(st, filter_obj)¶
Add the descriptors required by a filter to a given Structure.
- schrodinger.structutils.rgroup_enumerate.list_to_csv(fields)¶
Convert a list into a CSV string representation.
- Parameters
fields (list) – list to convert
- Return type
str
- schrodinger.structutils.rgroup_enumerate.get_metals_and_neighbors(st)¶
Returns indices of metal atoms, and atoms bonded to them.
- Parameters
st – Structure.
- Returns
Set of atom indices in
st
.- Return type
set(int)
- schrodinger.structutils.rgroup_enumerate.get_sources_from_r_labels(st, iters, prop='i_sd__MolFileRLabel')¶
Given a Structure and a list of iterables, return the “sources” data structure needed by RGroupEnumerator. The structure must have (some) atoms with the specified property; the values of this property must be in the range [1, len(iters)] and all the values in that range must be represented at least once. If this condition is not met, raise a ValueError.
- Parameters
st (schrodinger.structure.Structure) – core Structure
iters (list) – list of iterables of structures
prop (str) – name of the atom property holding the R-group labels. The default is what comes from reading an SD file with “M RGP” fields using RDKit.
- Returns
sources data structure. See RGroupEnumerator for details.
- Return type
list of list
- schrodinger.structutils.rgroup_enumerate.convert_attachment_point(struct, bond_order=None)¶
Converts attachment point from methyl to dummy atom. If r-group fragment is ‘Null’ returns False, otherwise returns True. Null r-group has atom with atomic number -2 and growname ‘rpc1’.
- Parameters
struct (
structure.Structure
) – structure objectbond_order (int or NoneType) – If
None
, has no effect. Otherwise return False if attachment bond is of different order.
- Returns
True if conversion succeeded and False otherwise.
- Return type
bool
- schrodinger.structutils.rgroup_enumerate.get_attachment_point(st)¶
Identifies attachment point (dummy atom with a single neighbor) in the provided structure.
- Parameters
st (
schrodinger.structure.Structure
) – Structure- Returns
Index of the dummy atom and order of the bond that joins the dummy to the rest.
- Return type
(int, int)
- schrodinger.structutils.rgroup_enumerate.check_attachment_point(struct, bond_order=None)¶
Checks that provided structure contains attachment point.
- Parameters
bond_order (int or NoneType) – If
None
, has no effect. Otherwise return False if attachment bond is of different order.- Returns
True if attachment point was found and False otherwise.
- Return type
bool
- schrodinger.structutils.rgroup_enumerate.create_fragment_structure(st, rgroup_data)¶
Creates r-group fragment structures from a given structure and a list of atoms that should be included in the r-group.
- Parameters
st (
structure.Structure
) – structure objectrgroups (tuple) – r-group data that contains index of attachment atom and indices of R-group fragment atoms.
- Returns
fragment structure
- Return type
structure.Structure
- schrodinger.structutils.rgroup_enumerate.RgroupEnumerator¶
alias of
schrodinger.structutils.rgroup_enumerate.RGroupEnumerator
- schrodinger.structutils.rgroup_enumerate.RgroupError¶
alias of
schrodinger.structutils.rgroup_enumerate.RGroupError