schrodinger.structutils.rgroup_enumerate module

Module for R-group enumeration.

schrodinger.structutils.rgroup_enumerate.logger = <Logger rgroup_enumerate (INFO)>

RGroup properties:

  • atom_index: index of the atom bound to the core (aka the “leaving atom”)

  • source_index: index of the R-group source that will replace this group

  • leaving_atoms: list of all the atoms in the leaving group (by index)

  • staying_atom: index of the core atom bound to the leaving group

class schrodinger.structutils.rgroup_enumerate.RGroup(atom_index, source_index, leaving_atoms, staying_atom, bond_order)

Bases: tuple

__contains__(key, /)

Return key in self.

__len__()

Return len(self).

atom_index

Alias for field number 0

bond_order

Alias for field number 4

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

leaving_atoms

Alias for field number 2

source_index

Alias for field number 1

staying_atom

Alias for field number 3

class schrodinger.structutils.rgroup_enumerate.Concentration(atoms, concentration)

Bases: object

Concentration classes manage “concentration groups”, which define the exact number of possible combinations to be explored. They construct new instances from commandline strings. For each concentration group, they also track the associated R group sources, which are written as properties to enumerated products.

__init__(atoms, concentration)
Parameters
  • atoms (list(int)) – the atom indices of the input structure corresponding to specified attachment points.

  • concentration (int) – the concentration, a float that defines the maximum number of simultaneous substitutions across the provided atoms

appendSource(rgroup_source)
Parameters

rgroup_source (rgroup_enumerate.RGgroupSource) – a pair of attachment points and R group input structures, used to enumerate new R groups at the specified attachment points

addConcentrationProperty(product)

Add the concentration property (if applicable) to a product. :param product: target product :type product: structure.Structure

class schrodinger.structutils.rgroup_enumerate.RGroupSource(source_index, structures, atom_indices, attachment_indices)

Bases: object

Class to generate rgroup structure iterators that are associated with specific attachment points.

__init__(source_index, structures, atom_indices, attachment_indices)
Parameters
  • structure (iter(structure.Structure)) – structure iterator

  • attachment_indices (list(int)) – affected attachment points

makeIter(target_attachments)

Create a new iterator that stores and returns the associated target atoms.

Parameters

target_attachments (list(int)) – list of attachment point indices specifying affected attachment points.

Returns

iterator that knows which attachment points to modify

Return type

RGroupIter

exception schrodinger.structutils.rgroup_enumerate.RGroupError

Bases: Exception

Exception class for errors specific to this module, which the caller may want to present to the user as a simple error message, as opposed to a traceback. This is meant for “user errors”, as opposed to bugs; for example, when an input structure doesn’t fulfill the requirements.

__init__(*args, **kwargs)
args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception schrodinger.structutils.rgroup_enumerate.BondOrderMismatch

Bases: schrodinger.structutils.rgroup_enumerate.RGroupError

A specific kind of error that we’ll ignore to allow libraries that include R-groups with different bond orders.

__init__(*args, **kwargs)
args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class schrodinger.structutils.rgroup_enumerate.RGroupEnumerator(core_st, sources, optimize_sidechains=True, deduplicate=True, start=0, stop=None, copy_properties=False, enumerate_cistrans=True, yield_renum_maps=False, concentrations=())

Bases: object

Enumerate a structure using R-group sources.

A source is a sequence with an iterable of Structure as its first element, followed by one or more core atom atom indices where the side chains from the source should be inserted.

RGroupEnumerator objects are iterable. Example:

sources = [
    (StructureReader('r1.maegz'), 4, 12),
    (StructureReader('r2.maegz'), 8),
]
for prod_st in RGroupEnumerator(core_st, sources):
    ...

will use the first reader to replace atoms 4 and 12 in an homo fashion (meaning that for a given product, the groups attached to atoms 4 and 12 are always the same), in combination with the structures for the second reader for atom 8.

The generated structures have the title of the core structure and the title of each of the R-groups, encoded in CSV format. For ease of parsing, this information is also stored as separate properties: i_rge_num_r_groups has the number of R groups, and the title of each is goes in properties r_rge_R1, r_rge_R2, etc.

As an option, all CT properties from the R groups can be copied to each product molecule. These properties have the original name prefixed with <type-char>_rge_R<index>_; for example, r_i_glide_gscore for the first R group becomes r_rge_R1_r_i_glide_gscore.

The structures in each R-group source should each have one dummy atom (symbol ‘’, atomic number zero).

The user of the class can request only a slice of the full set of combinations to be yielded, by providing the optional ‘start’ and ‘stop’ constructor arguments. These follow the standard Python slicing convention.

If the core structure came from an SDF file with R-group labels (“M RGP” lines), the attachment atoms don’t need to be specified; the labels from the file can be used implicitly.

__init__(core_st, sources, optimize_sidechains=True, deduplicate=True, start=0, stop=None, copy_properties=False, enumerate_cistrans=True, yield_renum_maps=False, concentrations=())

Initialize an enumerator for a given core structure and specification of rgroup sources.

Parameters
  • core_st (schrodinger.structure.Structure) – core structure

  • sources (list of list) – side chain sources. See class description for details.

  • optimize_sidechains (bool) – if true, generate 3D coordinates for the side chain atoms using Fast3D. The input coordinates will only be used for determining stereochemistry. If false, position the side chains using rigid rotation and translation (and an arbitrary torsional angle around the new bond).

  • deduplicate (bool) – use unique SMILES to identify and reject duplicate products

  • start (int) – beginning of results slice (used by subjobs)

  • stop (int or None) – end of results slice (used by subjobs)

  • copy_properties (bool) – if true, copy all CT properties from each R-group to the constructed molecule.

  • enumerate_cistrans (bool) – if True (default), emit both cis and trans isomers for double-bonded R-groups.

  • yield_renum_maps (bool) – if True then on each iteration yield not only the product structure but also the relevant old-to-new atom index map

  • concentrations (iterable(Concentration) or None) – List of concentrations, which define the number of simultaneous R group substitutions are made across all attachment point atoms in the concentration group.

attachSidechains(sidechains)

Attach the sidechains to the core structure and return the resulting structure and index map.

Parameters

sidechains (list of schrodinger.structure.Structure) – list of sidechains. Should have the same length as the number of attachment atoms in the core.

Yield

product structures and index maps

Ytype

schrodinger.structure.Structure, dict

combinations()

Return the number of combinations that will be generated for each tuple of R-groups. That is, combinations due to occupancy of the various attachment points when all the concentrations are not 1.0.

schrodinger.structutils.rgroup_enumerate.filtered_combinations(rgroup_iters, selection_iters)

Given a set of rgroup structures (provided as a list of RGroupIterFactories) and attachment point combination iterators, return an iterator over all distinct combinations of structures and attachment point combinations. (Note that while the combinations are unique, the corresponding products may still have duplicates.)

Parameters
  • rgroup_iters – iterators over R groups, one for every source.

  • selection_iters (list(iter(tuple(int)))) – iterators over combinations, one for every set of attachment points from which combinations are chosen

Returns

iterator over all distinct product combinations

Return type

itertools.product(tuple(structure.Structure, list(int)))

schrodinger.structutils.rgroup_enumerate.filtered_combinations_for_selection(rgroup_iters, selection_iter)

Helper function to reduce the loop nesting in filtered_combinations. Handles the filtering of rgroup structure iterators per attachment point set (associating the rgroup iterator to its respective attachment points). (Note that while the combinations are unique, the corresponding products may still have duplicates.)

Parameters
  • rgroup_iters – iterators over R groups, one for every source.

  • selection_iter – iterator over combinations for a set of attachment points

Yield

iterator over all distinct product (R group + attachment point) combinations

Ytype

itertools.product(tuple(structure.Structure, list(int)))

schrodinger.structutils.rgroup_enumerate.find_rgroup_from_smarts(st, smarts, leaving_atom_pos, staying_atom_pos, bond_order=None)

Find the various ways in which a structure can be split into “R-group” and “functional group” using a SMARTS pattern.

The SMARTS pattern must consist of at least two atoms. Two of the atoms, identified by their position in the SMARTS string, are used to define the bond to be broken between the R group and the “leaving group”. If the two atoms are not directly connected, the bond leading from the leaving atom to the staying atom is broken.

For example, consider the structure c1ccccc1cC(=O)O and the SMARTS pattern C(=O)O. With leaving_atom_pos=2, staying_atom_pos=1, the entire carboxylate is removed, producing the R-group c1ccccc1. With leaving_atom_pos=4, staying_atom_pos=2, only the terminal O is removed, leading to the R-group c1ccccc1C(=O)*. (The asterisks are shown here only to highlight the bond that was broken.)

The return value is a list of tuples, where the first element is the attachment atom index and the second is a list of the indexes of the atoms comprising the R-group. In the first example above, if we pretend there are no hydrogens, the return value might be [(7, [1,2,3,4,5,6])].

Notes: 1) ring bonds can’t be broken because they don’t split the structure in two; 2) if bond_order is not None, skip matches having the attachment bond of different order.

Parameters
  • st (schrodinger.structure.Structure) – structure to analyze

  • smarts (str) – SMARTS pattern describing the functional group

  • leaving_atom_pos (index) – position of the leaving atom in the SMARTS pattern (1-based)

  • staying_atom_pos (index) – position of the attachment atom in the SMARTS pattern (1-based)

  • bond_order (int or NoneType) – If None (default), has no effect. Otherwise skip matches having R-group attachment bond of different order.

Returns

list of tuples (attachment atom, list of R-group atom indexes). If no matches satisfied all the requirements, the list may be empty. May include duplicate R-groups (R-group in this context is the substructure made of the newly found R-group atoms).

Return type

list

schrodinger.structutils.rgroup_enumerate.find_staying_atom(st, leaving_atom)

Given a picked “leaving” atom, determine which of the atoms it is bonded to is part of the larger molecule - the “staying” atom. All other atoms bound to the leaving atom are considered to be part of the leaving group.

Parameters

leaving_atom (schrodinger.structure._StructureAtom) – atom which defines the start of the leaving group

Returns

“staying atom”: the core atom bound to the leaving atom

Rtype leaving_atom

schrodinger.structure._StructureAtom

schrodinger.structutils.rgroup_enumerate.get_dummy_filter()

Return a filter which has as criteria all the descriptors that can be computed by this module, along with their suggested default limiters (ranges).

Return type

schrodinger.ui.qt.filter_dialog_dir.filter_core.Filter

schrodinger.structutils.rgroup_enumerate.add_descriptors(st, filter_obj)

Add the descriptors required by a filter to a given Structure.

schrodinger.structutils.rgroup_enumerate.list_to_csv(fields)

Convert a list into a CSV string representation.

Parameters

fields (list) – list to convert

Return type

str

schrodinger.structutils.rgroup_enumerate.add_amide_constraints(st, frozen_set)

For amide bonds which have one atom frozen and the other not, add the necessary ct properties to tell fast3d to constrain the amides to the trans conformation.

Parameters
Returns

names of properties that were added

Return type

list of str

schrodinger.structutils.rgroup_enumerate.get_metals_and_neighbors(st)

Returns indices of metal atoms, and atoms bonded to them.

Parameters

st – Structure.

Returns

Set of atom indices in st.

Return type

set(int)

schrodinger.structutils.rgroup_enumerate.get_last_EZ_property_index(st)

Return the maximum index of the s_st_EZ_<index> properties of st. If there are no such properties, return 0.

Return type

int

schrodinger.structutils.rgroup_enumerate.get_sources_from_r_labels(st, iters, prop='i_sd__MolFileRLabel')

Given a Structure and a list of iterables, return the “sources” data structure needed by RGroupEnumerator. The structure must have (some) atoms with the specified property; the values of this property must be in the range [1, len(iters)] and all the values in that range must be represented at least once. If this condition is not met, raise a ValueError.

Parameters
  • st (schrodinger.structure.Structure) – core Structure

  • iters (list) – list of iterables of structures

  • prop (str) – name of the atom property holding the R-group labels. The default is what comes from reading an SD file with “M RGP” fields using RDKit.

Returns

sources data structure. See RGroupEnumerator for details.

Return type

list of list

schrodinger.structutils.rgroup_enumerate.convert_attachment_point(struct, bond_order=None)

Converts attachment point from methyl to dummy atom. If r-group fragment is ‘Null’ returns False, otherwise returns True. Null r-group has atom with atomic number -2 and growname ‘rpc1’.

Parameters
  • struct (structure.Structure) – structure object

  • bond_order (int or NoneType) – If None, has no effect. Otherwise return False if attachment bond is of different order.

Returns

True if conversion succeeded and False otherwise.

Return type

bool

schrodinger.structutils.rgroup_enumerate.get_attachment_point(st)

Identifies attachment point (dummy atom with a single neighbor) in the provided structure.

Parameters

st (schrodinger.structure.Structure) – Structure

Returns

Index of the dummy atom and order of the bond that joins the dummy to the rest.

Return type

(int, int)

schrodinger.structutils.rgroup_enumerate.check_attachment_point(struct, bond_order=None)

Checks that provided structure contains attachment point.

Parameters

bond_order (int or NoneType) – If None, has no effect. Otherwise return False if attachment bond is of different order.

Returns

True if attachment point was found and False otherwise.

Return type

bool

schrodinger.structutils.rgroup_enumerate.create_fragment_structure(st, rgroup_data)

Creates r-group fragment structures from a given structure and a list of atoms that should be included in the r-group.

Parameters
  • st (structure.Structure) – structure object

  • rgroups (tuple) – r-group data that contains index of attachment atom and indices of R-group fragment atoms.

Returns

fragment structure

Return type

structure.Structure

schrodinger.structutils.rgroup_enumerate.RgroupEnumerator

alias of schrodinger.structutils.rgroup_enumerate.RGroupEnumerator

schrodinger.structutils.rgroup_enumerate.RgroupError

alias of schrodinger.structutils.rgroup_enumerate.RGroupError