schrodinger.application.matsci.reaction_workflow_enum_utils module

Utilities for enumerating reaction workflows.

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.application.matsci.reaction_workflow_enum_utils.Site(from_idx, to_idx, hash_idx, structure_idx)

Bases: tuple

from_idx

Alias for field number 0

hash_idx

Alias for field number 2

structure_idx

Alias for field number 3

to_idx

Alias for field number 1

class schrodinger.application.matsci.reaction_workflow_enum_utils.Source(rgroup_st, site_idxs)

Bases: tuple

rgroup_st

Alias for field number 0

site_idxs

Alias for field number 1

class schrodinger.application.matsci.reaction_workflow_enum_utils.PositionedRGroup(hash_idx, sites, st, permutation)

Bases: tuple

hash_idx

Alias for field number 0

permutation

Alias for field number 3

sites

Alias for field number 1

st

Alias for field number 2

schrodinger.application.matsci.reaction_workflow_enum_utils.get_msg(site, msg)
schrodinger.application.matsci.reaction_workflow_enum_utils.get_core_idxs(st)

Return a set of atom indices for the core of the given structure.

Parameters:

st (schrodinger.structure.Structure) – the structure

Return type:

set

Returns:

core atom indices

schrodinger.application.matsci.reaction_workflow_enum_utils.get_rgroup_sts_by_n_site(rgroup_file)

Return R-group structures binned by the number of attachment sites on the R-group.

Parameters:

rgroup_file (str) – the R-group file

Return type:

dict[int]=list[schrodinger.structure.Structure]

Returns:

the R-group structures binned by the number of sites attachment sites on the R-group

schrodinger.application.matsci.reaction_workflow_enum_utils.get_rge_sources(st, rgroup_sts, binned_sites, old_to_new)

Return a list of Source that is prepared for enumeration using the R-Group Enumeration module and a copy of the input structure with potentially duplicated to-atoms.

Parameters:
  • st (schrodinger.structure.Structure) – the structure

  • rgroup_sts (list of structure.Structure) – the R-group structures

  • binned_sites (list of lists of Site) – the binned sites

  • old_to_new (dict) – a map of old-to-new atom indices

Return type:

list, schrodinger.structure.Structure

Returns:

contains Source, a copy of the input structure with potentially duplicated to-atoms

schrodinger.application.matsci.reaction_workflow_enum_utils.update_index_properties(st, old_to_new)

Update the index properties of the given structure.

Parameters:
  • st (structure.Structure) – the structure

  • old_to_new (dict) – a map of old-to-new atom indices

class schrodinger.application.matsci.reaction_workflow_enum_utils.RgroupEnumerator(core_st, sources, optimize_sidechains=True, deduplicate=True, start=0, stop=None, copy_properties=False, enumerate_cistrans=True, yield_renum_maps=False, concentrations=())

Bases: RGroupEnumerator

See parent class.

schrodinger.application.matsci.reaction_workflow_enum_utils.substitute(st, rgroups_dict, sites_dict)

Return a copy of the given structure that has been substituted with the given R-groups at the given sites.

Parameters:
  • st (structure.Structure) – the structure on which the substitution is performed

  • rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

  • sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

Raises:

InvalidInput – if there is an issue

Return type:

structure.Structure, dict, list[str]

Returns:

the substituted structure, dictionary mapping old atom indices to new atom indices, R-group titles

schrodinger.application.matsci.reaction_workflow_enum_utils.get_updated_sites_dict(sites_dict, old_to_new)

Return the given sites dictionary with updated indices according to the given index map.

Parameters:
  • sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

  • old_to_new (dict) – a map of old-to-new atom indices

Return type:

dict

Returns:

keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

schrodinger.application.matsci.reaction_workflow_enum_utils.substitute_or_sculpt(st, rgroups_dict, sites_dict, rgroups_perm_dict=None)

Return structures created by substituting or sculpting the given R-groups onto the given structure at the given sites.

Parameters:
  • st (structure.Structure) – the structure on which the substitutions or sculpts are performed

  • rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

  • sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

  • rgroups_perm_dict (dict or None) – for those hash indices that draw multi-site R-groups this dictionary allows for specifying which permutation of these R-group sites to use to connect it to the structure, keys are integers which are the hash_idx values relating to sites, values are 1-based permutations indices, if None all permutations are considered

Return type:

dict[schrodinger.structure.Structure]=list[str], list[str]

Returns:

keys are structures and values are lists of R-group titles added to the given structure, the list contains titles of enumerated structures that have ring-spears

schrodinger.application.matsci.reaction_workflow_enum_utils.get_smiles(st)

Return the smiles of the given metal complex.

Parameters:

st (schrodinger.structure.Structure) – the metal complex

Return type:

str

Returns:

smiles

exception schrodinger.application.matsci.reaction_workflow_enum_utils.InvalidInput

Bases: Exception

exception schrodinger.application.matsci.reaction_workflow_enum_utils.RingSpearsOutput

Bases: Exception

class schrodinger.application.matsci.reaction_workflow_enum_utils.Sites

Bases: object

Manage enumeration sites.

static getSites(sites_data, n_structures=1)

Return a list of Site from the given sites data.

Parameters:
  • sites_data (list) – contains for each site a list of data [from_idx, to_idx, hash_idx] with an optional fourth item structure_idx

  • n_structures (int) – the number of structures, used if the given sites lack the optional fourth item

Raises:

InvalidInput – if there is an issue

Return type:

list

Returns:

contains Site

static validateSitesFormat(sites)

Validate the given sites format.

Parameters:

sites (list) – contains Site

Raises:

InvalidInput – if there is an issue

static delete_substitution_site_bonds(st, sites)

Delete bonds in the given structure that occur after the given substitution sites and return extracted core information.

Parameters:
  • st (structure.Structure) – the structure, potentially modified in place

  • sites (list) – contains Site

Return type:

structure.Structure, dict

Returns:

the extracted core and old-to-new atom index map

static validateSitesData(sites, st)

Validate the given sites data.

Parameters:
  • sites (list) – contains Site

  • st (structure.Structure) – the structure

Raises:

InvalidInput – if there is an issue

static getBinnedSites(sites)

Get the sites binned firstly by structure_idx and secondly by hash_idx.

Parameters:

sites (list) – contains Site

Return type:

dict

Returns:

keys are structure_idx, values are dicts whose keys are hash_idx and values are lists of Site

class schrodinger.application.matsci.reaction_workflow_enum_utils.SculptMultiRGroup(st, rgroups_dict, sites_dict, rgroups_perm_dict=None, xtb=False)

Bases: object

Use sculpting to enumerate using R-groups with multiple sites.

__init__(st, rgroups_dict, sites_dict, rgroups_perm_dict=None, xtb=False)
Parameters:
  • st (structure.Structure) – the structure on which the sculpts are performed

  • rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

  • sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

  • rgroups_perm_dict (dict or None) – for those hash indices that draw multi-site R-groups this dictionary allows for specifying which permutation of these R-group sites to use to connect it to the structure, keys are integers which are the hash_idx values relating to sites, values are 1-based permutations indices, if None all permutations are considered

  • xtb (bool) – if True then use xtb optimization for sculpting otherwise use a classical force field

getRGroupSitesDict()

Return the R-group sites dictionary.

Return type:

dict

Returns:

keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

prepareStructure()

Prepare the structure for enumeration by removing the appropriate atoms that exist after the substitution sites.

getPositionedRGroups(hash_idx)

Enumerate the R-group for the given hash index on the available structures sites and put it in a favorable position for sculpting.

Parameters:

hash_idx (int) – the hash index of the R-group

Return type:

list[PositionedRGroup]

Returns:

the positioned R-groups

addRGroups(positioned_r_groups)

Return a copy of the structure with the given R-groups added in their guess positions, bonded, and ready for minimization.

Parameters:

positioned_r_groups (list[PositionedRGroups]) – the positioned R-groups

Return type:

schrodinger.structure.Structure, list, dict, list

Returns:

the structure, list containing atom indices to fix, dictionary mapping R-group site from indices to a numpy array of the xyz coordinates indicating the location to which it will be sculpted, list of pair tuples of atom indices of bonds created

addDummyAtomAnchors(st, restrain_pairs)

Return a copy of the given structure with dummy atom anchors added at the given locations.

Parameters:
  • st (schrodinger.structure.Structure) – the structure

  • restrain_pairs (dict) – dictionary mapping R-group site from indices to a numpy array of the xyz coordinates indicating the location to which it will be sculpted

Return type:

schrodinger.structure.Structure, dict

Returns:

the structure, dictionary mapping R-group site from indices to dummy atom indices indicating the location to which it will be sculpted

sculptRGroupsWClassical(positioned_r_groups)

Sculpt the given R-groups onto the structure using a classical forcefield.

Parameters:

positioned_r_groups (list[PositionedRGroups]) – the positioned R-groups

Raises:

InvalidInput – if there is an issue

Return type:

schrodinger.structure.Structure

Returns:

the structure

sculptRGroupsWXtb(positioned_r_groups)

Sculpt the given R-groups onto the structure using xTB.

Parameters:

positioned_r_groups (list[PositionedRGroups]) – the positioned R-groups

Raises:

InvalidInput – if there is an issue

Return type:

schrodinger.structure.Structure

Returns:

the structure

sculptRGroups(positioned_r_groups)

Sculpt the given R-groups onto the structure.

Parameters:

positioned_r_groups (list[PositionedRGroups]) – the positioned R-groups

Return type:

schrodinger.structure.Structure

Returns:

the structure

sculptAllRGroups()

Sculpt all R-groups onto the structure.

Return type:

dict[schrodinger.structure.Structure]=list[str]

Returns:

keys are structures and values are lists of R-group titles added to the given structure

run()

Run it.

Return type:

dict[schrodinger.structure.Structure]=list[str]

Returns:

keys are structures and values are lists of R-group titles added to the given structure

class schrodinger.application.matsci.reaction_workflow_enum_utils.EnumerateReactionWorkflow(rxnwf_file, rgroup_files, sites, force_hetero_substitution=False, out_rep=None, base_name='enumerate_reaction_workflow', ext='.mae', dedup_smiles=True, logger=None)

Bases: object

Manage enumeration of a reaction workflow.

__init__(rxnwf_file, rgroup_files, sites, force_hetero_substitution=False, out_rep=None, base_name='enumerate_reaction_workflow', ext='.mae', dedup_smiles=True, logger=None)

Create an instance.

Parameters:
  • rxnwf_file (str) – the reaction workflow file

  • rgroup_files (dict) – keys are hash_idx (see sites), values are file names

  • sites (list) – contains Site

  • force_hetero_substitution (bool) – if True then for hetero-eumeration do not additionally include homo-enumeration results

  • out_rep – if a string then must be either module constant parserutils.CENTROID or parserutils.ETA, if None then do nothing

  • base_name (str) – the base name to use in naming the enumerated output files

  • ext (str) – file name extension

  • dedup_smiles (bool) – if True then deduplicate by SMILES, needed because of multi-site R-group permutations

  • logger (logging.Logger) – the logger

getNSitesByHashIdx(sites_dict, rgroup_file)

Return the number of attachment sites that can be drawn for from the given R-group file for each hash index in the given sites dictionary as a dictionary.

Parameters:
  • sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

  • rgroup_file (str) – the R-group file

Raises:

InvalidInput – if there is an issue

Return type:

dict[int]=list[int]

Returns:

the number of attachment sites that can be drawn for from the given R-group file for each hash index

getNCombinations(n_site_group, rgroup_file)

Return the number of combinations for drawing R-groups from the given file according to the given numbers of sites for which to draw for.

Parameters:
  • n_site_group (tuple(int)) – contains the number of sites being drawn for for each hash index

  • rgroup_file (str) – the R-group file

Return type:

int, str

Returns:

the number of combinations, the error message if there is one

getTotalNCombinations(sites_dict)

For the given sites dictionary return the total number of ways of enumerating the R-groups in the corresponding R-group files among the available sites.

Parameters:

sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

Raises:

InvalidInput – if there is an issue

Return type:

int

Returns:

the number of combinations

getNumberRXNWFFiles()

Return the maximum number of rxnwf files that will be enumerated.

Return type:

int

Returns:

the maximum number of rxnwf files that will be enumerated

validate()

Validate.

Raises:

InvalidInput – if there is an issue

setRGroupStructures()

Set the R-group structures.

getStructures()

Generates structure dictionaries where keys are enumeration indices and values are structures.

Return type:

dict

Returns:

keys are enumeration indices, values are structure.Structure

isCompatible(rgroups_dict)

Return True if the given R-group structures are compatible with the reaction workflow.

Parameters:

rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

Return type:

bool

Returns:

True if the given R-group structures are compatible

getPermutationIdxDict(rgroups_dict)

For each hash index create a list of 1-based permutation indices that specify how to connect the given R-group to the structure and return it as a dictionary.

Parameters:

rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

Return type:

dict[int]=list[int]

Returns:

keys are integers which are the hash_idx values relating to sites, values are lists of 1-based permutation indices

getEnumeratedRXNWFSts(rgroups_dict, rgroups_perm_dict)

Return enumerated rxnwf structures.

Parameters:
  • rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

  • rgroups_perm_dict (dict) – for those hash indices that draw multi-site R-groups this dictionary allows for specifying which permutation of these R-group sites to use to connect it to the structure, keys are integers which are the hash_idx values relating to sites, values are 1-based permutations indices

Return type:

dict[schrodinger.structure.Structure]=list[str], list[str]

Returns:

keys are the enumerated rxnwf structures and values lists of their R-group titles, the list contains titles of enumerated structures that have ring-spears

doEnumeration()

Do the enumeration.

Raises:

InvalidInput – if there is an issue

Return type:

int

Returns:

the number of SMILES unique files written

run()

Run it.

Raises:

RingSpearsOutput – if ring-spears are present in the output

class schrodinger.application.matsci.reaction_workflow_enum_utils.EnumerateSwapMixin

Bases: object

Manage enumeration and swapping.

runEnumerateRXNWF(tag)

Run enumerate reaction workflow.

Parameters:

tag (str) – either the REFERENCE or NOVEL module constant

Return type:

set, dict[str]=list[str]

Returns:

the names of the enumerated reaction workflow files R-group titles, the dict keys are names of enumerated reaction workflow files and values are lists of structure titles for structures that have ring-spears

runSwapFragments(enumerated_novel_files, reference_rxnwf_file)

Run swap fragments.

Parameters:
  • enumerated_novel_files (set) – the names of the enumerated novel files

  • reference_rxnwf_file (str) – the reference reaction workflow file

Raises:

InvalidInput – if there is an issue

Return type:

set

Returns:

the names of the enumerated reaction workflow files

getRXNWFInputFiles(job_name)

Return all reaction workflow input files.

Parameters:

job_name (str) – the job name

Return type:

set, dict[str]=list[str]

Returns:

all reaction workflow input files, the dict keys are names of enumerated reaction workflow files and values are lists of structure titles for structures that have ring-spears