schrodinger.application.matsci.reaction_workflow_enum_utils module

Utilities for enumerating reaction workflows.

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.application.matsci.reaction_workflow_enum_utils.Site(from_idx, to_idx, hash_idx, structure_idx)

Bases: tuple

from_idx

Alias for field number 0

hash_idx

Alias for field number 2

structure_idx

Alias for field number 3

to_idx

Alias for field number 1

class schrodinger.application.matsci.reaction_workflow_enum_utils.Source(rgroup_st, site_idxs)

Bases: tuple

rgroup_st

Alias for field number 0

site_idxs

Alias for field number 1

class schrodinger.application.matsci.reaction_workflow_enum_utils.PositionedRGroup(hash_idx, sites, st, permutation)

Bases: tuple

hash_idx

Alias for field number 0

permutation

Alias for field number 3

sites

Alias for field number 1

st

Alias for field number 2

schrodinger.application.matsci.reaction_workflow_enum_utils.get_msg(site, msg)
schrodinger.application.matsci.reaction_workflow_enum_utils.get_core_idxs(st)

Return a set of atom indices for the core of the given structure.

Parameters

st (schrodinger.structure.Structure) – the structure

Return type

set

Returns

core atom indices

schrodinger.application.matsci.reaction_workflow_enum_utils.get_rgroup_sts_by_n_site(rgroup_file)

Return R-group structures binned by the number of attachment sites on the R-group.

Parameters

rgroup_file (str) – the R-group file

Return type

dict[int]=list[schrodinger.structure.Structure]

Returns

the R-group structures binned by the number of sites attachment sites on the R-group

schrodinger.application.matsci.reaction_workflow_enum_utils.get_rge_sources(st, rgroup_sts, binned_sites, old_to_new)

Return a list of Source that is prepared for enumeration using the R-Group Enumeration module and a copy of the input structure with potentially duplicated to-atoms.

Parameters
  • st (schrodinger.structure.Structure) – the structure

  • rgroup_sts (list of structure.Structure) – the R-group structures

  • binned_sites (list of lists of Site) – the binned sites

  • old_to_new (dict) – a map of old-to-new atom indices

Return type

list, schrodinger.structure.Structure

Returns

contains Source, a copy of the input structure with potentially duplicated to-atoms

schrodinger.application.matsci.reaction_workflow_enum_utils.update_index_properties(st, old_to_new)

Update the index properties of the given structure.

Parameters
  • st (structure.Structure) – the structure

  • old_to_new (dict) – a map of old-to-new atom indices

schrodinger.application.matsci.reaction_workflow_enum_utils.substitute(st, rgroups_dict, sites_dict)

Return a copy of the given structure that has been substituted with the given R-groups at the given sites.

Parameters
  • st (structure.Structure) – the structure on which the substitution is performed

  • rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

  • sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

Raises

InvalidInput – if there is an issue

Return type

structure.Structure, dict, list[str]

Returns

the substituted structure, dictionary mapping old atom indices to new atom indices, R-group titles

schrodinger.application.matsci.reaction_workflow_enum_utils.get_updated_sites_dict(sites_dict, old_to_new)

Return the given sites dictionary with updated indices according to the given index map.

Parameters
  • sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

  • old_to_new (dict) – a map of old-to-new atom indices

Return type

dict

Returns

keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

schrodinger.application.matsci.reaction_workflow_enum_utils.substitute_or_sculpt(st, rgroups_dict, sites_dict, rgroups_perm_dict=None)

Return structures created by substituting or sculpting the given R-groups onto the given structure at the given sites.

Parameters
  • st (structure.Structure) – the structure on which the substitutions or sculpts are performed

  • rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

  • sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

  • rgroups_perm_dict (dict or None) – for those hash indices that draw multi-site R-groups this dictionary allows for specifying which permutation of these R-group sites to use to connect it to the structure, keys are integers which are the hash_idx values relating to sites, values are 1-based permutations indices, if None all permutations are considered

Return type

dict[schrodinger.structure.Structure]=list[str]

Returns

keys are structures and values are lists of R-group titles added to the given structure

schrodinger.application.matsci.reaction_workflow_enum_utils.get_smiles(st)

Return the smiles of the given metal complex.

Parameters

st (schrodinger.structure.Structure) – the metal complex

Return type

str

Returns

smiles

exception schrodinger.application.matsci.reaction_workflow_enum_utils.InvalidInput

Bases: Exception

class schrodinger.application.matsci.reaction_workflow_enum_utils.Sites

Bases: object

Manage enumeration sites.

static getSites(sites_data, n_structures=1)

Return a list of Site from the given sites data.

Parameters
  • sites_data (list) – contains for each site a list of data [from_idx, to_idx, hash_idx] with an optional fourth item structure_idx

  • n_structures (int) – the number of structures, used if the given sites lack the optional fourth item

Raises

InvalidInput – if there is an issue

Return type

list

Returns

contains Site

static validateSitesFormat(sites)

Validate the given sites format.

Parameters

sites (list) – contains Site

Raises

InvalidInput – if there is an issue

static delete_substitution_site_bonds(st, sites)

Delete bonds in the given structure that occur after the given substitution sites and return extracted core information.

Parameters
  • st (structure.Structure) – the structure, potentially modified in place

  • sites (list) – contains Site

Return type

structure.Structure, dict

Returns

the extracted core and old-to-new atom index map

static validateSitesData(sites, st)

Validate the given sites data.

Parameters
  • sites (list) – contains Site

  • st (structure.Structure) – the structure

Raises

InvalidInput – if there is an issue

static getBinnedSites(sites)

Get the sites binned firstly by structure_idx and secondly by hash_idx.

Parameters

sites (list) – contains Site

Return type

dict

Returns

keys are structure_idx, values are dicts whose keys are hash_idx and values are lists of Site

class schrodinger.application.matsci.reaction_workflow_enum_utils.SculptMultiRGroup(st, rgroups_dict, sites_dict, rgroups_perm_dict=None)

Bases: object

Use sculpting to enumerate using R-groups with multiple sites.

__init__(st, rgroups_dict, sites_dict, rgroups_perm_dict=None)
Parameters
  • st (structure.Structure) – the structure on which the sculpts are performed

  • rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

  • sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

  • rgroups_perm_dict (dict or None) – for those hash indices that draw multi-site R-groups this dictionary allows for specifying which permutation of these R-group sites to use to connect it to the structure, keys are integers which are the hash_idx values relating to sites, values are 1-based permutations indices, if None all permutations are considered

getRGroupSitesDict()

Return the R-group sites dictionary.

Return type

dict

Returns

keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

prepareStructure()

Prepare the structure for enumeration by removing the appropriate atoms that exist after the substitution sites.

getPositionedRGroups(hash_idx)

Enumerate the R-group for the given hash index on the available structures sites and put it in a favorable position for sculpting.

Parameters

hash_idx (int) – the hash index of the R-group

Return type

list[PositionedRGroup]

Returns

the positioned R-groups

addRGroups(positioned_r_groups)

Return a copy of the structure with the given R-groups added in their guess positions, bonded, and ready for minimization.

Parameters

positioned_r_groups (list[PositionedRGroups]) – the positioned R-groups

Return type

schrodinger.structure.Structure, list, dict

Returns

the structure, list containing atom indices to fix, dictionary mapping R-group site from indices to a numpy array of the xyz coordinates indicating the location to which it will be sculpted

addDummyAtomAnchors(st, restrain_pairs)

Return a copy of the given structure with dummy atom anchors added at the given locations.

Parameters
  • st (schrodinger.structure.Structure) – the structure

  • restrain_pairs (dict) – dictionary mapping R-group site from indices to a numpy array of the xyz coordinates indicating the location to which it will be sculpted

Return type

schrodinger.structure.Structure, dict

Returns

the structure, dictionary mapping R-group site from indices to dummy atom indices indicating the location to which it will be sculpted

sculptRGroups(positioned_r_groups)

Sculpt the given R-groups onto the structure.

Parameters

positioned_r_groups (list[PositionedRGroups]) – the positioned R-groups

Raises

InvalidInput – if there is an issue

Return type

schrodinger.structure.Structure

Returns

the structure

run()

Run it.

Return type

dict[schrodinger.structure.Structure]=list[str]

Returns

keys are structures and values are lists of R-group titles added to the given structure

class schrodinger.application.matsci.reaction_workflow_enum_utils.EnumerateReactionWorkflow(rxnwf_file, rgroup_files, sites, force_hetero_substitution=False, out_rep=None, base_name='enumerate_reaction_workflow', ext='.mae', dedup_smiles=True, logger=None)

Bases: object

Manage enumeration of a reaction workflow.

__init__(rxnwf_file, rgroup_files, sites, force_hetero_substitution=False, out_rep=None, base_name='enumerate_reaction_workflow', ext='.mae', dedup_smiles=True, logger=None)

Create an instance.

Parameters
  • rxnwf_file (str) – the reaction workflow file

  • rgroup_files (dict) – keys are hash_idx (see sites), values are file names

  • sites (list) – contains Site

  • force_hetero_substitution (bool) – if True then for hetero-eumeration do not additionally include homo-enumeration results

  • out_rep – if a string then must be either module constant parserutils.CENTROID or parserutils.ETA, if None then do nothing

  • base_name (str) – the base name to use in naming the enumerated output files

  • ext (str) – file name extension

  • dedup_smiles (bool) – if True then deduplicate by SMILES, needed because of multi-site R-group permutations

  • logger (logging.Logger) – the logger

getNSitesByHashIdx(sites_dict, rgroup_file)

Return the number of attachment sites that can be drawn for from the given R-group file for each hash index in the given sites dictionary as a dictionary.

Parameters
  • sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

  • rgroup_file (str) – the R-group file

Raises

InvalidInput – if there is an issue

Return type

dict[int]=list[int]

Returns

the number of attachment sites that can be drawn for from the given R-group file for each hash index

getNCombinations(n_site_group, rgroup_file)

Return the number of combinations for drawing R-groups from the given file according to the given numbers of sites for which to draw for.

Parameters
  • n_site_group (tuple(int)) – contains the number of sites being drawn for for each hash index

  • rgroup_file (str) – the R-group file

Return type

int, str

Returns

the number of combinations, the error message if there is one

getTotalNCombinations(sites_dict)

For the given sites dictionary return the total number of ways of enumerating the R-groups in the corresponding R-group files among the available sites.

Parameters

sites_dict (dict) – keys are integers which are the hash_idx values relating to R-groups, values are lists of Site

Raises

InvalidInput – if there is an issue

Return type

int

Returns

the number of combinations

getNumberRXNWFFiles()

Return the maximum number of rxnwf files that will be enumerated.

Return type

int

Returns

the maximum number of rxnwf files that will be enumerated

validate()

Validate.

Raises

InvalidInput – if there is an issue

setRGroupStructures()

Set the R-group structures.

getStructures()

Generates structure dictionaries where keys are enumeration indices and values are structures.

Return type

dict

Returns

keys are enumeration indices, values are structure.Structure

isCompatible(rgroups_dict)

Return True if the given R-group structures are compatible with the reaction workflow.

Parameters

rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

Return type

bool

Returns

True if the given R-group structures are compatible

getPermutationIdxDict(rgroups_dict)

For each hash index create a list of 1-based permutation indices that specify how to connect the given R-group to the structure and return it as a dictionary.

Parameters

rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

Return type

dict[int]=list[int]

Returns

keys are integers which are the hash_idx values relating to sites, values are lists of 1-based permutation indices

getEnumeratedRXNWFSts(rgroups_dict, rgroups_perm_dict)

Return enumerated rxnwf structures.

Parameters
  • rgroups_dict (dict) – keys are integers which are the hash_idx values relating to sites, values are structure.Structure

  • rgroups_perm_dict (dict) – for those hash indices that draw multi-site R-groups this dictionary allows for specifying which permutation of these R-group sites to use to connect it to the structure, keys are integers which are the hash_idx values relating to sites, values are 1-based permutations indices

Return type

dict[schrodinger.structure.Structure]=list[str]

Returns

keys are the enumerated rxnwf structures and values lists of their R-group titles

doEnumeration()

Do the enumeration.

Raises

InvalidInput – if there is an issue

Return type

int

Returns

the number of SMILES unique files written

run()

Run it.

class schrodinger.application.matsci.reaction_workflow_enum_utils.EnumerateSwapMixin

Bases: object

Manage enumeration and swapping.

runEnumerateRXNWF(tag)

Run enumerate reaction workflow.

Parameters

tag (str) – either the REFERENCE or NOVEL module constant

Return type

set

Returns

the names of the enumerated reaction workflow files

runSwapFragments(enumerated_novel_files, reference_rxnwf_file)

Run swap fragments.

Parameters
  • enumerated_novel_files (set) – the names of the enumerated novel files

  • reference_rxnwf_file (str) – the reference reaction workflow file

Raises

InvalidInput – if there is an issue

Return type

set

Returns

the names of the enumerated reaction workflow files

getRXNWFInputFiles(job_name)

Return all reaction workflow input files.

Parameters

job_name (str) – the job name

Return type

set

Returns

all reaction workflow input files