schrodinger.application.transforms.enumerators module

class schrodinger.application.transforms.enumerators.SmilesTransformPair(smi: str, transform: str)

Bases: object

smi: str
transform: str
__init__(smi: str, transform: str) None
class schrodinger.application.transforms.enumerators.Fragment(core_smarts: str, max_mol_wt: float = inf, max_fragments: int = 500)

Bases: apache_beam.transforms.ptransform.PTransform

Fragment input molecules while maintaining a core substructure.

__init__(core_smarts: str, max_mol_wt: float = inf, max_fragments: int = 500)
Parameters
  • core_smarts – the core smarts string used for fragment matching

  • max_mol_wt – the maximum molecular weight of the fragments

  • max_fragments – the maximum number of fragments to generate

expand(inp_mols: apache_beam.pvalue.PCollection[rdkit.Chem.rdchem.Mol]) apache_beam.pvalue.PCollection[rdkit.Chem.rdchem.Mol]
class schrodinger.application.transforms.enumerators.SubstructureSubstitute(core_smarts: str, transforms_path: Optional[pathlib.Path] = None, cliques_path: Optional[pathlib.Path] = None, sample_size: int = 500000, n_pair_bonds: int = 3, n_apply_bonds: int = 1)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that returns all match-molecular-pair transformed molecules based on fragment-cliques in cliques_path, while protecting the core_smarts.

In case not all transforms that one wants to apply may be expressed by cliques, combinations with the fragments in the optional transforms_path will always be generated.

Note that only the first occurrence of the core_smarts in the molecule determines what part is protected. This means that if more than one match is possible, the others will never be protected allowing the first one to be modified.

__init__(core_smarts: str, transforms_path: Optional[pathlib.Path] = None, cliques_path: Optional[pathlib.Path] = None, sample_size: int = 500000, n_pair_bonds: int = 3, n_apply_bonds: int = 1)
Parameters
  • core_smarts – the core smarts string used for fragment matching

  • transforms_path – optional json file of the list of transforms that are always to be applied (first). Set to None to use the default file.

  • cliques_path – the optional json file (if gzipped must end with ‘gz’) of the fragment cliques used for enumeration. Set to None to use the default file.

  • sample_size – the maximum number of randomly sampled outputs to yield from the cliques_file.

  • n_pair_bonds – the number of bonds beyond which atoms of the core are included for fragment matching (extension of the R-group atoms)

  • n_apply_bonds – the number of bonds beyond which atoms of the core are protected.

expand(pcoll)
class schrodinger.application.transforms.enumerators.CorelessSubstitute(transforms_path: Optional[pathlib.Path] = None, cliques_path: Optional[pathlib.Path] = None, sample_size: int = 500000)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that returns unique sanitized products after applying the transformations based on fragment-cliques in cliques_path.

In case not all transforms that one wants to apply may be expressed by cliques, combinations with the fragments in the optional transforms_path will always be generated.

__init__(transforms_path: Optional[pathlib.Path] = None, cliques_path: Optional[pathlib.Path] = None, sample_size: int = 500000)
Parameters
  • transforms_path – optional json file of the list of transforms that are always to be applied (first). Set to None to use the default file.

  • cliques_path – the optional json file (if gzipped must end with ‘gz’) of the fragment cliques used for enumeration. Set to None to use the default file.

  • sample_size – the maximum number of randomly sampled outputs to yield from the cliques_file.

expand(pcoll)
class schrodinger.application.transforms.enumerators.Substitute(transform_smarts: apache_beam.pvalue.PCollection[str])

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that returns unique standardized molecules after applying the transform_smarts.

__init__(transform_smarts: apache_beam.pvalue.PCollection[str])
Parameters

transform_smarts – the reaction smarts for the tranformation

expand(pcoll)
class schrodinger.application.transforms.enumerators.Decorate(core_smarts: str, rgroups: apache_beam.pvalue.PCollection[schrodinger.application.transforms.enumerators.RGroup], property_ranges: Optional[Dict[str, List[float]]] = None)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that enumerates unique sanitized molecules formed by replacing a hydrogen on a C, N, or O atom in the ligand with an R-group that was attached to an Ar.

__init__(core_smarts: str, rgroups: apache_beam.pvalue.PCollection[schrodinger.application.transforms.enumerators.RGroup], property_ranges: Optional[Dict[str, List[float]]] = None)
Parameters
  • core_smarts – the SMARTS that the products should have and needs to be part of the input molecule

  • rgroups – the R-groups to use for decoration

  • property_ranges – the optional property ranges for the products

expand(pcoll)
class schrodinger.application.transforms.enumerators.Synthesize(core_smarts: str, depth: int = 1, dedupe_routes: bool = False, max_products_per_route: int = 100, max_tries_per_route: Optional[int] = None, reagent_libraries: Optional[List[pathlib.Path]] = None, seed: Optional[int] = None)

Bases: apache_beam.transforms.ptransform.PTransform

Enumerates unique sanitized molecules from a combinatorial synthesis using routes based on the input molecules using the default reaction dictionary and reagent library.

If the maximum number of products is less than the total number of combinations the route synthesis will be done by random sampling, which may yield fewer products than requested, otherwise a systematic set of unique products will be yielded.

__init__(core_smarts: str, depth: int = 1, dedupe_routes: bool = False, max_products_per_route: int = 100, max_tries_per_route: Optional[int] = None, reagent_libraries: Optional[List[pathlib.Path]] = None, seed: Optional[int] = None)
Parameters
  • core_smarts – the SMARTS that the products should have and needs to be part of the input molecule

  • depth – the maximum depth of the retrosynthetic routes to use

  • dedupe_routes – whether to deduplicate the routes

  • max_products_per_route – the maximum number of products try to synthesize for each input molecule per route. Use 0 to force an exhaustive synthesis.

  • max_tries_per_route – the maximum number of tries to synthesize products for each input molecule per route. If None, the number of tries is automatically determined. If 0, the synthesis will be exhaustive.

  • reagent_libraries – the optional reagent libraries to use. If None or an empty list, the default reagent library will be used.

  • seed – seed for random number generator. If None, the random number generator will not be seeded.

expand(pcoll)
class schrodinger.application.transforms.enumerators.EnumerateRoutes(core_smarts: str, depth: int = 1, dedup: bool = False)

Bases: apache_beam.transforms.ptransform.PTransform

Enumerates synthesis routes for core_smarts containing molecules using the default reaction dictionary.

if dedup is True, the routes will be deduplicated based on their one-line representation.

__init__(core_smarts: str, depth: int = 1, dedup: bool = False)
expand(pcoll)
class schrodinger.application.transforms.enumerators.DeduplicateRoutes(label: Optional[str] = None)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that deduplicates the route nodes based on their one-line representation.

expand(route_nodes)
class schrodinger.application.transforms.enumerators.EvaluateRoutes(max_products_per_route: int = 100, max_tries_per_route: Optional[int] = None, reagent_libraries: Optional[List[pathlib.Path]] = None, seed: Optional[int] = None)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that returns unique sanitized molecules after applying the synthesis reactions to the input molecules.

The products may be generated by random sampling if max_products_per_route is not zero and the number of products is less than the total number of combinations, otherwise a systematic set of unique products will be yielded.

__init__(max_products_per_route: int = 100, max_tries_per_route: Optional[int] = None, reagent_libraries: Optional[List[pathlib.Path]] = None, seed: Optional[int] = None)
expand(pcoll)
class schrodinger.application.transforms.enumerators.SystematicSynthesize(core_smarts: str, max_routes: Optional[int] = None, max_reactions_per_route: Optional[int] = None)

Bases: apache_beam.transforms.ptransform.PTransform

Enumerates unique sanitized molecules from a combinatorial synthesis of unique routes of depth 1, based on the input molecules with fixed frozen heavy core atoms using the default reaction dictionary and reagent library.

The number of routes used may be limited my max_routes, and the number of reactions per route may be limited by max_reactions_per_route.

The route synthesis will be done systematically.

__init__(core_smarts: str, max_routes: Optional[int] = None, max_reactions_per_route: Optional[int] = None)
expand(pcoll)