schrodinger.application.transforms.enumerators module

class schrodinger.application.transforms.enumerators.SmilesTransformPair(smi: str, transform: str)

Bases: object

smi: str
transform: str
__init__(smi: str, transform: str) None
class schrodinger.application.transforms.enumerators.Fragment(core_smarts: str, max_mol_wt: float = inf, max_fragments: int = 500)

Bases: apache_beam.transforms.ptransform.PTransform

Fragment input molecules while maintaining a core substructure.

__init__(core_smarts: str, max_mol_wt: float = inf, max_fragments: int = 500)
Parameters
  • core_smarts – the core smarts string used for fragment matching

  • max_mol_wt – the maximum molecular weight of the fragments

  • max_fragments – the maximum number of fragments to generate

expand(inp_mols: apache_beam.pvalue.PCollection[rdkit.Chem.rdchem.Mol]) apache_beam.pvalue.PCollection[rdkit.Chem.rdchem.Mol]
class schrodinger.application.transforms.enumerators.SubstructureSubstitute(core_smarts: str, transforms_path: Optional[pathlib.Path] = None, cliques_path: Optional[pathlib.Path] = None, sample_size: int = 500000, n_pair_bonds: int = 3, n_apply_bonds: int = 1)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that returns all match-molecular-pair transformed molecules based on fragment-cliques in cliques_path, while protecting the core_smarts.

In case not all transforms that one wants to apply may be expressed by cliques, combinations with the fragments in the optional transforms_path will always be generated.

Note that only the first occurrence of the core_smarts in the molecule determines what part is protected. This means that if more than one match is possible, the others will never be protected allowing the first one to be modified.

__init__(core_smarts: str, transforms_path: Optional[pathlib.Path] = None, cliques_path: Optional[pathlib.Path] = None, sample_size: int = 500000, n_pair_bonds: int = 3, n_apply_bonds: int = 1)
Parameters
  • core_smarts – the core smarts string used for fragment matching

  • transforms_path – optional json file of the list of transforms that are always to be applied (first). Set to None to use the default file.

  • cliques_path – the optional json file (if gzipped must end with ‘gz’) of the fragment cliques used for enumeration. Set to None to use the default file.

  • sample_size – the maximum number of randomly sampled outputs to yield from the cliques_file.

  • n_pair_bonds – the number of bonds beyond which atoms of the core are included for fragment matching (extension of the R-group atoms)

  • n_apply_bonds – the number of bonds beyond which atoms of the core are protected.

expand(pcoll)
class schrodinger.application.transforms.enumerators.CorelessSubstitute(transforms_path: Optional[pathlib.Path] = None, cliques_path: Optional[pathlib.Path] = None, sample_size: int = 500000)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that returns unique sanitized products after applying the transformations based on fragment-cliques in cliques_path.

In case not all transforms that one wants to apply may be expressed by cliques, combinations with the fragments in the optional transforms_path will always be generated.

__init__(transforms_path: Optional[pathlib.Path] = None, cliques_path: Optional[pathlib.Path] = None, sample_size: int = 500000)
Parameters
  • transforms_path – optional json file of the list of transforms that are always to be applied (first). Set to None to use the default file.

  • cliques_path – the optional json file (if gzipped must end with ‘gz’) of the fragment cliques used for enumeration. Set to None to use the default file.

  • sample_size – the maximum number of randomly sampled outputs to yield from the cliques_file.

expand(pcoll)
class schrodinger.application.transforms.enumerators.Substitute(transform_smarts: apache_beam.pvalue.PCollection[str])

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that returns unique standardized molecules after applying the transform_smarts.

__init__(transform_smarts: apache_beam.pvalue.PCollection[str])
Parameters

transform_smarts – the reaction smarts for the tranformation

expand(pcoll)
class schrodinger.application.transforms.enumerators.Decorate(core_smarts: str, rgroups: apache_beam.pvalue.PCollection[schrodinger.application.transforms.enumerators.RGroup], property_ranges: Optional[Dict[str, List[float]]] = None)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that enumerates unique sanitized molecules formed by replacing a hydrogen on a C, N, or O atom in the ligand with an R-group that was attached to an Ar.

__init__(core_smarts: str, rgroups: apache_beam.pvalue.PCollection[schrodinger.application.transforms.enumerators.RGroup], property_ranges: Optional[Dict[str, List[float]]] = None)
Parameters
  • core_smarts – the SMARTS that the products should have and needs to be part of the input molecule

  • rgroups – the R-groups to use for decoration

  • property_ranges – the optional property ranges for the products

expand(pcoll)
class schrodinger.application.transforms.enumerators.Synthesize(core_smarts: str, depth: int = 1, max_products_per_route: int = 100, seed: Optional[int] = None)

Bases: apache_beam.transforms.ptransform.PTransform

Enumerates unique sanitized molecules from a combinatorial synthesis using routes based on the input molecules using the default reaction dictionary and reagent library.

If the maximum number of products is less than the total number of combinations the route synthesis will be done by random sampling, which may yield fewer products than requested, otherwise a systematic set of unique products will be yielded.

__init__(core_smarts: str, depth: int = 1, max_products_per_route: int = 100, seed: Optional[int] = None)
Parameters
  • core_smarts – the SMARTS that the products should have and needs to be part of the input molecule

  • depth – the maximum depth of the retrosynthetic routes to use

  • max_products_per_route – the maximum number of products try to synthesize for each input molecule per route. Use 0 to force an exhaustive synthesis.

  • seed – seed for random number generator. If None, the random number generator will not be seeded.

expand(pcoll)