schrodinger.application.pathfinder.route module¶
This module contains Node classes for representing retrosynthetic analyses and synthetic routes, as well as functions for reading and writing route files.
- class schrodinger.application.pathfinder.route.ReactionInstance(reaction, precursors, broken_bonds=frozenset({}))¶
Bases:
object
A ReactionInstance is the application of a Reaction to a list of reagents/precursors. For example, “Amide synthesis” is a Reaction; but “Amide synthesis from acetic acid and ethylamine” is a ReactionInstance.
- __init__(reaction, precursors, broken_bonds=frozenset({}))¶
- property name¶
The name of the reaction.
- class schrodinger.application.pathfinder.route.Node(mol=None, reagent_class=None)¶
Bases:
object
Base class for a node in a synthetic route or a retrosynthetic tree. A node is associated with a molecule and with a reagent class (both of which may be None). It is also associated with zero or more reaction instances, but the base class does not implement an API to add reaction instances because each subclass has a different policy concerning the number of reaction instances allowed.
- __init__(mol=None, reagent_class=None)¶
- treeAsString(products=True, starting_materials=True, indexes=True)¶
Return a recursive string representation of the tree (unlike __str__, which is a short representation of the current node). The reaction names are always shown; starting materials and products are optional.
- Parameters
products (bool) – include reaction product SMILES
starting_materials (bool) – include starting material SMILES
- Return type
str
- getReactionSet()¶
Return the set of reactions used by the route.
- class schrodinger.application.pathfinder.route.RetroSynthesisNode(*a, frozen=False, **d)¶
Bases:
schrodinger.application.pathfinder.route.Node
A node in a retrosynthetic analysis tree. A node may have one or more reaction instances, which represent the ways of synthesizing the node in a single step.
- __init__(*a, frozen=False, **d)¶
- Parameters
frozen (bool) – The molecule associated with the current node has frozen atoms. When generating a route, ReagentNode’s derived from the current node won’t have a reagent class, so they won’t be enumerated by default.
- addReactionInstance(reaction, precursors, broken_bonds=frozenset({}))¶
Add a reaction instance to the current node. This represents a one-step synthesis of the node from a list of precursor nodes.
- Parameters
broken_bonds ({(int, int)}) – set of bonds broken by this reaction instance, where each bond is a tuple of two ints (sorted atom indexes in the original target molecule).
- getRoutes(require=None, seen=None, full_dedup=False, bond_reactions=None)¶
Generate the routes from a retrosynthesis tree.
- Parameters
require (set of str) – reaction tags or names to require in each route
seen (set of str) – set of routes that have already been seen and shouldn’t be generated again. Modified in place. This can be used to prevent duplicates when generating routes from multiple retrosynthesis trees.
full_dedup (bool) – “full deduplication”: do not generate routes consisting of the same reaction path but with different molecules
bond_reactions – dict specifying which reactions are allowed to break certain bonds. Keys are tuples of two ints (sorted atom indexes); values are sets of reaction names. Only routes which break all the bonds specified, using one of the reactions specified for each bond, are generated.
bond_reactions – {(int, int):set(str)}
- Returns
generator of routes (RouteNode)
- Return type
generator
- getReactionSet()¶
Return the set of reactions used by the route.
- treeAsString(products=True, starting_materials=True, indexes=True)¶
Return a recursive string representation of the tree (unlike __str__, which is a short representation of the current node). The reaction names are always shown; starting materials and products are optional.
- Parameters
products (bool) – include reaction product SMILES
starting_materials (bool) – include starting material SMILES
- Return type
str
- class schrodinger.application.pathfinder.route.RouteNode(mol=None, reagent_class=None, ncycles=None, fallback_reagent=None, include_smiles=True)¶
Bases:
schrodinger.application.pathfinder.route.Node
A node in a synthetic route. Similar to RetroSynthesisNode, except that it can have one reaction instance at most. Also, RouteNode provides additional methods that only make sense for a synthetic route.
- __init__(mol=None, reagent_class=None, ncycles=None, fallback_reagent=None, include_smiles=True)¶
- Parameters
mol (Mol or NoneType) – molecule associated with this node, if any
reagent_class (str or NoneType) – reagent class associated with this node, if any (only really relevant for starting materials)
ncycles (int or NoneType) – used by the Synthesizer: try to apply the reaction for this node up to ncycles times
fallback_reagent (int or NoneType) – used by the Synthesizer: when a reaction fails, the reagent identified by this 1-based index becomes the product
include_smiles (bool) – whether to include the SMILES string for each node.
- property precursors¶
The list of precursors of this node (may be empty).
- property reaction_instance¶
The reaction instance associated with this node. If the node has no reaction instance, raises KeyError.
- property reaction¶
The reaction associated with this node. If the node has no reaction instance, raises KeyError.
- isStartingMaterial()¶
Return True if the node represents a starting material (i.e., has no reaction instance).
- getStartingNodes()¶
Search recursively and return the Node objects for all the starting materials in the route.
- Return type
list of Node-derived objects
- updateTargetIndexMap(target_index_map)¶
Store target frozen atom indices for alignment
- Parameters
target_index_map ({(reagent_index, atom_index): target_index}) – map of reagent_index, atom_index pairs to target_indices
- setReactionInstance(reaction, precursors, broken_bonds=frozenset({}))¶
Set the reaction instance to the current node. This represents a one-step synthesis of the node from a list of precursor nodes.
- Parameters
broken_bonds ({(int, int)}) – set of bonds broken by this reaction instance, where each bond is a tuple of two ints (sorted atom indexes in the original target molecule).
- steps()¶
Return the total number of steps in the route. For example, the following synthesis has 3 steps but depth 2.
Target => A + B A => AA B => BB
- Return type
int
- depth(_depth=0)¶
Return the maximum depth of the route. See example under steps().
- Return type
int
- brokenBonds(broken_bonds=None)¶
Return a dict describing all the bonds broken by the route (going recursively down to the starting materials).
- Returns
dict where each key is a bond (tuple of two sorted int atoms indexes) and each value is a reaction name.
- Return type
{(int, int): str}
- write(filename, self_contained=False)¶
Write a route file.
- Parameters
filename (str) – File to write.
self_contained (bool) – Write a “self-contained route file”, which includes the reactions used by the route?
- getTreeData(self_contained=False)¶
Return a simple data structure, suitable for dumping to JSON, representing the route. See write() for more details.
Example return value:
{ "reaction_name": "alkylation-1", "smiles": "CNCC=O", "precursors": [ { "smiles_list": [ "O=CCBr" ], "reagent": "halides-primsec" }, { "reaction_name": "curtius-3", "smiles": "CN", "precursors": [ { "smiles_list": [ "CC(=O)O" ], "reagent": "carboxylates" } ] } ], "steps": 2, "depth": 2 }
- Parameters
self_contained (bool) – Return the data for a “self-contained route file”, which includes the reactions used by the route?
- Returns
contents of route file
- Return type
dict
- getSimplifiedTreeData(_counter=None)¶
Return a data structure suitable for dumping into JSON. This is similar to getTreeData but more concise, and is meant for a one-line representation, ruughly 100 characters long.
The same example given for getTreeData would look like this:
{“alkylation-1”: [“O=C([O-])CBr”, 2]}
this format is clearly more limited: there is no reagent class information and each starting material must be either a single SMILES (i.e., no lists supported) or an integer (meaning a reagent source index). The format is also harder to expand in a backward-compatible way. Its purpose is not to be an input for an enumeration job, but just to provide a small represantation of the route that can be stored as a structure property so a user can figure out how a compound was made.
- Returns
simplified route representation
- Return type
dict
- getOneLineRepresentation()¶
Return a one-line string with the simplified tree representation generated by getSimplifiedTreeData().
- Returns
simplified route representation
- Return type
str
- checkReactions(reqs)¶
Check that the route meets all requirements. Every element of ‘reqs’ must match a tag or reaction name for at least one of the reactions used by the route. Tag matching is exact; name matching uses shell-like globbing (*, ?, []).
- Returns
True if route meets requirements.
- Return type
bool
- getReactionSmiles()¶
Return a representation of the current node as a reaction SMILES (not SMARTS!). The SMILES are kekulized and with explicit single bonds where applicable, to maximize compatibility with the sketcher.
- Returns
reaction SMILES (retrosynthetic)
- Return type
str
- getReactionSet()¶
Return the set of reactions used by the route.
- treeAsString(products=True, starting_materials=True, indexes=True)¶
Return a recursive string representation of the tree (unlike __str__, which is a short representation of the current node). The reaction names are always shown; starting materials and products are optional.
- Parameters
products (bool) – include reaction product SMILES
starting_materials (bool) – include starting material SMILES
- Return type
str
- class schrodinger.application.pathfinder.route.ReagentNode(mol=None, reagent_class=None, filename=None, smiles=None, smiles_list=None, frozen_atoms=None, innerfrozen_atoms=None)¶
Bases:
schrodinger.application.pathfinder.route.RouteNode
A node representing a starting material in a synthetic route. Unlike RouteNode, it cannot have any reaction instances. Reagent nodes are identified by a reagent class.
Reagents may optionally have a filename or a smiles or a list of smiles as a source of reagent molecules. If none of these is provided, the object can try to find a reagent file based on the reagent class alone.
- __init__(mol=None, reagent_class=None, filename=None, smiles=None, smiles_list=None, frozen_atoms=None, innerfrozen_atoms=None)¶
- Parameters
frozen_atoms (list of int or NoneType) – 1-based indexes of frozen atoms, refering to the atom order in the SMILES string. If smiles_list is provided, it must have one smiles.
- findReagentFile(libpath=None)¶
First, look for structure files matching <reagent_class>.* in the CWD. If one is found, return it. If multiple matches are found, an exception is raised. If none are found, look for <reagent_class>.csv in the mmshare data directory and return it if it exists, or None otherwise.
- Parameters
libpath (list of str) – list of directories to prepend to the standard reagent library search path
- Returns
path to reagent file, or None if not found
- Return type
str
- Raises
ValueError if multiple matches are found in the CWD.
- getReagentSource(source=None, libpath=None)¶
Return a reagent source, which may be either a filename, a SMILES string, or the magical value route.SMILES which means “use the SMILES contained in the node”.
Precedence is the supplied source argument, if any, followed by filename associated with the node, reagent class associated with the node, and SMILES associated with the node.
- Parameters
source (str) – optional filename/SMILES to use; special value ‘’ (empty strings) means use SMILES associated with the node.
libpath (list of str) – list of directories to prepend to the standard reagent library search path
- Returns
reagent source
- Return type
str
- Raises
ValueError when no source file is found, the supplied SMILES is invalid, or when there is neither a reagent class nor a SMILES associated with the node.
- getTargetIndexMap(sm_index)¶
Return map of (reagent_index, atom_index) pairs to target_indices, which will be used to align products by frozen atom. :param sm_index: the reagent index to return the map for. :type sm_index: int
- Returns
map of reagent_index, atom_index pairs to target_indices
- Return type
{(reagent_index, atom_index): target_index}
- getLabeledMols(sm_index)¶
Return Mol objects, based on the SMILES held by the node, in which frozen atoms have been labeled using isotopes according to the formula
label = FROZEN_FACTOR * sm_index + atom_index
where atom_index is the 1-based index from the SMILES string, and innerfrozen atoms have been labeled according to
label = INNERFROZEN_OFFSET + FROZEN_FACTOR * sm_index + atom_index
- Parameters
sm_index (int) – “starting material index”. Used to distinguish between frozen atoms coming from different reactants.
- Returns
labeled molecules
- Return type
list of rdkit.Chem.Mol
- brokenBonds(broken_bonds=None)¶
Return a dict describing all the bonds broken by the route (going recursively down to the starting materials).
- Returns
dict where each key is a bond (tuple of two sorted int atoms indexes) and each value is a reaction name.
- Return type
{(int, int): str}
- checkReactions(reqs)¶
Check that the route meets all requirements. Every element of ‘reqs’ must match a tag or reaction name for at least one of the reactions used by the route. Tag matching is exact; name matching uses shell-like globbing (*, ?, []).
- Returns
True if route meets requirements.
- Return type
bool
- depth(_depth=0)¶
Return the maximum depth of the route. See example under steps().
- Return type
int
- getOneLineRepresentation()¶
Return a one-line string with the simplified tree representation generated by getSimplifiedTreeData().
- Returns
simplified route representation
- Return type
str
- getReactionSet()¶
Return the set of reactions used by the route.
- getReactionSmiles()¶
Return a representation of the current node as a reaction SMILES (not SMARTS!). The SMILES are kekulized and with explicit single bonds where applicable, to maximize compatibility with the sketcher.
- Returns
reaction SMILES (retrosynthetic)
- Return type
str
- getSimplifiedTreeData(_counter=None)¶
Return a data structure suitable for dumping into JSON. This is similar to getTreeData but more concise, and is meant for a one-line representation, ruughly 100 characters long.
The same example given for getTreeData would look like this:
{“alkylation-1”: [“O=C([O-])CBr”, 2]}
this format is clearly more limited: there is no reagent class information and each starting material must be either a single SMILES (i.e., no lists supported) or an integer (meaning a reagent source index). The format is also harder to expand in a backward-compatible way. Its purpose is not to be an input for an enumeration job, but just to provide a small represantation of the route that can be stored as a structure property so a user can figure out how a compound was made.
- Returns
simplified route representation
- Return type
dict
- getStartingNodes()¶
Search recursively and return the Node objects for all the starting materials in the route.
- Return type
list of Node-derived objects
- getTreeData(self_contained=False)¶
Return a simple data structure, suitable for dumping to JSON, representing the route. See write() for more details.
Example return value:
{ "reaction_name": "alkylation-1", "smiles": "CNCC=O", "precursors": [ { "smiles_list": [ "O=CCBr" ], "reagent": "halides-primsec" }, { "reaction_name": "curtius-3", "smiles": "CN", "precursors": [ { "smiles_list": [ "CC(=O)O" ], "reagent": "carboxylates" } ] } ], "steps": 2, "depth": 2 }
- Parameters
self_contained (bool) – Return the data for a “self-contained route file”, which includes the reactions used by the route?
- Returns
contents of route file
- Return type
dict
- isStartingMaterial()¶
Return True if the node represents a starting material (i.e., has no reaction instance).
- property precursors¶
The list of precursors of this node (may be empty).
- property reaction¶
The reaction associated with this node. If the node has no reaction instance, raises KeyError.
- property reaction_instance¶
The reaction instance associated with this node. If the node has no reaction instance, raises KeyError.
- setReactionInstance(reaction, precursors, broken_bonds=frozenset({}))¶
Set the reaction instance to the current node. This represents a one-step synthesis of the node from a list of precursor nodes.
- Parameters
broken_bonds ({(int, int)}) – set of bonds broken by this reaction instance, where each bond is a tuple of two ints (sorted atom indexes in the original target molecule).
- steps()¶
Return the total number of steps in the route. For example, the following synthesis has 3 steps but depth 2.
Target => A + B A => AA B => BB
- Return type
int
- treeAsString(products=True, starting_materials=True, indexes=True)¶
Return a recursive string representation of the tree (unlike __str__, which is a short representation of the current node). The reaction names are always shown; starting materials and products are optional.
- Parameters
products (bool) – include reaction product SMILES
starting_materials (bool) – include starting material SMILES
- Return type
str
- updateTargetIndexMap(target_index_map)¶
Store target frozen atom indices for alignment
- Parameters
target_index_map ({(reagent_index, atom_index): target_index}) – map of reagent_index, atom_index pairs to target_indices
- write(filename, self_contained=False)¶
Write a route file.
- Parameters
filename (str) – File to write.
self_contained (bool) – Write a “self-contained route file”, which includes the reactions used by the route?
- schrodinger.application.pathfinder.route.read_route_file(filename, reactions_dict=None)¶
Read a route file in JSON format, returning a RouteNode object.
- Parameters
reactions_dict (dict of {str: Reaction}) – dictionary of Reaction objects by name.
- schrodinger.application.pathfinder.route.parse_route_data(json_data, reactions_dict=None)¶
Generate a Route from the raw dict/list-based data structure usually obtained from a route JSON file.
- Parameters
reactions_dict (dict of {str: Reaction}) – dictionary of Reaction objects by name. Not required when using a self-contained route file.
- schrodinger.application.pathfinder.route.get_kekule_smiles(mol)¶
Return a Kekule SMILES, with explicit single bonds, for a molecule.
- Returns
Kekule SMILES
- Return type
str
- class schrodinger.application.pathfinder.route.LazyIterable(iterator)¶
Bases:
object
Lazily convert an iterator into an iterable. One could convert an iterator into a list, but that would consume the entire iterator upfront. This class only consumes as needed, but remembers everything that has been consumed so it can be reused.
- __init__(iterator)¶
- schrodinger.application.pathfinder.route.lazy_product(*iterators)¶
Like itertools.product, but does not consume the iterators before starting to yield tuples. For example, before yielding the first tuple, only the first element from each iterator gets consumed.
- Parameters
iterators – iterators
- Returns
generator of tuples
- schrodinger.application.pathfinder.route.parse_bond_reactions(vals)¶
Parse a list of bond reaction spec strings such as
[‘1-2:suzuki,stille’, ‘8-5:amide_coupling-1’]
into a dict structure expected by the RouteNode APIs:
{(1, 2): {‘suzuki’, ‘stille’}, (5, 8): {‘amide_coupling-1’}}
- Parameters
vals (list of str) – values from the -bond_reactions argument
- Returns
dict where the keys are bonds (tuples of two atom indexes) and the values are sets of reaction names.
- Return type
{(int, int): {str}}
- schrodinger.application.pathfinder.route.mol_to_nomap_cxsmiles(mol)¶
Generate a CXSMILES from a Mol, but stripped of atom mappings. For example, return
CCO
instead ofCC[OH:3] |atomProp:2.molAtomMapNumber.3|
.- Returns
CXSMILES
- Return type
str