schrodinger.livedesign.preprocessor module

schrodinger.livedesign.preprocessor.initialize_audit_log(verbose: bool)

Initialize global audit for logging purposes

class schrodinger.livedesign.preprocessor.ExplicitHydrogens(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: enum.Enum

REMOVE_ALL = 1
KEEP_WEDGED = 2
ADD_ALL = 3
AS_IS = 4
ON_HETERO_AND_KEEP_WEDGED = 5
class schrodinger.livedesign.preprocessor.GenerateCoordinates(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: enum.Enum

NONE = 1
FULL = 2
FULL_ALIGNED = 3
class schrodinger.livedesign.preprocessor.ChiralFlag0Meaning(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: enum.Enum

UNGROUPED_ARE_ABSOLUTE = 1
UNGROUPED_ARE_RACEMIC = 2
UNGROUPED_ARE_RELATIVE = 3
class schrodinger.livedesign.preprocessor.PreprocessorOptions(MAX_NUM_ATOMS: Optional[int] = None, KEEP_ONLY_LARGEST_STRUCTURE: bool = True, STRIP_SALTS: Optional[Tuple[str]] = None, NEUTRALIZE: bool = True, TRANSFORMATIONS: Optional[Tuple[str]] = ('[#7+0;v5:1]=[#8+0:2]>>[#7+:1]-[#8-:2]', '[#6:3][P+:1]([#6:4])([#6:5])[#6-:2]>>[#6:3][P-0:1]([#6:4])([#6:5])=[#6+0:2]', '[#6:3][P-:1]([#6:4])([#6:5])[#6+:2]>>[#6:3][P-0:1]([#6:4])([#6:5])=[#6+0:2]', '[#6:3][S+:1]([#6:4])-[#8-:2]>>[#6:3][S;X3+0:1]([#6:4])=[#8-0:2]', '[#6:3][P+:1]([#8;X2:4])([#8;X2:5])[#8-:2]>>[#6:3][P+0:1]([#8:4])([#8:5])=[#8-0:2]', '[#6:3][S+:1]([#6:4])([#8-:2])=[O:5]>>[#6:3][S+0:1]([#6:4])(=[#8-0:2])=[O:5]', '[#7;A;X2-:1][N;X2+:2]#[N;X1:3]>>[#7-0:1]=[N+:2]=[#7-:3]', '[#6;X3-:1][N;X2+:2]#[N;X1:3]>>[#6-0;A:1]=[N+:2]=[#7-:3]'), CHOOSE_CANONICAL_TAUTOMER: bool = False, RESOLVE_AMBIGUOUS_TAUTOMERS: bool = False, CHIRAL_FLAG_0_MEANING: schrodinger.livedesign.preprocessor.ChiralFlag0Meaning = ChiralFlag0Meaning.UNGROUPED_ARE_ABSOLUTE, STRIP_AND_GROUPS_ON_SINGLE_ATOM: bool = False, PRESERVE_ENHANCED_STEREO_GROUP_IDS: bool = False, REMOVE_PROPERTIES: bool = False, GENERATE_COORDINATES: schrodinger.livedesign.preprocessor.GenerateCoordinates = GenerateCoordinates.FULL_ALIGNED, EXPLICIT_HYDROGENS: schrodinger.livedesign.preprocessor.ExplicitHydrogens = ExplicitHydrogens.REMOVE_ALL, CLEAR_INVALID_WEDGE_BONDS: bool = True, WEDGE_TWO_BONDS_AROUND_CHIRAL_ATOMS: bool = False, HEAVY_HYDROGEN_DT: bool = False)

Bases: NamedTuple

Options dictating preprocessor actions

Variables
  • KEEP_ONLY_LARGEST_STRUCTURE – whether additional discontiguous structures are removed

  • REMOVE_PROPERTIES – whether properties are removed

  • STRIP_SALTS – whether salts are removed

  • RESOLVE_AMBIGUOUS_TAUTOMERS – whether to guess tautomeric state for ambiguous input that would otherwise fail with a kekulization error

  • WEDGE_TWO_BONDS_AROUND_CHIRAL_ATOMS – whether to try to wedge/dash 2 bonds around chiral atoms which have 4 neighbors.

MAX_NUM_ATOMS: Optional[int]

Alias for field number 0

KEEP_ONLY_LARGEST_STRUCTURE: bool

Alias for field number 1

STRIP_SALTS: Optional[Tuple[str]]

Alias for field number 2

NEUTRALIZE: bool

Alias for field number 3

TRANSFORMATIONS: Optional[Tuple[str]]

Alias for field number 4

CHOOSE_CANONICAL_TAUTOMER: bool

Alias for field number 5

RESOLVE_AMBIGUOUS_TAUTOMERS: bool

Alias for field number 6

CHIRAL_FLAG_0_MEANING: schrodinger.livedesign.preprocessor.ChiralFlag0Meaning

Alias for field number 7

STRIP_AND_GROUPS_ON_SINGLE_ATOM: bool

Alias for field number 8

PRESERVE_ENHANCED_STEREO_GROUP_IDS: bool

Alias for field number 9

REMOVE_PROPERTIES: bool

Alias for field number 10

GENERATE_COORDINATES: schrodinger.livedesign.preprocessor.GenerateCoordinates

Alias for field number 11

EXPLICIT_HYDROGENS: schrodinger.livedesign.preprocessor.ExplicitHydrogens

Alias for field number 12

CLEAR_INVALID_WEDGE_BONDS: bool

Alias for field number 13

WEDGE_TWO_BONDS_AROUND_CHIRAL_ATOMS: bool

Alias for field number 14

HEAVY_HYDROGEN_DT: bool

Alias for field number 15

static fromConfig(config: dict)
Parameters

config – configuration from which to build options

Raises
  • KeyError – if an unknown key is present

  • ValueError – if an unknown value is present

toConfig() dict
schrodinger.livedesign.preprocessor.remove_invalid_config_options(config: dict) Tuple[str]
Parameters

config – configuration from which to build options, from which all invalid keys and values will be stripped

Returns

tuple of errors encountered

schrodinger.livedesign.preprocessor.audit_changes(func: Callable, mol: rdkit.Chem.rdchem.Mol, *args)

When the global audit_changes_log is initialized, compares mol CXSMILES before and after the given function call, capturing information when the CXSMILES has been changed.

Parameters
  • func – transformation function

  • mol – molecule to apply transformation to

schrodinger.livedesign.preprocessor.getprop(getter: Callable, value: str, default: Any = None) Any
schrodinger.livedesign.preprocessor.is_wildcard(atom)

Is atom a wildcard?

schrodinger.livedesign.preprocessor.is_queryatom_exception(atom)

Normally we raise an exception if query atoms are in the molecule to be preprocessed, but we don’t want to do that if the atom is an attachment point

Parameters

atom – the atom to check

Returns

whether or not the atom is allowed in the preprocessor

schrodinger.livedesign.preprocessor.coords_all_zero(conf)

Returns whether or not all atom positions in a conformer are zero

schrodinger.livedesign.preprocessor.get_limited_sanitized_mol(mol: rdkit.Chem.rdchem.Mol) rdkit.Chem.rdchem.Mol

Sanitize the molecule in a limited way as to avoid changing the molecule or throwing when valence errors are present. Specifically we turn off: SANITIZE_PROPERTIES: which otherwise checks valences SANITIZE_CLEANUP: which can change the shape of molecule SANITIZE_CLEANUPCHIRALITY: which can remove chirality markers SANITIZE_FINDRADICALS: which checks valences of radicals

schrodinger.livedesign.preprocessor.setup_mol(mol)

Setup on a molecule that is always done regardless of configuration.

Parameters

mol – An unsanitized RDKit Mol

Returns

A partially sanitized RDKit mol, ready for the standardizer.

schrodinger.livedesign.preprocessor.check_kekulization(mol, options)
schrodinger.livedesign.preprocessor.assign_zero_coords_chirality(mol)

Molecules with all-zero coordinates need to have the “chirality tags” primed from the atom parity properties. Once these are in place, we remove the conformer, and leave the mol in a state equivalent to one that came from a SMILES input.

schrodinger.livedesign.preprocessor.correct_sgroup_coordinates(mol)

If coordinates are generated, make sure the FIELDDISP property in the Sgroups are using relative coordinates.

schrodinger.livedesign.preprocessor.preprocess_molblock(molblock: str, config: Optional[dict] = None, preserved_data_sgroups: Optional[List[str]] = None) str

Standardizes an MDL molblock

Parameters
  • molblock – input molblock

  • config – dict specifying preprocessor options

  • preserved_data_sgroups – list of sgroup names to preserve

Returns

standardized molblock

schrodinger.livedesign.preprocessor.preprocess(mol: rdkit.Chem.rdchem.Mol, options: Optional[schrodinger.livedesign.preprocessor.PreprocessorOptions] = None, preserved_data_sgroups: Optional[List[str]] = None) rdkit.Chem.rdchem.Mol

Standardizes an RDKit mol

Parameters
  • mol – input mol

  • options – preprocessor options

  • preserved_data_sgroups – list of sgroup names to preserve

Returns

standardized mol

exception schrodinger.livedesign.preprocessor.BlindedCompoundError

Bases: ValueError

schrodinger.livedesign.preprocessor.assert_not_blinded(mol: rdkit.Chem.rdchem.Mol, max_num_atoms: Optional[int] = None)

Checks imcoming mol to confirm it has real atoms; if it doesn’t it may have been intentionally stripped by the caller. LiveDesign marks these structures as having been “blinded”, meaning a customer may have IP/legal restrictions, or there’s a delay in registering the structure despite having assay data available. Currently, LiveDesign handles these structures by keeping a row in the LiveReport, but without an associated SDF or image. This is different from other registration errors, which are simply archived.

Parameters
  • mol – RDKit Mol to consider

  • max_mol_wt – maximum allowed molecular weight

schrodinger.livedesign.preprocessor.assert_not_query(mol: rdkit.Chem.rdchem.Mol)

Checks incoming mol to confirm there are no query features present on atoms or bonds, which would otherwise make it not compatible with registration.

Parameters

mol – RDKit Mol to consider

schrodinger.livedesign.preprocessor.get_atoms_mapping(mol_atoms)
schrodinger.livedesign.preprocessor.get_bonds_mapping(mol, original_bond_mapping, atom_idx_mapping)
schrodinger.livedesign.preprocessor.tag_mol_indexes(mol)

Tag atoms with the initial indexes on the mol and create a bond mapping. Bonds are less stable than atoms, so we create an external mapping to the atoms they bind

schrodinger.livedesign.preprocessor.check_attachment_points_changed(sg, atom_idx_mapping)
schrodinger.livedesign.preprocessor.check_cstate_changed(sg, bond_idx_mapping)
schrodinger.livedesign.preprocessor.update_sgroup_indexes(sg, sg_atoms, sg_parent_atoms, sg_bonds, atom_idx_mapping, bond_idx_mapping)
schrodinger.livedesign.preprocessor.update_mol_groups(mol, stereo_groups, substance_groups, original_bond_mapping)

Update atoms and bonds in stereo and substance groups, dropping any atoms/groups that are no longer valid for the current state of the mol.

schrodinger.livedesign.preprocessor.update_sgroups(mol, substance_groups, atom_idx_mapping, bond_idx_mapping)

Update SGroups to reflect the transformations done on mol, updating with new atom and bond indexes, as well as atoms that might have been added or removed.

schrodinger.livedesign.preprocessor.update_stereo_groups(mol, stereo_groups, atom_idx_mapping)
schrodinger.livedesign.preprocessor.add_explicit_hydrogens(mol, only_on_hetero=False)
schrodinger.livedesign.preprocessor.remove_explicit_hydrogens(mol, sgroups, keep_wedged=False, keep_hetero=False)
schrodinger.livedesign.preprocessor.convert_to_molblock(mol, options=None)

Convert processed mol into a molblock and make necessary updates.

schrodinger.livedesign.preprocessor.convert_heavy_hydrogens(molblock)

NOTE that this operates on a molblock, not a molecule

The RDKit does not currently (v2020.03) support writing D or T to mol blocks, so we need to post-process the text. Fortunately it’s an easy regex in v3000 mol blocks. This does not work with V2000 mol blocks, so we throw a ValueError there. This doesn’t seem like a big deal since V2000 support is primarly being kept around for debugging purposes. If we need to eventually support V2000+HEAVY_HYDROGEN_DT, some not-completely-trivial code will need to be written.

schrodinger.livedesign.preprocessor.neutralize(mol, checkForProblematicHs=False)
schrodinger.livedesign.preprocessor.unicode_to_str(unicode_str)

Takes a unicode object and converts it to a str (utf-8). If the arg is already a str, returns unicode_str (i.e. if run with python 3). Needed to support python 2/3 with unicode_literals.

py2: type<unicode> -> type<str utf-8> py3: type<str utf-8> (no unicode type exists)

Parameters

unicode_str (unicode (py2) or str (py3)) – the unicode that potentially needs converting (i.e. if run with python 2)

Returns

str

schrodinger.livedesign.preprocessor.transform(mol, transformation)

apply the transformation to the molecule repeatedly until it no longer applies.

the maxTransformations argument is just there to prevent us from ending up in an infinite loop due to a bogus transformation

Please note that running transformations may alter the stereochemistry of mol, so a stereo recalculation from coordinates might be required.

schrodinger.livedesign.preprocessor.in_xy_plane(mol)
schrodinger.livedesign.preprocessor.generate_coordinates(mol, align=False)
schrodinger.livedesign.preprocessor.is_polymer(s_group)
schrodinger.livedesign.preprocessor.clean_up_polymer_brackets(mol, revert_to_mol=None, keep_existing_brackets=False)

Add polymer brackets back to mol.

Parameters
  • mol – RDKit mol to add polymer brackets to

  • revert_to_mol – RDKit mol to revert to if polymer brackets cannot be added correctly to provided mol. This will occur when brackets cross more than one bond.

  • keep_existing_brackets – whether to recalculate the positions of brackets that are already present

Returns

RDKit mol with polymer brackets

schrodinger.livedesign.preprocessor.copy_lewis_structure_and_hydrogens(st, mol)

Applies bond orders and charges from st to mol. Updates #implicitH to match

Assumes st includes all hydrogens. Adds implicit and explicit hydrogens to the mol, but does not add any graph hydrogens. May remove graph hydrogens.

schrodinger.livedesign.preprocessor.generate_canonical_tautomer(mol)
schrodinger.livedesign.preprocessor.clear_wedge_bonds_from_achiral_centers(mol)
schrodinger.livedesign.preprocessor.calculate_enhanced_stereo(mol, enh_stereo_default_grouping, initial_chiral_flag)
schrodinger.livedesign.preprocessor.strip_stereo_and(input_mol)

Removes any Stereo AND groups with only one center and flattens the bonds around it

Parameters

input_mol – The original molecule to consider

Returns

post-processed molecule, if the input molecule was modified

schrodinger.livedesign.preprocessor.frag_is_smaller(atoms, largest_atoms, weight, largest_weight, smiles, largest_smiles)

A fragment is considered larger if its atoms/weight are larger, the length of the smiles string is larger, or the smiles string is lexicographically smaller if they are equal length. ie, ‘AAA’ is larger than ‘AAB’.. hence the final smiles > largest_smiles check here to reject

schrodinger.livedesign.preprocessor.connect_variable_attachment_points(mol)

forms zero-order bonds between one of the atoms of a bond with an ATTACH property to the “main” molecule in order to have the molecule+variable attachment point treated as a single fragment

returns a 2-tuple with:
  1. the modified molecule

  2. whether or not the molecule was modified

schrodinger.livedesign.preprocessor.remove_fragments(mol, substance_groups)

Fragments are not removed if the molecule contains any SGroups which are associated with polymers

Use the following criteria to remove unwanted fragments from mol:
  1. keep only the fragment which has the most number of atoms

  2. break ties by keeping only fragments with the greatest molecular weight

  3. break ties with the longest smiles string

  4. break additional ties by keeping the fragment with the earliest alpha sorted SMILES string

If two or more identical fragments remain after 1-4, we will throw a fatal error.

schrodinger.livedesign.preprocessor.remove_properties(mol)
schrodinger.livedesign.preprocessor.strip_salts(mol, salt_list)
schrodinger.livedesign.preprocessor.apply_transformations(mol, transformations)

Apply the given list of transformations, and recalculate stereo if at least one transformation applies.

schrodinger.livedesign.preprocessor.add_chiral_hs(mol)
schrodinger.livedesign.preprocessor.wedge_clean(mol, wedge_2_bonds_if_possible)
schrodinger.livedesign.preprocessor.remove_wiggly_bonds_around_double_bonds(mol)