schrodinger.application.transforms.filters module¶
- class schrodinger.application.transforms.filters.SubstructureFilter(filters: List[SingleSmartsFilter])¶
Bases:
PTransformA PTransform that returns structures or molecules that match every SingleSmartsFilter in
filters.Note: SMARTS patterns in the filters should be compatible with implicit H since structures are converted to Mol objects with implicit H before filtering.
Example usage:
>>> from schrodinger.structutils.filter import SingleSmartsFilter >>> filters = [SingleSmartsFilter( ... smarts='Br', min_matches=0, max_matches=1, name='Bromine')] >>> with beam.Pipeline() as p: ... output = (p ... | beam.Create(['CC', 'CBr', 'BrCBr']) ... | beam.Map(lambda smiles: Chem.MolFromSmiles(smiles)) ... | SubstructureFilter(filters) ... | beam.Map(lambda mol: Chem.MolToSmiles(mol)) ... | beam.LogElements() ... ) CC CBr
- Parameters:
filters – The SingleSmartsFilters that must all match.
- __init__(filters: List[SingleSmartsFilter])¶
- Parameters:
filters – the SingleSmartsFilters that must all match
- classmethod FromFilterFile(path: Union[str, Path]) Self¶
Load substructure filters from an optionally encrypted file.
- The file should contain one filter per line, with the format::
<smarts> <min_matches> <max_matches>[ <name>]
- writeFilterFile(path: pathlib.Path | str) Path¶
Write substructure filters to an optionally encrypted file.
- normalize(mols: list[rdkit.Chem.rdchem.Mol])¶
Modify the substructure filter by adjusting the single SMARTS filters to prevent them from rejected the provided molecules.
- Parameters:
mols – the molecules to adjust the substructure filter for
- exclude(smarts: list[str])¶
Modify the substructure filter so that any molecule that matches any SMARTS pattern in the smarts argument will be excluded from the output.
- Parameters:
smarts – the SMARTS patterns
- class schrodinger.application.transforms.filters.PropertySpaceFilter(property_ranges: Dict[str, List[float]], uncharge: bool = False)¶
Bases:
PTransformA PTransform that returns structures or molecules based on RDKit property ranges, with the option to uncharge the input before filtering.
- Parameters:
property_ranges – Dictionary containing property names as keys and lists of two floats as values, representing the minimum and maximum values for the property range.
Possible properties include all rdkit descriptors. This includes all descriptors in the following rdkit modules:
For a comprehensive list of possible properties, see
schrodinger.rdkit.descriptors.DESCRIPTORS_DICT.Example:
>>> property_ranges = { ... 'MolWt': [0, 100], # from rdkit's Lipinski descriptors ... 'NumAromaticRings': [1, 1] ... } >>> smiles = ['c1ccccc1', 'Brc1ccccc1', 'CC'] >>> with beam.Pipeline() as p: ... output = (p ... | beam.Create(smiles) ... | beam.Map(lambda smiles: Chem.MolFromSmiles(smiles)) ... | PropertySpaceFilter(property_ranges) ... | beam.Map(lambda mol: Chem.MolToSmiles(mol)) ... | beam.LogElements() ... ) c1ccccc1
- __init__(property_ranges: Dict[str, List[float]], uncharge: bool = False)¶
- expand(molecules)¶
- class schrodinger.application.transforms.filters.StructurePropertyFilter(property_ranges: Dict[str, List[float]])¶
Bases:
PTransformA PTransform that rejects structures that have one or more property values outside the allowed range as defined by the
property_ranges.Properties that are not on the structure will not be used as filters.
- __init__(property_ranges: Dict[str, List[float]])¶
- Parameters:
property_ranges – the property ranges to filter on
- expand(pcoll)¶
- class schrodinger.application.transforms.filters.FepAmenable(fep_references_path: Path, max_hac_diff: int = 10, core_smarts: str = '')¶
Bases:
PTransformA PTransform that returns molecules that have a perturbation that is amenable to FEP calculations.
A perturbation is considered acceptable if the number of heavy atoms in the perturbation from the maximum common substructure (MCS) is less than or equal to
max_hac_diff.The
core_smartsparameter can be used to specify a SMARTS pattern that is used to speed up filtering by avoiding the MCS calculation if possible.- __init__(fep_references_path: Path, max_hac_diff: int = 10, core_smarts: str = '')¶
- Parameters:
fep_references_path – the path to the FEP references SMILES file
max_hac_diff – the maximum number of heavy atoms not part of the maximum common substructure with molecules in the FEP references
- expand(pcoll)¶
- validate_mol(mol: Mol, what='Molecule') None¶
- Raises:
ValueError – if mol is not considered FepAmenable
- class schrodinger.application.transforms.filters.DistinctStructures(label: Optional[str] = None)¶
Bases:
PTransformA PTransform that returns the unique structures based on the SMILES.
- expand(pcoll)¶
- class schrodinger.application.transforms.filters.TanimotoFilter(references: Iterable[Mol], threshold: float, ignored_smarts: str = '', larger_is_better: bool = True)¶
Bases:
PTransformA PTransform that returns molecules that have a better Tanimoto similarity score to at least one molecule in
references. What is considered better depends on thelarger_is_betterparameter.The optional
ignored_smartsparameter can be used to ignore certain atoms in the Tanimoto similarity calculation.The
larger_is_betterparameter determines whether a larger or equal similarity score than the threshold is required to pass the filter. (Default is True)- __init__(references: Iterable[Mol], threshold: float, ignored_smarts: str = '', larger_is_better: bool = True)¶
- Parameters:
references – the molecules to compare against
ignored_smarts – the SMARTS pattern for the atoms to ignore in the Tanimoto similarity calculation
threshold – the Tanimoto similarity threshold
larger_is_better – whether a larger similarity score is better
- expand(pcoll)¶
- class schrodinger.application.transforms.filters.ChargedNHpKaFilter(min_pka: float, max_pka: float, exclude_smarts: Optional[str] = None)¶
Bases:
PTransformA PTransforms that only passes structures if all known pKa values of the hydrogens on a formally charged nitrogen atom fall in the min_pka to max_pka (borders included) range.
The pKa values should be stored in the
r_epik_H2O_pKaatom property, as is customarily done by ligprep. If the atom property is not defined, the atom is considered to have an acceptable pKa value.- __init__(min_pka: float, max_pka: float, exclude_smarts: Optional[str] = None)¶
- expand(pcoll)¶