schrodinger.active_learning.substruct_enrichment module

class schrodinger.active_learning.substruct_enrichment.SubstructEnrichCalculator(min_ratio=0.1, min_enrich=2, p_value_cut=0.01)

Bases: object

Calculates the substructure enrichment of a query set of ligands against a reference library.

DEFAULT_MIN_RATIO = 0.1
DEFAULT_MIN_ENRICH = 2
DEFAULT_P_VALUE_CUT = 0.01
__init__(min_ratio=0.1, min_enrich=2, p_value_cut=0.01)

Intialize the Calculator. :param min_ratio: minimum ratio of appearing in the query set to be considered as a candidate substructure :param min_enrich: minimum enrichment to be considered as a candidate substructure :param p_value_cut: minimum p-value to be considered as a candidate substructure

static get_frags_from_mol(mol)

Get the fragments from a molecule using BRICS decomposition.

Parameters:

mol – RDKit molecule object

Returns:

set of SMILES strings of the fragments

static remove_explicit_H(mol)
static dedup_frags(frags_count_dict)

If a fragment is a substructure of another (parent) fragment, and the substructure’s count is less than or equal to the parent’s count, then remove that substructure.

Parameters:

frags_count_dict – dictionary of SMILES strings and their counts

Returns:

dictionary of deduplicated SMILES strings and their counts

calculate(query_smiles_list, library_smiles_list)

Run the calculation of substructure enrichment.

Parameters:
  • query_smiles_list – list of SMILES strings of the query set

  • library_smiles_list – list of SMILES strings of the reference library

Returns:

DataFrame of the enrichment results