schrodinger.active_learning.scaffold_methods module

schrodinger.active_learning.scaffold_methods.validate_args(args: argparse.Namespace)
schrodinger.active_learning.scaffold_methods.prepare_args(args: argparse.Namespace)
schrodinger.active_learning.scaffold_methods.scaffold_dj(input_file: str, block_size: int = 100000, smiles_column: int = 0, logger: Optional[logging.Logger] = None) tuple[list, list]

This function takes in an input .smi file and parallelizes and runs scaffold analysis on it using scaffold_analysis_worker script. The number of subjobs it spins depends on blocksize and available cpus. It returns a list of scaffold smiles and the corresponding counts of those scaffolds in the input file.

Parameters
  • input_file – path to the input .smi file

  • block_size – maximum number of ligands in one subjob assuming enough number of available cpu.

  • smiles_column – SMILES column in the input .smi file.

  • logger – logger to report to if one of the subjobs fails.

Returns

a tuple of two lists containing smiles and counts of scaffolds.

schrodinger.active_learning.scaffold_methods.get_rdkit_mol(smi: str) Optional[rdkit.Chem.rdchem.Mol]
Parameters

smi – input smiles string

Returns

a rdkit mol object for the input smiles. returns none if rdkit fails to generate a mol object for the input smiles.

schrodinger.active_learning.scaffold_methods.get_scaffold_dictionary(input_file: str, smiles_column: int = 0) dict

This function takes in an input .smi file and calculates counts of different generic Bemis-Murcko scaffolds present in the file. It returns a dictionary with SMILES pattern of the scaffolds as keys and the counts different scaffolds as values.

Parameters

input_file – name of smiles file to read from.

Returns

dictionary with smiles and counts of scaffolds as keys and values respectively.