schrodinger.application.glide_ws.optimizer module¶

Ensemble optimizer module for WScore.

This module provides a class, WScoreOptimizer, for optimizing the gbshift and offset parameters and selecting the best ensembles by enrichment.

class schrodinger.application.glide_ws.optimizer.ScoredEnsembleTuple(score, ensemble)¶

Bases: tuple

ensemble¶: Alias for field number 1

score¶: Alias for field number 0

class schrodinger.application.glide_ws.optimizer.WScoreOptimizerReceptor(complex_id: int, complex_name: str, offset: float, gbshift: float)¶

Bases: object

A “receptor” is an ensemble member, as defined by its complex_id, offset, and gbshift.

Variables

complex_id (int) – unique int derived from the input file header block
complex_name (str) – unique name from the input file header block
offset (float) – offset added to raw WScore when docking to this receptor
gbshift (float) – adjustment to mmgbsa target energy for this receptor

complex_id: int¶

complex_name: str¶

offset: float¶

gbshift: float¶

copy()¶

Returns: a copy of self
Return type: WScoreOptimizerReceptor

__init__(complex_id: int, complex_name: str, offset: float, gbshift: float) → None¶

class schrodinger.application.glide_ws.optimizer.WScoreOptimizerEnsemble(receptors=None)¶

Bases: object

An ensemble is a list of receptors (WScoreOptimizerReceptor), each of which has an offset, gbshift, and complex ID.

__init__(receptors=None)¶

Parameters: receptors (list(WScoreOptimizerReceptor)) – list of receptors to initialize ensemble (optional)

append(recep)¶

Add a receptor to the ensemble.

Parameters: recep – receptor. NOTE: added by reference.

copy()¶

Returns: a copy of self
Return type: WScoreOptimizerEnsemble

property complex_names¶

Returns: read-only view of the complex names for all the receptors in the ensemble.
Return type: tuple(str)

property complex_ids¶

Returns: read-only view of the complex ids for all the receptors in the ensemble.
Return type: tuple(int)

property offsets¶

Returns: read-only view of the offsets for all the receptors in the ensemble.
Return type: tuple(int)

property gbshifts¶

Returns: read-only view of the gbshifts for all the receptors in the ensemble.
Return type: tuple(int)

__len__()¶

__contains__(recep)¶: Check if the ensemble contains a given receptor.

class schrodinger.application.glide_ws.optimizer.WScoreOptimizerPose(gbsa_score: float = 0.0, raw_score: float = 0.0, evdw: float = 0.0, lipo: float = 0.0, phobic_enc: float = 0.0, polar_grid: float = 0.0, voidrew: float = 0.0, complex_name: str = '', complex_id: int = 0)¶

Bases: object

Data class with properties needed by the combined scoring function that are dependent on the pose.

gbsa_score: float = 0.0¶

raw_score: float = 0.0¶

evdw: float = 0.0¶

lipo: float = 0.0¶

phobic_enc: float = 0.0¶

polar_grid: float = 0.0¶

voidrew: float = 0.0¶

complex_name: str = ''¶

complex_id: int = 0¶

property target_energy¶

__init__(gbsa_score: float = 0.0, raw_score: float = 0.0, evdw: float = 0.0, lipo: float = 0.0, phobic_enc: float = 0.0, polar_grid: float = 0.0, voidrew: float = 0.0, complex_name: str = '', complex_id: int = 0) → None¶

class schrodinger.application.glide_ws.optimizer.WScoreOptimizerLigand(ligtype='decoy', title='', ncrb=0, ichg=0.0, mw=0.0, exp_dg=None, complex=None, complex_id=None, **extra_kwds)¶

Bases: object

Attributes needed for MMGBSA merging function which are independent of the pose. Decoys and Actives both use this class, with the ligtype attribute used to distinguish them.

__init__(ligtype='decoy', title='', ncrb=0, ichg=0.0, mw=0.0, exp_dg=None, complex=None, complex_id=None, **extra_kwds)¶

Parameters

ligtype (str) – ligand type (active, decoy, testset)
ligname (str) – ligand name
ncrb (int) – number of core rotatable bonds
ichg (float) – formal charge
mw (float) – molecular weight
exp_dg (float) – experimental DeltaG (only relevant for actives)
complex (str) – name of complex associated with ligand (optional, only relevant for training complexes)
complex_id (str) – index of complex associated with ligand (optional, only relevant for training complexes)

addPosesFromDict(lig, complex_names, offsets)¶

Add the ligand poses from a dict representation, which usually comes from the JSON input file. Example:

{
  "title": "fviia_2flr_HTL_lig_ref",
  "ichg": 0,
  "nconf": 3,
  "ncrb": 2,
  "gbsconf[]": [ 0.0, 0.0, 0.0 ],
  "wsconf[]": [ -2.92, -1.06, -2.0 ],
  "evdw[]": [ -1.48, -1.55, -1.50 ],
  "lipo[]": [ -6.12, -4.42, -4.22 ],
  "phobic_encs[]": [ 0.0, 0.0, 0.0 ],
  "polar_grids[]": [ -0.54, -1.17, -0.62 ],
  "internales[]": [ 0.0, 0.0, 0.0 ],
  "mw": 358.40,
  "complex": "2flr_HTL",
  "exp_dg": -8.29
}

(“complex” and “exp_dg” are only present for actives.)

Parameters

lig (dict) – ligand dict
complex_names – list of names of ensemble members
offsets – list of offsets sepecified in header of input file, used to correct the raw score for each pose to avoid double counting.

getEnsemblePoses(ensemble)¶

Get the poses for the current ligand for a given ensemble.

Return type: list(WScoreOptimizerPose)

getNativePose()¶

Return the native pose, if applicable.

Return type: WScoreOptimizerPose

class schrodinger.application.glide_ws.optimizer.WScoreOptimizer(args=None)¶

Bases: object

Optimize the gbshift and offset parameters and select the best ensembles by enrichment.

__init__(args=None)¶

Parameters: args (argparse.Namespace) – command-line arguments

property actives¶

property testset_ligs¶

property decoys¶

optimizeReceptor(receptor)¶

Optimize the offsets _or_ gbshifts for a given receptor to try to reach the scoring cutoff (aka “goodness threshold”) for each parameter:

If the initial decoy score is too high (> S_cut_offset), the offsets are optimized;
if it is too low (< S_cut_mmgbsa), the gbshifts are optimized;
otherwise, not optimization is done.

Parameters: receptor (WScoreOptimizerReceptor) – receptor to optimize

optimizeGbshift(gblo, gbhi, goodness_cutoff, ensemble, resolution=1.0)¶

Individual gbshifts for each receptor are optimized by systematically sampling from a range of values to obtain the best ensemble score.

Parameters

gblo (list(float)) – low end of gbshift range to explore for each receptor
gbhi (list(float)) – high end of gbshift range to explore for each receptor
goodness_cutoff (float) – when the score is above this value, it is considered good enough and the search stops.
ensemble (WScoreOptimizerEnsemble) – ensemble to optimize
resolution (float) – increment between samples

Returns

ensemble score after optimizing gbshifts

Return type

float

run()¶

Run the ensemble optimization.

Returns: list of best ensembles
Return type: list(WScoreOptimizerEnsemble)

getEnsembleSizeRange(complete_ensemble_size)¶

Get the allowed the lower and upper limit of the ensemble size based on: 1. User defined ensemble size; 2. Size of the complete ensemble

Parameters: complete_ensemble_size (int) – size of the complete ensemble
Returns: min_size, max_size: lower and upper size limit for ensemble selection
Return type: min_size, max_size: int, int

getCompleteEnsemble()¶

Generate an ensemble consisting of every complex specified in the input file / command-line, with offsets and gbshift for each complex optimized independently.

Return type: WScoreOptimizerEnsemble

scoreLigand(lig, ensemble)¶

Get WScore, MMGBSA correction, and name of complex for the best receptor for a single ligand using a given an ensemble.

Return type: (float, float, str)

optimizeOffsets(max_offset_shift, goodness_cutoff, ensemble, delta_offset=0.1)¶

Attempt to improve the ensemble score by sampling the offset of each receptor one at a time.

Parameters

max_offset_shift (float) – size of the offset range to sample [current_offset..current_offset+max_offset_shift]
goodness_cutoff (float) – when the score is below this value, it is considered good enough and the search stops.
ensemble (WScoreOptimizerEnsemble) – ensemble to optimize
delta_offset (float) – step size when sampling offset values

getLigandWScore(lig, ensemble)¶

Return the WScore from the combined scoring function for a given ligand using a given ensemble.

Return type: float

getInitialOffset(complex_id, min_offset, max_offset)¶

Chose initial offsets to minimize the difference between the raw WScore values and the experimental binding affinity for each native compound.

Parameters

complex_id (int) – complex to generate offset for
min_offset (float) – minimum allowed offset
max_offset (float) – maximum allowed offset

getInitialGbshift(offset, complex_id, mingb, maxgb, step=0.5)¶

Choose the largest value on the supported range for a given receptor such that its native ligand is not subject to a penalty by the WScore/MMGBSA merging function. If no penalty is found within the range, default to mingb.

Parameters

offset (float) – initial offset
complex_id (int) – complex to generate gbshift for
mingb (float) – minimum allowed gbshift
maxgb (float) – maximum allowed gbshift
step (float) – gbshift sampling step

computeBedrocForEnsemble(ensemble, testset=False)¶

Compute the BEDROC area under the curve for a given ensemble.

Parameters: testset (bool) – whether to use the testset ligs instead of the actives (decoys are always used)

getEnrichmentCalculator(ensemble, testset=False)¶

Return an enrichment calculator loaded with the scored ligands for a given ensemble.

Parameters: testset (bool) – whether to use the testset ligs instead of the actives (decoys are always used)
Return type: schrodinger.analysis.enrichment.Calculator

generateBestEnsembles(complete_ensemble, min_size, max_size)¶

Determine the best ensembles in the following manner: first, N slots (from the -nslots command-line argument) are filled with the best 1-member ensembles, as measured by enrichment. Then, each ensemble is grown by adding another receptor, and the best N 2-member ensembles are selected. The process is repeated until the maximum ensemble size is reached, while keeping track of the single best ensemble of each size.

Note however that if the input file specified required complexes, those are included unconditionally before starting the growing cycle described above.

Parameters

min_size (int) – minimum ensemble size to return
max_size (int) – minimum ensemble size to return

Returns

list with the best ensemble of each size in the range [min_size..max_size], sorted by descending enrichment, but no more than the limit given by the -n_ensembles option

Return type

list(WScoreOptimizerEnsemble)

growEnsemble(base_ensemble, complete_ensemble)¶

Given a base_ensemble of size N, yield all ensembles of size N+1 by adding receptors from complete_ensemble that are not already members of base_ensemble.

Returns: generator of WScoreOptimizerEnsemble

getDecoyScore(ensemble)¶

Score the goodness of the decoy distribution by evaluating the number of high-scoring decoys and weighing each decoy by the difference between the individual score and the decoy_cutoff from the command line.

Return type: float

computeBindingAffinityRMSD(ensemble, use_testset=False)¶

Compute the RMSD between WScore and experimental binding affinity.

Parameters

ensemble (WScoreOptimizerEnsemble) – ensemble to use for computing WScore
use_testset (bool) – if true, use testset ligands instead of active ligands to do the calculation

Returns

RSMD

Return type

float

readJsonFile(filename, ligtype)¶

Read a JSON file with ligand and pose data.

Parameters

filename (str) – path to JSON file
ligtype (str) – ligand type (active, decoy, testset)

processJsonData(json_input_data, ligtype)¶

Incorporate the ligand and pose data from a JSON file into the WScoreOptimizer object.

Parameters

json_input_data (dict) – parsed contents of JSON file
ligtype (str) – ligand type (active, decoy, testset)

getReportData(ensemble)¶

Return a dict containing all the data that needs to be reported for a given ensemble.

Return type: dict

writeCsv(filename, ensemble)¶

Write a CSV file with aggregate and per-ligand data for a given ensemble.

getCsvData(ensemble)¶

Generator of rows suitable for writing out as CSV, including a header row and a row with summary data, as well as the data for each ligand.

Return type: generator of list

writeOutput(ensembles, filename)¶

Write a JSON file describing the optimized ensembles and parameters. If running under job control, the output filename is registered for transfer.

findNativeLigand(complex_id)¶

Given a complex ID, return its native ligand.

Return type: WScoreOptimizerLigand

schrodinger.application.glide_ws.optimizer.fmt_bedroc(bedroc)¶: Format a BEDROC for display with reasonable precision and a special case for zero.

schrodinger.application.glide_ws.optimizer.fmt_score(score)¶: Format a score for display with reasonable precision.

schrodinger.application.glide_ws.optimizer.fmt_offsets(ensemble)¶: Return a comma-separated list of offsets with reasonable precision.

schrodinger.application.glide_ws.optimizer.fmt_gbshifts(ensemble)¶: Return a comma-separated list of gbshifts with reasonable precision.

schrodinger.application.glide_ws.optimizer.fmt_ensemble(ensemble)¶: Return a comma-separated list of ensemble member names.

schrodinger.application.glide_ws.optimizer.fmt_decoys_in_top_n(decoys_in_top_n, n)¶