schrodinger.application.glide_ws.optimizer module

Ensemble optimizer module for WScore.

This module provides a class, WScoreOptimizer, for optimizing the gbshift and offset parameters and selecting the best ensembles by enrichment.

class schrodinger.application.glide_ws.optimizer.ScoredEnsembleTuple(score, ensemble)

Bases: tuple

ensemble

Alias for field number 1

score

Alias for field number 0

class schrodinger.application.glide_ws.optimizer.WScoreOptimizerReceptor(complex_id: int, complex_name: str, offset: float, gbshift: float)

Bases: object

A “receptor” is an ensemble member, as defined by its complex_id, offset, and gbshift.

Variables
  • complex_id (int) – unique int derived from the input file header block

  • complex_name (str) – unique name from the input file header block

  • offset (float) – offset added to raw WScore when docking to this receptor

  • gbshift (float) – adjustment to mmgbsa target energy for this receptor

complex_id: int
complex_name: str
offset: float
gbshift: float
copy()
Returns

a copy of self

Return type

WScoreOptimizerReceptor

__init__(complex_id: int, complex_name: str, offset: float, gbshift: float) None
class schrodinger.application.glide_ws.optimizer.WScoreOptimizerEnsemble(receptors=None)

Bases: object

An ensemble is a list of receptors (WScoreOptimizerReceptor), each of which has an offset, gbshift, and complex ID.

__init__(receptors=None)
Parameters

receptors (list(WScoreOptimizerReceptor)) – list of receptors to initialize ensemble (optional)

append(recep)

Add a receptor to the ensemble.

Parameters

recep – receptor. NOTE: added by reference.

copy()
Returns

a copy of self

Return type

WScoreOptimizerEnsemble

property complex_names
Returns

read-only view of the complex names for all the receptors in the ensemble.

Return type

tuple(str)

property complex_ids
Returns

read-only view of the complex ids for all the receptors in the ensemble.

Return type

tuple(int)

property offsets
Returns

read-only view of the offsets for all the receptors in the ensemble.

Return type

tuple(int)

property gbshifts
Returns

read-only view of the gbshifts for all the receptors in the ensemble.

Return type

tuple(int)

__len__()
__contains__(recep)

Check if the ensemble contains a given receptor.

class schrodinger.application.glide_ws.optimizer.WScoreOptimizerPose(gbsa_score: float = 0.0, raw_score: float = 0.0, evdw: float = 0.0, lipo: float = 0.0, phobic_enc: float = 0.0, polar_grid: float = 0.0, voidrew: float = 0.0, complex_name: str = '', complex_id: int = 0)

Bases: object

Data class with properties needed by the combined scoring function that are dependent on the pose.

gbsa_score: float = 0.0
raw_score: float = 0.0
evdw: float = 0.0
lipo: float = 0.0
phobic_enc: float = 0.0
polar_grid: float = 0.0
voidrew: float = 0.0
complex_name: str = ''
complex_id: int = 0
property target_energy
__init__(gbsa_score: float = 0.0, raw_score: float = 0.0, evdw: float = 0.0, lipo: float = 0.0, phobic_enc: float = 0.0, polar_grid: float = 0.0, voidrew: float = 0.0, complex_name: str = '', complex_id: int = 0) None
class schrodinger.application.glide_ws.optimizer.WScoreOptimizerLigand(ligtype='decoy', title='', ncrb=0, ichg=0.0, mw=0.0, exp_dg=None, complex=None, complex_id=None, **extra_kwds)

Bases: object

Attributes needed for MMGBSA merging function which are independent of the pose. Decoys and Actives both use this class, with the ligtype attribute used to distinguish them.

__init__(ligtype='decoy', title='', ncrb=0, ichg=0.0, mw=0.0, exp_dg=None, complex=None, complex_id=None, **extra_kwds)
Parameters
  • ligtype (str) – ligand type (active, decoy, testset)

  • ligname (str) – ligand name

  • ncrb (int) – number of core rotatable bonds

  • ichg (float) – formal charge

  • mw (float) – molecular weight

  • exp_dg (float) – experimental DeltaG (only relevant for actives)

  • complex (str) – name of complex associated with ligand (optional, only relevant for training complexes)

  • complex_id (str) – index of complex associated with ligand (optional, only relevant for training complexes)

addPosesFromDict(lig, complex_names, offsets)

Add the ligand poses from a dict representation, which usually comes from the JSON input file. Example:

{
  "title": "fviia_2flr_HTL_lig_ref",
  "ichg": 0,
  "nconf": 3,
  "ncrb": 2,
  "gbsconf[]": [ 0.0, 0.0, 0.0 ],
  "wsconf[]": [ -2.92, -1.06, -2.0 ],
  "evdw[]": [ -1.48, -1.55, -1.50 ],
  "lipo[]": [ -6.12, -4.42, -4.22 ],
  "phobic_encs[]": [ 0.0, 0.0, 0.0 ],
  "polar_grids[]": [ -0.54, -1.17, -0.62 ],
  "internales[]": [ 0.0, 0.0, 0.0 ],
  "mw": 358.40,
  "complex": "2flr_HTL",
  "exp_dg": -8.29
}

(“complex” and “exp_dg” are only present for actives.)

Parameters
  • lig (dict) – ligand dict

  • complex_names – list of names of ensemble members

  • offsets – list of offsets sepecified in header of input file, used to correct the raw score for each pose to avoid double counting.

getEnsemblePoses(ensemble)

Get the poses for the current ligand for a given ensemble.

Return type

list(WScoreOptimizerPose)

getNativePose()

Return the native pose, if applicable.

Return type

WScoreOptimizerPose

class schrodinger.application.glide_ws.optimizer.WScoreOptimizer(args=None)

Bases: object

Optimize the gbshift and offset parameters and select the best ensembles by enrichment.

__init__(args=None)
Parameters

args (argparse.Namespace) – command-line arguments

property actives
property testset_ligs
property decoys
optimizeReceptor(receptor)

Optimize the offsets _or_ gbshifts for a given receptor to try to reach the scoring cutoff (aka “goodness threshold”) for each parameter:

  • If the initial decoy score is too high (> S_cut_offset), the offsets are optimized;

  • if it is too low (< S_cut_mmgbsa), the gbshifts are optimized;

  • otherwise, not optimization is done.

Parameters

receptor (WScoreOptimizerReceptor) – receptor to optimize

optimizeGbshift(gblo, gbhi, goodness_cutoff, ensemble, resolution=1.0)

Individual gbshifts for each receptor are optimized by systematically sampling from a range of values to obtain the best ensemble score.

Parameters
  • gblo (list(float)) – low end of gbshift range to explore for each receptor

  • gbhi (list(float)) – high end of gbshift range to explore for each receptor

  • goodness_cutoff (float) – when the score is above this value, it is considered good enough and the search stops.

  • ensemble (WScoreOptimizerEnsemble) – ensemble to optimize

  • resolution (float) – increment between samples

Returns

ensemble score after optimizing gbshifts

Return type

float

run()

Run the ensemble optimization.

Returns

list of best ensembles

Return type

list(WScoreOptimizerEnsemble)

getEnsembleSizeRange(complete_ensemble_size)

Get the allowed the lower and upper limit of the ensemble size based on: 1. User defined ensemble size; 2. Size of the complete ensemble

Parameters

complete_ensemble_size (int) – size of the complete ensemble

Returns

min_size, max_size: lower and upper size limit for ensemble selection

Return type

min_size, max_size: int, int

getCompleteEnsemble()

Generate an ensemble consisting of every complex specified in the input file / command-line, with offsets and gbshift for each complex optimized independently.

Return type

WScoreOptimizerEnsemble

scoreLigand(lig, ensemble)

Get WScore, MMGBSA correction, and name of complex for the best receptor for a single ligand using a given an ensemble.

Return type

(float, float, str)

optimizeOffsets(max_offset_shift, goodness_cutoff, ensemble, delta_offset=0.1)

Attempt to improve the ensemble score by sampling the offset of each receptor one at a time.

Parameters
  • max_offset_shift (float) – size of the offset range to sample [current_offset..current_offset+max_offset_shift]

  • goodness_cutoff (float) – when the score is below this value, it is considered good enough and the search stops.

  • ensemble (WScoreOptimizerEnsemble) – ensemble to optimize

  • delta_offset (float) – step size when sampling offset values

getLigandWScore(lig, ensemble)

Return the WScore from the combined scoring function for a given ligand using a given ensemble.

Return type

float

getInitialOffset(complex_id, min_offset, max_offset)

Chose initial offsets to minimize the difference between the raw WScore values and the experimental binding affinity for each native compound.

Parameters
  • complex_id (int) – complex to generate offset for

  • min_offset (float) – minimum allowed offset

  • max_offset (float) – maximum allowed offset

getInitialGbshift(offset, complex_id, mingb, maxgb, step=0.5)

Choose the largest value on the supported range for a given receptor such that its native ligand is not subject to a penalty by the WScore/MMGBSA merging function. If no penalty is found within the range, default to mingb.

Parameters
  • offset (float) – initial offset

  • complex_id (int) – complex to generate gbshift for

  • mingb (float) – minimum allowed gbshift

  • maxgb (float) – maximum allowed gbshift

  • step (float) – gbshift sampling step

computeBedrocForEnsemble(ensemble, testset=False)

Compute the BEDROC area under the curve for a given ensemble.

Parameters

testset (bool) – whether to use the testset ligs instead of the actives (decoys are always used)

getEnrichmentCalculator(ensemble, testset=False)

Return an enrichment calculator loaded with the scored ligands for a given ensemble.

Parameters

testset (bool) – whether to use the testset ligs instead of the actives (decoys are always used)

Return type

schrodinger.analysis.enrichment.Calculator

generateBestEnsembles(complete_ensemble, min_size, max_size)

Determine the best ensembles in the following manner: first, N slots (from the -nslots command-line argument) are filled with the best 1-member ensembles, as measured by enrichment. Then, each ensemble is grown by adding another receptor, and the best N 2-member ensembles are selected. The process is repeated until the maximum ensemble size is reached, while keeping track of the single best ensemble of each size.

Note however that if the input file specified required complexes, those are included unconditionally before starting the growing cycle described above.

Parameters
  • min_size (int) – minimum ensemble size to return

  • max_size (int) – minimum ensemble size to return

Returns

list with the best ensemble of each size in the range [min_size..max_size], sorted by descending enrichment, but no more than the limit given by the -n_ensembles option

Return type

list(WScoreOptimizerEnsemble)

growEnsemble(base_ensemble, complete_ensemble)

Given a base_ensemble of size N, yield all ensembles of size N+1 by adding receptors from complete_ensemble that are not already members of base_ensemble.

Returns

generator of WScoreOptimizerEnsemble

getDecoyScore(ensemble)

Score the goodness of the decoy distribution by evaluating the number of high-scoring decoys and weighing each decoy by the difference between the individual score and the decoy_cutoff from the command line.

Return type

float

computeBindingAffinityRMSD(ensemble, use_testset=False)

Compute the RMSD between WScore and experimental binding affinity.

Parameters
  • ensemble (WScoreOptimizerEnsemble) – ensemble to use for computing WScore

  • use_testset (bool) – if true, use testset ligands instead of active ligands to do the calculation

Returns

RSMD

Return type

float

readJsonFile(filename, ligtype)

Read a JSON file with ligand and pose data.

Parameters
  • filename (str) – path to JSON file

  • ligtype (str) – ligand type (active, decoy, testset)

processJsonData(json_input_data, ligtype)

Incorporate the ligand and pose data from a JSON file into the WScoreOptimizer object.

Parameters
  • json_input_data (dict) – parsed contents of JSON file

  • ligtype (str) – ligand type (active, decoy, testset)

getReportData(ensemble)

Return a dict containing all the data that needs to be reported for a given ensemble.

Return type

dict

writeCsv(filename, ensemble)

Write a CSV file with aggregate and per-ligand data for a given ensemble.

getCsvData(ensemble)

Generator of rows suitable for writing out as CSV, including a header row and a row with summary data, as well as the data for each ligand.

Return type

generator of list

writeOutput(ensembles, filename)

Write a JSON file describing the optimized ensembles and parameters. If running under job control, the output filename is registered for transfer.

findNativeLigand(complex_id)

Given a complex ID, return its native ligand.

Return type

WScoreOptimizerLigand

schrodinger.application.glide_ws.optimizer.fmt_bedroc(bedroc)

Format a BEDROC for display with reasonable precision and a special case for zero.

schrodinger.application.glide_ws.optimizer.fmt_score(score)

Format a score for display with reasonable precision.

schrodinger.application.glide_ws.optimizer.fmt_offsets(ensemble)

Return a comma-separated list of offsets with reasonable precision.

schrodinger.application.glide_ws.optimizer.fmt_gbshifts(ensemble)

Return a comma-separated list of gbshifts with reasonable precision.

schrodinger.application.glide_ws.optimizer.fmt_ensemble(ensemble)

Return a comma-separated list of ensemble member names.

schrodinger.application.glide_ws.optimizer.fmt_decoys_in_top_n(decoys_in_top_n, n)