schrodinger.application.glide_ws.optimizer module¶
Ensemble optimizer module for WScore.
This module provides a class, WScoreOptimizer, for optimizing the gbshift and offset parameters and selecting the best ensembles by enrichment.
- class schrodinger.application.glide_ws.optimizer.ScoredEnsembleTuple(score, ensemble)¶
Bases:
tuple
- ensemble¶
Alias for field number 1
- score¶
Alias for field number 0
- class schrodinger.application.glide_ws.optimizer.WScoreOptimizerReceptor(complex_id: int, complex_name: str, offset: float, gbshift: float)¶
Bases:
object
A “receptor” is an ensemble member, as defined by its complex_id, offset, and gbshift.
- Variables
complex_id (int) – unique int derived from the input file header block
complex_name (str) – unique name from the input file header block
offset (float) – offset added to raw WScore when docking to this receptor
gbshift (float) – adjustment to mmgbsa target energy for this receptor
- complex_id: int¶
- complex_name: str¶
- offset: float¶
- gbshift: float¶
- copy()¶
- Returns
a copy of self
- Return type
- __init__(complex_id: int, complex_name: str, offset: float, gbshift: float) None ¶
- class schrodinger.application.glide_ws.optimizer.WScoreOptimizerEnsemble(receptors=None)¶
Bases:
object
An ensemble is a list of receptors (WScoreOptimizerReceptor), each of which has an offset, gbshift, and complex ID.
- __init__(receptors=None)¶
- Parameters
receptors (list(WScoreOptimizerReceptor)) – list of receptors to initialize ensemble (optional)
- append(recep)¶
Add a receptor to the ensemble.
- Parameters
recep – receptor. NOTE: added by reference.
- copy()¶
- Returns
a copy of self
- Return type
- property complex_names¶
- Returns
read-only view of the complex names for all the receptors in the ensemble.
- Return type
tuple(str)
- property complex_ids¶
- Returns
read-only view of the complex ids for all the receptors in the ensemble.
- Return type
tuple(int)
- property offsets¶
- Returns
read-only view of the offsets for all the receptors in the ensemble.
- Return type
tuple(int)
- property gbshifts¶
- Returns
read-only view of the gbshifts for all the receptors in the ensemble.
- Return type
tuple(int)
- __len__()¶
- __contains__(recep)¶
Check if the ensemble contains a given receptor.
- class schrodinger.application.glide_ws.optimizer.WScoreOptimizerPose(gbsa_score: float = 0.0, raw_score: float = 0.0, evdw: float = 0.0, lipo: float = 0.0, phobic_enc: float = 0.0, polar_grid: float = 0.0, voidrew: float = 0.0, complex_name: str = '', complex_id: int = 0)¶
Bases:
object
Data class with properties needed by the combined scoring function that are dependent on the pose.
- gbsa_score: float = 0.0¶
- raw_score: float = 0.0¶
- evdw: float = 0.0¶
- lipo: float = 0.0¶
- phobic_enc: float = 0.0¶
- polar_grid: float = 0.0¶
- voidrew: float = 0.0¶
- complex_name: str = ''¶
- complex_id: int = 0¶
- property target_energy¶
- __init__(gbsa_score: float = 0.0, raw_score: float = 0.0, evdw: float = 0.0, lipo: float = 0.0, phobic_enc: float = 0.0, polar_grid: float = 0.0, voidrew: float = 0.0, complex_name: str = '', complex_id: int = 0) None ¶
- class schrodinger.application.glide_ws.optimizer.WScoreOptimizerLigand(ligtype='decoy', title='', ncrb=0, ichg=0.0, mw=0.0, exp_dg=None, complex=None, complex_id=None, **extra_kwds)¶
Bases:
object
Attributes needed for MMGBSA merging function which are independent of the pose. Decoys and Actives both use this class, with the ligtype attribute used to distinguish them.
- __init__(ligtype='decoy', title='', ncrb=0, ichg=0.0, mw=0.0, exp_dg=None, complex=None, complex_id=None, **extra_kwds)¶
- Parameters
ligtype (str) – ligand type (active, decoy, testset)
ligname (str) – ligand name
ncrb (int) – number of core rotatable bonds
ichg (float) – formal charge
mw (float) – molecular weight
exp_dg (float) – experimental DeltaG (only relevant for actives)
complex (str) – name of complex associated with ligand (optional, only relevant for training complexes)
complex_id (str) – index of complex associated with ligand (optional, only relevant for training complexes)
- addPosesFromDict(lig, complex_names, offsets)¶
Add the ligand poses from a dict representation, which usually comes from the JSON input file. Example:
{ "title": "fviia_2flr_HTL_lig_ref", "ichg": 0, "nconf": 3, "ncrb": 2, "gbsconf[]": [ 0.0, 0.0, 0.0 ], "wsconf[]": [ -2.92, -1.06, -2.0 ], "evdw[]": [ -1.48, -1.55, -1.50 ], "lipo[]": [ -6.12, -4.42, -4.22 ], "phobic_encs[]": [ 0.0, 0.0, 0.0 ], "polar_grids[]": [ -0.54, -1.17, -0.62 ], "internales[]": [ 0.0, 0.0, 0.0 ], "mw": 358.40, "complex": "2flr_HTL", "exp_dg": -8.29 }
(“complex” and “exp_dg” are only present for actives.)
- Parameters
lig (dict) – ligand dict
complex_names – list of names of ensemble members
offsets – list of offsets sepecified in header of input file, used to correct the raw score for each pose to avoid double counting.
- getEnsemblePoses(ensemble)¶
Get the poses for the current ligand for a given ensemble.
- Return type
list(WScoreOptimizerPose)
- getNativePose()¶
Return the native pose, if applicable.
- Return type
- class schrodinger.application.glide_ws.optimizer.WScoreOptimizer(args=None)¶
Bases:
object
Optimize the gbshift and offset parameters and select the best ensembles by enrichment.
- __init__(args=None)¶
- Parameters
args (argparse.Namespace) – command-line arguments
- property actives¶
- property testset_ligs¶
- property decoys¶
- optimizeReceptor(receptor)¶
Optimize the offsets _or_ gbshifts for a given receptor to try to reach the scoring cutoff (aka “goodness threshold”) for each parameter:
If the initial decoy score is too high (> S_cut_offset), the offsets are optimized;
if it is too low (< S_cut_mmgbsa), the gbshifts are optimized;
otherwise, not optimization is done.
- Parameters
receptor (WScoreOptimizerReceptor) – receptor to optimize
- optimizeGbshift(gblo, gbhi, goodness_cutoff, ensemble, resolution=1.0)¶
Individual gbshifts for each receptor are optimized by systematically sampling from a range of values to obtain the best ensemble score.
- Parameters
gblo (list(float)) – low end of gbshift range to explore for each receptor
gbhi (list(float)) – high end of gbshift range to explore for each receptor
goodness_cutoff (float) – when the score is above this value, it is considered good enough and the search stops.
ensemble (WScoreOptimizerEnsemble) – ensemble to optimize
resolution (float) – increment between samples
- Returns
ensemble score after optimizing gbshifts
- Return type
float
- run()¶
Run the ensemble optimization.
- Returns
list of best ensembles
- Return type
list(WScoreOptimizerEnsemble)
- getEnsembleSizeRange(complete_ensemble_size)¶
Get the allowed the lower and upper limit of the ensemble size based on: 1. User defined ensemble size; 2. Size of the complete ensemble
- Parameters
complete_ensemble_size (int) – size of the complete ensemble
- Returns
min_size, max_size: lower and upper size limit for ensemble selection
- Return type
min_size, max_size: int, int
- getCompleteEnsemble()¶
Generate an ensemble consisting of every complex specified in the input file / command-line, with offsets and gbshift for each complex optimized independently.
- Return type
- scoreLigand(lig, ensemble)¶
Get WScore, MMGBSA correction, and name of complex for the best receptor for a single ligand using a given an ensemble.
- Return type
(float, float, str)
- optimizeOffsets(max_offset_shift, goodness_cutoff, ensemble, delta_offset=0.1)¶
Attempt to improve the ensemble score by sampling the offset of each receptor one at a time.
- Parameters
max_offset_shift (float) – size of the offset range to sample [current_offset..current_offset+max_offset_shift]
goodness_cutoff (float) – when the score is below this value, it is considered good enough and the search stops.
ensemble (WScoreOptimizerEnsemble) – ensemble to optimize
delta_offset (float) – step size when sampling offset values
- getLigandWScore(lig, ensemble)¶
Return the WScore from the combined scoring function for a given ligand using a given ensemble.
- Return type
float
- getInitialOffset(complex_id, min_offset, max_offset)¶
Chose initial offsets to minimize the difference between the raw WScore values and the experimental binding affinity for each native compound.
- Parameters
complex_id (int) – complex to generate offset for
min_offset (float) – minimum allowed offset
max_offset (float) – maximum allowed offset
- getInitialGbshift(offset, complex_id, mingb, maxgb, step=0.5)¶
Choose the largest value on the supported range for a given receptor such that its native ligand is not subject to a penalty by the WScore/MMGBSA merging function. If no penalty is found within the range, default to
mingb
.- Parameters
offset (float) – initial offset
complex_id (int) – complex to generate gbshift for
mingb (float) – minimum allowed gbshift
maxgb (float) – maximum allowed gbshift
step (float) – gbshift sampling step
- computeBedrocForEnsemble(ensemble, testset=False)¶
Compute the BEDROC area under the curve for a given ensemble.
- Parameters
testset (bool) – whether to use the testset ligs instead of the actives (decoys are always used)
- getEnrichmentCalculator(ensemble, testset=False)¶
Return an enrichment calculator loaded with the scored ligands for a given ensemble.
- Parameters
testset (bool) – whether to use the testset ligs instead of the actives (decoys are always used)
- Return type
schrodinger.analysis.enrichment.Calculator
- generateBestEnsembles(complete_ensemble, min_size, max_size)¶
Determine the best ensembles in the following manner: first, N slots (from the -nslots command-line argument) are filled with the best 1-member ensembles, as measured by enrichment. Then, each ensemble is grown by adding another receptor, and the best N 2-member ensembles are selected. The process is repeated until the maximum ensemble size is reached, while keeping track of the single best ensemble of each size.
Note however that if the input file specified required complexes, those are included unconditionally before starting the growing cycle described above.
- Parameters
min_size (int) – minimum ensemble size to return
max_size (int) – minimum ensemble size to return
- Returns
list with the best ensemble of each size in the range [min_size..max_size], sorted by descending enrichment, but no more than the limit given by the -n_ensembles option
- Return type
list(WScoreOptimizerEnsemble)
- growEnsemble(base_ensemble, complete_ensemble)¶
Given a
base_ensemble
of size N, yield all ensembles of size N+1 by adding receptors fromcomplete_ensemble
that are not already members ofbase_ensemble
.- Returns
generator of WScoreOptimizerEnsemble
- getDecoyScore(ensemble)¶
Score the goodness of the decoy distribution by evaluating the number of high-scoring decoys and weighing each decoy by the difference between the individual score and the decoy_cutoff from the command line.
- Return type
float
- computeBindingAffinityRMSD(ensemble, use_testset=False)¶
Compute the RMSD between WScore and experimental binding affinity.
- Parameters
ensemble (WScoreOptimizerEnsemble) – ensemble to use for computing WScore
use_testset (bool) – if true, use testset ligands instead of active ligands to do the calculation
- Returns
RSMD
- Return type
float
- readJsonFile(filename, ligtype)¶
Read a JSON file with ligand and pose data.
- Parameters
filename (str) – path to JSON file
ligtype (str) – ligand type (active, decoy, testset)
- processJsonData(json_input_data, ligtype)¶
Incorporate the ligand and pose data from a JSON file into the WScoreOptimizer object.
- Parameters
json_input_data (dict) – parsed contents of JSON file
ligtype (str) – ligand type (active, decoy, testset)
- getReportData(ensemble)¶
Return a dict containing all the data that needs to be reported for a given ensemble.
- Return type
dict
- writeCsv(filename, ensemble)¶
Write a CSV file with aggregate and per-ligand data for a given ensemble.
- getCsvData(ensemble)¶
Generator of rows suitable for writing out as CSV, including a header row and a row with summary data, as well as the data for each ligand.
- Return type
generator of list
- writeOutput(ensembles, filename)¶
Write a JSON file describing the optimized ensembles and parameters. If running under job control, the output filename is registered for transfer.
- findNativeLigand(complex_id)¶
Given a complex ID, return its native ligand.
- Return type
- schrodinger.application.glide_ws.optimizer.fmt_bedroc(bedroc)¶
Format a BEDROC for display with reasonable precision and a special case for zero.
- schrodinger.application.glide_ws.optimizer.fmt_score(score)¶
Format a score for display with reasonable precision.
- schrodinger.application.glide_ws.optimizer.fmt_offsets(ensemble)¶
Return a comma-separated list of offsets with reasonable precision.
- schrodinger.application.glide_ws.optimizer.fmt_gbshifts(ensemble)¶
Return a comma-separated list of gbshifts with reasonable precision.
- schrodinger.application.glide_ws.optimizer.fmt_ensemble(ensemble)¶
Return a comma-separated list of ensemble member names.
- schrodinger.application.glide_ws.optimizer.fmt_decoys_in_top_n(decoys_in_top_n, n)¶