schrodinger.application.desmond.kinetics.utils module

class schrodinger.application.desmond.kinetics.utils.RmsdSelection(aligned_weights, displaced_weights, atom_idcs)

Bases: NamedTuple

aligned_weights: List[float]

Alias for field number 0

displaced_weights: List[float]

Alias for field number 1

atom_idcs: List[int]

Alias for field number 2

class schrodinger.application.desmond.kinetics.utils.MtdInfo(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: enum.Enum

KILL_NOW = 'kill_now'
LAST_INDEX = 'last_index'
MINDIST = 'mindist'
TAU_ESTIMATE_SEC = 'tau_estimate_sec'
SIM_TIME = 'sim_time'
TOO_SLOW = 'too_slow'
STATUS = 'status'
class schrodinger.application.desmond.kinetics.utils.MtdStatus(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: enum.Enum

The Status of each MtD unbinding job

OK = 'OK'
TOO_SLOW = 'Too Slow'
BOUND = 'Bound'
class schrodinger.application.desmond.kinetics.utils.ResultsCols(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: enum.Enum

Columns used in MtD summary statistics

PATH = 'Path'
NAME = 'Name'
PRED_RES_TIME = 'Pred.Res.Time(s)'
P_VALUE = 'P-value'
SIM_TIME = 'Sim.Time(ns)'
SIMS = '#Sims'
UNBOUND_SIMS = '#Unbound.Sims'
schrodinger.application.desmond.kinetics.utils.find_ligand_sites(st: schrodinger.structure._structure.Structure, asl: str, nsites: int, sampling_size: int = 5000, rng_seed: int = 30) List[int]

Find a nsites atoms which are furthest away from each other on a provided small molecule. These atoms (sites) are then used in distance calculations.

schrodinger.application.desmond.kinetics.utils.find_protein_sites(st: schrodinger.structure._structure.Structure, asl: str, area_cutoff: Optional[float] = None) List[int]

Find solvent exposed protein residues which will be used to check if the ligand is still bound to the protein.

Parameters
  • st – Input system structure.

  • asl – ASL to identify protein residues eg. ‘protein’.

  • area_cutoff – SASA threshold where residues with SASA above this value have their C-alpha added to the set of protein sites. If no cutoff is given, return all C-alpha atom indices.

Returns

A list of C-alpha atoms indices passing the area_cutoff.

schrodinger.application.desmond.kinetics.utils.plot_descriptors(traj_times: numpy.array, rmsd: numpy.array, mindist: numpy.array, cvseq_times: numpy.array = None, force_constant: numpy.array = None, endpoints: Tuple[Optional[int], Optional[int]] = (None, None), rmsd_limits: Tuple[Optional[float], Optional[float]] = (None, None), mindist_limits: Tuple[Optional[float], Optional[float]] = (None, None))

Plot descriptors from a RAMD trajectory. Saved to CWD as “trj_descriptors.png”.

Parameters
  • traj_times – Array of times in ns corresponding to trajectory frames.

  • rmsd – Array of ligand RMSDs. Must be the same length as traj_times.

  • mindist – Array of receptor-ligand minimum distances. Must be the same length as traj_times.

  • cvseq_times – Array of times in ns corresponding to cvseq lines.

  • force_constant – Array of spring_fk from the RAMD cvseq. Must be the same length as cvseq_times.

  • endpoints – Reactive trajectory start/stop frame indices.

  • rmsd_limits – RMSD criteria for reactive start/stop frames. When entries are None, skip plotting that limit.

  • mindist_limits – MinDist criteria for reactive start/stop frames. When entries are None, skip plotting that limit.

schrodinger.application.desmond.kinetics.utils.extract_reactive_path(cms_filename: str, cvseq_filename: Optional[str] = None, asl_receptor: str = 'protein', asl_ligand: str = 'ligand', rmsd_on: float = 2.0, mindist_on: float = 6.0, rmsd_off: Optional[float] = None, mindist_off: Optional[float] = None, strict: Optional[bool] = None, pad_traj_start: int = 0, max_traj_frames: int = 2000, debug: bool = False) List[str]

Reactive path is equivelent is a the unbinding path of the ligand

Parameters
  • rmsd_on – value to start recording ligand detachment cutoff

  • rmsd_off – value to stop recording ligand detachment cutoff, this value should be larger than rmsd_on

  • mindist_on – minimal distance between receptor and ligand to start recording ligand detachment cutoff

  • mindist_off – minimal distance between receptor and ligand to stop recording ligand detachment cutoff. This value should be larger than rmsd_on

  • strict – mindist_off and rmsd_off must be both satisfied for reactive trajectory extraction

  • pad_traj_start – number of initial trajectory frames to prepend to the reactive trajectory frames

  • max_traj_frames – maximum number of frames to load. If trajectory contains more frames than this value, then we will slice the trajectory such that the specified number of frames is used in the downstream analysis

Returns

a list of filenames to be saved in the analsysis stage of the workflow

schrodinger.application.desmond.kinetics.utils.calc_tension(rmsd_matrix: numpy.array, ivec: Optional[List[int]] = None) float
schrodinger.application.desmond.kinetics.utils.path_rmsd_matrix(sts: List[schrodinger.structure._structure.Structure], rmsd_atoms: List[int]) numpy.array

calculate 2d RMSD of the ligands on prealigned structures

schrodinger.application.desmond.kinetics.utils.path_optimize(current_guess: List[int], rmsd_matrix: numpy.array, step: float) List[int]

Provided that you already have a rough path (init.mae) you can now optimize it, whiere the best set of available frames can be used. :return: a list with frame IDs, which will be used to extract the path

schrodinger.application.desmond.kinetics.utils.path_satisfies_mindist(sts: List[schrodinger.structure._structure.Structure], asl_receptor: str, asl_ligand: str, mindist_off: float) bool

Evaluate if the provided path satisfies the mindist criteria

schrodinger.application.desmond.kinetics.utils.path_prune(sts: List[schrodinger.structure._structure.Structure], rmsd_matrix: numpy.array, threshold: float = 1.5) List[schrodinger.structure._structure.Structure]

Remove waypoints that are too close to non-adjacent waypoints. This is treated as a shortest path problem using Dijkstra’s algorithm.

Parameters
  • sts – Waypoint structures

  • rmsd_matrix – Matrix of RMSDs between input waypoints

  • threshold – Factor used to set how far adjacent waypoints can be. This is multiplied by the median of adjacent waypoint RMSDs to set the maximum for when waypoints may be connected in the shortest path

Returns

Subset of input waypoint structures composing the shortest path

schrodinger.application.desmond.kinetics.utils.path_smooth(sts: List[schrodinger.structure._structure.Structure], path_resolution: float, receptor_atom_idcs: List[int], ligand_atom_idcs: List[int]) List[schrodinger.structure._structure.Structure]

As final path optimization step, we will shuffle each waypoint towards or away from its neightbors such that RMSD between all the waypoints is equal. If nwaypoints > len(sts), additional waypoints will be interpolated. This approach is using a string-like method to optimize the distances between each waypoint.

Parameters
  • sts – Waypoint structures, may have length < nwaypoints

  • path_resolution – Desired RMSD spacing between waypoints in Å

  • receptor_atom_idcs – List of receptor atom indices

  • ligand_atom_idcs – List of ligand atom indices

Returns

a new optimized path

schrodinger.application.desmond.kinetics.utils.plot_path_matrices(title_to_matrix_map: Dict[str, numpy.array], fname: str) None

Write a png showing the RMSD matrix for each path processing step

schrodinger.application.desmond.kinetics.utils.reduce_and_optimize_path(asl_receptor: str, asl_ligand: str, nwaypoints: int, path_resolution: float, mindist_off: float, debug: bool = False) List[str]

Reduce the reactive path to a given number of waypoints, optimize which frames are selected, and smooth the optimized path so waypoints are evenly spaced. Returns an empty list when no reactive path is found.

Parameters
  • nwaypoints – the number of snapshots/waypoints in the unbinding path to generate

  • path_resolution – Desired RMSD spacing between waypoints in Å

Returns

a list of filenames to be saved in the analysis stage

schrodinger.application.desmond.kinetics.utils.summarize_traj_with_com_st(cms_filename: str, asl_receptor: str, asl_ligand: str, tau: Optional[float] = None) Optional[schrodinger.structure._structure.Structure]

Read in a trajectory and create a summary structure with dummy atoms representing the ligand at its center of mass.

Parameters
  • cms_filename – CMS file to read linked trajectory and convert.

  • asl_receptor – Atom selection language for the receptor

  • asl_ligand – Atom selection language for the ligand

  • tau – Predicted residence time to save as structure property

schrodinger.application.desmond.kinetics.utils.check_ligand_stability_and_center(asl_receptor: str, asl_ligand: str, cms_filename: str, ligand_rmsd_cutoff: float, rmsd_plot_filename: Optional[str] = None) Tuple[bool, cms.Cms]

Measure Ligand RMSD and determine if its suitable for downstream RAMD calculations. Save the input CMS centered on the receptor. If a file name is specified, save a plot of the RMSD.

Returns

(ligand_stable, cms_model) A bool specifying if the ligand does not unbind during the relaxation stage (ligand RMSD is less than ligand_rmsd_cutoff) and a CMS file.

schrodinger.application.desmond.kinetics.utils.extend_path(sts: List[schrodinger.structure._structure.Structure], nwaypoints: int, ligand_atom_idcs: List[int], ligand_heavy_atom_idcs: List[int]) List[schrodinger.structure._structure.Structure]

Add additional waypoints to match the desired number of waypoints. If the input path has more or equal waypoints, return the input path.

Parameters
  • sts – Waypoint structures

  • nwaypoints – The final number of snapshots/waypoints in the unbinding path after extrapolation

  • ligand_atom_idcs – List of ligand atom indices

  • ligand_heavy_atom_idcs – List of ligand heavy atom indices

Returns

List of waypoint structures, extended to length nwaypoints

schrodinger.application.desmond.kinetics.utils.cluster_unbinding_paths(paths: List[List[schrodinger.structure._structure.Structure]], asl_ligand: str, max_iter: int, rng_seed: int = 0, all_paths_which_wps_bound: Optional[List[List[int]]] = None) Tuple[List[Tuple[int, int, int]], List[int]]

Given a list of paths, cluster them using AffinityPropogation method. Return information about the clusters as well as their centroids.

Parameters
  • paths – Lists of waypoint structures that define unbinding paths.

  • asl_receptor – Atom selection language for the receptor

  • asl_ligand – Atom selection language for the ligand

  • use_weights – Whether to use weights when calculating the distance between paths. Waypoints pairs that are not interacting with the receptor will be given 0 weight.

  • max_iter – Maximum iterations for AffinityPropagation.

  • rng_seed – RNG seed for AffinityPropagation.

  • all_paths_which_wps_bound – List of lists determining whether to consider each waypoint when calculating distances between paths. When both waypoints are unbound (value of 0), the comparison is skipped.

Returns

for each cluster return a (size, centroid, label) tuple and a “paths label” path label is a cluster all paths belong in.

schrodinger.application.desmond.kinetics.utils.set_all_st_group_hierarchies(sts: List[schrodinger.structure._structure.Structure], hierarchy: List[str])

Set the project group name for each structure in a list

schrodinger.application.desmond.kinetics.utils.get_outermost_group(st: schrodinger.structure._structure.Structure) str

Get the outermost group in a structure’s group hierarchy

schrodinger.application.desmond.kinetics.utils.write_ramd_clustering_summary(paths: List[List[schrodinger.structure._structure.Structure]], asl_ligand: str, cluster_info: List[Tuple[int, int, int]], paths_labels: List[int], jobname: str, all_paths_which_wps_bound: Optional[List[List[int]]] = None)

Write a summary .maegz which contains the centers of mass for each waypoint in each RAMD unbinding path as well as the simulation time needed to unbind. These waypoints are grouped and colored by cluster and centroids are labeled with (C) after their replica name.

Parameters
  • paths – Lists of waypoint structures that define unbinding paths.

  • asl_ligand – Atom selection language for the ligand

  • cluster_info – Tuple for each cluster of (size, centroid, label).

  • paths_labels – Cluster label for each path.

  • jobname – Master jobname used to label the main structure group.

  • all_paths_which_wps_bound – Label for whether each waypoint in each path is considered bound.

schrodinger.application.desmond.kinetics.utils.get_unbinding_pathway(cluster_info: List[Tuple[int, int, int]]) List[List[schrodinger.structure._structure.Structure]]

Using the clustering information, write the centroid unbinding paths. If the first N clusters have the the same number of members then we will return N centroids for these clusters.

Parameters

cluster_info – a list that contatins (size, center, label) of clusters

schrodinger.application.desmond.kinetics.utils.calculate_path_msd(path_sts: List[schrodinger.structure._structure.Structure], align_asl: str, displace_asl: str) float

Function to calculate ‘lambda’ variable as described in the reference below. This variable is comparable to the inverse of mean square displacement between neigboring waypoints. See for details: -Branduardi D, et. al., JChemPhys 126(5), 054103; doi:10.1063/1.2432340

Parameters

path_sts – structrures defining unbinding path

schrodinger.application.desmond.kinetics.utils.process_atom_selection(path_sts: List[schrodinger.structure._structure.Structure], align_asl: str, displace_asl: str) schrodinger.application.desmond.kinetics.utils.RmsdSelection

This function generates three atom lists used with required weights for M-expression RMSD calculation. The length of each list should be the same. Here we perform the bookkeeping to ensure the the length of these list are the same.

Parameters
  • path_sts – A list of structures that define unbinding path. These structures should not have any hydrogen atoms.

  • align_asl – Selection to align the system on

  • displace_asl – Selection to measure RMSD on

Returns

A tuple of three lists, all the same sizes, with weights for RMSD alignment/measurement for the first to variables, and atom indices that correspond to atoms used in RMSD calculation

schrodinger.application.desmond.kinetics.utils.insert_path_into_cms(cms_fname: str, path_sts_fn: str, combined_cms_fname: str)

This function reads in cms and path_structures and combines the two by inserting the coordinates for all the path waypoint (wp) structures as atom properties in full-system CT of the cms file. The path’s waypoint coordinates are stored in the properties like this: * r_tau_path_waypt00_{x,y,z} Which corresponds to the xyz coordinates in the first waypoint for that atom.

This is done due to a limitation that MSJ workflow can pass only one structure as an input.

Note: Hydrogen coordinates are not stored.

schrodinger.application.desmond.kinetics.utils.extract_path_from_cms(cms_fname: str) Tuple[List[structure.Structure], cms.Cms]

This function reads a cms file with stored path waypoints (see insert_path_in_cms()) in atom properties and extracts its coordinates into separate structures.

Parameters

cms_fname – filename to the CMS file with the waypoint coordinates

schrodinger.application.desmond.kinetics.utils.get_mtd_results(cvseq_filename: str) pandas.core.frame.DataFrame

Read in cvseq data from a filename and then convert it to a DataFrame. Since the results of mtd can get quite large we are going to change precision and/or change data types of several fields, for more efficient storage.

schrodinger.application.desmond.kinetics.utils.get_mtd_info(df: pandas.core.frame.DataFrame, residence_time_cutoff: Optional[float]) dict

Return a dict with key info about the MtD data in the provided DataFrame.

Parameters
  • df – dataframe with all the unbinding descriptors

  • residence_time_cutoff – a cutoff value to determine if the is too slow to unbind, in which case the workflow is terminated early

Returns

a dictionary with various descriptors calculated from the job. The

keys and their values of this dict are the following.

MtdInfo.TAU_ESTIMATE_SEC: Predicted resedence time (tau) in Seconds

MtdInfo.SIM_TIME: Total simulation time (in nanoseconds).

MtdInfo.MINDIST: Distance between the receptor and the ligand when the

simulation was terminated

MtdInfo.TOO_SLOW: A bool indicating if the unbinding event had

occured or not. A value of True suggests that no unbinding happend.

MtdInfo.STATUS: A string of the final status: ‘OK’, ‘BOUND’ ‘TOO SLOW’

schrodinger.application.desmond.kinetics.utils.calc_mean_tau_and_pval(res_times: List[float], ntrials: int = 100) Tuple[schrodinger.application.desmond.measurement.Measurement, schrodinger.application.desmond.measurement.Measurement]

Return mean Tau and P-val values as a Measurement object

schrodinger.application.desmond.kinetics.utils.plot_tau_fitting(res_times: List[float], tau: schrodinger.application.desmond.measurement.Measurement, pvalue: schrodinger.application.desmond.measurement.Measurement, jobname: str)

Plot predicted residence time (tau) from the replicas and the fitted value. The plot will be written to the current directory with the jobname as a filename.

Parameters
  • res_times – Residence times from previous subjobs

  • tau – Fitted residence time for the ligand

  • pvalue – P-value for the fitted residence time

  • jobname – Master jobname, will be used to save plot in PNG format

schrodinger.application.desmond.kinetics.utils.plot_mtd_report(results: pandas.core.frame.DataFrame, jobname: str)

Generate a series of plots to expore MtD unbinding simulations.

schrodinger.application.desmond.kinetics.utils.read_structs_from_tgz(file_str: str, tar_path: pathlib.Path) List[schrodinger.structure._structure.Structure]

Extract files from a tgz archive. Warning: This will find and extract the first file matching file_str inside any tarred subdirectory. Additional matches will be ignored.

Parameters
  • file_str – File name to seach for within the tar file

  • tar_path – Path of the tar file to extract structures from

schrodinger.application.desmond.kinetics.utils.get_indices_from_unbinding_subjob(subjob_name: str)

Retrieve the indices for the path CV and the replica from a subjob name

Parameters

subjob_name – The subjob name to extract indices from Some examples are cdk8_4f6s_full_mtd_replica-0-01 or cdk8_4f6s_full_mtd_replica-0-01_3 where the path cv index is 0 and the replica index is 01.

schrodinger.application.desmond.kinetics.utils.write_mtd_unbinding_path_summary(subjob_to_unbinding_path_map: Dict[str, List[schrodinger.structure._structure.Structure]], path_cv_idx_to_structure_map: Dict[str, List[schrodinger.structure._structure.Structure]], asl_ligand: str, jobname: str, results_df: pandas.core.frame.DataFrame)

Write a summary structure showing the metadynamics paths sampled for each path CV represented using their ligand center of mass. The structures will contain the receptor, the ligand, each path CV used, and the metadynamics paths grouped with their associated path CV. The mtd paths are also annotated with their simulation time and predicted residence time.

Parameters
  • subjob_to_unbinding_path_map – Dict mapping unbinding path subjob names to path structures

  • path_cv_idx_to_structure_map – Dict mapping path CV (equivalently RAMD cluster) number to path CV structures

  • asl_ligand – Atom selection language for the ligand

  • jobname – Master jobname used to label the main structure group.

schrodinger.application.desmond.kinetics.utils.pad_asl_if_invalid(asl: str) str

Pad ASL strings with spaces to ensure they are recognized as valid. This addresses the issue in DESMOND-13855 where ASL strings containing SMARTS will not be recognized if they are within parentheses unless they are padded.

Parameters

asl – Input ASL string to pad with spaces if needed

schrodinger.application.desmond.kinetics.utils.set_random_seed(seed: Optional[int])

Create a new random number generator for this module with the specified random seed.

schrodinger.application.desmond.kinetics.utils.get_dominant_path_idx(mean_res_times: numpy.array, frac_unbound: numpy.array, p_values: numpy.array, frac_unbound_thresholds: List[float], p_value_thresholds: List[float]) int

Select a path to use when reporting metadynamics summary statistics. We sort paths into confidence thresholds, evaluated using the number of unbinding events as well as the Kolmogorov-Smirnov p-values. We then select the fastest path in the highest confidence thresholds as the ‘dominant’ path.

Parameters
  • mean_res_times – Array of residence times for each path.

  • frac_unbound – Array of fraction unbound simulations for each path.

  • p_values – Array of p-values for each path.

  • frac_unbound_thresholds – Confidence level thresholds based on the fraction of simulations that unbound.

  • p_value_thresholds – Confidence level thresholds based on Kolmogorov-Smirnov p-values.

Returns

The index corresponding to the dominant path.

schrodinger.application.desmond.kinetics.utils.prepare_system(input_st: schrodinger.structure._structure.Structure, asl_ligand: str, jobname: str) Tuple[schrodinger.structure._structure.Structure, schrodinger.structure._structure.Structure]
Prepare system for MD by:

1) Split the ligand into a separate CT from the receptor/cofactor. This is so that we can assign custom charges to the ligand. 2) Label the ligand so that we can equilibrate the system with GCMC.

Parameters
  • input_st – Input structure containing the full system to simulate.

  • asl_ligand – ASL to select and separate the ligand.

  • jobname – Jobname to label the ligand in the separated structure.