schrodinger.application.desmond.kinetics.utils module¶
- class schrodinger.application.desmond.kinetics.utils.RmsdSelection(aligned_weights, displaced_weights, atom_idcs)¶
Bases:
NamedTuple
- class schrodinger.application.desmond.kinetics.utils.MtdInfo(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
enum.Enum
- KILL_NOW = 'kill_now'¶
- LAST_INDEX = 'last_index'¶
- MINDIST = 'mindist'¶
- TAU_ESTIMATE_SEC = 'tau_estimate_sec'¶
- SIM_TIME = 'sim_time'¶
- TOO_SLOW = 'too_slow'¶
- STATUS = 'status'¶
- class schrodinger.application.desmond.kinetics.utils.MtdStatus(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
enum.Enum
The Status of each MtD unbinding job
- OK = 'OK'¶
- TOO_SLOW = 'Too Slow'¶
- BOUND = 'Bound'¶
- class schrodinger.application.desmond.kinetics.utils.ResultsCols(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
enum.Enum
Columns used in MtD summary statistics
- PATH = 'Path'¶
- NAME = 'Name'¶
- PRED_RES_TIME = 'Pred.Res.Time(s)'¶
- P_VALUE = 'P-value'¶
- SIM_TIME = 'Sim.Time(ns)'¶
- SIMS = '#Sims'¶
- UNBOUND_SIMS = '#Unbound.Sims'¶
- schrodinger.application.desmond.kinetics.utils.find_ligand_sites(st: schrodinger.structure._structure.Structure, asl: str, nsites: int, sampling_size: int = 5000, rng_seed: int = 30) List[int] ¶
Find a
nsites
atoms which are furthest away from each other on a provided small molecule. These atoms (sites) are then used in distance calculations.
- schrodinger.application.desmond.kinetics.utils.find_protein_sites(st: schrodinger.structure._structure.Structure, asl: str, area_cutoff: Optional[float] = None) List[int] ¶
Find solvent exposed protein residues which will be used to check if the ligand is still bound to the protein.
- Parameters
st – Input system structure.
asl – ASL to identify protein residues eg. ‘protein’.
area_cutoff – SASA threshold where residues with SASA above this value have their C-alpha added to the set of protein sites. If no cutoff is given, return all C-alpha atom indices.
- Returns
A list of C-alpha atoms indices passing the area_cutoff.
- schrodinger.application.desmond.kinetics.utils.plot_descriptors(traj_times: numpy.array, rmsd: numpy.array, mindist: numpy.array, cvseq_times: numpy.array = None, force_constant: numpy.array = None, endpoints: Tuple[Optional[int], Optional[int]] = (None, None), rmsd_limits: Tuple[Optional[float], Optional[float]] = (None, None), mindist_limits: Tuple[Optional[float], Optional[float]] = (None, None))¶
Plot descriptors from a RAMD trajectory. Saved to CWD as “trj_descriptors.png”.
- Parameters
traj_times – Array of times in ns corresponding to trajectory frames.
rmsd – Array of ligand RMSDs. Must be the same length as traj_times.
mindist – Array of receptor-ligand minimum distances. Must be the same length as traj_times.
cvseq_times – Array of times in ns corresponding to cvseq lines.
force_constant – Array of spring_fk from the RAMD cvseq. Must be the same length as cvseq_times.
endpoints – Reactive trajectory start/stop frame indices.
rmsd_limits – RMSD criteria for reactive start/stop frames. When entries are None, skip plotting that limit.
mindist_limits – MinDist criteria for reactive start/stop frames. When entries are None, skip plotting that limit.
- schrodinger.application.desmond.kinetics.utils.extract_reactive_path(cms_filename: str, cvseq_filename: Optional[str] = None, asl_receptor: str = 'protein', asl_ligand: str = 'ligand', rmsd_on: float = 2.0, mindist_on: float = 6.0, rmsd_off: Optional[float] = None, mindist_off: Optional[float] = None, strict: Optional[bool] = None, pad_traj_start: int = 0, max_traj_frames: int = 2000, debug: bool = False) List[str] ¶
Reactive path is equivelent is a the unbinding path of the ligand
- Parameters
rmsd_on – value to start recording ligand detachment cutoff
rmsd_off – value to stop recording ligand detachment cutoff, this value should be larger than
rmsd_on
mindist_on – minimal distance between receptor and ligand to start recording ligand detachment cutoff
mindist_off – minimal distance between receptor and ligand to stop recording ligand detachment cutoff. This value should be larger than
rmsd_on
strict – mindist_off and rmsd_off must be both satisfied for reactive trajectory extraction
pad_traj_start – number of initial trajectory frames to prepend to the reactive trajectory frames
max_traj_frames – maximum number of frames to load. If trajectory contains more frames than this value, then we will slice the trajectory such that the specified number of frames is used in the downstream analysis
- Returns
a list of filenames to be saved in the analsysis stage of the workflow
- schrodinger.application.desmond.kinetics.utils.calc_tension(rmsd_matrix: numpy.array, ivec: Optional[List[int]] = None) float ¶
- schrodinger.application.desmond.kinetics.utils.path_rmsd_matrix(sts: List[schrodinger.structure._structure.Structure], rmsd_atoms: List[int]) numpy.array ¶
calculate 2d RMSD of the ligands on prealigned structures
- schrodinger.application.desmond.kinetics.utils.path_optimize(current_guess: List[int], rmsd_matrix: numpy.array, step: float) List[int] ¶
Provided that you already have a rough path (init.mae) you can now optimize it, whiere the best set of available frames can be used. :return: a list with frame IDs, which will be used to extract the path
- schrodinger.application.desmond.kinetics.utils.path_satisfies_mindist(sts: List[schrodinger.structure._structure.Structure], asl_receptor: str, asl_ligand: str, mindist_off: float) bool ¶
Evaluate if the provided path satisfies the mindist criteria
- schrodinger.application.desmond.kinetics.utils.path_prune(sts: List[schrodinger.structure._structure.Structure], rmsd_matrix: numpy.array, threshold: float = 1.5) List[schrodinger.structure._structure.Structure] ¶
Remove waypoints that are too close to non-adjacent waypoints. This is treated as a shortest path problem using Dijkstra’s algorithm.
- Parameters
sts – Waypoint structures
rmsd_matrix – Matrix of RMSDs between input waypoints
threshold – Factor used to set how far adjacent waypoints can be. This is multiplied by the median of adjacent waypoint RMSDs to set the maximum for when waypoints may be connected in the shortest path
- Returns
Subset of input waypoint structures composing the shortest path
- schrodinger.application.desmond.kinetics.utils.path_smooth(sts: List[schrodinger.structure._structure.Structure], path_resolution: float, receptor_atom_idcs: List[int], ligand_atom_idcs: List[int]) List[schrodinger.structure._structure.Structure] ¶
As final path optimization step, we will shuffle each waypoint towards or away from its neightbors such that RMSD between all the waypoints is equal. If nwaypoints > len(sts), additional waypoints will be interpolated. This approach is using a string-like method to optimize the distances between each waypoint.
- Parameters
sts – Waypoint structures, may have length < nwaypoints
path_resolution – Desired RMSD spacing between waypoints in Å
receptor_atom_idcs – List of receptor atom indices
ligand_atom_idcs – List of ligand atom indices
- Returns
a new optimized path
- schrodinger.application.desmond.kinetics.utils.plot_path_matrices(title_to_matrix_map: Dict[str, numpy.array], fname: str) None ¶
Write a png showing the RMSD matrix for each path processing step
- schrodinger.application.desmond.kinetics.utils.reduce_and_optimize_path(asl_receptor: str, asl_ligand: str, nwaypoints: int, path_resolution: float, mindist_off: float, debug: bool = False) List[str] ¶
Reduce the reactive path to a given number of waypoints, optimize which frames are selected, and smooth the optimized path so waypoints are evenly spaced. Returns an empty list when no reactive path is found.
- Parameters
nwaypoints – the number of snapshots/waypoints in the unbinding path to generate
path_resolution – Desired RMSD spacing between waypoints in Å
- Returns
a list of filenames to be saved in the analysis stage
- schrodinger.application.desmond.kinetics.utils.summarize_traj_with_com_st(cms_filename: str, asl_receptor: str, asl_ligand: str, tau: Optional[float] = None) Optional[schrodinger.structure._structure.Structure] ¶
Read in a trajectory and create a summary structure with dummy atoms representing the ligand at its center of mass.
- Parameters
cms_filename – CMS file to read linked trajectory and convert.
asl_receptor – Atom selection language for the receptor
asl_ligand – Atom selection language for the ligand
tau – Predicted residence time to save as structure property
- schrodinger.application.desmond.kinetics.utils.check_ligand_stability_and_center(asl_receptor: str, asl_ligand: str, cms_filename: str, ligand_rmsd_cutoff: float, rmsd_plot_filename: Optional[str] = None) Tuple[bool, cms.Cms] ¶
Measure Ligand RMSD and determine if its suitable for downstream RAMD calculations. Save the input CMS centered on the receptor. If a file name is specified, save a plot of the RMSD.
- Returns
(ligand_stable, cms_model) A bool specifying if the ligand does not unbind during the relaxation stage (ligand RMSD is less than
ligand_rmsd_cutoff
) and a CMS file.
- schrodinger.application.desmond.kinetics.utils.extend_path(sts: List[schrodinger.structure._structure.Structure], nwaypoints: int, ligand_atom_idcs: List[int], ligand_heavy_atom_idcs: List[int]) List[schrodinger.structure._structure.Structure] ¶
Add additional waypoints to match the desired number of waypoints. If the input path has more or equal waypoints, return the input path.
- Parameters
sts – Waypoint structures
nwaypoints – The final number of snapshots/waypoints in the unbinding path after extrapolation
ligand_atom_idcs – List of ligand atom indices
ligand_heavy_atom_idcs – List of ligand heavy atom indices
- Returns
List of waypoint structures, extended to length
nwaypoints
- schrodinger.application.desmond.kinetics.utils.cluster_unbinding_paths(paths: List[List[schrodinger.structure._structure.Structure]], asl_ligand: str, max_iter: int, rng_seed: int = 0, all_paths_which_wps_bound: Optional[List[List[int]]] = None) Tuple[List[Tuple[int, int, int]], List[int]] ¶
Given a list of paths, cluster them using AffinityPropogation method. Return information about the clusters as well as their centroids.
- Parameters
paths – Lists of waypoint structures that define unbinding paths.
asl_receptor – Atom selection language for the receptor
asl_ligand – Atom selection language for the ligand
use_weights – Whether to use weights when calculating the distance between paths. Waypoints pairs that are not interacting with the receptor will be given 0 weight.
max_iter – Maximum iterations for AffinityPropagation.
rng_seed – RNG seed for AffinityPropagation.
all_paths_which_wps_bound – List of lists determining whether to consider each waypoint when calculating distances between paths. When both waypoints are unbound (value of 0), the comparison is skipped.
- Returns
for each cluster return a (size, centroid, label) tuple and a “paths label” path label is a cluster all paths belong in.
- schrodinger.application.desmond.kinetics.utils.set_all_st_group_hierarchies(sts: List[schrodinger.structure._structure.Structure], hierarchy: List[str])¶
Set the project group name for each structure in a list
- schrodinger.application.desmond.kinetics.utils.get_outermost_group(st: schrodinger.structure._structure.Structure) str ¶
Get the outermost group in a structure’s group hierarchy
- schrodinger.application.desmond.kinetics.utils.write_ramd_clustering_summary(paths: List[List[schrodinger.structure._structure.Structure]], asl_ligand: str, cluster_info: List[Tuple[int, int, int]], paths_labels: List[int], jobname: str, all_paths_which_wps_bound: Optional[List[List[int]]] = None)¶
Write a summary .maegz which contains the centers of mass for each waypoint in each RAMD unbinding path as well as the simulation time needed to unbind. These waypoints are grouped and colored by cluster and centroids are labeled with (C) after their replica name.
- Parameters
paths – Lists of waypoint structures that define unbinding paths.
asl_ligand – Atom selection language for the ligand
cluster_info – Tuple for each cluster of (size, centroid, label).
paths_labels – Cluster label for each path.
jobname – Master jobname used to label the main structure group.
all_paths_which_wps_bound – Label for whether each waypoint in each path is considered bound.
- schrodinger.application.desmond.kinetics.utils.get_unbinding_pathway(cluster_info: List[Tuple[int, int, int]]) List[List[schrodinger.structure._structure.Structure]] ¶
Using the clustering information, write the centroid unbinding paths. If the first N clusters have the the same number of members then we will return N centroids for these clusters.
- Parameters
cluster_info – a list that contatins (size, center, label) of clusters
- schrodinger.application.desmond.kinetics.utils.calculate_path_msd(path_sts: List[schrodinger.structure._structure.Structure], align_asl: str, displace_asl: str) float ¶
Function to calculate ‘lambda’ variable as described in the reference below. This variable is comparable to the inverse of mean square displacement between neigboring waypoints. See for details: -Branduardi D, et. al., JChemPhys 126(5), 054103; doi:10.1063/1.2432340
- Parameters
path_sts – structrures defining unbinding path
- schrodinger.application.desmond.kinetics.utils.process_atom_selection(path_sts: List[schrodinger.structure._structure.Structure], align_asl: str, displace_asl: str) schrodinger.application.desmond.kinetics.utils.RmsdSelection ¶
This function generates three atom lists used with required weights for M-expression RMSD calculation. The length of each list should be the same. Here we perform the bookkeeping to ensure the the length of these list are the same.
- Parameters
path_sts – A list of structures that define unbinding path. These structures should not have any hydrogen atoms.
align_asl – Selection to align the system on
displace_asl – Selection to measure RMSD on
- Returns
A tuple of three lists, all the same sizes, with weights for RMSD alignment/measurement for the first to variables, and atom indices that correspond to atoms used in RMSD calculation
- schrodinger.application.desmond.kinetics.utils.insert_path_into_cms(cms_fname: str, path_sts_fn: str, combined_cms_fname: str)¶
This function reads in cms and path_structures and combines the two by inserting the coordinates for all the path waypoint (wp) structures as atom properties in full-system CT of the cms file. The path’s waypoint coordinates are stored in the properties like this: * r_tau_path_waypt00_{x,y,z} Which corresponds to the xyz coordinates in the first waypoint for that atom.
This is done due to a limitation that MSJ workflow can pass only one structure as an input.
Note: Hydrogen coordinates are not stored.
- schrodinger.application.desmond.kinetics.utils.extract_path_from_cms(cms_fname: str) Tuple[List[structure.Structure], cms.Cms] ¶
This function reads a cms file with stored path waypoints (see
insert_path_in_cms()
) in atom properties and extracts its coordinates into separate structures.- Parameters
cms_fname – filename to the CMS file with the waypoint coordinates
- schrodinger.application.desmond.kinetics.utils.get_mtd_results(cvseq_filename: str) pandas.core.frame.DataFrame ¶
Read in cvseq data from a filename and then convert it to a DataFrame. Since the results of mtd can get quite large we are going to change precision and/or change data types of several fields, for more efficient storage.
- schrodinger.application.desmond.kinetics.utils.get_mtd_info(df: pandas.core.frame.DataFrame, residence_time_cutoff: Optional[float]) dict ¶
Return a dict with key info about the MtD data in the provided DataFrame.
- Parameters
df – dataframe with all the unbinding descriptors
residence_time_cutoff – a cutoff value to determine if the is too slow to unbind, in which case the workflow is terminated early
- Returns
- a dictionary with various descriptors calculated from the job. The
keys and their values of this
dict
are the following.
MtdInfo.TAU_ESTIMATE_SEC: Predicted resedence time (tau) in Seconds
MtdInfo.SIM_TIME: Total simulation time (in nanoseconds).
- MtdInfo.MINDIST: Distance between the receptor and the ligand when the
simulation was terminated
- MtdInfo.TOO_SLOW: A
bool
indicating if the unbinding event had occured or not. A value of
True
suggests that no unbinding happend.
MtdInfo.STATUS: A string of the final status: ‘OK’, ‘BOUND’ ‘TOO SLOW’
- schrodinger.application.desmond.kinetics.utils.calc_mean_tau_and_pval(res_times: List[float], ntrials: int = 100) Tuple[schrodinger.application.desmond.measurement.Measurement, schrodinger.application.desmond.measurement.Measurement] ¶
Return mean Tau and P-val values as a Measurement object
- schrodinger.application.desmond.kinetics.utils.plot_tau_fitting(res_times: List[float], tau: schrodinger.application.desmond.measurement.Measurement, pvalue: schrodinger.application.desmond.measurement.Measurement, jobname: str)¶
Plot predicted residence time (tau) from the replicas and the fitted value. The plot will be written to the current directory with the jobname as a filename.
- Parameters
res_times – Residence times from previous subjobs
tau – Fitted residence time for the ligand
pvalue – P-value for the fitted residence time
jobname – Master jobname, will be used to save plot in PNG format
- schrodinger.application.desmond.kinetics.utils.plot_mtd_report(results: pandas.core.frame.DataFrame, jobname: str)¶
Generate a series of plots to expore MtD unbinding simulations.
- schrodinger.application.desmond.kinetics.utils.read_structs_from_tgz(file_str: str, tar_path: pathlib.Path) List[schrodinger.structure._structure.Structure] ¶
Extract files from a tgz archive. Warning: This will find and extract the first file matching file_str inside any tarred subdirectory. Additional matches will be ignored.
- Parameters
file_str – File name to seach for within the tar file
tar_path – Path of the tar file to extract structures from
- schrodinger.application.desmond.kinetics.utils.get_indices_from_unbinding_subjob(subjob_name: str)¶
Retrieve the indices for the path CV and the replica from a subjob name
- Parameters
subjob_name – The subjob name to extract indices from Some examples are cdk8_4f6s_full_mtd_replica-0-01 or cdk8_4f6s_full_mtd_replica-0-01_3 where the path cv index is 0 and the replica index is 01.
- schrodinger.application.desmond.kinetics.utils.write_mtd_unbinding_path_summary(subjob_to_unbinding_path_map: Dict[str, List[schrodinger.structure._structure.Structure]], path_cv_idx_to_structure_map: Dict[str, List[schrodinger.structure._structure.Structure]], asl_ligand: str, jobname: str, results_df: pandas.core.frame.DataFrame)¶
Write a summary structure showing the metadynamics paths sampled for each path CV represented using their ligand center of mass. The structures will contain the receptor, the ligand, each path CV used, and the metadynamics paths grouped with their associated path CV. The mtd paths are also annotated with their simulation time and predicted residence time.
- Parameters
subjob_to_unbinding_path_map – Dict mapping unbinding path subjob names to path structures
path_cv_idx_to_structure_map – Dict mapping path CV (equivalently RAMD cluster) number to path CV structures
asl_ligand – Atom selection language for the ligand
jobname – Master jobname used to label the main structure group.
- schrodinger.application.desmond.kinetics.utils.pad_asl_if_invalid(asl: str) str ¶
Pad ASL strings with spaces to ensure they are recognized as valid. This addresses the issue in DESMOND-13855 where ASL strings containing SMARTS will not be recognized if they are within parentheses unless they are padded.
- Parameters
asl – Input ASL string to pad with spaces if needed
- schrodinger.application.desmond.kinetics.utils.set_random_seed(seed: Optional[int])¶
Create a new random number generator for this module with the specified random seed.
- schrodinger.application.desmond.kinetics.utils.get_dominant_path_idx(mean_res_times: numpy.array, frac_unbound: numpy.array, p_values: numpy.array, frac_unbound_thresholds: List[float], p_value_thresholds: List[float]) int ¶
Select a path to use when reporting metadynamics summary statistics. We sort paths into confidence thresholds, evaluated using the number of unbinding events as well as the Kolmogorov-Smirnov p-values. We then select the fastest path in the highest confidence thresholds as the ‘dominant’ path.
- Parameters
mean_res_times – Array of residence times for each path.
frac_unbound – Array of fraction unbound simulations for each path.
p_values – Array of p-values for each path.
frac_unbound_thresholds – Confidence level thresholds based on the fraction of simulations that unbound.
p_value_thresholds – Confidence level thresholds based on Kolmogorov-Smirnov p-values.
- Returns
The index corresponding to the dominant path.
- schrodinger.application.desmond.kinetics.utils.prepare_system(input_st: schrodinger.structure._structure.Structure, asl_ligand: str, jobname: str) Tuple[schrodinger.structure._structure.Structure, schrodinger.structure._structure.Structure] ¶
- Prepare system for MD by:
1) Split the ligand into a separate CT from the receptor/cofactor. This is so that we can assign custom charges to the ligand. 2) Label the ligand so that we can equilibrate the system with GCMC.
- Parameters
input_st – Input structure containing the full system to simulate.
asl_ligand – ASL to select and separate the ligand.
jobname – Jobname to label the ligand in the separated structure.