Working With Trajectories

The desmond package contains modules for various Desmond-related operations. In particular, the trajectory infrastructure and trajectory analyzers are in the following modules:

Atom indexing

For MD related workflows, there are two output files associated with the simulated trajectory:

  • A text file called <jobname>-out.cms, which contains structural information such as atom types, bonds, etc, as well as atom positions of the last trajectory frame

  • A binary file of the trajectory, which contains atom positions at all time points.

One tricky detail of the trajectory infrastructure is that the .cms files only contain physical atoms whereas the trajectory files may contain additional pseudo atoms. This leads to two atom indices:

  • Atom ID (AID) for structural files such as cms and mae files

    • physical atoms only

    • 1-indexed

    • results of ASL/SMARTS evaluation

  • Global ID (GID) for trajectory files and Desmond MSYS files

    • all atoms, both physical and pseudo

    • 0-indexed

Since pseudo atoms carry charge, they are relevant to all charge related analyses, such as electric dipole, center of charge, etc. Note that the pseudo atoms are not selected from ASL or SMARTS evaluations, and one needs to use APIs in the schrodinger.application.desmond.topo such as:

asl2gids(cms_model, asl, include_pseudoatoms=True)
aids2gids(cms_model, aids, include_pseudoatoms=True)

Here the argument cms_model has to be created from:

from schrodinger.application.desmond.packages import topo
msys_model, cms_model = topo.read_cms('example-out.cms')

This is the only way to guarantee the correct AID/GID mapping, which is saved in cms_model. Avoid loading the .cms file as a Schrodinger Structure object or as a Desmond Cms object directly if it is to interact with trajectories.

Trajectory frames

The Desmond trajectories are saved in binary files and both DTR and XTC formats are supported. To read a trajectory:

from schrodinger.application.desmond.packages import traj
tr = traj.read_traj('foo_trj')  # DTR format trajectory
fr = tr[0]  # first frame

Here tr is simply a python list of trajectory Frame objects. The guaranteed properties of Frame include:

  • fr.natoms: number of atoms in the frame, including the pseudo atoms

  • fr.nactive: number of active atoms in the frame, including the pseudo atoms. This number is relevant for a GCMC trajectory and it equals to fr.natoms for a non-GCMC trajectory.

  • fr.time: chemical time in pico-seconds

  • fr.box: simulation box (key information for periodic boundary unwrapping), which may change from frame to frame

  • fr.pos() or fr.pos(gid), or fr.pos(gids): atom positions

Here gid is an integer valued GID and gids is a list/tuple of them. The position calls return N x 3 Numpy array where N is the number of requested atom(s). Note that fr.pos() (which selects all atoms) and fr.pos(gid) return views whereas fr.pos(gids) returns a copy of the data in Frame. This is a result of Numpy advanced indexing.

Additional information may exist in the Frame objects. For example, atom velocities. In that case, the API fr.vel() has the same interface as fr.pos().

The path to the trajectory file is also saved in the <jobname>-out.cms file thus it is possible to load the trajectory from the .cms file:

from schrodinger.application.desmond.packages import traj_util
msys_model, cms_model, tr = traj_util.read_cms_and_traj('example-out.cms')

If only the path to the trajectory file is needed, use:

from schrodinger.application.desmond.packages import topo
trj_path = topo.find_traj_path(cms_model)  # or
trj_path = topo.find_traj_path_from_cms_path('example-out.cms')

Note that in order to utilize these helper functions, one should not rename the trajectory file or change the relative locations between the .cms file and the trajectory file.

Often one needs to update the coordinates of a group of atoms (e.g., a protein molecule from ASL selection) on a per-frame basis. This task can be fulfilled in two ways:

  • update all atom coordinates (i.e., the full system structure) for each frame and then extract the protein per frame

  • extract the protein once and then update its coordinates per frame

The second approach is more efficient and its implementation is shown below:

from schrodinger.application.desmond.packages import traj, topo

_, cms_model = topo.read_cms('example-out.cms')
tr = traj.read_traj('example_trj')
protein_aids = cms_model.select_atom('protein')
protein_st = cms_model.extract(protein_aids)
protein_gids = topo.aids2gids(cms_model, protein_aids,
                          include_pseudoatoms=False)
for fr in tr:
    protein_st.setXYZ(fr.pos(protein_gids))
    # whatever needs to be done on the protein structure
    ...

Note that this only works if the ASL selection doesn’t change over the frames.

In case the full system structure is needed per frame, use:

fsys_ct = cms_model.fsys_ct.copy()
for fr in tr:
    topo.update_fsys_ct_from_frame_GF(fsys_ct, cms_model, fr)
    # whatever needs to be done on the full system structure
    ...

Trajectory analysis

The schrodinger.application.desmond.analysis module contains about 60 trajectory analyzers, ranging from basic geometric operations such as angle, distance, vector, torsion, to specialized analyses such as order parameter, protein-ligand interactions.

To use the existing trajectory analyzers:

from schrodinger.application.desmond.packages import analysis, traj, topo

# load data
msys_model, cms_model = topo.read_cms('foo-out.cms')
tr = traj.read_traj('foo.xtc')  # XTC format trajectory

# define analyzers
analyzer1 = analysis.Com(msys_model, cms_model, asl='m.n 1')
analyzer2 = analysis.ProtLigInter(msys_model, cms_model, 'protein', 'm.n 2')

# compute result
results = analysis.analyze(tr, analyzer1, analyzer2, )

Here results is a list of two items corresponding to the outputs of the two analyzers. The outputs of the analyzers may differ in their format. Typically, each output is a list of frame-wise results.

There is a caching mechanism built in the analysis framework to avoid redundant calculations among analyzers, e.g., coordinate unwrapping around periodic boundary condition, centering solute atoms, etc. Thus it is better to analyze all analyzers together, as in the example.

A special feature of the geometric analyzers is that they can use center of mass Com, center of charge Coc, and centroid Centroid analyzers on an equal footing with atoms. For example:

from schrodinger.application.desmond.packages import analysis
# data loading for msys_model, cms_model, and tr is omitted
com = analysis.Com(msys_model, cms_model, asl='atom.num 1-4')
centroid = analysis.Centroid(msys_model, cms_model, asl='atom.num 40-200')

d1 = analysis.Distance(msys_model, cms_model, 1, 2)
d2 = analysis.Vector(msys_model, cms_model, com, 27)
d3 = analysis.Angle(msys_model, cms_model, com, centroid, 3)
d4 = analysis.Torsion(msys_model, cms_model, 10, com, centroid, 3)

results = analysis.analyze(tr, d1, d2, d3)

Here the inputs are either AIDs or some geometric-center analyzers.

Utility scripts

The following utility scripts are shipped with the Schrodinger Suites and can be triggered as:

$SCHRODINGER/run <script-name> ...
  • analyze_simulation.py: run geometric and energy group analyses on a simulation trajectory

  • build_constraints.py: build constraints

  • dtr2xtc.py: convert a Desmond trajectory from DTR format to XTC format

  • frame2cms.py: convert a trajectory into a series of .cms files

  • generate_solvent_box.py: generate solvent box

  • membrane_cms2fep.py: convert cms to FEP pv file

  • rdf.py: calculate the radial distribution function of selected atoms

  • trj2mae.py: convert a trajectory into a series of Maestro files

  • trj_align.py: trajectory alignment

  • trj_center.py: center selected atoms in the trajectory

  • trj_cluster.py: cluster trajectory frames based on the RMSD of selected atoms

  • trj_convert.py: convert a trajectory to the requested output name and format

  • trj_essential_dynamics.py: run protein essential dynamics analysis

  • trj_extract_subsystem.py: extract subsystem

  • trj_info.py: display information on the model system and the trajectory

  • trj_merge.py: merge trajectories

  • trj_occupancy.py: calculate the occupancy histogram of selected atoms

  • trj_parch.py: parch the trajectory

  • trj_rescue.py: attempt to rescue corrupted trajectory

  • trj_slice.py: slice the trajectory

  • trj_unwrap.py: unwrap the trajectory

  • trj_wrap.py: wrap the trajectory

  • viparr.py: forcefield assignment

Extracting trajectories from .fmp and .fmpdb files

Run the following command:

$SCHRODINGER/run -FROM scisol extract_fmpdb.py <input>.fmp <input>.fmpdb