schrodinger.structutils.structalign2 module

A module for performing protein structure alignment.

If performing the aligment using the SKA program, this requires Prime to be installed and licensed appropriately.

After the structural alignment is performed, some results are added as properties of the input structures. A more comprehensive Alignment object is returned per template-mobile structure pair.

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.structutils.structalign2.Alignment(rmsd: float, length: int, transform_matrix: numpy.array, align: Tuple[Tuple[int, int]], score: float = None)

Bases: NamedTuple

Container to store alignment data.

Parameters
  • score – pairwise alignment score produced by the underlying method.

  • rmsd – RMSD of the aligned atoms after superposition.

  • length – number of residues in the alignment.

  • transform_matrix – 4x4 array where the first 3x3 cells contain the rotation matrix, the last 3x1 is the translation vector, and the 4th row is a spectator row without meaningful information.

  • align – corresponding atom indices for each aligned structure. Only one atom per residue (e.g. Calpha).

rmsd: float

Alias for field number 0

length: int

Alias for field number 1

transform_matrix: numpy.array

Alias for field number 2

align: Tuple[Tuple[int, int]]

Alias for field number 3

score: float

Alias for field number 4

schrodinger.structutils.structalign2.get_guide_atoms(st)

Return a list of guide atoms for a structure.

Guide atoms are CA atoms for protein residues, and C4’ atoms for nucleic acid residues. Any other type of residue (e.g. small molecule) is ignored.

Parameters

st (structure.Structure) – Input structure to process.

schrodinger.structutils.structalign2.map_corresponding_atoms_by_alignment(st_ref, st_mobile, max_dist=1.5)

Map corresponding residues based on shortest guide atom distance.

Requires st_ref and st_mobile to be aligned. Will consider two residues aligned to each other if the distance between their respective guide atoms is smaller than max_dist. If multiple residues are under min_dist to a given guide atom in the reference structure, will assign the closest as the corresponding.

Returns a tuple of tuples with equivalent atom indices.

schrodinger.structutils.structalign2.cealign(st_ref, st_mobile, *, atlist_ref=None, atlist_mobile=None, transform_mobile=True, window_size=8, max_gap=30, extend_alignment=True)

Align two structures using the CEAlign method.

Based on the method described in:

Shindyalov IN, Bourne PE. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 1998 Sep;11(9):739-47. doi: 10.1093/protein/11.9.739. PMID: 9796821.

Parameters
  • st_ref (structure.Structure) – Reference structure, always remains fixed.

  • st_mobile (structure.Structure) – Mobile structure to be aligned onto the reference.

  • atlist_ref (list(int)) – List of atom indices from reference structure to be used for the alignment and superposition. If None, uses guide atoms.

  • atlist_mobile (list(int)) – List of atom indices from mobile structure to be used for the alignment and superposition. If None, uses guide atoms.

  • transform_mobile (bool) – If True (default), the coordinates of the mobile structure will be modified after calculating the optimal alignment.

  • window_size (float) – CE algorithm parameter. Defines the length of fragments to be used for the alignment. Shorter values may yield more accurate alignments, up to a certain point, at the expense of computation time. Default is 8, as defined in the original publication.

  • max_gap (float) – CE algorithm parameter. Size of maximum gap allowed between two aligned fragment pairs, in number of residues. Short values yield faster computation times, but might prevent similar fragments from being found by the algorithm. Default is 30, as defined in the original publication and found to be a reasonable compromise between computational cost and alignment quality.

  • extend_alignment (bool) – extend the alignment to the full structure. Since CEAlign produces a local alignment, it can be useful to obtain a full mapping of which atoms correspond to which. This extension is done by analyzing distances between atoms in the structures after running CEAlign. This option has no effect on the reported RMSD. Default is True.

Return type

Alignment

Returns

a named tuple with the RMSD (in Angstrom) between the two structures after superposition, the alignment length, a mapping of the aligned residues, and the transform matrix as a 4x4 numpy array.

schrodinger.structutils.structalign2.ska_pairwise_align(st_ref, st_mobile, *, asl_ref=None, asl_mobile=None, transform_mobile=True, use_scanning_alignment=False, use_automatic_settings=False, use_standard_residues=False, reorder_by_connectivity=False, reckless_align=False, window_length=5, minimum_length=2, minimum_similarity=1.0, gap_penalty=2.0, deletion_penalty=1.0, custom_logger=None)

Align two structures using SKA.

Parameters
  • st_ref – reference structure, remains fixed.

  • st_mobile – mobile structure, will be transformed to minimize RMSD to the reference after the alignment.

  • asl_ref – ASL to identify residues in the reference structure to be used for the alignment. If None, will consider all residues.

  • asl_mobile – ASL to identify residues in the mobile structure to be used for the alignment. If None, will consider all residues.

  • transform_mobile (bool) – If True (default), the coordinates of the mobile structure will be modified after calculating the optimal alignment.

  • use_scanning_alignment – will use a scanning alignment based on the specified window length if the sequences have low homology. Default is False.

  • use_automatic_settings – use automatic settings to produce a better alignment. Default is False.

  • use_standard_residues – if True, residue names will be translated to standard forms (e.g. HIE -> HIS). Default is False.

  • reorder_by_connectivity – if True, structure will be reordered by atom connectivity (N to C terminus in proteins). Default is False.

  • reckless_align – run SKA alignment in reckless mode, which produces an alignment even when similarity is very low. Default is False.

  • window_length – number of residues to consider in each window when scanning alignment mode is enabled. Default is 5.

  • minimum_length – minimum length of sequences needed to enable scanning alignment mode.

  • minimum_similarity – minimum similarity needed to enable alignment. Default is False.

  • gap_penalty – sequence gap penalty when building the alignment. Default is 2.0.

  • deletion_penalty – sequence deletion penalty when building the alignment. Default is 1.0.

  • custom_logger – logger object used to emit log messages. If None, use the module logger object.

Return type

Alignment

Returns

a named tuple with the RMSD (in Angstrom) between the two structures after superposition, the alignment length, a mapping of the aligned residues, and the transform matrix as a 4x4 numpy array.

schrodinger.structutils.structalign2.align_many(st_ref, mobile_st_list, **kwargs)

Aligns each structure in mobile_st_list to st_ref.

The coordinates of each of the input structures are modified to minimize RMSD after superposition.

Returns a list of Alignment objects containing multiple data fields about the alignments (e.g. rmsd, score).

schrodinger.structutils.structalign2.align_pair(st_ref, st_mobile, **kwargs)

Aligns st_mobile to st_ref.

The coordinates of the mobile structure are modified to minimize RMSD after superposition.

Returns an Alignment object containing multiple data fields about the alignment (e.g. rmsd, score).