schrodinger.protein.analysis module¶
A class for diagnosing and reporting common structural problems of protein complexes.
Usage:
instance = Report(ct) instance.write_to_stdout()
Copyright Schrodinger, LLC. All rights reserved.
- schrodinger.protein.analysis.use_gfactors = False¶
Use gfactors instead of deviations.
- schrodinger.protein.analysis.kT = 0.5925¶
Value of kT at room temperature.
- class schrodinger.protein.analysis.Report(ct, sets_to_run=('ALL',))¶
Bases:
object
A class to calculate properties of proteins.
To use this class in a script to compute a set of data:
reporter = Report(struct, ['SIDECHAIN DIHEDRALS', 'GAMMA BFACTORS']) dihedrals = reporter.get_set('SIDECHAIN DIHEDRALS') for point in dihedrals.points: resnum = point.descriptor.split()[1] resid = point.descriptor.split(':')[0] + ':' + resnum for val in point.values: try: true_value = float(val) except ValueError: # Some angles will have '-' if they aren't defined for # this residue type continue do_something_with_chi_angle(true_value)
- __init__(ct, sets_to_run=('ALL',))¶
Create a Report instance.
- Parameters
ct (
Structure
) – The structure this Report operates onsets_to_run (list) – Either [‘ALL’] (default) or one or more of valid sets.
Valid sets to compute:
‘STERIC CLASHES’
‘PRIMEX STERIC CLASHES’
‘BOND LENGTHS’
‘BOND ANGLES’
‘BACKBONE DIHEDRALS’
‘SIDECHAIN DIHEDRALS’
‘GFACTOR SUMMARY’
‘BFACTORS’
‘GAMMA BFACTORS’
‘PEPTIDE PLANARITY’
‘SIDECHAIN PLANARITY’
‘IMPROPER TORSIONS’
‘CHIRALITY’
‘MISSING ATOMS’
- class data_set(label)¶
Bases:
object
Base class for all data sets.
- Variables
label (str) – the name of this data set
title (str) – the name of this data set when printing out the data
fields (list[str]) – first item is the name of the
data_point
descriptor property, remaining items are the names of the items in thedata_point
values propertypoints (list(data_point)) – list of
data_point
objectssummary (str) – overall summary of the data set for the entire protein
bad_points (list(data_point)) – a subset of problematic points for filtering in table and bubble plot
count (int) – the number of violations, in most cases the length of
data_points
, but for ‘global’ or non-residue specific properties it may be just 0 (for no issues) or 1 (e.g., X-Ray check)score (float) – raw quality score, which has higher priority
bubble_scale (float) – normalized scale
color (str) – bubble color used as a pylab color argument
area (float) – bubble area
- __init__(label)¶
- class data_point(descriptor='', values=[], atoms=[])¶
Bases:
object
Class that holds the data for each point in a data set
- Variables
descriptor (str) – label for this point - typically the user friendly names of the atom or residues involved
values (list(float or str)) – the values at this point - varies by subclass
atoms (list(int)) – the atoms involved in this point
- __init__(descriptor='', values=[], atoms=[])¶
- add_point(descriptor='', values=[], atoms=[])¶
Add a new point to the points property
- Parameters
descriptor (str) – Label for this point - typically the user-friendly names of the atom or residues involved
values (list) – The values at this point - varies by subclass
atoms (list) – The atoms involved in this point
- report_data_points()¶
Return all data points for this set in a list
- Return type
list
- Returns
list of data for each point in self.points, each item is a list whose first item is the point.descriptor and remaining items are the point.values items.
- analyze(parent)¶
Must be subclassed, this implementation does nothing
- Parameters
parent (
Report
object) – The Report object that this is for
- report()¶
Must be subclassed, this implementation does nothing
- class steric_clash_data_set(*args, **kwargs)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold data for Steric Clashes.
Data point descriptor is atoms involved, values are “Distance”, “Min Allowed”, “Delta”.
Summary is N/A
See parent class
data_set
for additional documenation- __init__(*args, **kwargs)¶
- within_three_bonds(protein, iatom, target_atom)¶
Method to determine whether two atoms are within three bonds of each other.
- Parameters
parent – The Report object that this is for
iatom (int) – atom number of first atom
target_atom (int) – atom number of the second atom
- Return type
bool
- Returns
True if atoms are within 3 bonds of each other, False if not
- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (l{report} object) – the report object that this is for
- run_analysis(protein)¶
Iterate over the atom pairs and check record clashes :param protein: the protein :type protein: Report.local_protein
- analyze_pair(protein, atom1, atom2)¶
Test a pair of atoms and record data if they clash. Order does not matter. :param protein: The protein :type protein: Report.local_protein :param atom1: One atom of the pair :type atom1: Report.local_atom :param atom2: The second atom of the pair :type atom2: Report.local_atom
- check_hbond(protein, atom1, atom2, clash=None, distance=None, require_hydrogen=True)¶
Test and atom pair to see if they can be considered a hydrogen bond. The presence of a H-bond makes permissible atom proximity that would normally be considered a clash. Atom order does not matter.
- Parameters
protein (Report.local_protein) – The protein
atom1 (Report.local_atom) – atom 1
atom2 (Report.local_atom) – atom 2
clash (float or None) – Pre-computed clash ratio
distance (float or None) – Pre-computed distance
require_hydrogen (bool) – Whether an intervening hydrogen must be found to qualify as an H-bond.
- find_hbond_hydrogen(protein, donor, acceptor, distance=None)¶
Locate a hydrogen that is bound to the donor that is closer to the acceptor than the donor is. :param protein: the protein :type protein: Report.local_protein :param donor: donor atom :type donor: Report.local_atom :param acceptor: acceptor atom :type acceptor: Report.local_atom :param distance: pre-computed distance between donor and acceptor :type distance: float or None
- class primex_steric_clash_data_set(*args, **kwargs)¶
Bases:
schrodinger.protein.analysis.Report.steric_clash_data_set
A subclass of steric_clash_data_set that computes clashes using a different set of criteria used by PrimeX Polish.
- __init__(*args, **kwargs)¶
- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (l{report} object) – the report object that this is for
- check_hbond(protein, atom1, atom2, clash=None, distance=None)¶
Runs a number of quick checks first before calling the super class’ method. See the super class for more information.
- add_point(descriptor, values, atoms)¶
Makes changes to how the output data is recorded relative to the super class.
- class bond_length_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold data for Bond Length Deviations
Data point descriptor is atoms involved, values are “Deviation”
Summary is the RMS the deviations.
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class bond_angle_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold data for Bond Angle Deviations
Data point descriptor is atoms involved, values are “Deviation”
Summary is the RMS of the deviations.
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class backbone_dihedral_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold data for Backbone Dihedrals
Data point descriptor is the residue involved, values are “Phi”, “Psi”, “G-Factor”
Summary is N/A
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class sidechain_dihedral_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold data for Sidechain Dihedrals
Data point descriptor is the residue involved, values are “Chi1”, “Chi2”, “G-Factor”
Summary is N/A
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class Gfactor_summary_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold data for G-factor summaries of the Backbone and Sidechain dihedrals.
Data point descriptor is the residue involved, values are “Backbone”, “Sidechain”, “Total”
Summary is N/A
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class Bfactor_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold data for B-Factors
Data point descriptor is the residue involved, values are “Backbone”, “BBStdDev”, “Sidechain”, “SCStdDev”
Summary is the average B-Factor for backbone and sidechain atoms
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class gamma_Bfactor_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold data for B-Factors of sidechain gamma atoms
Data point descriptor is the residue involved, value is “B-Factor”
Summary is the average B-Factor for gamma atoms
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class peptide_planarity_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold data for Peptide Planarity
Data point descriptor is the atoms involved, value is “Dihedral Angle”
Summary is the average absolute planarity
See parent class
data_set
for additional documenation- MAX_PEPTIDE_BOND_LENGTH = 3.0¶
- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class sidechain_planarity_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold RMS data for planarity of sidechains
Data point descriptor is the residue involved, value is “RMSD From Planarity”
Summary is the average RMSD deviation from planarity of sidechains
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class improper_torsion_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold RMS data for improper torsions
Data point descriptor is the residue involved, value is “RMS Deviation”
Summary is the average RMSD deviation for improper torsions
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class chirality_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold C-alpha chirality data
Data point descriptor is the residue involved, value is C-alpha “Chirality”
Summary is N/A
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class missing_atoms_data_set(label)¶
Bases:
schrodinger.protein.analysis.Report.data_set
Class to compute and hold information on missing atoms
Data point descriptor is the residue involved, value is nothing
Summary is N/A
See parent class
data_set
for additional documenation- analyze(parent)¶
compute and store the data for this class
- Parameters
parent (
report
object) – the report object that this is for
- class local_atom¶
Bases:
object
Private class used to store atom information locally for speed
- __init__()¶
- class local_residue¶
Bases:
object
Private class used to store residue information locally for convenience
- __init__()¶
- get_backbone_indices()¶
- get_sidechain_indices()¶
- get_atom_indices()¶
- class local_protein¶
Bases:
object
Private class used to store protein information locally for convenience
- __init__()¶
- get_vdw_radius(atom, vdwr_mode=0)¶
- setup_protein(ct)¶
An internal method used to set up the protein for Report calculations. This method should be considered a private method of the Report class and need not be explicitly called by the user/calling script.
- Parameters
ct (
Structure
) – The structure this method operates on
- make_local_protein(ct, keep_hydrogens=False, vdwr_mode=0)¶
- Parameters
ct (
Structure
) – The structure this method operates onkeep_hydrogens (bool) – Whether to include hydrogens in the local protein
vdwr_mode (int) – which VDWR source should be used
- get_residue_name(iatom, protein=None)¶
Create a residue name for the atom iatom.
- Parameters
iatom (int) – the atom index to build a residue name for
- Return type
str
- Returns
A residue name of the form: Chain:PDB_residue_codeResidue_numberInsertion_code where PDB_residue_code is a 4 character field and Residue_number is a 3 character field
- get_set(set_label)¶
Return a single set of data that was specified using the sets_to_run parameter when the class object was created.
- Parameters
set_label (str) – One of the sets specified at Report object creation time using the sets_to_run parameter.
Valid sets are:
‘STERIC CLASHES’
‘BOND LENGTHS’
‘BOND ANGLES’
‘BACKBONE DIHEDRALS’
‘SIDECHAIN DIHEDRALS’
‘GFACTOR SUMMARY’
‘BFACTORS’
‘GAMMA BFACTORS’
‘PEPTIDE PLANARITY’
‘SIDECHAIN PLANARITY’
‘IMPROPER TORSIONS’
‘CHIRALITY’
‘MISSING ATOMS’
If the requested set was not specified at the Report object creation time, None will be returned.
- write_to_stdout()¶
Write each of the computed sets of data to the terminal