schrodinger.protein.analysis module

A class for diagnosing and reporting common structural problems of protein complexes.

Usage:

instance = Report(ct) instance.write_to_stdout()

Copyright Schrodinger, LLC. All rights reserved.

schrodinger.protein.analysis.use_gfactors = False

Use gfactors instead of deviations.

schrodinger.protein.analysis.kT = 0.5925

Value of kT at room temperature.

class schrodinger.protein.analysis.Report(ct, sets_to_run=('ALL',))

Bases: object

A class to calculate properties of proteins.

To use this class in a script to compute a set of data:

reporter = Report(struct, ['SIDECHAIN DIHEDRALS', 'GAMMA BFACTORS'])
dihedrals = reporter.get_set('SIDECHAIN DIHEDRALS')
for point in dihedrals.points:
    resnum = point.descriptor.split()[1]
    resid = point.descriptor.split(':')[0] + ':' + resnum
    for val in point.values:
        try:
            true_value = float(val)
        except ValueError:
            # Some angles will have '-' if they aren't defined for
            # this residue type
            continue
        do_something_with_chi_angle(true_value)
__init__(ct, sets_to_run=('ALL',))

Create a Report instance.

Parameters
  • ct (Structure) – The structure this Report operates on

  • sets_to_run (list) – Either [‘ALL’] (default) or one or more of valid sets.

Valid sets to compute:

  • ‘STERIC CLASHES’

  • ‘PRIMEX STERIC CLASHES’

  • ‘BOND LENGTHS’

  • ‘BOND ANGLES’

  • ‘BACKBONE DIHEDRALS’

  • ‘SIDECHAIN DIHEDRALS’

  • ‘GFACTOR SUMMARY’

  • ‘BFACTORS’

  • ‘GAMMA BFACTORS’

  • ‘PEPTIDE PLANARITY’

  • ‘SIDECHAIN PLANARITY’

  • ‘IMPROPER TORSIONS’

  • ‘CHIRALITY’

  • ‘MISSING ATOMS’

class data_set(label)

Bases: object

Base class for all data sets.

Variables
  • label (str) – the name of this data set

  • title (str) – the name of this data set when printing out the data

  • fields (list[str]) – first item is the name of the data_point descriptor property, remaining items are the names of the items in the data_point values property

  • points (list(data_point)) – list of data_point objects

  • summary (str) – overall summary of the data set for the entire protein

  • bad_points (list(data_point)) – a subset of problematic points for filtering in table and bubble plot

  • count (int) – the number of violations, in most cases the length of data_points, but for ‘global’ or non-residue specific properties it may be just 0 (for no issues) or 1 (e.g., X-Ray check)

  • score (float) – raw quality score, which has higher priority

  • bubble_scale (float) – normalized scale

  • color (str) – bubble color used as a pylab color argument

  • area (float) – bubble area

__init__(label)
class data_point(descriptor='', values=[], atoms=[])

Bases: object

Class that holds the data for each point in a data set

Variables
  • descriptor (str) – label for this point - typically the user friendly names of the atom or residues involved

  • values (list(float or str)) – the values at this point - varies by subclass

  • atoms (list(int)) – the atoms involved in this point

__init__(descriptor='', values=[], atoms=[])
add_point(descriptor='', values=[], atoms=[])

Add a new point to the points property

Parameters
  • descriptor (str) – Label for this point - typically the user-friendly names of the atom or residues involved

  • values (list) – The values at this point - varies by subclass

  • atoms (list) – The atoms involved in this point

report_data_points()

Return all data points for this set in a list

Return type

list

Returns

list of data for each point in self.points, each item is a list whose first item is the point.descriptor and remaining items are the point.values items.

analyze(parent)

Must be subclassed, this implementation does nothing

Parameters

parent (Report object) – The Report object that this is for

report()

Must be subclassed, this implementation does nothing

class steric_clash_data_set(*args, **kwargs)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold data for Steric Clashes.

Data point descriptor is atoms involved, values are “Distance”, “Min Allowed”, “Delta”.

Summary is N/A

See parent class data_set for additional documenation

__init__(*args, **kwargs)
within_three_bonds(protein, iatom, target_atom)

Method to determine whether two atoms are within three bonds of each other.

Parameters
  • parent – The Report object that this is for

  • iatom (int) – atom number of first atom

  • target_atom (int) – atom number of the second atom

Return type

bool

Returns

True if atoms are within 3 bonds of each other, False if not

analyze(parent)

compute and store the data for this class

Parameters

parent (l{report} object) – the report object that this is for

run_analysis(protein)

Iterate over the atom pairs and check record clashes :param protein: the protein :type protein: Report.local_protein

analyze_pair(protein, atom1, atom2)

Test a pair of atoms and record data if they clash. Order does not matter. :param protein: The protein :type protein: Report.local_protein :param atom1: One atom of the pair :type atom1: Report.local_atom :param atom2: The second atom of the pair :type atom2: Report.local_atom

check_hbond(protein, atom1, atom2, clash=None, distance=None, require_hydrogen=True)

Test and atom pair to see if they can be considered a hydrogen bond. The presence of a H-bond makes permissible atom proximity that would normally be considered a clash. Atom order does not matter.

Parameters
  • protein (Report.local_protein) – The protein

  • atom1 (Report.local_atom) – atom 1

  • atom2 (Report.local_atom) – atom 2

  • clash (float or None) – Pre-computed clash ratio

  • distance (float or None) – Pre-computed distance

  • require_hydrogen (bool) – Whether an intervening hydrogen must be found to qualify as an H-bond.

find_hbond_hydrogen(protein, donor, acceptor, distance=None)

Locate a hydrogen that is bound to the donor that is closer to the acceptor than the donor is. :param protein: the protein :type protein: Report.local_protein :param donor: donor atom :type donor: Report.local_atom :param acceptor: acceptor atom :type acceptor: Report.local_atom :param distance: pre-computed distance between donor and acceptor :type distance: float or None

class primex_steric_clash_data_set(*args, **kwargs)

Bases: schrodinger.protein.analysis.Report.steric_clash_data_set

A subclass of steric_clash_data_set that computes clashes using a different set of criteria used by PrimeX Polish.

__init__(*args, **kwargs)
analyze(parent)

compute and store the data for this class

Parameters

parent (l{report} object) – the report object that this is for

check_hbond(protein, atom1, atom2, clash=None, distance=None)

Runs a number of quick checks first before calling the super class’ method. See the super class for more information.

add_point(descriptor, values, atoms)

Makes changes to how the output data is recorded relative to the super class.

class bond_length_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold data for Bond Length Deviations

Data point descriptor is atoms involved, values are “Deviation”

Summary is the RMS the deviations.

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class bond_angle_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold data for Bond Angle Deviations

Data point descriptor is atoms involved, values are “Deviation”

Summary is the RMS of the deviations.

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class backbone_dihedral_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold data for Backbone Dihedrals

Data point descriptor is the residue involved, values are “Phi”, “Psi”, “G-Factor”

Summary is N/A

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class sidechain_dihedral_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold data for Sidechain Dihedrals

Data point descriptor is the residue involved, values are “Chi1”, “Chi2”, “G-Factor”

Summary is N/A

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class Gfactor_summary_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold data for G-factor summaries of the Backbone and Sidechain dihedrals.

Data point descriptor is the residue involved, values are “Backbone”, “Sidechain”, “Total”

Summary is N/A

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class Bfactor_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold data for B-Factors

Data point descriptor is the residue involved, values are “Backbone”, “BBStdDev”, “Sidechain”, “SCStdDev”

Summary is the average B-Factor for backbone and sidechain atoms

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class gamma_Bfactor_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold data for B-Factors of sidechain gamma atoms

Data point descriptor is the residue involved, value is “B-Factor”

Summary is the average B-Factor for gamma atoms

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class peptide_planarity_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold data for Peptide Planarity

Data point descriptor is the atoms involved, value is “Dihedral Angle”

Summary is the average absolute planarity

See parent class data_set for additional documenation

MAX_PEPTIDE_BOND_LENGTH = 3.0
analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class sidechain_planarity_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold RMS data for planarity of sidechains

Data point descriptor is the residue involved, value is “RMSD From Planarity”

Summary is the average RMSD deviation from planarity of sidechains

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class improper_torsion_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold RMS data for improper torsions

Data point descriptor is the residue involved, value is “RMS Deviation”

Summary is the average RMSD deviation for improper torsions

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class chirality_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold C-alpha chirality data

Data point descriptor is the residue involved, value is C-alpha “Chirality”

Summary is N/A

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class missing_atoms_data_set(label)

Bases: schrodinger.protein.analysis.Report.data_set

Class to compute and hold information on missing atoms

Data point descriptor is the residue involved, value is nothing

Summary is N/A

See parent class data_set for additional documenation

analyze(parent)

compute and store the data for this class

Parameters

parent (report object) – the report object that this is for

class local_atom

Bases: object

Private class used to store atom information locally for speed

__init__()
class local_residue

Bases: object

Private class used to store residue information locally for convenience

__init__()
get_backbone_indices()
get_sidechain_indices()
get_atom_indices()
class local_protein

Bases: object

Private class used to store protein information locally for convenience

__init__()
get_vdw_radius(atom, vdwr_mode=0)
setup_protein(ct)

An internal method used to set up the protein for Report calculations. This method should be considered a private method of the Report class and need not be explicitly called by the user/calling script.

Parameters

ct (Structure) – The structure this method operates on

make_local_protein(ct, keep_hydrogens=False, vdwr_mode=0)
Parameters
  • ct (Structure) – The structure this method operates on

  • keep_hydrogens (bool) – Whether to include hydrogens in the local protein

  • vdwr_mode (int) – which VDWR source should be used

get_residue_name(iatom, protein=None)

Create a residue name for the atom iatom.

Parameters

iatom (int) – the atom index to build a residue name for

Return type

str

Returns

A residue name of the form: Chain:PDB_residue_codeResidue_numberInsertion_code where PDB_residue_code is a 4 character field and Residue_number is a 3 character field

get_set(set_label)

Return a single set of data that was specified using the sets_to_run parameter when the class object was created.

Parameters

set_label (str) – One of the sets specified at Report object creation time using the sets_to_run parameter.

Valid sets are:

  • ‘STERIC CLASHES’

  • ‘BOND LENGTHS’

  • ‘BOND ANGLES’

  • ‘BACKBONE DIHEDRALS’

  • ‘SIDECHAIN DIHEDRALS’

  • ‘GFACTOR SUMMARY’

  • ‘BFACTORS’

  • ‘GAMMA BFACTORS’

  • ‘PEPTIDE PLANARITY’

  • ‘SIDECHAIN PLANARITY’

  • ‘IMPROPER TORSIONS’

  • ‘CHIRALITY’

  • ‘MISSING ATOMS’

If the requested set was not specified at the Report object creation time, None will be returned.

write_to_stdout()

Write each of the computed sets of data to the terminal