schrodinger.application.bioluminate.protein.protein module

Module to gather residue property data for proteins.

Copyright (c) Schrodinger, LLC. All rights reserved

schrodinger.application.bioluminate.protein.protein.get_residue_asl(residue, ca=False)

Creates an ASL based on a residue’s chain, residue number and inscode. The ASL can optionally only include the alpha carbon of the residue.

Parameters

residue (schrodinger.structure._Residue) – The residue to create an ASL for

Raises

RuntimeError – If the passed in residue is incorrect type

Returns

ASL expression for residue

Return type

str

schrodinger.application.bioluminate.protein.protein.get_residues_asl(residues, ca=False)

Creates an ASL based on a list of residue’s chains, residue numbers and inscodes. The ASL can optionally only include the alpha carbon of the residue.

Parameters

residue (list or tuple of schrodinger.structure._Residue) – The residues to create an ASL for

Raises
  • RuntimeError – If residues are not a list or tuple

  • RuntimeError – If any passed in residues are incorrect type

Returns

ASL expression for all residues

Return type

str

schrodinger.application.bioluminate.protein.protein.valid_asl(st, asl)

Returns True/False depending on whether the asl is a valid expression or not.

schrodinger.application.bioluminate.protein.protein.get_residues_within(st, residues, within=0.0, ca=False)

Returns a list of residues for st that are within within angstroms of each residue. If the ca keyword is True the within calculation will only look for alpha carbon in residues. This means that if within is set to 5.5 angstroms and there is only a single atom that belongs to a residue at that cutoff, the residue that the atom belongs to will be refined.

Parameters
  • st (schrodinger.structure.Structure) – Structure to evaluate and which all residues correspond

  • residues (list or tuple of schrodinger.structure._Residue) – All residues targeted for refinement

  • within (float) – Distance (angstroms) of residues to include in refinement

  • ca (bool) – Use only alpha carbons to find residues within

Returns

List of schrodinger.structure._Residue objects

Return type

list

schrodinger.application.bioluminate.protein.protein.residue_is_polar(residue)

Tests whether a residue is polar

Parameters

residue (structure._Residue) – Residue to test

Return type

bool

schrodinger.application.bioluminate.protein.protein.residue_is_nonpolar(residue)

Tests whether a residue is nonpolar (for SASA)

Parameters

residue (structure._Residue) – Residue to test

Return type

bool

schrodinger.application.bioluminate.protein.protein.atom_is_nonpolar(atom)

Returns true if the atom is considered non-polar. Here are the rules for non-polar atoms:

  • The atom’s element is a C or S

  • The atom’s element is a H and one bonded atom’s element is C or S

class schrodinger.application.bioluminate.protein.protein.PrimeConfig(st_filename, set_defaults=True, **kwargs)

Bases: schrodinger.application.prime.input.Prime

Class containing the methods to write Prime input files. NOTE THAT THIS ALWAYS USES OPLS2005

ALL_RESIDUES = 'all'
__init__(st_filename, set_defaults=True, **kwargs)

Accepts one argument which is either a path or a keyword dictionary.

addResidues(residues=None)

Adds residues to consider for refinement. The passed in argument can take the form of:

  • ASL expression

  • List of schrodinger.structure._Residue objects

  • ‘all’

  • None

prepEnergy()
prepMinimize(residues=None)
prepResidue(residues=None)
prepSidechain(residues=None)
prepSidechainCBeta(residues=None)
prepSidechainBB(residues=None)
prepActive(lig_id, residues=None)
prepLoop(start_res=None, end_res=None, res_sphere=7.5, maxcalpha=None, protocol='LOOP_BLD', loop2=None, max_jobs=0, residues=None)
Parameters
  • start_res (string) – loop start residue, e.g. A:15

  • end_res (string) – loop start residue, e.g. A:20

  • res_sphere (float) – radius of nearby residue refinement

  • maxcalpha (float) – CA atom movement constraint

  • protocol (string) – loop refinement protocol

  • loop2 (list) – the definition of the second loop, e.g. [‘A:4’,’A:6’]

  • residues (None) – Unused, kept for API compatibility

  • max_jobs (int) – how many processes will be run simultaneously

prepAntibodyLoop(start_res=None, end_res=None, cpus=1, residues=None)
prepBldStruct(jobname, dirname)
class schrodinger.application.bioluminate.protein.protein.PrimeStructure(jobname)

Bases: object

__init__(jobname)
createTemplateFile(template_seq, filename=None)

Writes a template PDB file as .ent

createAlignFile(reference_seq, template_seq, filename=None)

Writes an alignment file for the template. If no filename is supplied the file will be named <jobname>.aln.

Parameters
  • reference_seq (sequence) – The reference sequence

  • template_seq (sequence) – The template sequence

exception schrodinger.application.bioluminate.protein.protein.PropkaError

Bases: Exception

A custom exception for any propka failures

class schrodinger.application.bioluminate.protein.protein.OrderedResidueDict(residues, default_value=None)

Bases: collections.OrderedDict

Creates an ordered dictionary for residues in a structure

__init__(residues, default_value=None)
class schrodinger.application.bioluminate.protein.protein.PropertyCalculator(struct, jobname, cleanup=True, nbcutoff=14.0, residues=None, lig_asl=None)

Bases: object

Class for calculating properties of proteins and protein residues.

Here is an example of how to calculate properties for a protein:

from schrodinger import structure
from schrodinger.application.bioluminate import protein

# Get the input structure
st = structure.Structure.read('receptor.maegz')

# Define the properties to calculate
calculations = [ 'e_pot', 'e_internal', 'e_interaction', 'prime_energy',
                 'pka', 'sasa_polar', 'sasa_nonpolar', 'sasa_total']

# Create the calculator
calculator = protein.PropertyCalculator(st, "my_calculator_jobname")

# Calculate the properties
properties = calculator.calculate(*calculations)

In the example above the properties output would look something like this:

properties = {
    'e_pot'         : 1573.4,
    'e_internal'    : 624.7,
    'e_interaction' : 994.8,
    'prime_energy'  : 744.2,
    'pka'           : 124.1,
    'sasa_polar',   : 3122.3,
    'sasa_nonpolar' : 271.1,
    'sasa_total'    : 3393.4
}
AGGREGATE_CALCULATIONS = ['e_pot', 'prime_energy', 'pka', 'sasa_polar', 'sasa_nonpolar', 'sasa_total', 'hydropathy', 'rotatable', 'vdw_surf_comp']
RESIDUE_CALCULATIONS = ['e_pot', 'e_internal', 'e_interaction', 'pka', 'sasa_polar', 'sasa_nonpolar', 'sasa_total', 'hydropathy', 'rotatable', 'vdw_surf_comp']
__init__(struct, jobname, cleanup=True, nbcutoff=14.0, residues=None, lig_asl=None)

Construct a ProteinCalculator class from a structure file and a jobname.

Parameters
  • struct (schrodinger.structure.Structure object) – The protein structure or protein/ligand structures

  • jobname – The jobname that will be used for all calculations that require output files.

  • residues (Iterable of schrodinger.structure._Residue objects.) – An iterable of _Residue objects to analyze. If not specified, all residues in the structure are considered.

  • lig_asl (str) – The ASL for the ligand substructure. Used for calculating the vdW surface complementarity.

progress

Variable that can be used to get the progress of calculations. This variable is only set in self.calculateOverResidues. Since that method returns a generator, each step can query self.progress to get a description of the progress. This variable is a tuple with the form ( step, total steps ).

property minimizer

The minimizer used in energy calculations.

runpKa()

Runs PROPKA to get the pKa of all residues in the self.struct, then sets self.pka_data.

getResiduepKa(residue)

Returns the pKa for specified residue

Parameters

residue (structure._Residue) – Residue to get internal energy for

Return type

float

getTotalpKa()

Gets the sum of the pKa values for the protein.

Return type

float

setpKaData(summary, renum_map=None)

Compares residues from the PROPKA summary with the residues in self.residues and when matches are found the summary’s pKa is set for that residue in self.pka_data

getTotalPrimeEnergy()

Run Prime Minimization on self.struct. This will launch a job using job control. After the job completes the total energy will be taken from the first CT using the “r_psp_Prime_Energy” property.

Returns

Prime energy of protein

Return type

float

getPrimeEnergyByResidues(residues)

Run Prime Minimization on self.struct only minimizing the residues in residues. This will launch a job using job control. After the job completes the total energy will be taken from the first CT using the “r_psp_Prime_Energy” property.

Parameters

residues (list of residues) – Residues to minimize

Returns

Prime energy of protein

Return type

float

getResiduePotentialEnergy(residue)

Return the potential energy for a residue.

Parameters

residue (structure._Residue) – Residue to get potential energy for

Return type

float

getPotentialEnergyGenerator()

Return a generator that iterates over each residue in self.struct yielding the schrodinger.structure._Residue object and it’s potential energy.

Return type

generator

See

schrodinger.structutils.minimize.Minimizer.getSelfEnergy

See

schrodinger.structutils.minimize.Minimizer.getInteractionEnergy

getTotalPotentialEnergy()

Get the potential energy of self.struct which is calculated using schrodinger.structutils.minimize.Minimizer. The potential energy is the sum of the internal energies and the interaction energies.

Returns

Total potential energy of all the residues

Return type

float

See

schrodinger.structutils.minimize.Minimizer.getSelfEnergy

See

schrodinger.structutils.minimize.Minimizer.getInteractionEnergy

getResidueInternalEnergy(residue)

Return the residue’s internal energy.

Parameters

residue (structure._Residue) – Residue to get internal energy for

Return type

float

See

schrodinger.structutils.minimize.Minimizer.getSelfEnergy

getInternalEnergyGenerator()

Return a generator that iterates over each residue in self.struct. This yields the schrodinger.structure._Residue object and it’s internal energy.

Return type

generator

See

schrodinger.structutils.minimize.Minimizer.getSelfEnergy

getResidueInteractionEnergy(residue)

Return the residue’s interaction energy.

Parameters

residue (structure._Residue) – Residue to get interaction energy for

Return type

float

See

schrodinger.structutils.minimize.Minimizer.getInteractionEnergy

getInteractionEnergyGenerator()

Return a generator that iterates over each residue in self.struct. This yields the schrodinger.structure._Residue object and it’s interaction energy.

Return type

generator

See

schrodinger.structutils.minimize.Minimizer.getInteractionEnergy

getResidueAtomicPolarSASA(residue, sidechain=False)

Returns SASA for all polar atoms in residue

Parameters
  • residue (structure._Residue) – Residue to get atomic polar SASA contribution for

  • sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

float

getAtomicPolarSASAGenerator(sidechain=False)

Returns a generator that yields the schrodinger.structure._Residue object and its calculated SASA for only the polar atoms in each residue in self.struct.

Parameters

sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

generator

getResidueAtomicNonPolarSASA(residue, sidechain=False)

Returns SASA for only the nonpolar atoms in residue

Parameters
  • residue (structure._Residue) – Residue to get atomic nonpolar SASA contribution for

  • sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

float

getAtomicNonPolarSASAGenerator(sidechain=False)

Returns a generator that yields the schrodinger.structure._Residue object and its calculated SASA for only the nonpolar atoms in each residue in self.struct.

Parameters

sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

generator

getResidueSASA(residue, sidechain=False)

Returns the SASA for residue.

Parameters
  • residue (structure._Residue) – Residue to get SASA for

  • sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

float

getSASAPolarGenerator(sidechain=False)

Returns a generator that yields the schrodinger.structure._Residue object and its calculated SASA for each polar residue in self.struct.

Parameters

sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

generator

getTotalSASAPolar(sidechain=False)

Returns the total approximate solvent accessible surface area for all polar residues.

Parameters

sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

float

getSASANonPolarGenerator(sidechain=False)

Returns a generator that yields the schrodinger.structure._Residue object and its calculated SASA for each nonpolar residue in self.struct.

Parameters

sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

generator

getTotalSASANonPolar(sidechain=False)

Returns the total approximate solvent accessible surface area for all non-polar residues.

Parameters

sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

float

getSASAGenerator(sidechain=False)

Returns a generator that yields the schrodinger.structure._Residue object and its calculated SASA for each residue in self.struct.

Parameters

sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

generator

getTotalSASA(sidechain=False)

Returns the total approximate solvent accessible surface area for all residues.

Parameters

sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

float

getResidueHydropathy(residue, sidechain=False)

Returns hydropathy value for residue

Parameters
  • residue (structure._Residue) – Residue to get hydropathy value for

  • sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

float

getHydropathyGenerator(sidechain=False)

Returns a generator that yields the schrodinger.structure._Residue object and its calculated hydropathy for each residue in self.struct.

Parameters

sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

generator

getTotalHydropathy(sidechain=False)

Returns the total calculated hydropathy value for all residues.

Parameters

sidechain (bool) – Only consider sidechain atoms when calculating SASA

Return type

float

getResidueRotatableBonds(residue)

Return the number of rotors for a residue.

Parameters

residue (structure._Residue) – Residue to get rotor count for

Return type

int

getRotatableBondsGenerator()

Returns a generator that yields the schrodinger.structure._Residue object and its number of rotors for each residue in self.struct.

Return type

generator

getTotalRotatableBonds()
Returns

Sum of rotors for all residues.

Return type

float

getTotalSurfComp()
Returns

Median of vdW surface complementarity values for all surface points for all residues.

Return type

float

getResidueSurfComp(residue)
Returns

Median of vdW surface complementarity values for all accounted points on the surface of this residue.

Return type

float

Parameters

residue (structure._Residue) – Residue to get the value for

calculateOverResidues(*properties)

Helper method that returns a generator which will calculate multiple properties for self.struct. All results will be returned in a tuple with the form ( structure._Residue, calc dict ). Here is a list of valid properties to calculate:

  • e_pot

  • e_internal

  • e_interaction

  • pka

  • sasa_polar

  • sasa_nonpolar

  • sasa_total

  • hydropathy

  • rotatable

  • vdw_surf_comp

Parameters

properties (str (see PropertyCalculator.RESIDUE_CALCULATIONS)) – Properties to calculate

Raises

KeyError – If a property passed in is invalid

Returns

Generator that yields structure._Residue and dict where keys are properties passed in and values are the total value of the property for the protein. e.g (_Residue, {‘e_pot’:1324.3})

Return type

generator

calculate(*properties)

Helper method to calculate multiple properties for self.struct. All results will be returned in a dict where the keys are each of the properties in properties, and their values are the values returned from their corresponding method. Here is a list of valid properties to calculate:

  • e_pot

  • sasa_polar

  • sasa_nonpolar

  • sasa_total

  • prime_energy

  • pka

  • hydropathy

  • rotatable

  • vdw_surf_comp

Parameters

properties (str (see PropertyCalculator.AGGREGATE_CALCULATIONS)) – Properties to calculate

Raises

KeyError – If a property passed in is invalid

Returns

Dict where keys are properties passed in and values are the total value of the property for the protein. e.g {‘e_pot’: 1324.3, ‘sasa_total’: 1846.9}

Return type

dict

getTotalAggregation()
getTotalSolubility()
getTotalComplementarity()
class schrodinger.application.bioluminate.protein.protein.Refiner(struct, residues=None)

Bases: object

Creates input files and runs calculations for protein refinement jobs using Prime and our schrodinger.structutils.minimize.Minimizer class.

Here is an example of how to refine a protein that just had a residue mutated. In this example only the residues within 7.0 angstroms of the mutated residue will be refined:

from schrodinger.structure import StructureReader
from schrodinger.structutils import build
from schrodinger.application.bioluminate import protein

# Get the structure
st = StructureReader('receptor.maegz')

# Atom number 30 is the alpha carbon of a GLU
ca = st.atom[30]

# Mutate GLU -> ASP
renum_map = build.mutate(st, ca.index, "ASP")

# Get the residue that was mutated
mutated_residue = None
for res in st.residue:
    ca_keys  = (ca.chain,  ca.resnum,  ca.inscode)
    res_keys = (res.chain, res.resnum, res.inscode)
    if ca_keys == res_keys:
        mutated_residue = res
        break

# We want to use the reference to gather the residues to refine
refine_residues = protein.get_residues_within(
    st,
    [mutated_residue],
    within = 7.0
)

# Create the refiner
refiner = protein.Refiner(st, residues=refine_residues)

# Run Prime minimization which returns the refined structure
refined_struct = refiner.runPrimeMinimization('my_refinement_jobname')
PYTHON_MINIMIZE = 'python_minimize'
PRIME_MINIMIZE = 'prime_minimize'
PRIME_RESIDUE = 'prime_residue'
PRIME_SIDECHAIN = 'prime_sidechain'
PRIME_SIDECHAIN_CBETA = 'prime_sidechain_cbeta'
PRIME_SIDECHAIN_BB = 'prime_sidechain_bb'
PRIME_LOOP_PRED = 'prime_loop_prediction'
PRIME_ANTIB_LOOP_PRED = 'prime_antibody_loop_prediction'
__init__(struct, residues=None)
Parameters
  • struct (schrodinger.structure.Structure) – The structure being refined

  • residues (None or list/tuple of structure.structure._Residue) – Residues to consider for refinement

setResidues(residues)

Set the residues to refine. This is a list of integers refering to the residue indices for the structure.

clean()

Remove all files created from the refinement job

writePrimeInput(refine_type, input_file, st_filename, **kwargs)

Writes the input file for a Prime refinement job.

Parameters
  • refine_type (str) – The type of Prime refinement to run (see class variables)

  • input_file (str) – Name of the input file for the refinement job

  • st_filename (str) – Filename of the structure to be refined

Raises

RuntimeError – If refine_type is not supported

Return type

None

refinePrime(refine_type, jobname, completed_callback=None, **kwargs)

Run a Prime refinement job through job control and return the refined output structure.

Parameters
  • refine_type (str) – The type of Prime refinement to run (see class variables)

  • jobname (str) – Jobname to use

  • completed_callback (callable) – Whether to start the job and wait, or call given function with Job object is parameter on completion.

Raises
  • RuntimeError – If refine_type is not supported

  • RuntimeError – If launching the refinement job fails

  • RuntimeError – If the refinement job fails

Returns

Refined structure

Return type

schrodinger.structure.Structure object or schrodinger.job.jobcontrol.Job

runPrimeMinimization(jobname)

Shortcut to run a Prime minimization job

See

Refiner.refinePrime documentation

runPrimeResidue(jobname)

Shortcut to run a Prime residue refinement job

See

Refiner.refinePrime documentation

runPrimeSidechain(jobname)

Shortcut to run a Prime sidechain refinement job

See

Refiner.refinePrime documentation

runPrimeSidechainCBeta(jobname)

Shortcut to run a Prime sidechain refinement job with CA-CB vector sampling. This will vary the orientation of the CA-CB bond by up to 30 degrees from the initial direction.

See

Refiner.refinePrime documentation

runPrimeSidechainBB(jobname)

Shortcut to run a Prime sidechain refinement job with backbone sampling. This will sample the backbone by running a loop prediction on a set of 3 residues centered on the residue for which the side chain is being refined.

See

Refiner.refinePrime documentation

runPrimeLoopPrediction(jobname, start_res=None, end_res=None)

Shortcut to run a Prime loop prediction refinement job..

See

Refiner.refinePrime documentation

runPythonMinimize(jobname)

Shortcut to run a schrodinger.structutils.minimize.Minimizer job.

Parameters

jobname (str) – Jobname to use

Returns

Minimized structure

Return type

schrodinger.structure.Structure object

runRefinement(refine_type, jobname, **kwargs)

Shortcut to run any of the available refinement jobs.

Parameters
  • refine_type (str) – The type of Prime refinement to run (see class variables)

  • jobname (str) – Jobname to use

Raises
  • RuntimeError – If refine_type is not supported

  • RuntimeError – If the refinement job fails

Returns

Refined structure

Return type

schrodinger.structure.Structure object

class schrodinger.application.bioluminate.protein.protein.Consensus(asl_map, minimum_number, dist_cutoff=2.0)

Bases: object

Access the atoms, residues, and molecules (or just their indices) that are considered to be consensus objects for a template structure and query structure. All properties are returned as an OrderedDict that maps the template objects to their consensus objects from the query structure.

Here is an example of how to get all the consensus waters between two protein structures. We define the cutoff here at 2 Angstroms:

from schrodinger.structure import StructureReader
from schrodinger.application.bioluminate import protein

pt = maestro.project_table_get()

# Create an ASL map for all ligands in the WS
asl_map = []
for row in pt.included_rows:
    st = row.getStructure()
    ligands = analyze.find_ligands(st)
    if not ligands:
        continue
    indices = []
    for ligand in ligands:
        indices.extend([str(i) for i in ligand.atom_indexes])

    asl = 'atom.n %s' % ','.join(indices)

    asl_map.append((st, asl))

# Create a consensus of all ligands, specifying that at least three
# structures must have a ligand atom within 2A from one another.
consensus = protein.Consensus(asl_map, 3, dist_cutoff=2)

# To get the atom objects
consensus_atoms = consensus.atoms

# To get the molecule objects
molecules = consensus.molecules
ASL_WATER = 'water and NOT (atom.ele H)'
ASL_WATER_NOZOB = 'water and NOT (atom.ele H) and NOT (withinbonds 1 (not water))'
ASL_IONS = 'ions'
ASL_LIGAND = '(((m.atoms 5-130)) and not ((ions) or (res.pt ACE ACT ACY BCT BME BOG CAC CIT CO3 DMS EDO EGL EPE FES FMT FS3 FS4 GOL HEC HED HEM IOD IPA MES MO6 MPD MYR NAG NCO NH2 NH3 NO3 PG4 PO4 POP SEO SO4 SPD SPM SUC SUL TRS )))'
__init__(asl_map, minimum_number, dist_cutoff=2.0)
Parameters
  • asl_map (tuple of (structure, ASL)) – List of structures and the ASL used to limit the atoms used when calculating the consensus

  • minimum_number (int) – The minimum number of matches within structures. An atom will be considered a “consensus” atom if it is within the dist_cutoff of at least minimum_number of structures in the list of passed in structures.

  • dist_cutoff (float) – Distance in Angstroms used to define a consensus match

Attention

The list of consensus atoms (or molecules, residues, indices, etc. depending on the property called, i.e. self.molecules) will all be unique and will depend on the ASL passed in. If the ASL is not specific enough you may end up with poor results.

getClosest(ref_atom, mob_atoms)

Gets the closest atom to the ref_atom from mob_atoms.

property atoms

Get the map of atom objects of consensus atoms.

Returns

Atoms of consensus atoms

Return type

OrderedDict of atom objects where the keys are the template atoms and their values are the consensus atoms from the query.

property atom_indices

Get the map of atom indices of consensus atoms.

Returns

Atom indices of consensus atoms

Return type

OrderedDict of ints where the keys are the template atom indices and their values are the consensus atom indices from the query.

property residues

Get the list of residue objects of consensus atoms for each structure in self.asl_map.

Returns

Residues of consensus atoms

Return type

list of unique consensus residue objects for each structure in self.asl_map. (Order is maintained)

property residue_indices

Get the map of residue indices of consensus atoms.

Returns

Residue indices of consensus atoms

Return type

list of unique consensus residue indices for each structure in self.asl_map. (Order is maintained)

property molecules

Get the map of molecule objects of consensus atoms.

Returns

Molecules of consensus atoms

Return type

list of unique consensus molecule objects for each structure in self.asl_map. (Order is maintained)

property molecule_indices

Get the map of molecule indices of consensus atoms.

Returns

Molecule indices of consensus atoms

Return type

list of unique consensus molecule indices for each structure in self.asl_map. (Order is maintained)