schrodinger.application.bioluminate.protein.mutator module

class schrodinger.application.bioluminate.protein.mutator.ProteinMutator(ref_struct, mutations, concurrent=1, sequential=False, idealize=True)

Bases: schrodinger.application.bioluminate.mutation.MacromoleculeMutator

Mutates a set of residues in a protein structure allowing concurrent mutations as well as the option to limit concurrent mutations to sequential residues only.

Here is an example of a mutation of a Ser residue to: Asp, Glu, Asn, & Gln (one-letter codes are D, E, N, & Q respectively). The Ser residue is in chain A and has a residue number of 22. This example will write a file named ‘mutated_structures.maegz’ that has the reference structure as the first CT and each mutation CT after that. Five total structures will be in the output file:

from schrodinger import structure
from schrodinger.application.bioluminate import protein

# Get the input structure
reference_st = structure.Structure.read('receptor.maegz')

# Create the writer for the output file and append the reference
writer = structure.StructureWriter('mutated_structures.maegz')
writer.append(reference_st)

# Define the residues and mutations
residues = ['A:22']
muts     = 'DENQ'

# Get a compatible list of mutations. The above turns into
# [('A', 22, 'DENQ')]
mutations = ProteinMutator.convert_residue_list(residues, muts)

# Construct the mutator
mutator = ProteinMutator(st, mutations)

# Loop over each mutation
for mutation in mutator.generate():
    #
    mutated_structure = mutation.struct
    residue_map       = mutation.residue_map

    res_str = ", ".join(str(res) for res in residue_map.values())
    print 'Residues affected by this mutation: %s' % res_str

    # Do something with the mutated structure (refine maybe)

    writer.append(mutated_structure)

@todo: Add logging

MUTATIONS_PROPERTY = 's_bioluminate_Mutations'
UNFOLDED_PROPERTY = 'r_bioluminate_Unfolded_Contribution'
UNFOLDED_PROPERTY_PRIME = 'r_bioluminate_Unfolded_Contribution_Prime'
RES_FILE_REGEX = re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n    :\n    (?P<resnum>-?\\d+)\n    (?P<inscode>[a-zA-Z]{1})?  # optional\n    \\s?                        # optional\n    (?P<mutations>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HI, re.VERBOSE)
MUTS_FILE_REGEX = re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n    :\n    (?P<resnum>-?\\d+)\n    (?P<inscode>[a-zA-Z]{1})?  # optional\n    ->\n    (?P<new_resname>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HIS|HIP|HID|HIE|ILE|LEU|LYS|MET|PHE|, re.VERBOSE)
GXG_DATA = {'ALA': -100.736, 'ARG': -118.478, 'ARN': -118.478, 'ASH': -149.255, 'ASN': -137.153, 'ASP': -149.255, 'CYS': -100.845, 'GLH': -143.536, 'GLN': -136.183, 'GLU': -143.536, 'GLY': -105.658, 'HID': -104.977, 'HIE': -104.977, 'HIP': -104.977, 'HIS': -104.977, 'ILE': -84.13, 'LEU': -92.4, 'LYN': -110.759, 'LYS': -110.759, 'MET': -99.708, 'PHE': -96.483, 'PRO': -66.763, 'SER': -96.365, 'THR': -98.156, 'TRP': -105.114, 'TYR': -101.858, 'VAL': -93.493}
GXG_DATA_PRIME = {'ALA': -112.635, 'ARG': -142.467, 'ARN': -142.467, 'ASH': -154.559, 'ASN': -156.375, 'ASP': -154.559, 'CYS': -113.747, 'GLH': -141.673, 'GLN': -152.611, 'GLU': -141.673, 'GLY': -116.263, 'HID': -121.6, 'HIE': -121.6, 'HIP': -121.6, 'HIS': -121.6, 'ILE': -97.541, 'LEU': -107.106, 'LYN': -123.751, 'LYS': -123.751, 'MET': -112.643, 'PHE': -113.719, 'PRO': -81.734, 'SER': -112.693, 'THR': -116.048, 'TRP': -122.407, 'TYR': -123.497, 'VAL': -109.017}
SUPPORTED_BUILD_RESIDUES = ['ALA', 'ARG', 'ASN', 'ASP', 'CYS', 'GLN', 'GLU', 'GLY', 'HIS', 'HIP', 'HIE', 'ILE', 'LEU', 'LYS', 'MET', 'PHE', 'PRO', 'SER', 'THR', 'TRP', 'TYR', 'VAL']
__init__(ref_struct, mutations, concurrent=1, sequential=False, idealize=True)
Parameters
  • ref_struct (schrodinger.structure.Structure instance) – The reference (starting) structure

  • mutations (List of tuples) – A list of the mutations to carry out on the ref_struct. Each element of the list is a tuple of (“res num.”, [“pdbnames”]) where “res num.” is the residue number being altered and “pdbnames” is a list of the standard PDB residue names to mutate it to.

  • concurrent (int) – Maximum concurrent mutations

  • sequential (bool) – Limit concurrent mutations to being sequential

  • idealize (bool) – Whether to idealize the reference structure by self-mutating the affected residues before calculating properties.

Raises

RuntimeError – If concurrent is less than 1.

See

For easy creation of mutations variable ProteinMutator.convert_residue_list

classmethod validate_mutated_residues(residues)

Method for validating the residues used in mutations passed in to the MutateProtein class.

Raises

ValueError – If the 3-letter residue name is not supported by the build,mutate method.

classmethod convert_res_file(filename, regex=re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n    :\n    (?P<resnum>-?\\d+)\n    (?P<inscode>[a-zA-Z]{1})?  # optional\n    \\s?                        # optional\n    (?P<mutations>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HI, re.VERBOSE))

Shim for the mixed-case class method convertResFile

classmethod convert_res_list(reslist, regex=re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n    :\n    (?P<resnum>-?\\d+)\n    (?P<inscode>[a-zA-Z]{1})?  # optional\n    \\s?                        # optional\n    (?P<mutations>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HI, re.VERBOSE), validate=True)

Shim for the mixed-case class method convertResList.

static get_3_letter_mutation(mutation)
classmethod get_3_letter_mutation_list(mutation_list)
classmethod convert_res_to_muts(res_str, regex=re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n    :\n    (?P<resnum>-?\\d+)\n    (?P<inscode>[a-zA-Z]{1})?  # optional\n    \\s?                        # optional\n    (?P<mutations>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HI, re.VERBOSE), validate=True)

Shim for the mixed-case class method convertResToMuts.

classmethod convert_muts_file(muts_file, regex=re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n    :\n    (?P<resnum>-?\\d+)\n    (?P<inscode>[a-zA-Z]{1})?  # optional\n    ->\n    (?P<new_resname>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HIS|HIP|HID|HIE|ILE|LEU|LYS|MET|PHE|, re.VERBOSE))

Shim for the mixed-case class method convertMutsFile.

classmethod convert_residue_list(residues, mutations, regex=re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n    :\n    (?P<resnum>-?\\d+)\n    (?P<inscode>[a-zA-Z]{1})?  # optional\n    \\s?                        # optional\n    (?P<mutations>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HI, re.VERBOSE))

Shim for the mixed-case method convertResidueList.

calculateMutationsList()

Calculate the mutations that will be performed, based on the input residues and their mutations, and the “concurrent” and “sequential” settings.

generate()

Used to loop over all mutations. Each mutation consists of the mutated structure and a residue mapping dict. The structure is raw, that is, unrefined in any way.

Returns

Generator for all mutations defined in self.mutations Each step of generator yields a mutation.

Return type

generator

getMutationFromChanges(changes)
classmethod convertMutsFile(muts_file, regex=None)

Converts lines in filename into a list of mutations to use. Returns a list of tuples where each tuple is (“chain”, “resnum”, “inscode”, “one-letter nucleobase for mutation”).

Each line is one mutation (could be multiple residues)

classmethod convertResFile(filename, regex=None)

Converts lines in filename into a list of mutations to use. Returns a list of tuples where each tuple is (“chain”, “resnum”, “inscode”, “three-letter resnames for mutation”).

Each line could be multiple mutations (one residue to multiple mutation states)

Parameters
  • filename (str) – Name of file containing the list of mutations.

  • regex (regular expression object) – Regular expression for matching residues

Raises

RuntimeError – If any of chain, resnum or mutation is missing

Returns

List of mutations with valid syntax for the class

Return type

list of tuples

classmethod convertResList(reslist, regex=None, validate=True)

Converts list of residues into a list of mutations to use. Returns a list of tuples where each tuple is (“chain”, “resnum”, “inscode”, “three-letter resnames for mutation”).

Each residue string could be multiple mutations (one residue to multiple mutation states)

Parameters
  • reslist (list of str) – List of residues to convert to mutations

  • regex (regular expression object) – Regular expression for matching residues

  • validate (bool) – Whether to validate the potential mutations

Returns

List of mutations with valid syntax for the class or None if any item in the list is not valid

Return type

list of tuples or None

classmethod convertResToMuts(res_str, regex=None, validate=True)

Converts a residue string into a list of mutations to use. Returns a list of tuples of (“chain”, “resnum”, “inscode”, “one-letter resnames for mutation”). Will return None if any item in the list is not a valid residue string.

A residue string could be multiple mutations (one residue to multiple mutation states)

Parameters
  • res_str (str) – Residue string to convert to mutations

  • regex (regular expression object) – Regular expression for matching residues

  • validate (bool) – Whether to run validation on the mutation

Returns

List of mutations with valid syntax for the class or None if the res_str is not valid.

Return type

list tuples or None

classmethod convertResidueList(residues, mutations, regex=None)

Convert a list of residues and mutations to a standard list of mutations. Returns a list of tuples where each tuple is (“chain”, “resnum”, “inscode”, “one-letter resnames for mutation”).

Parameters
  • residues (list of strings (Syntax: <chain>:<resnum> if no chain use "_")) – Residues that will be mutated.

  • mutations – The one-letter names for the residues that will be used in mutation.

Raises

RuntimeError – If any of chain, resnum or mutation is missing or if there is an invalid residue name

Returns

List of mutations with valid syntax for the class

Return type

list of tuples

getLoopMutation(mutated_st, res_str, new_resname)

build loop insertion or deletion

property mutations

The list of mutations that will be carried out

property total_mutations

Total number of mutations that will be generated

static validateMutatedResidues(cls, residues)

Method for validating the residues used in mutations passed in to the MutateProtein class.

Raises

ValueError – If the 1-letter residue name is not supported by the build.mutate method.

static validateMutations(mutations)

Private method for validating the mutations passed in to the MutateProtein class.

Raises

ValueError – If the mutations passed in is not a list, if each item in the list is not a tuple, if the tuple is not of length 4 (chain, resnum idx, inscode, mutation resnames), if the resnum is not an integer, or any of the 3-letter residue names in “mutation resnames” is not supported by the build,mutate method.