schrodinger.application.bioluminate.protein.mutator module¶
- class schrodinger.application.bioluminate.protein.mutator.ProteinMutator(ref_struct, mutations, concurrent=1, sequential=False, idealize=True)¶
Bases:
schrodinger.application.bioluminate.mutation.MacromoleculeMutator
Mutates a set of residues in a protein structure allowing concurrent mutations as well as the option to limit concurrent mutations to sequential residues only.
Here is an example of a mutation of a Ser residue to: Asp, Glu, Asn, & Gln (one-letter codes are D, E, N, & Q respectively). The Ser residue is in chain A and has a residue number of 22. This example will write a file named ‘mutated_structures.maegz’ that has the reference structure as the first CT and each mutation CT after that. Five total structures will be in the output file:
from schrodinger import structure from schrodinger.application.bioluminate import protein # Get the input structure reference_st = structure.Structure.read('receptor.maegz') # Create the writer for the output file and append the reference writer = structure.StructureWriter('mutated_structures.maegz') writer.append(reference_st) # Define the residues and mutations residues = ['A:22'] muts = 'DENQ' # Get a compatible list of mutations. The above turns into # [('A', 22, 'DENQ')] mutations = ProteinMutator.convert_residue_list(residues, muts) # Construct the mutator mutator = ProteinMutator(st, mutations) # Loop over each mutation for mutation in mutator.generate(): # mutated_structure = mutation.struct residue_map = mutation.residue_map res_str = ", ".join(str(res) for res in residue_map.values()) print 'Residues affected by this mutation: %s' % res_str # Do something with the mutated structure (refine maybe) writer.append(mutated_structure)
@todo: Add logging
- MUTATIONS_PROPERTY = 's_bioluminate_Mutations'¶
- UNFOLDED_PROPERTY = 'r_bioluminate_Unfolded_Contribution'¶
- UNFOLDED_PROPERTY_PRIME = 'r_bioluminate_Unfolded_Contribution_Prime'¶
- RES_FILE_REGEX = re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n :\n (?P<resnum>-?\\d+)\n (?P<inscode>[a-zA-Z]{1})? # optional\n \\s? # optional\n (?P<mutations>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HI, re.VERBOSE)¶
- MUTS_FILE_REGEX = re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n :\n (?P<resnum>-?\\d+)\n (?P<inscode>[a-zA-Z]{1})? # optional\n ->\n (?P<new_resname>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HIS|HIP|HID|HIE|ILE|LEU|LYS|MET|PHE|, re.VERBOSE)¶
- GXG_DATA = {'ALA': -100.736, 'ARG': -118.478, 'ARN': -118.478, 'ASH': -149.255, 'ASN': -137.153, 'ASP': -149.255, 'CYS': -100.845, 'GLH': -143.536, 'GLN': -136.183, 'GLU': -143.536, 'GLY': -105.658, 'HID': -104.977, 'HIE': -104.977, 'HIP': -104.977, 'HIS': -104.977, 'ILE': -84.13, 'LEU': -92.4, 'LYN': -110.759, 'LYS': -110.759, 'MET': -99.708, 'PHE': -96.483, 'PRO': -66.763, 'SER': -96.365, 'THR': -98.156, 'TRP': -105.114, 'TYR': -101.858, 'VAL': -93.493}¶
- GXG_DATA_PRIME = {'ALA': -112.635, 'ARG': -142.467, 'ARN': -142.467, 'ASH': -154.559, 'ASN': -156.375, 'ASP': -154.559, 'CYS': -113.747, 'GLH': -141.673, 'GLN': -152.611, 'GLU': -141.673, 'GLY': -116.263, 'HID': -121.6, 'HIE': -121.6, 'HIP': -121.6, 'HIS': -121.6, 'ILE': -97.541, 'LEU': -107.106, 'LYN': -123.751, 'LYS': -123.751, 'MET': -112.643, 'PHE': -113.719, 'PRO': -81.734, 'SER': -112.693, 'THR': -116.048, 'TRP': -122.407, 'TYR': -123.497, 'VAL': -109.017}¶
- SUPPORTED_BUILD_RESIDUES = ['ALA', 'ARG', 'ASN', 'ASP', 'CYS', 'GLN', 'GLU', 'GLY', 'HIS', 'HIP', 'HIE', 'ILE', 'LEU', 'LYS', 'MET', 'PHE', 'PRO', 'SER', 'THR', 'TRP', 'TYR', 'VAL']¶
- __init__(ref_struct, mutations, concurrent=1, sequential=False, idealize=True)¶
- Parameters
ref_struct (
schrodinger.structure.Structure
instance) – The reference (starting) structuremutations (List of tuples) – A list of the mutations to carry out on the
ref_struct
. Each element of the list is a tuple of (“res num.”, [“pdbnames”]) where “res num.” is the residue number being altered and “pdbnames” is a list of the standard PDB residue names to mutate it to.concurrent (int) – Maximum concurrent mutations
sequential (bool) – Limit concurrent mutations to being sequential
idealize (bool) – Whether to idealize the reference structure by self-mutating the affected residues before calculating properties.
- Raises
RuntimeError – If concurrent is less than 1.
- See
For easy creation of mutations variable
ProteinMutator.convert_residue_list
- classmethod validate_mutated_residues(residues)¶
Method for validating the residues used in mutations passed in to the
MutateProtein
class.- Raises
ValueError – If the 3-letter residue name is not supported by the
build,mutate
method.
- classmethod convert_res_file(filename, regex=re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n :\n (?P<resnum>-?\\d+)\n (?P<inscode>[a-zA-Z]{1})? # optional\n \\s? # optional\n (?P<mutations>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HI, re.VERBOSE))¶
Shim for the mixed-case class method convertResFile
- classmethod convert_res_list(reslist, regex=re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n :\n (?P<resnum>-?\\d+)\n (?P<inscode>[a-zA-Z]{1})? # optional\n \\s? # optional\n (?P<mutations>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HI, re.VERBOSE), validate=True)¶
Shim for the mixed-case class method convertResList.
- static get_3_letter_mutation(mutation)¶
- classmethod get_3_letter_mutation_list(mutation_list)¶
- classmethod convert_res_to_muts(res_str, regex=re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n :\n (?P<resnum>-?\\d+)\n (?P<inscode>[a-zA-Z]{1})? # optional\n \\s? # optional\n (?P<mutations>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HI, re.VERBOSE), validate=True)¶
Shim for the mixed-case class method convertResToMuts.
- classmethod convert_muts_file(muts_file, regex=re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n :\n (?P<resnum>-?\\d+)\n (?P<inscode>[a-zA-Z]{1})? # optional\n ->\n (?P<new_resname>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HIS|HIP|HID|HIE|ILE|LEU|LYS|MET|PHE|, re.VERBOSE))¶
Shim for the mixed-case class method convertMutsFile.
- classmethod convert_residue_list(residues, mutations, regex=re.compile('(?P<chain>[a-zA-Z0-9_]{1})\n :\n (?P<resnum>-?\\d+)\n (?P<inscode>[a-zA-Z]{1})? # optional\n \\s? # optional\n (?P<mutations>(ALA|ARG|ASN|ASP|CYS|GLN|GLU|GLY|HI, re.VERBOSE))¶
Shim for the mixed-case method convertResidueList.
- calculateMutationsList()¶
Calculate the mutations that will be performed, based on the input residues and their mutations, and the “concurrent” and “sequential” settings.
- generate()¶
Used to loop over all mutations. Each mutation consists of the mutated structure and a residue mapping dict. The structure is raw, that is, unrefined in any way.
- Returns
Generator for all mutations defined in
self.mutations
Each step of generator yields amutation
.- Return type
generator
- getMutationFromChanges(changes)¶
- classmethod convertMutsFile(muts_file, regex=None)¶
Converts lines in filename into a list of mutations to use. Returns a list of tuples where each tuple is (“chain”, “resnum”, “inscode”, “one-letter nucleobase for mutation”).
Each line is one mutation (could be multiple residues)
- classmethod convertResFile(filename, regex=None)¶
Converts lines in filename into a list of mutations to use. Returns a list of tuples where each tuple is (“chain”, “resnum”, “inscode”, “three-letter resnames for mutation”).
Each line could be multiple mutations (one residue to multiple mutation states)
- Parameters
filename (str) – Name of file containing the list of mutations.
regex (regular expression object) – Regular expression for matching residues
- Raises
RuntimeError – If any of chain, resnum or mutation is missing
- Returns
List of mutations with valid syntax for the class
- Return type
list of tuples
- classmethod convertResList(reslist, regex=None, validate=True)¶
Converts list of residues into a list of mutations to use. Returns a list of tuples where each tuple is (“chain”, “resnum”, “inscode”, “three-letter resnames for mutation”).
Each residue string could be multiple mutations (one residue to multiple mutation states)
- Parameters
reslist (list of str) – List of residues to convert to mutations
regex (regular expression object) – Regular expression for matching residues
validate (bool) – Whether to validate the potential mutations
- Returns
List of mutations with valid syntax for the class or None if any item in the list is not valid
- Return type
list of tuples or None
- classmethod convertResToMuts(res_str, regex=None, validate=True)¶
Converts a residue string into a list of mutations to use. Returns a list of tuples of (“chain”, “resnum”, “inscode”, “one-letter resnames for mutation”). Will return None if any item in the list is not a valid residue string.
A residue string could be multiple mutations (one residue to multiple mutation states)
- Parameters
res_str (str) – Residue string to convert to mutations
regex (regular expression object) – Regular expression for matching residues
validate (bool) – Whether to run validation on the mutation
- Returns
List of mutations with valid syntax for the class or None if the res_str is not valid.
- Return type
list tuples or None
- classmethod convertResidueList(residues, mutations, regex=None)¶
Convert a list of residues and mutations to a standard list of mutations. Returns a list of tuples where each tuple is (“chain”, “resnum”, “inscode”, “one-letter resnames for mutation”).
- Parameters
residues (list of strings (Syntax: <chain>:<resnum> if no chain use
"_"
)) – Residues that will be mutated.mutations – The one-letter names for the residues that will be used in mutation.
- Raises
RuntimeError – If any of chain, resnum or mutation is missing or if there is an invalid residue name
- Returns
List of mutations with valid syntax for the class
- Return type
list of tuples
- getLoopMutation(mutated_st, res_str, new_resname)¶
build loop insertion or deletion
- property mutations¶
The list of mutations that will be carried out
- property total_mutations¶
Total number of mutations that will be generated
- static validateMutatedResidues(cls, residues)¶
Method for validating the residues used in mutations passed in to the
MutateProtein
class.- Raises
ValueError – If the 1-letter residue name is not supported by the
build.mutate
method.
- static validateMutations(mutations)¶
Private method for validating the mutations passed in to the
MutateProtein
class.- Raises
ValueError – If the
mutations
passed in is not a list, if each item in the list is not a tuple, if the tuple is not of length 4 (chain, resnum idx, inscode, mutation resnames), if the resnum is not an integer, or any of the 3-letter residue names in “mutation resnames” is not supported by thebuild,mutate
method.