schrodinger.protein.annotation module

Annotations for biological sequences

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.protein.annotation.BINDING_SITE

Bases: Enum

CloseContact = 1
FarContact = 2
NoContact = 3
class schrodinger.protein.annotation.AntibodyCDRLabel

Bases: Enum

NotCDR = 1
L1 = 2
L2 = 3
L3 = 4
H1 = 5
H2 = 6
H3 = 7
class schrodinger.protein.annotation.AntibodyCDR(label, start, end)

Bases: tuple

end

Alias for field number 2

label

Alias for field number 0

start

Alias for field number 1

class schrodinger.protein.annotation.Region(label, value, start, end)

Bases: tuple

end

Alias for field number 3

label

Alias for field number 0

start

Alias for field number 2

value

Alias for field number 1

class schrodinger.protein.annotation.AntibodyRegionLabel

Bases: Enum

H = 1
HFR = 2
CH = 3
L = 4
LFR = 5
CL = 6
Hinge = 7
class schrodinger.protein.annotation.TCRRegionLabel

Bases: Enum

A = 1
AFR = 2
B = 3
BFR = 4
class schrodinger.protein.annotation.GPCRSegmentLabel

Bases: Enum

NTerm = 1
CTerm = 2
ICL = 3
ECL = 4
H8 = 5
TM = 6
Other = 7
class schrodinger.protein.annotation.Domains

Bases: Enum

Domain = 1
NoDomain = 2
class schrodinger.protein.annotation.ImmuneReceptorDisplayNames(name: str, single_letter: str, full_name: str, locus: str, anarci_shorthand: str, anarci_full_name: str)

Bases: object

name: str
single_letter: str
full_name: str
locus: str
anarci_shorthand: str
anarci_full_name: str
__init__(name: str, single_letter: str, full_name: str, locus: str, anarci_shorthand: str, anarci_full_name: str) None
class schrodinger.protein.annotation.AntibodyDisplayNames(name: str, single_letter: str, full_name: str, locus: str, anarci_shorthand: str, anarci_full_name: str, h_l_code: str)

Bases: ImmuneReceptorDisplayNames

h_l_code: str
__init__(name: str, single_letter: str, full_name: str, locus: str, anarci_shorthand: str, anarci_full_name: str, h_l_code: str) None
class schrodinger.protein.annotation.TCRDisplayNames(name: str, single_letter: str, full_name: str, locus: str, anarci_shorthand: str, anarci_full_name: str)

Bases: ImmuneReceptorDisplayNames

__init__(name: str, single_letter: str, full_name: str, locus: str, anarci_shorthand: str, anarci_full_name: str) None
class schrodinger.protein.annotation.KinaseConservation

Bases: JsonableEnum

VeryLow = 'Very Low'
Low = 'Low'
Medium = 'Medium'
High = 'High'
VeryHigh = 'Very High'
class schrodinger.protein.annotation.KinaseFeatureLabel

Bases: JsonableEnum

GLYCINE_RICH_LOOP = 'Glycine Rich Loop'
ALPHA_C = 'Alpha-C'
GATE_KEEPER = 'Gate Keeper'
HINGE = 'Hinge'
LINKER = 'Linker'
HRD = 'HRD'
CATALYTIC_LOOP = 'Catalytic Loop'
DFG = 'DFG'
ACTIVATION_LOOP = 'Activation Loop'
NO_FEATURE = 'No Feature'
class schrodinger.protein.annotation.Consensus

Bases: Enum

not_conserved = ' '
fully_conserved = '*'
strongly_conserved = ':'
weakly_conserved = '.'
property tooltip
class schrodinger.protein.annotation.TupleWithRange(iterable=(), /)

Bases: tuple

property range

The range of data contianed in this tuple. Will return a tuple of (minimum value or zero whichever is less, maximum value or zero whichever is greater). None values will be ignored. If there are no None values in this tuple, will return (0, 0). :rtype: tuple(int or float, int or float)

class schrodinger.protein.annotation.AbstractSequenceAnnotations(seq)

Bases: QObject

A base class for single-chain and combined-chain sequence annotations

Variables:

titleChanged (QtCore.pyqtSignal) – A signal emitted after an annotation’s title (row header) changes.

titleChanged

A pyqtSignal emitted by instances of the class.

__init__(seq)
Parameters:

seq (sequence.Sequence) – The sequence to store annotations for.

sequence

A descriptor for an instance attribute that should be stored as a weakref. Unlike weakref.proxy, this descriptor allows the attribute to be hashed.

Note that the weakref is stored on the instance using the same name as the descriptor (which is stored on the class). Since this descriptor implements __set__, it will always take precedence over the value stored on the instance.

class schrodinger.protein.annotation.AbstractProteinSequenceAnnotationsMixin(*args, **kwargs)

Bases: object

domainsChanged

A pyqtSignal emitted by instances of the class.

invalidatedDomains

A pyqtSignal emitted by instances of the class.

__init__(*args, **kwargs)
Parameters:

seq (sequence.Sequence) – The sequence to store annotations for.

property max_b_factor
property min_b_factor
invalidateMaxMinBFactor()
getAntibodyCDR(col, scheme)

Returns the antibody CDR information of the col’th index in the sequence under a given antibody CDR numbering scheme.

Parameters:
  • col (int) – index into the sequence

  • scheme (AntibodyCDRScheme) – The antibody CDR numbering scheme to use

Returns:

Antibody CDR label, start, and end positions

Return type:

AntibodyCDR, which is a named tuple of (AntibodyCDRLabel, int, int) if col is in a CDR, otherwise (AntibodyCDRLabel.NotCDR, None, None)

getAntibodyCDRs(scheme)

Returns a list of antibody CDR information for the entire sequence.

Parameters:

scheme (AntibodyCDRScheme) – The antibody CDR numbering scheme to use

Returns:

A list of Antibody CDR labels, starts, and end positions

Return type:

list(AntibodyCDR)

getResID(res) str

Get the structure residue ID with the kabat numbering scheme.

Parameters:

res – the residue

Returns:

the residue ID

getGPCRSegment(col: int) Optional[Region]

Return the GPCR segment information of the col’th index in the sequence.

Parameters:

col – index into the sequence

Returns:

GPCR Segment label, value, start, and end positions or None if col is not in a GPCR segment

getGPCRSegments() List[Region]

Return a list of GPCR segment information for the entire sequence.

Returns:

a list of GPCR Segments labels, values, start and end positions

getAntibodyRegion(col: int, scheme: AntibodyCDRScheme) Optional[Region]

Return the antibody region of the given residue based on the numbering scheme.

The regex will strip trailing numbers and get the label according to the AntibodyRegionLabel enum.

Example values: H1, CL, HINGE, LFR4, CH3

Parameters:
  • col – index into the sequence

  • scheme – the antibody CDR numbering scheme to use

Returns:

An AntibodyRegion with a label and value or None if there is no region

getAntibodyRegions(scheme: AntibodyCDRScheme) List[Region]

Return the list of all antibody regions based on the numbering scheme.

Parameters:

scheme – the antibody CDR numbering scheme to use

Returns:

getTCRRegion(col: int) Optional[Region]

Return the TCR region information of the col’th index in the sequence.

Parameters:

col – index into the sequence

Returns:

TCR Region label, value, start, and end positions or None if col is not in a TCR Region

getTCRRegions() List[Region]

Return a list of TCR region information for the entire sequence.

Returns:

a list of TCR region labels, values, start and end positions

isAntibodyChain()
Returns:

Whether the sequence described is an antibody chain

Return type:

bool

isAntibodyHeavyChain()
Returns:

Whether the sequence described is an antibody heavy chain

Return type:

bool

isAntibodyLightChain()
Returns:

Whether the sequence described is an antibody light chain

Return type:

bool

property binding_sites
property ligands
property ligand_asls
setLigandDistance(distance)

Updates the ligand distance and invalidates the cache

property domains
getSSBondPartner(index)

Return the residue’s intra-sequence disulfide bond partner, if any.

If the residue is not involved in a disulfide bond, its partner has been deleted, or its partner is in another sequence, it will return None.

Parameters:

index (int) – Index of the residue to check

Returns:

the other Residue in the disulfide bond or None

Return type:

residue.Residue or None

clearAllCaching()
getNumAnnValues(ann)
class schrodinger.protein.annotation.SequenceAnnotations(seq)

Bases: AbstractSequenceAnnotations

Knows how to annotate a single-chain sequence

Annotations can be set at the level of the sequence as a whole, or be per sequence element annotations. If an attribute is accessed on the SequenceAnnotations object, the attribute is first looked for on the object and if not found is assumed to be a per sequence element annotation. If the elements in the sequence lack the attribute, an AttributeError will be raised.

class schrodinger.protein.annotation.ProteinSequenceAnnotations(seq)

Bases: AbstractProteinSequenceAnnotationsMixin, SequenceAnnotations

Knows how to annotate a ProteinSequence

annotationInvalidated

A pyqtSignal emitted by instances of the class.

invalidatedLigandContacts

A pyqtSignal emitted by instances of the class.

invalidatedMaxMinBFactor

A pyqtSignal emitted by instances of the class.

class ANNOTATION_TYPES(*args, **kwargs)

Bases: JsonableClassMixin

alignment_set = 2
antibody_cdr = 21
antibody_regions = 35
b_factor = 15
beta_strand_propensity = 7
binding_sites = 19
custom_annotation = 34
disulfide_bonds = 5
domains = 20
exposure_tendency = 10
classmethod fromJsonImplementation(json_obj)

Abstract method that must be defined by all derived classes. Takes in a dictionary and constructs an instance of the derived class.

Parameters:

json_dict (dict) – A dictionary loaded from a JSON string or file.

Returns:

An instance of the derived class.

Return type:

cls

gpcr_generic_number = 33
gpcr_segment = 32
helix_propensity = 6
helix_termination_tendency = 9
hydrophobicity = 13
isoelectric_point = 14
kinase_conservation = 31
kinase_features = 30
pairwise_constraints = 1
pfam = 23
pred_accessibility = 26
pred_disordered = 27
pred_disulfide_bonds = 24
pred_domain_arr = 28
pred_secondary_structure = 25
proximity_constraints = 29
rescode = 4
resnum = 3
sasa = 22
secondary_structure = 18
side_chain_chem = 12
steric_group = 11
tcr_regions = 36
toJsonImplementation()

Abstract method that must be defined by all derived classes. Converts an instance of the derived class into a jsonifiable object.

Returns:

A dict made up of JSON native datatypes or Jsonable objects. See the link below for a table of such types. https://docs.python.org/2/library/json.html#encoders-and-decoders

turn_propensity = 8
window_hydrophobicity = 16
window_isoelectric_point = 17
RES_PROPENSITY_ANNOTATIONS = {<ANNOTATION_TYPES.exposure_tendency: 10>, <ANNOTATION_TYPES.beta_strand_propensity: 7>, <ANNOTATION_TYPES.helix_termination_tendency: 9>, <ANNOTATION_TYPES.steric_group: 11>, <ANNOTATION_TYPES.helix_propensity: 6>, <ANNOTATION_TYPES.side_chain_chem: 12>, <ANNOTATION_TYPES.turn_propensity: 8>}
PRED_ANNOTATION_TYPES = {<ANNOTATION_TYPES.pred_disordered: 27>, <ANNOTATION_TYPES.pred_accessibility: 26>, <ANNOTATION_TYPES.pred_disulfide_bonds: 24>, <ANNOTATION_TYPES.pred_secondary_structure: 25>, <ANNOTATION_TYPES.pred_domain_arr: 28>}
__init__(seq)
Parameters:

seq (sequence.Sequence) – The sequence to store annotations for.

invalidateMaxMinBFactor()
property window_hydrophobicity
property hydrophobicity_window_padding
property binding_site_residues

Binding site residues of the sequence as a map, with key being the ligand name(str) and value is the set of residues(protein.Residue).

property isoelectric_point_window_padding
invalidateWindowHydrophobicity()

Invalidate the cached window hydrophobicity data. Note that this method is also called from the sequence when the window size changes.

property window_isoelectric_point
invalidateWindowIsoelectricPoint()

Invalidate the cached window isoelectric point data. Note that this method is also called from the sequence when the window size changes.

property sasa
getAntibodyCDR(col, scheme)

Returns the antibody CDR information of the col’th index in the sequence under a given antibody CDR numbering scheme.

Parameters:
  • col (int) – index into the sequence

  • scheme (AntibodyCDRScheme) – The antibody CDR numbering scheme to use

Returns:

Antibody CDR label, start, and end positions

Return type:

AntibodyCDR, which is a named tuple of (AntibodyCDRLabel, int, int) if col is in a CDR, otherwise (AntibodyCDRLabel.NotCDR, None, None)

getAntibodyCDRs(scheme)

Returns a list of antibody CDR information for the entire sequence.

Parameters:

scheme (AntibodyCDRScheme) – The antibody CDR numbering scheme to use

Returns:

A list of Antibody CDR labels, starts, and end positions

Return type:

list(AntibodyCDR)

getGPCRSegment(col: int) Optional[Region]

Return the GPCR segment information of the col’th index in the sequence.

Parameters:

col – index into the sequence

Returns:

GPCR Segment label, value, start, and end positions or None if col is not in a GPCR segment

getGPCRSegments() List[Region]

Return a list of GPCR segment information for the entire sequence.

Returns:

a list of GPCR Segments labels, values, start and end positions

getTCRRegion(col: int) Optional[Region]

Return the TCR region information of the col’th index in the sequence.

Parameters:

col – index into the sequence

Returns:

TCR Region label, value, start, and end positions or None if col is not in a TCR Region

getTCRRegions() List[Region]

Return a list of TCR region information for the entire sequence.

Returns:

a list of TCR region labels, values, start and end positions

isAntibodyChain()
Returns:

Whether the sequence described is an antibody chain

Return type:

bool

getAntibodyRegion(col: int, scheme: AntibodyCDRScheme)

Return the antibody region of the given residue based on the numbering scheme.

The regex will strip trailing numbers and get the label according to the AntibodyRegionLabel enum.

Example values: H1, CL, HINGE, LFR4, CH3

Parameters:
  • col – index into the sequence

  • scheme – the antibody CDR numbering scheme to use

Returns:

An AntibodyRegion with a label and value or None if there is no region

getAntibodyRegions(scheme: AntibodyCDRScheme)

Return the list of all antibody regions based on the numbering scheme.

Parameters:

scheme – the antibody CDR numbering scheme to use

Returns:

isAntibodyHeavyChain()
Returns:

Whether the sequence described is an antibody heavy chain

Return type:

bool

isAntibodyLightChain()
Returns:

Whether the sequence described is an antibody light chain

Return type:

bool

getSparseRescodes(modulo)
onStructureChanged()
setLigandDistance(distance)

Updates the ligand distance and invalidates the cache

parseDomains(filename)

Parse XML file from UniProt database to get domain information.

Parameters:

filename (str) – the XML file to parse for domain information

Returns:

a list of the domains (names) for the sequence in order

Return type:

list(str)

resetAnnotation(ann)

Force a reset of an annotation’s cache.

clearAllCaching()
property inscode
property resnum
getCDRResidueList(scheme)
Returns:

List of CDR Residues.

Return type:

List[str]

class schrodinger.protein.annotation.NucleicAcidSequenceAnnotations(seq)

Bases: ProteinSequenceAnnotations

isAntibodyChain()
Returns:

Whether the sequence described is an antibody chain

Return type:

bool

class schrodinger.protein.annotation.ProteinAlignmentAnnotations(aln)

Bases: object

Knows how to annotate an alignment (a collection of aligned sequences)

class ANNOTATION_TYPES(*args, **kwargs)

Bases: JsonableClassMixin

consensus_freq = 6
consensus_seq = 5
consensus_symbols = 4
classmethod fromJsonImplementation(json_obj)

Abstract method that must be defined by all derived classes. Takes in a dictionary and constructs an instance of the derived class.

Parameters:

json_dict (dict) – A dictionary loaded from a JSON string or file.

Returns:

An instance of the derived class.

Return type:

cls

indices = 1
mean_hydrophobicity = 2
mean_isoelectric_point = 3
toJsonImplementation()

Abstract method that must be defined by all derived classes. Converts an instance of the derived class into a jsonifiable object.

Returns:

A dict made up of JSON native datatypes or Jsonable objects. See the link below for a table of such types. https://docs.python.org/2/library/json.html#encoders-and-decoders

__init__(aln)
Parameters:

alnalignment.Alignment

alignment

A descriptor for an instance attribute that should be stored as a weakref. Unlike weakref.proxy, this descriptor allows the attribute to be hashed.

Note that the weakref is stored on the instance using the same name as the descriptor (which is stored on the class). Since this descriptor implements __set__, it will always take precedence over the value stored on the instance.

property indices

A numbering of all the column indices in an alignment

property mean_hydrophobicity

returns: A list of floats representing per-column averages of the hydrophobicity of residues in the alignment

property mean_isoelectric_point

returns: A list of floats representing per-column averages of the isoelectric point of residues in the alignment

property consensus_seq

Consensus sequence in the alignment. If there is more than one highest freq. residue in the column, save all of them.

Returns:

consensus sequence

Return type:

list(list(Residue))

property consensus_freq

Returns the frequency of the consensus residue in each alignment column as a list. Gaps are not used for calculation.

Returns:

consensus residue frequencies

Return type:

TupleWithRange(float)

property consensus_symbols

Consensus symbols in the alignment based on pre-defined residue sets, same as in ClustalW

Returns:

consensus symbols for each alignment position

Type:

A list of ConsensusSymbol enums.

Calculates normalized frequencies of individual amino acids per alignment position, and overall estimate of column composition conservation (‘bits’). Bit values are weighted by the number of gaps in the column.

Schneider TD, Stephens RM (1990). “Sequence Logos: A New Way to Display Consensus Sequences”. Nucleic Acids Res 18 (20): 6097–6100. doi:10.1093/nar/18.20.6097

Returns:

the list of bits and frequencies (in decreasing order) of the residues in each column of the alignment.

Return type:

list(tuple(float, tuple(tuple(str, float))))

clearAllCaching()
class schrodinger.protein.annotation.AbstractRegionFinder(seq)

Bases: object

Abstract class to help with finding annotated regions from the sequence. Values should be cached to reduce load on multiple requests from the table helpers and annotation requests.

VALUE_MAP = None
__init__(seq)
Parameters:

seq (schrodinger.protein.sequence.ProteinSequence) – The sequence to find the regions on

seq

A descriptor for an instance attribute that should be stored as a weakref. Unlike weakref.proxy, this descriptor allows the attribute to be hashed.

Note that the weakref is stored on the instance using the same name as the descriptor (which is stored on the class). Since this descriptor implements __set__, it will always take precedence over the value stored on the instance.

forceIndexReassignment()

Force a recalculation of the region start and end indices. This is required when gaps are inserted/removed.

This will always do a full recalcuation, but is here to match _AntibodyCDRFinder’s API.

class schrodinger.protein.annotation.AntibodyRegionsFinder(*args, **kwargs)

Bases: AbstractRegionFinder

Class to help with finding Antibody Regions from the sequence. Values should be cached to reduce load on multiple requests from the table helpers and annotation requests.

VALUE_MAP = {'H1': AntibodyRegionLabel.H, 'H2': AntibodyRegionLabel.H, 'H3': AntibodyRegionLabel.H, 'HFR1': AntibodyRegionLabel.HFR, 'HFR2': AntibodyRegionLabel.HFR, 'HFR3': AntibodyRegionLabel.HFR, 'HFR4': AntibodyRegionLabel.HFR, 'L1': AntibodyRegionLabel.L, 'L2': AntibodyRegionLabel.L, 'L3': AntibodyRegionLabel.L, 'LFR1': AntibodyRegionLabel.LFR, 'LFR2': AntibodyRegionLabel.LFR, 'LFR3': AntibodyRegionLabel.LFR, 'LFR4': AntibodyRegionLabel.LFR}
__init__(*args, **kwargs)
Parameters:

seq (schrodinger.protein.sequence.ProteinSequence) – The sequence to find the regions on

getAntibodyRegions(scheme)
class schrodinger.protein.annotation.TCRRegionFinder(seq)

Bases: AbstractRegionFinder

Class to help with finding TCR Regions from the sequence. Values should be cached to reduce load on multiple requests from the table helpers and annotation requests.

VALUE_MAP = {'A1': TCRRegionLabel.A, 'A2': TCRRegionLabel.A, 'A3': TCRRegionLabel.A, 'AFR1': TCRRegionLabel.AFR, 'AFR2': TCRRegionLabel.AFR, 'AFR3': TCRRegionLabel.AFR, 'AFR4': TCRRegionLabel.AFR, 'B1': TCRRegionLabel.B, 'B2': TCRRegionLabel.B, 'B3': TCRRegionLabel.B, 'BFR1': TCRRegionLabel.BFR, 'BFR2': TCRRegionLabel.BFR, 'BFR3': TCRRegionLabel.BFR, 'BFR4': TCRRegionLabel.BFR}
getTCRRegions()
class schrodinger.protein.annotation.GPCRSegmentFinder(seq)

Bases: AbstractRegionFinder

Class to help with finding GPCR Segments from the sequence. Values should be cached to reduce load on multiple requests from the table helpers and annotation requests.

VALUE_MAP = {'C-term': GPCRSegmentLabel.CTerm, 'ECL1': GPCRSegmentLabel.ECL, 'ECL2': GPCRSegmentLabel.ECL, 'ECL3': GPCRSegmentLabel.ECL, 'H8': GPCRSegmentLabel.H8, 'ICL1': GPCRSegmentLabel.ICL, 'ICL2': GPCRSegmentLabel.ICL, 'ICL3': GPCRSegmentLabel.ICL, 'N-term': GPCRSegmentLabel.NTerm, 'TM1': GPCRSegmentLabel.TM, 'TM2': GPCRSegmentLabel.TM, 'TM3': GPCRSegmentLabel.TM, 'TM4': GPCRSegmentLabel.TM, 'TM5': GPCRSegmentLabel.TM, 'TM6': GPCRSegmentLabel.TM, 'TM7': GPCRSegmentLabel.TM, 'TM8': GPCRSegmentLabel.TM}
getGPCRs()
class schrodinger.protein.annotation.SeqTypeMixin(seq, *args, **kwargs)

Bases: object

Mixin to customize antibody.SeqType for MSV2. See _delayed_antibody_import for class declaration.

__init__(seq, *args, **kwargs)
isHeavyChain()
isLightChain()
class schrodinger.protein.annotation.CombinedChainSequenceAnnotationMeta(cls, bases, classdict, *, wraps=None, cached_annotations=(), wrapped_properties=())

Bases: QtDocstringWrapperMetaClass

The metaclass for CombinedChainSequenceAnnotations. This metaclass automatically wraps getters for all sequence annotations.

class schrodinger.protein.annotation.CombinedChainProteinSequenceAnnotations(seq)

Bases: AbstractProteinSequenceAnnotationsMixin, AbstractSequenceAnnotations

Sequence annotations for a sequence.CombinedChainProteinSequence. Annotations will be fetched from the ProteinSequenceAnnotations objects for each split-chain sequence.

sequence

A descriptor for an instance attribute that should be stored as a weakref. Unlike weakref.proxy, this descriptor allows the attribute to be hashed.

Note that the weakref is stored on the instance using the same name as the descriptor (which is stored on the class). Since this descriptor implements __set__, it will always take precedence over the value stored on the instance.

__init__(seq)
Parameters:

seq (sequence.CombinedChainProteinSequence) – The sequence to store annotations for.

chainAdded(chain)

Respond to a new chain being added to the sequence. The sequence is responsible for calling this method whenever a chain is added.

Parameters:

chain (sequence.ProteinSequence) – The newly added chain.

chainRemoved(chain)

Respond to a chain being removed from the sequence. The sequence is responsible for calling this method whenever a chain is removed.

Parameters:

chain (sequence.ProteinSequence) – The removed chain.

class ANNOTATION_TYPES(*args, **kwargs)

Bases: JsonableClassMixin

alignment_set = 2
antibody_cdr = 21
antibody_regions = 35
b_factor = 15
beta_strand_propensity = 7
binding_sites = 19
custom_annotation = 34
disulfide_bonds = 5
domains = 20
exposure_tendency = 10
classmethod fromJsonImplementation(json_obj)

Abstract method that must be defined by all derived classes. Takes in a dictionary and constructs an instance of the derived class.

Parameters:

json_dict (dict) – A dictionary loaded from a JSON string or file.

Returns:

An instance of the derived class.

Return type:

cls

gpcr_generic_number = 33
gpcr_segment = 32
helix_propensity = 6
helix_termination_tendency = 9
hydrophobicity = 13
isoelectric_point = 14
kinase_conservation = 31
kinase_features = 30
pairwise_constraints = 1
pfam = 23
pred_accessibility = 26
pred_disordered = 27
pred_disulfide_bonds = 24
pred_domain_arr = 28
pred_secondary_structure = 25
proximity_constraints = 29
rescode = 4
resnum = 3
sasa = 22
secondary_structure = 18
side_chain_chem = 12
steric_group = 11
tcr_regions = 36
toJsonImplementation()

Abstract method that must be defined by all derived classes. Converts an instance of the derived class into a jsonifiable object.

Returns:

A dict made up of JSON native datatypes or Jsonable objects. See the link below for a table of such types. https://docs.python.org/2/library/json.html#encoders-and-decoders

turn_propensity = 8
window_hydrophobicity = 16
window_isoelectric_point = 17
PRED_ANNOTATION_TYPES = {<ANNOTATION_TYPES.pred_disordered: 27>, <ANNOTATION_TYPES.pred_accessibility: 26>, <ANNOTATION_TYPES.pred_disulfide_bonds: 24>, <ANNOTATION_TYPES.pred_secondary_structure: 25>, <ANNOTATION_TYPES.pred_domain_arr: 28>}
RES_PROPENSITY_ANNOTATIONS = {<ANNOTATION_TYPES.exposure_tendency: 10>, <ANNOTATION_TYPES.beta_strand_propensity: 7>, <ANNOTATION_TYPES.helix_termination_tendency: 9>, <ANNOTATION_TYPES.steric_group: 11>, <ANNOTATION_TYPES.helix_propensity: 6>, <ANNOTATION_TYPES.side_chain_chem: 12>, <ANNOTATION_TYPES.turn_propensity: 8>}
property alignment_set
property antibody_cdr
property antibody_regions
property b_factor
property beta_strand_propensity
property custom_annotation
property disulfide_bonds
property domains
property exposure_tendency
property gpcr_generic_number
property gpcr_segment
property helix_propensity
property helix_termination_tendency
property hydrophobicity
property hydrophobicity_window_padding
property isoelectric_point
property isoelectric_point_window_padding
property kinase_conservation
property kinase_features
property pairwise_constraints
property pfam
property pred_accessibility
property pred_disordered
property pred_disulfide_bonds
property pred_domain_arr
property pred_secondary_structure
property proximity_constraints
property rescode
property resnum
property sasa
property secondary_structure
property side_chain_chem
property steric_group
property tcr_regions
property turn_propensity
property window_hydrophobicity
property window_isoelectric_point
getAntibodyCDR(col, scheme)

Returns the antibody CDR information of the col’th index in the sequence under a given antibody CDR numbering scheme.

Parameters:
  • col (int) – index into the sequence

  • scheme (AntibodyCDRScheme) – The antibody CDR numbering scheme to use

Returns:

Antibody CDR label, start, and end positions

Return type:

AntibodyCDR, which is a named tuple of (AntibodyCDRLabel, int, int) if col is in a CDR, otherwise (AntibodyCDRLabel.NotCDR, None, None)

getAntibodyCDRs(scheme)

Returns a list of antibody CDR information for the entire sequence.

Parameters:

scheme (AntibodyCDRScheme) – The antibody CDR numbering scheme to use

Returns:

A list of Antibody CDR labels, starts, and end positions

Return type:

list(AntibodyCDR)

getGPCRSegment(col: int) Optional[Region]

Return the GPCR segment information of the col’th index in the sequence.

Parameters:

col – index into the sequence

Returns:

GPCR Segment label, value, start, and end positions or None if col is not in a GPCR segment

getGPCRSegments() List[Region]

Return a list of GPCR segment information for the entire sequence.

Returns:

a list of GPCR Segments labels, values, start and end positions

getTCRRegion(col: int) Optional[Region]

Return the TCR region information of the col’th index in the sequence.

Parameters:

col – index into the sequence

Returns:

TCR Region label, value, start, and end positions or None if col is not in a TCR Region

getTCRRegions() List[Region]

Return a list of TCR region information for the entire sequence.

Returns:

a list of TCR region labels, values, start and end positions

getAntibodyRegion(col: int, scheme) Optional[Region]

Return the antibody region of the given residue based on the numbering scheme.

The regex will strip trailing numbers and get the label according to the AntibodyRegionLabel enum.

Example values: H1, CL, HINGE, LFR4, CH3

Parameters:
  • col – index into the sequence

  • scheme – the antibody CDR numbering scheme to use

Returns:

An AntibodyRegion with a label and value or None if there is no region

getAntibodyRegions(scheme) List[Region]

Return the list of all antibody regions based on the numbering scheme.

Parameters:

scheme – the antibody CDR numbering scheme to use

Returns:

isAntibodyChain()
Returns:

Whether the sequence described is an antibody chain

Return type:

bool

setLigandDistance(distance)

Updates the ligand distance and invalidates the cache

clearAllCaching()
schrodinger.protein.annotation.make_ligand_name_atom(ct, atom_index)

Make a unique, human-readable name for a ligand identified by atom index.

Parameters:
Returns:

The name for the ligand

Return type:

str

schrodinger.protein.annotation.make_ligand_name(ct, ligand)

Make a unique, human-readable name for a ligand. This name matches the ligand name in the structure hierarchy.

Parameters:
Returns:

The name for the ligand

Return type:

str

schrodinger.protein.annotation.parse_antibody_rescode(newcode)

Extract the resnum and inscode from residue number as per the scheme. If the inscode is a number it will be converted to alphabet. eg: ‘H101.1’ -> ‘101A’. Residues that are outside of the numbering scheme catalog (FV) or can not be assigned properly, will have residue number as ‘-1’. eg: ‘H-1’

Parameters:

newcode (str) – Residue code by the Antibody CDR numbering scheme.

Returns:

new residue number and insertion code.

Return type:

tuple

Raises:

KeyError – if newcode doesn’t follow the expected pattern.