schrodinger.application.bioluminate.protein_data module¶

class schrodinger.application.bioluminate.protein_data.ProteinClassifier¶

Bases: object

A classifier for a protein sequence.

PROTEIN_TYPES = ()¶

__init__()¶

class schrodinger.application.bioluminate.protein_data.NumberedResidue(residue: str, chain_id: str, number: int, insertion_code: str)¶

Bases: object

A residue with a sequence number.

residue: str¶

chain_id: str¶

number: int¶

insertion_code: str¶

property number_with_ins_code: str¶

property number_with_ins_code_and_chain_id: str¶

__init__(residue: str, chain_id: str, number: int, insertion_code: str) → None¶

class schrodinger.application.bioluminate.protein_data.NumberedSequence(*args, scheme=None, **kwargs)¶

Bases: list

__init__(*args, scheme=None, **kwargs)¶

class schrodinger.application.bioluminate.protein_data.ProteinClass(type: str = 'Unknown', region_bounds: dict[str, tuple[int, int]] = <factory>, numbering_with_gaps: ~schrodinger.application.bioluminate.protein_data.NumberedSequence = <factory>)¶

Bases: object

The basic information about a protein class - its type, the bounds of its regions, and its re-numbered sequence.

e.g.: sequence: ‘EVQ…’ type: Antibody VH region_bounds: {‘FR1’: (1, 26), ‘CDR1’: (27, 38), ‘FR2’: (39, 55), …} numbering_with_gaps: [ NumberedResidue(residue=’E’, chain_id=’H’, number=1, insertion_code=’ ‘), NumberedResidue(residue=’V’, chain_id=’H’, number=2, insertion_code=’ ‘), NumberedResidue(residue=’Q’, chain_id=’H’, number=3, insertion_code=’ ‘), …, # etc., the insertion code being ‘ ‘ is important; it cannot be ‘’. ]

type: str = 'Unknown'¶

region_bounds: dict[str, tuple[int, int]]¶

numbering_with_gaps: NumberedSequence¶

property numbering: NumberedSequence¶

property numbering_strings: list[str]¶

property numbering_strings_with_chain_id: list[str]¶

classmethod fromSequence(sequence, ab_scheme=AntibodyCDRScheme.Kabat, chain_name=None, classifiers=())¶

adjustRegionBounds(region_adjustments: dict[str, [<class 'int'>, <class 'int'>]])¶: Create a new ProteinClass with adjusted region bounds.

__init__(type: str = 'Unknown', region_bounds: dict[str, tuple[int, int]] = <factory>, numbering_with_gaps: ~schrodinger.application.bioluminate.protein_data.NumberedSequence = <factory>) → None¶

schrodinger.application.bioluminate.protein_data.find_domains(sequence: str, scheme, chain_name: str, classifiers: tuple[ProteinClassifier, ...]) → list[ProteinClass]¶

Find domains in a protein sequence using the provided classifiers.

Parameters:

sequence – The protein sequence to classify.
scheme – The antibody CDR scheme to use for classification.
chain_name – The name of the chain to classify.
classifiers – A tuple of ProteinClassifier instances to use for classification.

Returns:

A list of ProteinClass instances representing the classified domains in the sequence.

schrodinger.application.bioluminate.protein_data.number_terminal_residues(numbering: NumberedSequence, start: int, end: int, full_sequence: str, skip_zero=True) → NumberedSequence¶

Number the terminal residues of a sequence based on the existing numbering. Each end of the sequence will be numbered sequentially. If skip_zero is True, the numbering will skip zero.

Parameters:

numbering – The numbering of the sequence.
start – The starting index where the sequence has already been numbered.
end – The ending index where the sequence has already been numbered.
full_sequence – The full sequence to number.

Returns:

The updated numbering with terminal residues numbered.

schrodinger.application.bioluminate.protein_data.validate_numbering(numbering: NumberedSequence, full_sequence: str)¶

Validate that the numbering matches the sequence, ignoring gaps.

Parameters:

numbering – The numbered sequence to validate.
full_sequence – The full sequence to validate against.