schrodinger.application.bioluminate.classify module¶

schrodinger.application.bioluminate.classify.renumber_st_chain(chain: Structure, scheme: AntibodyCDRScheme = AntibodyCDRScheme.Kabat, resid_map: dict[str, str] = None) → dict[str, str]¶

Renumber the residues in a structure and the resids in a resid_map mapping the original residue ID to the current residue ID if provided.

Parameters:

chain – The structure chain to renumber
scheme – The numbering scheme to use
resid_map – Mapping original residue ID to current residue ID

Returns:

The updated resid_map

schrodinger.application.bioluminate.classify.update_resid_map(chain: Structure, resid_map: dict[str, str] = None) → dict[str, str]¶

Update the resid_map with the current residue IDs from the chain. If no resid_map is provided, a new one will be created.

Parameters:

chain – The structure chain to update
resid_map – A mapping from the original residue ID to the current residue ID; if not provided, a new mapping will be created

Returns:

The updated resid_map

schrodinger.application.bioluminate.classify.apply_numbering(chain: Structure, numbered_sequence: NumberedSequence, resid_map: dict[str, str] = None) → dict[str, str]¶

Apply the numbering to the chain residues and to the resid_map.

Parameters:

chain – The structure chain to renumber
numbered_sequence – The sequence numbering to apply
resid_map – A mapping from the original residue ID to the current residue ID; the mapping will be updated and returned

Returns:

The updated resid_map

schrodinger.application.bioluminate.classify.get_sequence_numbering(sequence, ab_scheme=AntibodyCDRScheme.Kabat, chain_name=None, classifiers=(<class 'schrodinger.application.bioluminate.anarci_classifier.AnarciClassifier'>, )) → NumberedSequence¶: Get the sequence numbering for a sequence.

schrodinger.application.bioluminate.classify.get_protein_family_data(sequence: str, ab_scheme=AntibodyCDRScheme.Kabat, chain_name: str = None, classifiers=(<class 'schrodinger.application.bioluminate.anarci_classifier.AnarciClassifier'>, )) → ProteinClass¶: Get the protein family data for a sequence.

schrodinger.application.bioluminate.classify.get_region_lengths_from_bounds(region_bounds: dict[str, tuple[int, int]]) → dict[str, int]¶: Calculate the lengths of the regions based on their bounds.

schrodinger.application.bioluminate.classify.validate_sequence_for_format(sequence: str, ab_format: AntibodyFormat, ab_scheme=AntibodyCDRScheme.Kabat) → None¶

Check if the sequence contains the necessary regions for the specified antibody format.

Parameters:

sequence – The full sequence to check
ab_format – The antibody format specifying the required regions
ab_scheme – The antibody numbering scheme to use for determining the regions; defaults to DEFAULT_ANTIBODY_SCHEME

Raises:

ValueError – If the sequence is not valid for the specified antibody format

schrodinger.application.bioluminate.classify.get_sequence_for_format(sequence: str, ab_format: AntibodyFormat, ab_scheme=AntibodyCDRScheme.Kabat) → str¶

Get the sequence segment corresponding to the specified antibody format.

Parameters:

sequence – The full sequence to extract the segment from
ab_format – The antibody format specifying the segment to extract
ab_scheme – The antibody numbering scheme to use for determining the regions; defaults to DEFAULT_ANTIBODY_SCHEME

schrodinger.application.bioluminate.classify.get_regions_for_format(ab_format: AntibodyFormat, anarci_type: str) → list[str]¶

Get the region names corresponding to the specified antibody format and ANARCI type.

Parameters:

ab_format – The antibody format specifying the segment to extract
anarci_type – The ANARCI type of the sequence (e.g., ‘VH’, ‘VL’)

Returns:

A list of region names corresponding to the specified format and ANARCI type

schrodinger.application.bioluminate.classify.trim_sequence_to_regions(protein_data: ProteinClass, regions: tuple[str]) → str¶

Get the sequence segment corresponding to the specified regions.

Check that: - All specified regions are present in the protein data. - The regions are contiguous in the sequence, meaning there are no gaps between them.

Parameters:

protein_data – The ProteinClass instance containing the sequence and region boundaries
regions – A tuple of region names to extract from the sequence

Returns:

The concatenated sequence segments corresponding to the specified regions

schrodinger.application.bioluminate.classify.get_merged_region_bounds(bounds_list: list[tuple[int, int]]) → list[tuple[int, int]]¶

Merges overlapping or adjacent regions from the given list of (start, end) Merges overlapping and adjacent regions from the given list of (start, end) bounds, returning a single region for each set of adjacent or overlapping regions.

Parameters:: bounds_list – list of (start, end) bounds for regions
Returns:: list of (start, end) bounds for each merged continuous region