schrodinger.application.bioluminate.anarci.annotate module

class schrodinger.application.bioluminate.anarci.annotate.AnnotationResult(anarci_type: Optional[schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciType], sequence: schrodinger.protein.sequence.ProteinSequence, start_index: int = 0)

Bases: object

Basic information about a domain found by anarci. Includes the type of domain anarci found, the annotated sequence, and the index on the input sequence where the domain starts.

anarci_type: Optional[schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciType]
sequence: schrodinger.protein.sequence.ProteinSequence
start_index: int = 0
classmethod from_domain_result(domain_result: schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciDomainResult)

Create an AnnotationResult from an AnarciDomainResult

get_annotation_strings_with_gaps()
get_annotation_strings()
property annotation_strings_with_gaps
property annotation_strings
property ig_type_str
__init__(anarci_type: Optional[schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciType], sequence: schrodinger.protein.sequence.ProteinSequence, start_index: int = 0) None
schrodinger.application.bioluminate.anarci.annotate.read_fasta(filename: str) list[schrodinger.application.bioluminate.anarci.anarci_adapter.InputSequence]
schrodinger.application.bioluminate.anarci.annotate.get_anarci_results_from_fasta(filename: str, **kwargs)
schrodinger.application.bioluminate.anarci.annotate.split_numbers_by_region(scheme: schrodinger.infra.util.AntibodyCDRScheme, anarci_type: schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciType, numbering: list[schrodinger.application.bioluminate.anarci.anarci_adapter.ResInfo], ignore_gaps=False) tuple[tuple[schrodinger.application.bioluminate.anarci.anarci_adapter.ResInfo], ...]

Split the numbering into tuples of ResInfo objects for each region

Parameters
  • scheme – antibody numbering scheme to use for the region definitions

  • anarci_type – the type of antibody to get the region indices for

  • numbering – numbered residue info objects in sequential order

  • ignore_gaps – whether to skip ResInfo objects which represent gap characters in the sequence

Returns

tuples of ResInfo objects for each region

schrodinger.application.bioluminate.anarci.annotate.get_region_bounds(scheme: str, anarci_type: schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciType, numbering: list[schrodinger.application.bioluminate.anarci.anarci_adapter.ResInfo], ignore_gaps=False, start_index: int = 0) tuple[tuple[int, int], ...]

Get the start and end indices for each loop and non-loop region

Parameters
  • scheme – antibody numbering scheme to use for the region definitions

  • anarci_type – the type of antibody to get the region indices for

  • numbering – numbered residue info objects in sequential order

  • ignore_gaps – whether to skip ResInfo objects which represent gap characters in the sequence

  • start_index – the starting index of the domain in the sequence

Returns

a tuple of start and end indices for each region

schrodinger.application.bioluminate.anarci.annotate.get_numbers_by_region_name(scheme: schrodinger.infra.util.AntibodyCDRScheme, anarci_type: schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciType, numbering: list[schrodinger.application.bioluminate.anarci.anarci_adapter.ResInfo], ignore_gaps=False) dict[str, tuple[schrodinger.application.bioluminate.anarci.anarci_adapter.ResInfo]]

Get a dictionary of start and end indices for each loop and non-loop region for the given antibody type, keyed by the region name (e.g. “HFR1”, “L3”, etc.)

Parameters
  • scheme – antibody numbering scheme to use for the region definitions

  • anarci_type – the type of antibody to get the region indices for

  • numbering – numbered residue info objects in sequential order

  • ignore_gaps – whether to skip ResInfo objects which represent gap characters in the sequence

  • scheme – antibody numbering scheme to use for the region definitions

  • anarci_type – the type of antibody to get the region indices for

  • numbering – numbered residue info objects in sequential order

  • ignore_gaps – whether to skip ResInfo objects which represent gap characters in the sequence

Returns

tuples of ResInfo objects for each region, keyed by the region name

schrodinger.application.bioluminate.anarci.annotate.get_variable_region_names(anarci_type: schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciType)

Get the names of the variable regions for the given anarci type :param anarci_type: the type to get the region names for :return: a generator of the region names

schrodinger.application.bioluminate.anarci.annotate.get_region_lengths(scheme: schrodinger.infra.util.AntibodyCDRScheme, anarci_type: schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciType, numbering: list[schrodinger.application.bioluminate.anarci.anarci_adapter.ResInfo], ignore_gaps=False) tuple[int, ...]
Get the length of each loop and non-loop region for the given numbered

residues

Parameters
  • scheme – antibody numbering scheme to use for the region definitions

  • anarci_type – the type of antibody to get the region indices for

  • numbering – numbered residue info objects in sequential order

  • ignore_gaps – whether to skip ResInfo objects which represent gap

Returns

the lengths of the regions before, inside, and in between the supplied ranges

schrodinger.application.bioluminate.anarci.annotate.get_annotations_from_results(results: schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciResults) list[schrodinger.application.bioluminate.anarci.annotate.AnnotationResult]
schrodinger.application.bioluminate.anarci.annotate.get_annotations(sequences: list[schrodinger.application.bioluminate.anarci.anarci_adapter.InputSequence], scheme=AntibodyCDRScheme.IMGT) list[schrodinger.application.bioluminate.anarci.annotate.AnnotationResult]

Get the MSV annotations for the given sequences

Parameters
  • sequences – the sequences to annotate

  • scheme – numbering scheme

Returns

a tuple containing the type of immunoglobulin and a list of annotated sequences

schrodinger.application.bioluminate.anarci.annotate.show_msv_annotation(annotated_domains: list[schrodinger.protein.sequence.ProteinSequence], ig_type_str: str)

Show an MSV window with the given annotated domains

Parameters
  • annotated_domains – the MSV-annotated domain objects to show

  • ig_type_str – the type of immunoglobulin to show (TCR or Antibody)

schrodinger.application.bioluminate.anarci.annotate.write_annotated_fasta(out_filename: str, dom_annotations: list[schrodinger.application.bioluminate.anarci.annotate.AnnotationResult])
schrodinger.application.bioluminate.anarci.annotate.get_out_filename(filename: str) str

Get the output filename for the given input filename

schrodinger.application.bioluminate.anarci.annotate.parse_args(args)
schrodinger.application.bioluminate.anarci.annotate.main()
schrodinger.application.bioluminate.anarci.annotate.get_family_region_bounds(sequence: str, ab_scheme=AntibodyCDRScheme.Kabat, skip_constant=False) dict[str, tuple[int, int]]

Get the region bounds for a sequence.

schrodinger.application.bioluminate.anarci.annotate.get_family_region_bounds_from_chain_result(chain_result)
schrodinger.application.bioluminate.anarci.annotate.get_const_ref_filename(chain_type: schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciType)
schrodinger.application.bioluminate.anarci.annotate.get_family_region_bounds_from_domain_result(result: schrodinger.application.bioluminate.anarci.anarci_adapter.AnarciDomainResult) dict[str, tuple[int, int]]

Get the region bounds for a domain result.

schrodinger.application.bioluminate.anarci.annotate.get_constant_domain_bounds(full_sequence, chain_type, subseq_bounds=None)