schrodinger.livedesign.biologics.sequence module¶
- class schrodinger.livedesign.biologics.sequence.AlignedSequence(sequence: 'str', natural_analog_sequence: 'str' = None, identity: 'float' = None, similarity: 'float' = None)¶
- Bases: - object- sequence: str¶
 - natural_analog_sequence: str = None¶
 - identity: float = None¶
 - similarity: float = None¶
 - static fromProteinSequence(seq: ProteinSequence, ref_seq: ProteinSequence = None, nonstandard_symbol='X')¶
 - __init__(sequence: str, natural_analog_sequence: str = None, identity: float = None, similarity: float = None) None¶
 
- schrodinger.livedesign.biologics.sequence.get_sequence_viewer_data(mol: Mol, scheme: AntibodyCDRScheme = AntibodyCDRScheme.Kabat)¶
- Parameters:
- mol – rdmol to extract sequence data from 
- Returns:
- a map from polymer id to a dictionary mapping antibody regions to monomer indices in the corresponding simple polymer 
- Raises:
- RuntimeError – if the molecule contains nonlinear peptides 
 
- schrodinger.livedesign.biologics.sequence.get_annotations_for_helm_model(model: HelmModel, scheme: AntibodyCDRScheme) Dict[str, Dict[str, Union[Tuple[int, int], List[str]]]]¶
- HelmModels reorder polymer chains to canonicalize input, which means that the same polymer can have two different polymer ids in two models if those two models contain different peptide polymers. This function goes back through a HELM model and computes the mapping between each antibody chains and its constituent region annotation. - Parameters:
- model – HelmModel to extract annotations for 
- Returns:
- a map from polymer id to a dictionary mapping antibody regions to monomer indices in the corresponding simple polymer. 
 
- schrodinger.livedesign.biologics.sequence.get_sequence_filter_chain_name(entity_class: EntityClass) str¶
- Simplify chain names presented in the sequence viewer filter combobox - Parameters:
- entity_class – the entity class of the given polymer chain 
- Returns:
- chain name to label the given entity’s sequence viewer data 
 
- schrodinger.livedesign.biologics.sequence.get_polymer_annotations(polymer: HelmPolymer, scheme: AntibodyCDRScheme) Tuple[str, Dict[str, Union[Tuple[int, int], List[str]]]]¶
- Returns the chain ID and sequence annotations for a HelmPolymer. 
- schrodinger.livedesign.biologics.sequence.get_monomer_data(polymer: HelmPolymer) dict[str, dict[str, Any]]¶
- Returns a list of dictionaries containing monomer information for each monomer in the polymer. 
- schrodinger.livedesign.biologics.sequence.get_ab_annotations(fasta_sequence: str, scheme: AntibodyCDRScheme) Dict[str, Union[Tuple[int, int], List[str]]]¶
- Cheap cache wrapper around antibody.SeqType to reduce the cost of calling get_annotations for each RegistrationData object. 
- schrodinger.livedesign.biologics.sequence.split_by_hierarchy(region_dict: Dict[str, List[int]]) Dict[str, Dict[str, List[int]]]¶
- Splits a region dictionary into a dictionary of antibody domain boundaries (e.g., VH, CH1) and a dictionary of subdomain boundaries (e.g. HFR1, H1). 
- schrodinger.livedesign.biologics.sequence.get_arm_indices(model: HelmModel) Dict[str, int]¶
- Returns a mapping from polymer id to arm pairs. If no arm pairing is provided, assignes a unique arm pair to each polymer id. 
- schrodinger.livedesign.biologics.sequence.align_helm_polymer_sequences(sequences: List[HelmPolymer], align_mode=SeqAlnMode.Multiple, ref_seq_index: Optional[int] = None, gap_open_penalty=None, gap_extend_penalty=None) List[AlignedSequence]¶
- Returns aligned sequences as a FASTA string. - Parameters:
- sequences – sequences to align 
- ref_seq_index – if not None, all sequences are pairwise aligned using the sequence at ref_seq_index as a reference sequence 
- polymer_type – the type of polymer to align (Default: PEPTIDE) 
 
- Returns:
- FASTA string of the aligned sequences 
 
- schrodinger.livedesign.biologics.sequence.align_sequences(sequences: List[str], natural_analog_sequences: List[str] = None, align_mode=SeqAlnMode.Multiple, ref_seq_index: Optional[int] = None, polymer_type=PolymerType.PEPTIDE, gap_open_penalty=None, gap_extend_penalty=None) List[AlignedSequence]¶
- Returns aligned sequences as a FASTA string. - Parameters:
- sequences – sequences to align 
- ref_seq_index – if not None, all sequences are pairwise aligned using the sequence at ref_seq_index as a reference sequence 
- polymer_type – the type of polymer to align (Default: PEPTIDE) 
 
- Returns:
- FASTA string of the aligned sequences 
 
- schrodinger.livedesign.biologics.sequence.make_alignment_seq_with_analogs(seq: str, natural_analog_seq: str, polymer_type: PolymerType) ProteinSequence¶
- Returns a ProteinSequence with nonstandard monomers replaced with their natural analogs. - Parameters:
- seq – sequence to replace nonstandard monomers in 
- polymer_type – the type of polymer to align (Default: PEPTIDE) 
- nonstandard_symbol – the symbol to use for nonstandard monomers 
 
 
- schrodinger.livedesign.biologics.sequence.get_res_type(monomer: str, analog_code: str, polymer_type: PolymerType) ResidueType¶
- Returns the ResidueType (an alignment-specific data structure) for a given monomer id. It returns one of three possible ResidueTypes: 1. the ResidueType for the monomer, if it is a standard monomer 2. the ResidueType for the natural analog of the monomer, if it is a nonstandard monomer with a natural analog in the monomer database 3. a nonstandard ResidueType with the monomer id as the symbol, if it is a nonstandard monomer without a natural analog in the monomer database - Parameters:
- monomer_id – the monomer id to get the residue type for 
- polymer_type – the type of polymer to align (Default: PEPTIDE) 
- nonstandard_symbol – the symbol to use for nonstandard monomers 
 
 
- schrodinger.livedesign.biologics.sequence.align_all_to_reference(aln: ProteinAlignment, ref_seq_index: int, align_settings: AlignSettingsModel) None¶
- Aligns a given ProteinAlignment pairwise with respect to the specified reference sequence. Due to the way alignments were implemented, (see protein.alignment.BaseAlignment) ref_seq must be a sequence already in the alignment. The input ProteinAlignment is modified and not returned. - Parameters:
- aln – the alignment to be aligned 
- ref_seq_index – index corresponding to the reference sequence 
- align_settings – settings for the alignment 
 
 
- schrodinger.livedesign.biologics.sequence.multiple_align(aln: ProteinAlignment, align_settings: AlignSettingsModel) None¶
- Aligns a given ProteinAlignment via multiple sequence alignment. The input ProteinAlignment is modified and not returned. - Parameters:
- aln – the alignment to be aligned 
 
- schrodinger.livedesign.biologics.sequence.get_fasta_monomers(model: HelmModel) List[List[HelmMonomer]]¶
- Returns a list of sequences of monomers from a given HelmModel. Only monomers that are FASTA-compatible (nucleotides and amino acids) are included in the output. - Parameters:
- model – the HelmModel to get the monomers from