schrodinger.livedesign.bbchem_endpoints module

Collection of functions intended as bbchem web endpoints.

Copyright Schrodinger, LLC. All rights reserved.

schrodinger.livedesign.bbchem_endpoints.split_data_blocks(data: str, input_format: Format, options: Optional[RegistrationOptions] = None)

Iterates across serialized formats, yielding a single data block at a time. Supports iterating across SD, Maestro, and FASTA files; other formats are returned as a single block. NOTE: if a FASTA mapping is set on the registration options, the FASTA is parsed as a single block.

Parameters:
  • data – input text string

  • input_format – input format of the data

  • options – registration options

Returns:

an iterator of data blocks

schrodinger.livedesign.bbchem_endpoints.to_registration_data(data: str, input_format: Format, options: Optional[RegistrationOptions] = None) Iterator[Union[RegistrationData, Exception]]

Generalizes small molecule and biologics registration processes, returning all data LiveDesign stores in it’s internal databases. This includes returning rdmol binaries directly as for each entity. Output includes properties from the input mol, computed properties, and potentially any child data of derived virtuals.

schrodinger.livedesign.bbchem_endpoints.to_format(mol_input: str, input_format: Format, output_format: Format, additional_properties: Optional[Dict] = None) str

Main entrypoint for converting to a serialized text format.

Parameters:
  • mol_input – serialized mol

  • input_format – input format of the mol string

  • output_format – desired format for output string

  • additional_properties – property data to include on serialization

Returns:

converted text string

schrodinger.livedesign.bbchem_endpoints.to_image(mol_input: Optional[str], alignment_input: Optional[str] = None, substructure_options: Optional[QueryOptions] = None, highlight_input: Optional[str] = None, render_options: Optional[ImageGenOptions] = None, force_atomistic: bool = False) bytes

Generates an image from a serialized input string; the request may include alignment, or substructure highlighting, or both.

Parameters:
  • mol_input – serialized mol

  • alignment_input – molecule to align to prior to image generation

  • substructure_options – substructure matching options

  • highlight_input – core to highlight in generated image

  • render_options – image generation options

  • force_atomistic – whether to force monomeric mols to be represented atomistically

Returns:

generated image SVG or PNG bytes

schrodinger.livedesign.bbchem_endpoints.to_entity_stock_image(entity_class: EntityClass, draw_options: ImageGenOptions) bytes
Parameters:

entity_class – the entity class to get the stock image for

Returns:

bytes for the entity specific stock image

schrodinger.livedesign.bbchem_endpoints.requires_entity_stock_image(mol_input: str) bool
Parameters:

mol_input – serialized mol

Returns:

whether the image should fall back to stock images derived from entity type

schrodinger.livedesign.bbchem_endpoints.to_fingerprint(mol_input: str, use: FingerprintUse, substructure_options: Optional[QueryOptions] = None) ExplicitBitVect

Generates a substructure or similarity fingerprint for a given mol.

Parameters:
  • mol_input – serialized mol

  • use – type of fingerprint to generate

  • substructure_options – substructure matching options

schrodinger.livedesign.bbchem_endpoints.num_substructure_matches(*args, **kwargs) int
Returns:

number of substructure/subsequence matches

schrodinger.livedesign.bbchem_endpoints.has_substructure_match(*args, **kwargs) bool
Returns:

whether any substructure/subsequence match was found

schrodinger.livedesign.bbchem_endpoints.to_sequence_viewer_data(mol_binary_str: str, scheme: AntibodyCDRScheme = AntibodyCDRScheme.Kabat) Dict[str, Dict]
Returns:

biologics sequence viewer data for the given mol

schrodinger.livedesign.bbchem_endpoints.get_mutated_helm(mol_binary_str: str, mut_res_by_idx: dict[tuple[str, int], str]) str

Given a monomeric mol and mutations that indicate which monomers to mutate to which other monomers, returns a new monomeric mol HELM string with the mutations applied.

Parameters:
  • mol_binary_str – input monomeric mol as RDMOL_BINARY_BASE64 string

  • mut_res_by_idx – dictionary mapping (chain_id, residue_index) tuples to new monomer names

Returns:

mutated monomeric mol as HELM string

schrodinger.livedesign.bbchem_endpoints.get_mutated_helms(mol_binary_str: str, mut_res_by_idx_list: list[dict[tuple[str, int], str]]) Generator[str, None, None]

Given a monomeric mol and a list of mutations that indicate which monomers to mutate to which other monomers, yields new monomeric mol HELM strings with the mutations applied.

Parameters:
  • mol_binary_str – input monomeric mol as RDMOL_BINARY_BASE64 string

  • mut_res_by_idx_list – list of dictionaries mapping (chain_id, residue_index) tuples to new monomer names

Yield:

mutated monomeric mols as HELM strings

schrodinger.livedesign.bbchem_endpoints.generate_image(mol: Mol, alignment_mol: Optional[Mol] = None, substructure_options: Optional[QueryOptions] = None, highlight_mol: Optional[Mol] = None, draw_options: Optional[ImageGenOptions] = None) bytes

DEPRECATED: Remove once bbchem is updated

schrodinger.livedesign.bbchem_endpoints.generate_sar_analysis_image(match_mol: Mol, scaffold_mol: Mol, substructure_options: Optional[QueryOptions] = None, draw_options: Optional[ImageGenOptions] = None) bytes

Generates an image used in LiveDesign that is specifically from SAR analysis output, highlighting the core and all r-groups from the decomposition.

Parameters:
  • match_mol – source molecule for R-group decomposition to highlight and generate image of

  • scaffold_mol – scaffold molecule on which to find R-groups

  • substructure_options – substructure matching options

  • draw_options – image generation options

Returns:

generated image as a string

schrodinger.livedesign.bbchem_endpoints.pop_properties(mol: Mol) dict
Parameters:

mol – molecule to extract, then clear all properties from

Returns:

map of all removed properties as strings

schrodinger.livedesign.bbchem_endpoints.set_properties(mol: Mol, new_props: dict)
Parameters:
  • mol – molecule to clear, then set given properties on

  • new_props – map of properties to add onto the molecule

schrodinger.livedesign.bbchem_endpoints.split_fragments(mol: Mol)
Param:

input molecule

Returns:

iterable containing each fragment mol

schrodinger.livedesign.bbchem_endpoints.enumerate_stereoisomers(mol: Mol, max_stereoisomers: int = 512) Iterator[Mol]

Generates stereoisomers from a specified SDF structure string.

Parameters:
  • structure – structure from which to generate stereoisomers

  • max_stereoisomers – maximum number of stereoisomers to generate

Returns:

generated stereoisomers

schrodinger.livedesign.bbchem_endpoints.rgroup_decompose(scaffold_mol: Mol, match_mol: Mol, options: Optional[QueryOptions] = None) Optional[List[dict]]

Decomposes a molecule into its core and R-groups given a scaffold

Parameters:
  • scaffold_mol – scaffold molecule on which to find R-groups

  • match_mol – source molecule for R-group decomposition

  • stereospecific – whether to consider bond stereochemistry and atom chirality of scaffold

Returns:

list of dicts of R-group matches

schrodinger.livedesign.bbchem_endpoints.get_rgroup_labels(scaffold_mol: Mol) List[str]
Parameters:

scaffold_mol – scaffold molecule

Returns:

R-group labels present on the scaffold

schrodinger.livedesign.bbchem_endpoints.check_reaction(rxn_input: str) RxnCheckResult
schrodinger.livedesign.bbchem_endpoints.setup_reaction(rxn_input: str) str

Tidy up and convert user sketched reactions into a format that can be used for reaction enumeration.

Parameters:

rxn_input – a RXNBlock or RXNSMARTS describing the user’s reaction.

Returns:

a SMARTS string describing the cleaned up reaction

schrodinger.livedesign.bbchem_endpoints.run_reaction(rxn_input: str, reactant_lists: List[List[str]], reactant_id_lists: Optional[List[List[str]]] = None, max_products: Optional[int] = None, property_filters: Optional[Dict] = None) Iterator[Tuple[str, List[str]]]

Execute a reaction on one or more sets of reagents

Basically, each “reaction” can have one or more than reagents, and could be run on one or more sets of reagents.

Parameters:
  • rxn_input – reaction definition in any supported format, such as RXN or reaction SMARTS.

  • reactant_lists – lists of reactants in any supported format, such as MOL or SMILES. Each list has to have the correct length, matching the number of reactant templates used by the reaction.

  • reactant_lists – lists of IDs of the reactants. Each list should have exactly the same size as the corresponding list in reactant_lists. For reactants with no ID, an empty string should be used.

  • max_products – yield at most this many unique products. If not provided, all products are returned without limit or deduplication. (The canonical SMILES is used as the key.)

  • property_filters – dictionary with JSON data describing the property filters, with the schema expected by schrodinger.ui.qt.filter_dialog_dir.filter_core.Filter.

Returns:

generator of tuples products in SDF format and their IDs.

schrodinger.livedesign.bbchem_endpoints.get_entity_properties(mol_input: str, input_format: Format = Format.AUTO_DETECT) MolecularProperties

Computes and returns molecular properties for a given molecule.

Parameters:
  • mol_input – serialized mol string

  • input_format – input format of the mol string

Returns:

NamedTuple of computed properties

Raises:

ValueError – if molecule is monomeric (biologics not supported)

schrodinger.livedesign.bbchem_endpoints.get_json_formatted_structure_hierarchy(mol_input: str, input_format: Format, structure_schemes: Optional[List[str]] = None) str

Returns a JSON string representing the structure hierarchy of the molecule.

Parameters:
  • mol_input – serialized mol string

  • input_format – input format of the mol string

  • structure_scheme – scheme to use for annotation.

Returns:

JSON string representing the structure hierarchy in specified scheme in a way of a dictionary.

{ “output_response”:[{“scheme”:”<requested_scheme>”, “output”:”<output_structure_hierarchy>”}]}

schrodinger.livedesign.bbchem_endpoints.get_sequence_to_structure_mapping(sequence_annotations: dict, structure_hierarchy_json: str) str

Get the mapping of sequences to their chain names.

Parameters:
  • sequence_annotations – dictionary of sequences to their chain names

  • structure_hierarchy_json – structure hierarchy json string

Returns:

json string mapping sequence keys to their chain ids

schrodinger.livedesign.bbchem_endpoints.has_subregion_match(target_input: str, query_input: str, regions: List[str], scheme: AntibodyCDRScheme = AntibodyCDRScheme.Kabat) bool

Checks for subsequence matches within specified subregions of monomeric molecules. E.g., CDRs of antibodies. If any of the specified regions have a match, returns true.

Parameters:
  • target_input – serialized mol of what to search

  • query_input – serialized mol on query molecule to search with

  • regions – list of subregions to search within the target

  • scheme – numbering scheme to use for subregion matching

Returns:

whether any subsequence match was found within the given subregions

schrodinger.livedesign.bbchem_endpoints.get_3dviz_data(mol_input: str, input_format: Format, structure_schemes: Optional[List[str]] = None) VizData

Returns JSON strings for structure hierarchy and stereo labels for the given molecule input.

Parameters:
  • mol_input – serialized mol

  • input_format – input format of the mol string

  • structure_schemes – List of antibody structure schemes to generate structure hierarchy for.

Returns:

VizData object containing the JSON strings detailing the structure hierarchy and the chirality labels.

schrodinger.livedesign.bbchem_endpoints.get_sequence_logo_data(aligned_seqs: list[str]) tuple[float, list[tuple[str, float]]]

Returns data for generating logo plots for each residue in a sequence alignment.

Parameters:

aligned_seqs – List of sequences in the alignment.

Returns:

List of conservation scores and residue data for each alignment position. The residue data details the residue codes and their respective frequencies (ratio of occurence in the position), ordered from highest to lowest frequency. Example:

[
    (4.3219..., [("A", 1.0)]),
    (3.3219..., [("E", 0.5), ("D", 0.5)]),
    ...
]