schrodinger.application.matsci.species module

Module to group structure object in to subgroup of species

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.application.matsci.species.SUB_STRUCT_INFO(main_st, sub_st, aids)

Bases: tuple

__contains__(key, /)

Return key in self.

__len__()

Return len(self).

aids

Alias for field number 2

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

main_st

Alias for field number 0

sub_st

Alias for field number 1

class schrodinger.application.matsci.species.SC_TYPE_DATA(menus, finder, signature, split_unique)

Bases: tuple

__contains__(key, /)

Return key in self.

__len__()

Return len(self).

count(value, /)

Return number of occurrences of value.

finder

Alias for field number 1

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

menus

Alias for field number 0

signature

Alias for field number 2

split_unique

Alias for field number 3

schrodinger.application.matsci.species.format_species_display_name(name, count, is_cg=False, typo_type=None)

Gets the formatted species display name for the name of the atom/particle with the count

Parameters
  • name (str) – The element/particle name

  • count (int) – The number of element/particle names in the structure

  • is_cg (bool) – Whether the structure is coarse-grained.

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no formatting will be done.

Returns

The formatted name

Return type

str

schrodinger.application.matsci.species.get_molecular_formula(struct, is_cg=None, typo_type=None)

Gets the molecular formula for the passed structure.

Parameters
  • struct (schrodinger.structure.Structure) – the structure to get formula from

  • is_cg (bool or None) – Whether the structure is coarse-grained. Will check if None is passed.

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no formatting will be done.

Return type

str

Returns

‘molecular formula’ based for the passed molecule

schrodinger.application.matsci.species.group_objects_by_hash(grp_list, hash_func)

Group a list of objects by their hashes, where their hashes are calculated by the given hashing function

Parameters
  • grp_list (list) – The list of objects to group

  • hash_func (callable) – The function to generate hash for the passed object

Returns

Dictionary where the key is the hash generated for the object and value is the list of object which generated the given hash

Return type

dict

schrodinger.application.matsci.species.split_using_composition(atom_collections)

Splits list of atom collections to unique atom collections and non-unique atom collections using their elemental composition.

Parameters

atom_collections (list(structure._AtomCollection)) – A list of atom collections

Returns

The first element is the list of atom collections with unique elemental composition and second element is the list of atom collections that have non-unique elemental composition

Return type

tuple(list, list)

schrodinger.application.matsci.species.get_unique_molecule_nums(struct)

Get the molecule number of unique representative molecules in the structure

Parameters

struct (structure.Structure) – The structure to unique molecule number in

Return type

list

Returns

list of molecule numbers that are unique

schrodinger.application.matsci.species.get_unique_molecule_struct(struct)

Gets the structure comprising only unique representative molecules.

Parameters

struct (structure.Structure) – The structure to find unique molecules from

Returns

The structure with only unique molecules

Return type

structure.Structure

class schrodinger.application.matsci.species.BaseSignature

Bases: object

Base signature class to calculate key and name for a passed structure

abstract getKeyName(struct, is_cg=None, unique=False, typo_type=None)

Signature function to calculate key and name for passed structure

Parameters
  • struct (structure.Structure) – The structure to find signature

  • is_cg (bool) – Whether the structure is CG. If None then structure is tested for CG.

  • unique (bool) – Whether the structure is unique substructure

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no formatting will be done.

Returns

Tuple of two items where first item is the key and second the name

Return type

tuple(str, str)

class schrodinger.application.matsci.species.MolecularFormulaSignature

Bases: schrodinger.application.matsci.species.BaseSignature

Signature class to calculate signature using molecular formula

getKeyName(struct, is_cg=None, unique=False, typo_type=None)

Calculate key and name the molecular formula. See parent for more information.

Returns

Tuple of two items. Both items here are the generated molecular formula

Return type

tuple(str, str)

class schrodinger.application.matsci.species.SmilesSignature(stereo_allowed=True)

Bases: schrodinger.application.matsci.species.BaseSignature

Signature class to calculate signature using smiles structure

__init__(stereo_allowed=True)

Constructs a new SmilesSignature

Parameters

stereo_allowed (bool) – Whether the SMILES pattern should be stereo aware

getKeyName(struct, is_cg=None, unique=False, typo_type=None)

Calculate key as smiles and name as molecular formula. See parent for more information.

Returns

Tuple of two items. Both items here are the generated molecular formula

Return type

tuple(str, str)

class schrodinger.application.matsci.species.PolymerSmilesSignature(stereo_allowed=True)

Bases: schrodinger.application.matsci.species.SmilesSignature

Signature class to calculate signature using polymer name. Revert to SMILES if the structure is not a polymer.

getKeyName(struct, is_cg=None, unique=False, typo_type=None)

Calculate key and name as polymer name. It will revert to calculating key as smiles and name as molecular formula if the structure is not a polymer. See parent for more information.

Returns

Tuple of two items. Both items here are the generated molecular formula

Return type

tuple(str, str)

__init__(stereo_allowed=True)

Constructs a new SmilesSignature

Parameters

stereo_allowed (bool) – Whether the SMILES pattern should be stereo aware

class schrodinger.application.matsci.species.SpeciesData(key, name)

Bases: object

Tracks information about each chemically distinct group of atoms in a system

__init__(key, name)

Constructs a new instance.

Parameters
  • key (str) – The key unique to the group of atoms

  • name (str) – The name of the group of atoms

property count

Get the number of groups in the species

addAtomGroup(struct, atom_group)

Add a new group of atoms to the species

Parameters
  • struct (structure.Structure) – structure to which the atoms belong to

  • atom_group (iterable) – Atom indices of atoms belonging to the group

getStructures()

Get all structures to which the group of atoms in this species belong to

Returns

All structures to which the group of atoms in this species belong to

Return type

list(structure.Structure)

getAtomGroups(struct)

Gets the groups of atoms belonging to passed structure for the current species.

Parameters

struct (structure.Structure) – structure to which the group of atoms need to be searched.

Returns

All the groups of atoms belonging to passed structure

Return type

set(frozenset)

getAllAtomIds(struct)

Gets all atom indices belonging to the species for the passed structure. If all atom groups belong to same structure then structure is optional.

Parameters

struct (structure.Structure) – structure to which the group of atoms need to be searched.

Returns

all atom indices belonging to the species for the passed structure

Return type

list(int)

isAtomMember(atom_id, struct)

Check if atom is a part of the species. If all atom groups belong to same structure then structure is optional.

Parameters
  • atom_id (int) – Index of the atom to check

  • struct (structure.Structure) – structure to which the group of atoms need to be searched.

Return type

bool

Returns

Whether the atom is part of the species

Raises

RuntimeError – If struct is None and the species contains atom indices from different structures.

getExampleAtomGroup(struct)

Get a group of atoms for the species with the lowest atom index

Parameters

struct (structure.Structure) – structure to which the group of atoms sample should belong to.

Return type

frozenset(int)

Returns

The first of group of atoms of the species

getSampleAtomId(struct)

Gets the sample atom index that belongs to the species.

Parameters

struct (structure.Structure) – structure to which the sample atom should belong to.

Return type

Int

Returns

Index of an atom that belongs to the species

class schrodinger.application.matsci.species.MoleculeSpeciesFinder(signature, split_unique=True)

Bases: object

Class to find unique molecule species in structures.

__init__(signature, split_unique=True)

Constructs a new instance of MoleculeSpeciesFinder

Parameters
  • signature (BaseSignature) – The signature class to generate name and key for the species

  • split_unique (bool) – Whether to split using composition before splitting using signature key. Note this is skipped when finding species in multiple structures.

getKeyNamePrefix(struct)

Gets the prefix for the key and name generated from the signature

Parameters

struct (structure.Structure) – The sub structure used to generate key and name

Returns

The prefix for key and name generated by the signature

Return type

str

iterAtomCollections(struct)

Generate atom collections of molecules for the passed structure

Parameters

struct (schrodinger.Structure) – The structure to be divided into molecules

Returns

Atom collections for the passed structure that divides the structure into list of molecules

Return type

iterable(structure.Structure._AtomCollection)

getSubStAndAids(struct)

Gets the extracted sub structures and their associated atom indices

Parameters

struct (structure.Structure) – The structure

getAtomGroupKeyName(sub_st, is_cg, unique, typo_type)

Gets the substructure key and name from the signature

Parameters
  • sub_st (structure.Structure) – The sub structure for current group of atoms

  • is_cg (bool) – Whether the substructure is cg

  • unique (bool) – Whether the substructure is unique in composition

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no formatting will be done.

Return type

typle(str, str)

Returns

Tuple with two items. First item is the signature key associated with the substructure and the second item is the signature name for the substructure

addAtomGroup(sub_st_info, is_cg, unique=False, typo_type=None)

Add group of atom to species

Parameters
  • sub_st_info (SUB_STRUCT_INFO) – Namedtuple containing information regarding the group of atoms to be added

  • is_cg (bool) – Whether the substructure is cg

  • unique (bool) – Whether the substructure is unique in composition

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no formatting will be done.

getSampleAtomAndStructTag(species)

Gets the sample atom and structure tag for the passed species

Parameters

species (SpeciesData) – The species to get the structure tag and sample atom from

Returns

Tuple containing two items. First is the sample atom and second is the structure tag for the sample structure.

Return type

tuple(structure.Structure._StructureAtom, str)

getExampleTag(species)

Get the example tag for the passed species

Parameters

species (SpeciesData) – The species to get the example tag from

Returns

The example tag for the passed species

Return type

str

setDisplayNames(all_species)

Sets the display names for the list of species such that they are unique

Parameters

all_species (list(SpeciesData)) – All species

findSpeciesInStruct(struct, split_unique, typo_type)

Finds a species in passed structure

Parameters
  • struct (structure.Structure) – The structure to find species in

  • split_unique (bool) – Whether structure split first based on it composition

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no formatting will be done.

getSpecies(structs, typo_type=None)

Gets the species for the passed structure

Parameters
  • structs (list(structure.Structure)) – The list of structures to find species in

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no display name is not set for the species.

Returns

The list of species

Return type

list(SpeciesData)

addSpeciesAsProps(all_species)

Add Species name as an atom property in its structure

Parameters

all_species (list(SpeciesData)) – List of species

class schrodinger.application.matsci.species.ResidueSpeciesFinder(signature, split_unique=True)

Bases: schrodinger.application.matsci.species.MoleculeSpeciesFinder

Class to find unique residue species in structures.

getKeyNamePrefix(struct)

Gets the prefix for the key and name generated from the signature

Parameters

struct (structure.Structure) – The sub structure used to generate key and name

Returns

The prefix for key and name generated by the signature

Return type

str

iterAtomCollections(struct)

Generate atom collections of residues for the passed structure

Parameters

struct (schrodinger.Structure) – The structure to be divided into residues

Returns

Atom collections for the passed structure that divides the structure into list of residues

Return type

iterable(structure.Structure._AtomCollection)

getExampleTag(species)

Get the example tag for the passed species

Parameters

species (SpeciesData) – The species to get the example tag from

Returns

The example tag for the passed species

Return type

str

__init__(signature, split_unique=True)

Constructs a new instance of MoleculeSpeciesFinder

Parameters
  • signature (BaseSignature) – The signature class to generate name and key for the species

  • split_unique (bool) – Whether to split using composition before splitting using signature key. Note this is skipped when finding species in multiple structures.

addAtomGroup(sub_st_info, is_cg, unique=False, typo_type=None)

Add group of atom to species

Parameters
  • sub_st_info (SUB_STRUCT_INFO) – Namedtuple containing information regarding the group of atoms to be added

  • is_cg (bool) – Whether the substructure is cg

  • unique (bool) – Whether the substructure is unique in composition

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no formatting will be done.

addSpeciesAsProps(all_species)

Add Species name as an atom property in its structure

Parameters

all_species (list(SpeciesData)) – List of species

findSpeciesInStruct(struct, split_unique, typo_type)

Finds a species in passed structure

Parameters
  • struct (structure.Structure) – The structure to find species in

  • split_unique (bool) – Whether structure split first based on it composition

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no formatting will be done.

getAtomGroupKeyName(sub_st, is_cg, unique, typo_type)

Gets the substructure key and name from the signature

Parameters
  • sub_st (structure.Structure) – The sub structure for current group of atoms

  • is_cg (bool) – Whether the substructure is cg

  • unique (bool) – Whether the substructure is unique in composition

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no formatting will be done.

Return type

typle(str, str)

Returns

Tuple with two items. First item is the signature key associated with the substructure and the second item is the signature name for the substructure

getSampleAtomAndStructTag(species)

Gets the sample atom and structure tag for the passed species

Parameters

species (SpeciesData) – The species to get the structure tag and sample atom from

Returns

Tuple containing two items. First is the sample atom and second is the structure tag for the sample structure.

Return type

tuple(structure.Structure._StructureAtom, str)

getSpecies(structs, typo_type=None)

Gets the species for the passed structure

Parameters
  • structs (list(structure.Structure)) – The list of structures to find species in

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no display name is not set for the species.

Returns

The list of species

Return type

list(SpeciesData)

getSubStAndAids(struct)

Gets the extracted sub structures and their associated atom indices

Parameters

struct (structure.Structure) – The structure

setDisplayNames(all_species)

Sets the display names for the list of species such that they are unique

Parameters

all_species (list(SpeciesData)) – All species

class schrodinger.application.matsci.species.SpeciesCollectionTypes(value)

Bases: enum.Enum

Enumeration for various species collection types

Variables
  • formula (SC_TYPE_DATA) – Molecules grouped using their molecular formula

  • mol (SC_TYPE_DATA) – Molecules grouped using their stereochemically unaware SMILES

  • chiral_mol (SC_TYPE_DATA) – Molecules grouped using their stereochemically aware SMILES

  • poly_mol (SC_TYPE_DATA) – Molecules grouped using their polymer name. If polymer name is not available then stereochemically unaware SMILES is used.

  • poly_chiral_mol (SC_TYPE_DATA) – Molecules grouped using their polymer name. If polymer name is not available then stereochemically unaware SMILES is used.

  • formula_res (SC_TYPE_DATA) – Residues grouped using their molecular formula

  • res (SC_TYPE_DATA) – Residues grouped using their stereochemically unaware SMILES

  • chiral_res (SC_TYPE_DATA) – Residues grouped using their stereochemically aware SMILES

formula = SC_TYPE_DATA(menus=['Molecule', 'Formula'], finder=<class 'schrodinger.application.matsci.species.MoleculeSpeciesFinder'>, signature=<class 'schrodinger.application.matsci.species.MolecularFormulaSignature'>, split_unique=False)
mol = SC_TYPE_DATA(menus=['Molecule', 'SMILES'], finder=<class 'schrodinger.application.matsci.species.MoleculeSpeciesFinder'>, signature=functools.partial(<class 'schrodinger.application.matsci.species.SmilesSignature'>, False), split_unique=True)
chiral_mol = SC_TYPE_DATA(menus=['Molecule', 'Chiral SMILES'], finder=<class 'schrodinger.application.matsci.species.MoleculeSpeciesFinder'>, signature=functools.partial(<class 'schrodinger.application.matsci.species.SmilesSignature'>, True), split_unique=True)
poly_mol = SC_TYPE_DATA(menus=['Polymer', 'Name and SMILES'], finder=<class 'schrodinger.application.matsci.species.MoleculeSpeciesFinder'>, signature=functools.partial(<class 'schrodinger.application.matsci.species.PolymerSmilesSignature'>, False), split_unique=True)
poly_chiral_mol = SC_TYPE_DATA(menus=['Polymer', 'Name and Chiral SMILES'], finder=<class 'schrodinger.application.matsci.species.MoleculeSpeciesFinder'>, signature=functools.partial(<class 'schrodinger.application.matsci.species.PolymerSmilesSignature'>, True), split_unique=True)
formula_res = SC_TYPE_DATA(menus=['Residue', 'Formula'], finder=<class 'schrodinger.application.matsci.species.ResidueSpeciesFinder'>, signature=<class 'schrodinger.application.matsci.species.MolecularFormulaSignature'>, split_unique=False)
res = SC_TYPE_DATA(menus=['Residue', 'SMILES'], finder=<class 'schrodinger.application.matsci.species.ResidueSpeciesFinder'>, signature=functools.partial(<class 'schrodinger.application.matsci.species.SmilesSignature'>, False), split_unique=False)
chiral_res = SC_TYPE_DATA(menus=['Residue', 'Chiral SMILES'], finder=<class 'schrodinger.application.matsci.species.ResidueSpeciesFinder'>, signature=functools.partial(<class 'schrodinger.application.matsci.species.SmilesSignature'>, True), split_unique=False)
class schrodinger.application.matsci.species.SpeciesCollection(sc_type, typo_type=None)

Bases: object

Class to find and interact with species in a structure.

__init__(sc_type, typo_type=None)

Constructs a new instance SpeciesCollection

Parameters
  • sc_type (str) – Name of the SpeciesCollectionTypes

  • typo_type (TYPO_TYPE or None) – The typography type style used for formatting text. If None no display name is not set for the species.

__len__()

Get the number species in current species collection

Return type

int

Returns

The number species in the collection

loadSpecies(structs)

Load species in the current species collection

Parameters

structs – The structure to find species in

getSpeciesFromAtomIndex(atom_idx, struct)

Select the species data in which the atom index belongs to

Parameters
  • atom_idx (int) – The atom index

  • structs – The structure to find species belong to

Returns

SpeciesData

Return type

The species data in which the atom index belongs to.

Raises

KeyError – If the atom index is not found in any of the species.

getSpeciesCount()

Get the species count for each species

Return type

dict

Returns

Dictionary containing the species name and count