schrodinger.application.matsci.smartsutils module

Utilities for working with SMARTS patterns

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.application.matsci.smartsutils.SMARTSGroupData(number, name, pattern, indexes)

Bases: tuple

indexes

Alias for field number 3

name

Alias for field number 1

number

Alias for field number 0

pattern

Alias for field number 2

schrodinger.application.matsci.smartsutils.defines_integer_atom_props(st)

Return True if the given structure has SMARTS-related integer atom properties defined. These properties were originally used when each atom was forced to belong to a single SMARTS group. That condition has been relaxed but these properties are kept for backwards compatibility.

Parameters:

st (schrodinger.structure.Structure) – the structure

Return type:

bool

Returns:

True if such properties are defined

schrodinger.application.matsci.smartsutils.get_group_names(atom)

Return a list of group names for the given atom.

Parameters:

atom (schrodinger.structure._StructureAtom) – the atom

Return type:

list

Returns:

group names

schrodinger.application.matsci.smartsutils.get_group_atom_indices(atom)

Return a list of group atom indices for the given atom.

Parameters:

atom (schrodinger.structure._StructureAtom) – the atom

Return type:

list

Returns:

group atom indices

schrodinger.application.matsci.smartsutils.get_group_numbers(atom)

Return a list of group numbers for the given atom.

Parameters:

atom (schrodinger.structure._StructureAtom) – the atom

Return type:

list

Returns:

group numbers

schrodinger.application.matsci.smartsutils.append_property(atom, key, value)

Append the given property to the atom.

Parameters:
  • atom (schrodinger.structure._StructureAtom) – the atom

  • key (str) – the property key

  • value (str) – the property value

schrodinger.application.matsci.smartsutils.validate_name(name)

Make sure name has the correct set of characters

Parameters:

name (str) – The string to check

Return type:

bool

Returns:

True if name has no invalid characters, False if any characters are invalid

exception schrodinger.application.matsci.smartsutils.SMARTSGroupError

Bases: Exception

Class for exceptions related to SMARTS group finding

schrodinger.application.matsci.smartsutils.delete_group_properties(struct)

Delete all SMARTS group properties (structure and atom) from the structure

Parameters:

struct (schrodinger.structure.Structure) – The structure to delete properties from

schrodinger.application.matsci.smartsutils.find_group_data(struct)

Find an SMARTS group data on the structure

Parameters:

struct (schrodinger.structure.Structure) – The structure to find groups on

Return type:

dict

Returns:

A dictionary. Keys are smarts group numbers, values are SMARTSGroupData named tuples for the SMARTS group with that number

Raises:

SMARTSGroupError – If something in the data is not consistent

schrodinger.application.matsci.smartsutils.get_rdkit_atoms(smarts)

Return a collection of rdkit atoms for the given SMARTS. The return value has the length of a potential match group, for example for ‘cc’ this length is 2, for ‘[$([NH]([CH2])[CH2])]C’ it is 2, for [n-0X2].[n-0X2] it is 2, etc., even though there might be any number of matches if the pattern was matched.

Parameters:

smarts (str) – the SMARTS pattern

Raises:

RuntimeError – if rdkit has a problem with the SMARTS

Return type:

rdkit.Chem.rdchem._ROAtomSeq

Returns:

the rdkit atoms

schrodinger.application.matsci.smartsutils.is_smarts_bonding_pair(smarts)

Return True if the given SMARTS would match a bonding pair, False otherwise.

Parameters:

smarts (str) – the SMARTS pattern

Return type:

bool

Returns:

True if the SMARTS would match a bonding pair, False otherwise

class schrodinger.application.matsci.smartsutils.SMARTSGroup(name, pattern, logger=None)

Bases: object

Handles matching and record-keeping for a SMARTS patter

__init__(name, pattern, logger=None)

Create a SMARTSGroup object

Parameters:
  • name (str) – The name of this SMARTS group

  • pattern (str) – The SMARTS pattern for this group

Raises:
  • ValueError – If name has invalid characters

  • ValueError – If the SMARTS is invalid

nextNumber(numbers_used)

Get the next unused group number

Parameters:

numbers_used (set) – Each member is a number that has already been used for a group and is unavailable. The number returned by this function is added to the numbers_used set.

Return type:

int

Returns:

The lowest available number. This number will have been added to the numbers_used set.

prioritizeBackbone(matches, backbone_atoms)

Prioritize matches that are in backbone of the molecule

Parameters:
  • matches (list) – List of list containing the smart pattern matches

  • backbone_atoms (dict) – dictionary with key as molecule number and backbone atoms index ordered in a list

Returns:

List of list containing the smart pattern matches where the matches in the backbone appears first

Return type:

list

prioritizeSameMonomers(struct, matches)

Prioritize matches that belong to same (or least) number of unique monomers.

Parameters:
  • struct (schrodinger.structure.Structure) – The structure to find SMARTS pattern match in

  • matches (list) – List of list containing the smart pattern matches

Returns:

List of list containing the smart pattern matches where the matches that belong to same (or least) number of unique monomers appear first

Return type:

list

orderedMatches(struct, backbone_atoms)

Evaluate the smarts pattern matches, where matches are ordered to follow network sequence. Consider backbone atoms matches first in the sequence and the side chain matches are then ordered according to atom index.

Parameters:
  • struct (schrodinger.structure.Structure) – The structure to delete properties from

  • backbone_atoms (dict) – dictionary with key as molecule number and backbone atoms index ordered in a list

Return list(list):

List of list containing the smart pattern matches

getSmartsDictData(struct, match, allow_partial_overlap)

Gets the smarts dictionary data for the passed match

Parameters:
  • struct (schrodinger.structure.Structure) – The structure to which the SMARTS match belongs to

  • match (list) – The list of atom indices that belong to the SMARTS match

  • allow_partial_overlap (bool) – Whether partial overlap is allowed

Returns:

Dictionary where the key is a SMARTSGroupData and the value is the list of atom indices that belong to the group

Return type:

dict

Raises:

ValueError – If overlapping is not supported due to backwards incompatibility

isAlreadyMatched(match, smarts_group_matches)

Check if atoms have already been matched to the same SMARTS pattern. This allows creating a group from some atoms already matched to some group and other atoms already matched to some other group but prevents creating a group that is a sub-group of another group

Parameters:
  • match (list) – The list of atom indices that belong to the SMARTS match

  • smarts_group_matches (dict) – Dictionary where the key is a unique SMARTS pattern and the value is the list of atom indices that belong to the group

Return type:

bool

Returns:

Whether the match has already been matched to same SMARTS

onlyDifferByAHydrogen(struct, match, smarts_group_matches)

Check all atoms except a single H atom have already been matched to the same SMARTS group. This prevents matching a terminal methyl 3 times as is the case for a polymer head or tail monomer with a terminating hydrogen

Parameters:
  • struct (schrodinger.structure.Structure) – The structure to which the SMARTS match belongs to

  • match (list) – The list of atom indices that belong to the SMARTS match

  • smarts_group_matches (dict) – Dictionary where the key is a unique SMARTS pattern and the value is the list of atom indices that belong to the group

Return type:

bool

Returns:

Whether all atoms except a single H atom have already been matched

isOverlapAllowedInMatch(struct, match, smarts_group_data_dict)

Determines if overlap is allowed for the current match.

Parameters:
  • struct (schrodinger.structure.Structure) – The structure to which the SMARTS match belongs to

  • match (list) – The list of atom indices that belong to the SMARTS match

  • smarts_group_data_dict (dict) – Dictionary where the key is a SMARTSGroupData and the value is the list of atom indices that belong to the group

Return type:

bool

Returns:

Whether overlapping is allowed for the current match

addMatchToAtoms(struct, numbers_used, match)

Adds match information to structure atom property

Parameters:
  • struct (schrodinger.structure.Structure) – The structure to which the SMARTS match belongs to

  • numbers_used (set) – Each member is a number that has already been used for a group and is unavailable. The number returned by this function is added to the numbers_used set.

  • match (list) – The list of atom indices from the match

log(msg)

Log message if logger is present

Parameters:

msg (str) – The message to log

match(struct, numbers_used, backbone_atoms, allow_partial_overlap=False)

Find all the SMARTS groups matching the SMARTS pattern and mark them with the appropriate properties

Parameters:
  • struct (schrodinger.structure.Structure) – The structure to find SMARTS pattern match in

  • numbers_used (set) – A set of all the group numbers that have been used and are unavailable

  • backbone_atoms (dict) – dictionary with key as molecule number and backbone atoms index ordered in a list

  • allow_partial_overlap (bool) – whether atoms can belong to multiple groups

class schrodinger.application.matsci.smartsutils.SMARTSComplexity(smarts)

Bases: object

Class that holds SMARTS pattern and allows to compare complexity of SMARTS pattern

__init__(smarts)

Constructs a new instance of SMARTSComplexity

Parameters:

smarts (str) – The SMARTS pattern