schrodinger.structutils.smilesfilter module

A group of functions and classes to help filtering structure files on unique SMILES strings.

Copyright Schrodinger, LLC. All rights reserved.

schrodinger.structutils.smilesfilter.remove_dupes(smiles_generator, struct_iterator, put_unique, put_dupes=None, error_handler=None, reporter=None)

Process structures from the provided structure iterator ‘struct_iterator’ using the ‘smiles_generator’ to compute unique SMILES strings. Unique structures are passed to the ‘put_unique’ callback and duplicates to ‘put_dupes’ (if provided).

Parameters
  • struct_iterator (iterator return Structure objects) – Any iterator returning Structure objects (such as a StructureReader) will work.

  • put_dupes (callable) – These functions will be called with (structure, SMILES string) arguments in the appropriate situation.

  • error_handler (callable) – A callable that takes (index, structure, exception) for cases where SMILES generation generates a RuntimeError.

  • reporter (FilterReporter) – If present, information will be logged.

schrodinger.structutils.smilesfilter.add_smiles(smiles_generator, struct_iterator, put_output, error_handler=None)

Calculate SMILES strings with ‘smiles_generator’ for each structure in ‘struct_iterator’.

Parameters

smiles_generator (SmilesGenerator)

struct_iterator (iterator return Structure objects)

Any iterator returning Structure objects (such as a StructureReader) will work.

put_output (callable)

This function will be called with (structure, SMILES string) arguments when a SMILES string can be calculated.

error_handler (callable)

A callback function that takes (index, structure, exception) and is called in cases where SMILES generation generates a RuntimeError.

class schrodinger.structutils.smilesfilter.FilterReporter(logger)

Bases: object

A class to handle reporting of results from the remove_dupes function.

__init__(logger)

Parameters

logger (logging.Logger)

The logger instance that will be used to output messages.

logResult(index, orig_index, title, smiles, dupe_index)

Log messages for a given result.

Parameters

index (int)

Index of the current structure.

orig_index (int)

Index of the unique structure of which the current one is a duplicate.

title (str)

Title of the current structure.

smiles (str)

SMILES string of the structure.

dupe_index (int)

Index of the duplicate structure in the saved duplicates file.

summarize(total, unique, duplicates, error_count)

Generate a summary of the filter results.

Parameters

total (int)

Total number of structures filtered.

unique (int)

Number of unique structures found.

duplicates (int)

Number of duplicates found.

error_count (int)

Number of structures generating errors in SMILES conversion.

class schrodinger.structutils.smilesfilter.SmilesErrorHandler(logger, struct_writer=None, message=None)

Bases: object

A class that acts as an error handler for cases where SMILES generation fails. It is used as a callable that takes arguments of (index, struct, exception).

__init__(logger, struct_writer=None, message=None)

Parameters

logger (logging.Logger)

The logger instance that will be used to output messages.

struct_writer (obj with append method)

An instance that will log structures that have errors. Any instance with an append method (such as a StructureWriter or list) can be used.

message (str)

The message to use as the logging template. It should have format conversion specifiers for (index, structure title, exception) (types of int, str, exception).