schrodinger.stepper.core_steps module

Core steps that are generally useful in workflows.

These steps are generic and often are highly optimized using privileged method access to stepper framework internals. It is strongly recommended to not subclass these methods or to use the the implementations as an example of what methods are safe to override in your own steps.

class schrodinger.stepper.core_steps.DeduplicationStep(*args, **kwargs)

Bases: schrodinger.stepper.stepper.ReduceStep

A step that deduplicates its inputs.

Optimized for fast deduplication when dealing with large numbers of inputs from PubSub topics.

Input = <object object>
Output = <object object>
reduceFunction(inputs)

The main computation for this step. This function should take in a iterable of inputs and return an iterable of outputs.

Example:

def reduceFunction(self, words):
    # Find all unique words
    seen_words = set()
    for word in words:
        if word not in seen_words:
            seen_words.add(word)
            yield word
class schrodinger.stepper.core_steps.RandomSampleFilter(*args, **kwargs)

Bases: schrodinger.stepper.stepper.ReduceStep

A filter that takes a random subsample. The sample size can be set through the step’s settings n.

Implementation of Algorithm R, but without knowing the size of sequence to sample from. See https://en.wikipedia.org/wiki/Reservoir_sampling

Input = <object object>
Output = <object object>
class Settings(*args, _param_type=<object object>, **kwargs)

Bases: schrodinger.models.parameters.CompoundParam

n: int

Base class for all Param classes. A Param is a descriptor for storing data, which means that a single Param instance will manage the data values for multiple instances of the class that owns it. Example:

class Coord(CompoundParam):
    x: int
    y: int

An instance of the Coord class can be created normally, and Params can be accessed as normal attributes:

coord = Coord()
coord.x = 4

When a Param value is set, the valueChanged signal is emitted. Params can be serialized and deserialized to and from JSON. Params can also be nested:

class Atom(CompoundParam):
    coord: Coord
    element: str
seed: int

Base class for all Param classes. A Param is a descriptor for storing data, which means that a single Param instance will manage the data values for multiple instances of the class that owns it. Example:

class Coord(CompoundParam):
    x: int
    y: int

An instance of the Coord class can be created normally, and Params can be accessed as normal attributes:

coord = Coord()
coord.x = 4

When a Param value is set, the valueChanged signal is emitted. Params can be serialized and deserialized to and from JSON. Params can also be nested:

class Atom(CompoundParam):
    coord: Coord
    element: str
nChanged

pyqtSignal(*types, name: str = …, revision: int = …, arguments: Sequence = …) -> PYQT_SIGNAL

types is normally a sequence of individual types. Each type is either a type object or a string that is the name of a C++ type. Alternatively each type could itself be a sequence of types each describing a different overloaded signal. name is the optional C++ name of the signal. If it is not specified then the name of the class attribute that is bound to the signal is used. revision is the optional revision of the signal that is exported to QML. If it is not specified then 0 is used. arguments is the optional sequence of the names of the signal’s arguments.

nReplaced

pyqtSignal(*types, name: str = …, revision: int = …, arguments: Sequence = …) -> PYQT_SIGNAL

types is normally a sequence of individual types. Each type is either a type object or a string that is the name of a C++ type. Alternatively each type could itself be a sequence of types each describing a different overloaded signal. name is the optional C++ name of the signal. If it is not specified then the name of the class attribute that is bound to the signal is used. revision is the optional revision of the signal that is exported to QML. If it is not specified then 0 is used. arguments is the optional sequence of the names of the signal’s arguments.

seedChanged

pyqtSignal(*types, name: str = …, revision: int = …, arguments: Sequence = …) -> PYQT_SIGNAL

types is normally a sequence of individual types. Each type is either a type object or a string that is the name of a C++ type. Alternatively each type could itself be a sequence of types each describing a different overloaded signal. name is the optional C++ name of the signal. If it is not specified then the name of the class attribute that is bound to the signal is used. revision is the optional revision of the signal that is exported to QML. If it is not specified then 0 is used. arguments is the optional sequence of the names of the signal’s arguments.

seedReplaced

pyqtSignal(*types, name: str = …, revision: int = …, arguments: Sequence = …) -> PYQT_SIGNAL

types is normally a sequence of individual types. Each type is either a type object or a string that is the name of a C++ type. Alternatively each type could itself be a sequence of types each describing a different overloaded signal. name is the optional C++ name of the signal. If it is not specified then the name of the class attribute that is bound to the signal is used. revision is the optional revision of the signal that is exported to QML. If it is not specified then 0 is used. arguments is the optional sequence of the names of the signal’s arguments.

validateSettings()

Check whether the step settings are valid and return a list of SettingsError and SettingsWarning to report any invalid settings. Default implementation checks that all stepper files are set to valid file paths.

Return type

list[TaskError or TaskWarning]

reduceFunction(inps)

The main computation for this step. This function should take in a iterable of inputs and return an iterable of outputs.

Example:

def reduceFunction(self, words):
    # Find all unique words
    seen_words = set()
    for word in words:
        if word not in seen_words:
            seen_words.add(word)
            yield word
class schrodinger.stepper.core_steps.DedupeAndRandomSampleFilter(*args, **kwargs)

Bases: schrodinger.stepper.core_steps.DeduplicationStep

Input = <object object>
Output = <object object>
class Settings(*args, _param_type=<object object>, **kwargs)

Bases: schrodinger.models.parameters.CompoundParam

n: int

Base class for all Param classes. A Param is a descriptor for storing data, which means that a single Param instance will manage the data values for multiple instances of the class that owns it. Example:

class Coord(CompoundParam):
    x: int
    y: int

An instance of the Coord class can be created normally, and Params can be accessed as normal attributes:

coord = Coord()
coord.x = 4

When a Param value is set, the valueChanged signal is emitted. Params can be serialized and deserialized to and from JSON. Params can also be nested:

class Atom(CompoundParam):
    coord: Coord
    element: str
seed: int

Base class for all Param classes. A Param is a descriptor for storing data, which means that a single Param instance will manage the data values for multiple instances of the class that owns it. Example:

class Coord(CompoundParam):
    x: int
    y: int

An instance of the Coord class can be created normally, and Params can be accessed as normal attributes:

coord = Coord()
coord.x = 4

When a Param value is set, the valueChanged signal is emitted. Params can be serialized and deserialized to and from JSON. Params can also be nested:

class Atom(CompoundParam):
    coord: Coord
    element: str
nChanged

pyqtSignal(*types, name: str = …, revision: int = …, arguments: Sequence = …) -> PYQT_SIGNAL

types is normally a sequence of individual types. Each type is either a type object or a string that is the name of a C++ type. Alternatively each type could itself be a sequence of types each describing a different overloaded signal. name is the optional C++ name of the signal. If it is not specified then the name of the class attribute that is bound to the signal is used. revision is the optional revision of the signal that is exported to QML. If it is not specified then 0 is used. arguments is the optional sequence of the names of the signal’s arguments.

nReplaced

pyqtSignal(*types, name: str = …, revision: int = …, arguments: Sequence = …) -> PYQT_SIGNAL

types is normally a sequence of individual types. Each type is either a type object or a string that is the name of a C++ type. Alternatively each type could itself be a sequence of types each describing a different overloaded signal. name is the optional C++ name of the signal. If it is not specified then the name of the class attribute that is bound to the signal is used. revision is the optional revision of the signal that is exported to QML. If it is not specified then 0 is used. arguments is the optional sequence of the names of the signal’s arguments.

seedChanged

pyqtSignal(*types, name: str = …, revision: int = …, arguments: Sequence = …) -> PYQT_SIGNAL

types is normally a sequence of individual types. Each type is either a type object or a string that is the name of a C++ type. Alternatively each type could itself be a sequence of types each describing a different overloaded signal. name is the optional C++ name of the signal. If it is not specified then the name of the class attribute that is bound to the signal is used. revision is the optional revision of the signal that is exported to QML. If it is not specified then 0 is used. arguments is the optional sequence of the names of the signal’s arguments.

seedReplaced

pyqtSignal(*types, name: str = …, revision: int = …, arguments: Sequence = …) -> PYQT_SIGNAL

types is normally a sequence of individual types. Each type is either a type object or a string that is the name of a C++ type. Alternatively each type could itself be a sequence of types each describing a different overloaded signal. name is the optional C++ name of the signal. If it is not specified then the name of the class attribute that is bound to the signal is used. revision is the optional revision of the signal that is exported to QML. If it is not specified then 0 is used. arguments is the optional sequence of the names of the signal’s arguments.

__init__(*args, **kwargs)

See class docstring for info on the different constructor arguments.

validateSettings()

Check whether the step settings are valid and return a list of SettingsError and SettingsWarning to report any invalid settings. Default implementation checks that all stepper files are set to valid file paths.

Return type

list[TaskError or TaskWarning]

reduceFunction(inps)

The main computation for this step. This function should take in a iterable of inputs and return an iterable of outputs.

Example:

def reduceFunction(self, words):
    # Find all unique words
    seen_words = set()
    for word in words:
        if word not in seen_words:
            seen_words.add(word)
            yield word