schrodinger.tasks.beamtasks module

class schrodinger.tasks.beamtasks.BeamTask(*args, _param_type=<object object>, **kwargs)

Bases: MapTaskMixin, ComboJobTask

A base class for tasks that use Apache Beam to process input items in a pipeline. To create a BeamTask:

Specify the ItemInputClass that defines the type of input items. Define a Settings CompoundParam (if the pipeline requires any settings). Override the definePipeline method to define the Beam pipeline.

For an example, see scripts/examples/tasks/hello_beamtask.py.

__init__(*args, **kwargs)
combineFunction(settings)

The combineFunction is responsible for getting inputs via _generateInputItems, processing them in conjunction with the settings, and calling _handleOutputItem for each output item.

This method should be implemented by subclasses.

static definePipeline(InputTransform, OutputTransform, settings, options)

Override this static method to define the Beam pipeline. This method should contain the following block:

with beam.Pipeline(runner=SeamRunner(), options=options) as p:
    (p
     | InputTransform()
     # Additional transforms to define the workflow. Use settings
     # as needed to configure transforms
     | OutputTransform()
     )
Parameters:
  • InputTransform – A PTransform to create the input PCollection.

  • OutputTransform – A PTransform to save the output PCollection.

  • settings – The workflow settings, defined by the Settings class.

  • options – The pipeline options to use for the Beam pipeline. Pass these options to the beam.Pipeline constructor with options=options.

processOutputs(output_list: list)

Override this method to do output processing in the launch process. The output_list will contain the items from the PCollection returned by the definePipeline method.

This is a convenience API to allow for basic output processing in the launch process. However, using this API causes all output to be collected in memory.

This feature also relies on a global output_list variable. Running multiple BeamTasks in the same process may cause unexpected behavior.

For improved robustness and memory usage, move output processing/writing into the beam pipeline itself.

class Input(*args, _param_type=<object object>, **kwargs)

Bases: Settings

Variables:
  • input_items – A list of input items.

  • input_file – A file containing input items.

input_file: TaskFile

A parameter of the class.

input_fileChanged

A pyqtSignal emitted by instances of the class.

input_fileReplaced

A pyqtSignal emitted by instances of the class.

input_items: list[NotImplemented]

A parameter of the class.

input_itemsChanged

A pyqtSignal emitted by instances of the class.

input_itemsReplaced

A pyqtSignal emitted by instances of the class.

class Output(*args, _param_type=<object object>, **kwargs)

Bases: CompoundParam

output_items: list[NotImplemented]

A parameter of the class.

output_itemsChanged

A pyqtSignal emitted by instances of the class.

output_itemsReplaced

A pyqtSignal emitted by instances of the class.

calling_contextChanged

A pyqtSignal emitted by instances of the class.

calling_contextReplaced

A pyqtSignal emitted by instances of the class.

failure_infoChanged

A pyqtSignal emitted by instances of the class.

failure_infoReplaced

A pyqtSignal emitted by instances of the class.

input: parameters.CompoundParam

A parameter of the class.

inputChanged

A pyqtSignal emitted by instances of the class.

inputReplaced

A pyqtSignal emitted by instances of the class.

job_configChanged

A pyqtSignal emitted by instances of the class.

job_configReplaced

A pyqtSignal emitted by instances of the class.

max_progressChanged

A pyqtSignal emitted by instances of the class.

max_progressReplaced

A pyqtSignal emitted by instances of the class.

nameChanged

A pyqtSignal emitted by instances of the class.

nameReplaced

A pyqtSignal emitted by instances of the class.

output: parameters.CompoundParam

A parameter of the class.

outputChanged

A pyqtSignal emitted by instances of the class.

outputReplaced

A pyqtSignal emitted by instances of the class.

progressChanged

A pyqtSignal emitted by instances of the class.

progressReplaced

A pyqtSignal emitted by instances of the class.

progress_stringChanged

A pyqtSignal emitted by instances of the class.

progress_stringReplaced

A pyqtSignal emitted by instances of the class.

statusChanged

A pyqtSignal emitted by instances of the class.

statusReplaced

A pyqtSignal emitted by instances of the class.