schrodinger.seam.transforms.core module¶
- schrodinger.seam.transforms.core.GlobalDoFn(do_fn_cls: type)¶
A decorator that wraps a DoFn class to make it a global DoFn.
The only functional difference is that the finish_bundle method of the wrapped DoFn can yield values directly as opposed to having to include them with a window.
Example usage:
>>> import apache_beam as beam >>> from schrodinger.seam.transforms import core >>> >>> @core.GlobalDoFn ... class Batcher(beam.DoFn): ... ... def __init__(self): ... self._batch = [] ... ... def start_bundle(self): ... self._batch.clear() ... ... def process(self, element): ... self._batch.append(element) ... ... def finish_bundle(self): ... # With a normal DoFn, this would have to be ... # yield windowed_value.WindowedValue(self._batch, ... # MIN_TIMESTAMP, ... # [GlobalWindow()]) ... yield self._batch ... self._batch.clear() ...
- class schrodinger.seam.transforms.core.GroupIntoBatches(batch_size: int)¶
Bases:
apache_beam.transforms.ptransform.PTransform
PTransform that batches the input into desired batch size after grouping by key.
Identical behavior to Beam’s GroupIntoBatches but doesn’t use Beam timer’s (which are unsupported by the SeamRunner).
- __init__(batch_size: int)¶
- expand(pcoll)¶