schrodinger.seam.transforms.combiners module

class schrodinger.seam.transforms.combiners.Smallest

Bases: object

class Globally(*, margin: float, keyfunc: Optional[Callable[[schrodinger.seam.transforms.combiners.T], float]] = None)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that reduces the input PCollection to just its smallest element(s).

If margin is specified, the smallest elements within margin of the smallest element will be returned.

If keyfunc is specified, the key function will be used to extract the value from the elements to compare. If it’s not supplied, the elements themselves will be compared and should be numeric (float or int).

Example usages:

>>> with beam.Pipeline() as p:
...     smallest = (p | beam.Create([1, 2, 3])
...        | Smallest.Globally(margin=1))
>>> # smallest will contain [1, 2]
>>> with beam.Pipeline() as p:
...     smallest = (p | beam.Create([1, 2, 3])
...        | Smallest.Globally(margin=1, keyfunc=lambda x: -x))
>>> # smallest will contain [2, 3]
__init__(*, margin: float, keyfunc: Optional[Callable[[schrodinger.seam.transforms.combiners.T], float]] = None)
expand(pcoll)
default_label() str
class PerKey(*, margin: float, keyfunc: Optional[Callable[[schrodinger.seam.transforms.combiners.V], float]] = None)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that returns the smallest elements per key.

If margin is specified, the smallest elements within margin of the smallest element will be returned.

If keyfunc is specified, the key function will be used to extract the value from the elements to compare. If it’s not supplied, the elements themselves will be compared and should be numeric (float or int).

Example usages:

>>> with beam.Pipeline() as p:
...     smallest = (p | beam.Create([("a", 1), ("a", 2), ("b", 3), ("b", 5)])
...        | Smallest.PerKey(margin=1))
>>> # smallest will contain [("a", [1, 2]), ("b", [3])]
>>> with beam.Pipeline() as p:
...     smallest = (p | beam.Create([("a", 1), ("a", 2), ("b", 3), ("b", 5)])
...        | Smallest.PerKey(margin=1, keyfunc=lambda x: -x))
>>> # smallest will contain [("a", [3, 2]), ("b", [5])]
__init__(*, margin: float, keyfunc: Optional[Callable[[schrodinger.seam.transforms.combiners.V], float]] = None)
expand(pcoll)
default_label() str
class PerGroupedValues(*, margin: float, keyfunc: Optional[Callable[[schrodinger.seam.transforms.combiners.V], float]] = None)

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform that returns the smallest elements per grouped value.

If margin is specified, the smallest elements within margin of the smallest element will be returned.

If keyfunc is specified, the key function will be used to extract the value from the elements to compare. If it’s not supplied, the elements themselves will be compared and should be numeric (float or int).

Example usages:

>>> with beam.Pipeline() as p:
...     smallest = (p | beam.Create([("a", [1, 2, 3]), ("b", [1, 3])])
...        | Smallest.PerGroupedValues(margin=1))
>>> # smallest will contain [("a", [1, 2]), ("b", [1])]
>>> with beam.Pipeline() as p:
...     smallest = (p | beam.Create([("a", [1, 2, 3]), ("b", [1, 3])])
...        | Smallest.PerGroupedValues(margin=1, keyfunc=lambda x: -x))
>>> # smallest will contain [("a", [3, 2]), ("b", [3])]
__init__(*, margin: float, keyfunc: Optional[Callable[[schrodinger.seam.transforms.combiners.V], float]] = None)
expand(pcoll)
default_label() str