schrodinger.application.transforms.sourceid module

Apache Beam transforms for SourceID tagging and ancestor lookup.

class schrodinger.application.transforms.sourceid.TagSourceIds(tag: str)

Bases: PTransform

Tag the SourceID on each structure.

Adds a tag to each structure’s SourceID, which can later be used with KeyByParentSourceId or GroupByParentSourceId to find ancestors at this point in the pipeline.

__init__(tag: str)
Parameters:

tag – Tag to set on each structure’s SourceID

expand(pcoll)
class schrodinger.application.transforms.sourceid.KeyByParentSourceId(tag: str)

Bases: PTransform

Key each structure by its ancestor SourceID with the given tag.

Raises if no ancestor with the tag is found.

__init__(tag: str)
Parameters:

tag – Tag identifying which ancestor to key by

expand(pcoll)
class schrodinger.application.transforms.sourceid.GroupByParentSourceId(tag: str)

Bases: PTransform

Group structures by their ancestor SourceID with the given tag.

__init__(tag: str)
Parameters:

tag – Tag identifying which ancestor to group by

expand(pcoll)