schrodinger.seam.metric module

NOTE: The Metrics API is still actively being developed and is subject to change.

An API to access the user defined and Beam provided metrics of a Seam pipeline.

class schrodinger.seam.metric.WorkerPeakMemUsageDistributionInMB(mem_min: float, mem_p25: float, mem_median: float, mem_p75: float, mem_max: float)

Bases: JsonableClassMixin

Metrics for the workers used to execute a stage. Memory metrics are in bytes.

Variables:
  • mem_min – The minimum memory usage of the workers.

  • mem_p25 – The 25th percentile memory usage of the workers.

  • mem_median – The median memory usage of the workers.

  • mem_p75 – The 75th percentile memory usage of the workers.

  • mem_max – The standard deviation of the memory usage of the workers.

mem_min: float
mem_p25: float
mem_median: float
mem_p75: float
mem_max: float
toJsonImplementation()

Abstract method that must be defined by all derived classes. Converts an instance of the derived class into a jsonifiable object.

Returns:

A dict made up of JSON native datatypes or Jsonable objects. See the link below for a table of such types. https://docs.python.org/2/library/json.html#encoders-and-decoders

classmethod fromJsonImplementation(json_dict)

Abstract method that must be defined by all derived classes. Takes in a dictionary and constructs an instance of the derived class.

Parameters:

json_dict (dict) – A dictionary loaded from a JSON string or file.

Returns:

An instance of the derived class.

Return type:

cls

__init__(mem_min: float, mem_p25: float, mem_median: float, mem_p75: float, mem_max: float) None
class schrodinger.seam.metric.WorkerMetrics(worker_peak_mem_distribution_mb: WorkerPeakMemUsageDistributionInMB)

Bases: JsonableClassMixin

Metrics for the workers used to execute a stage. Memory metrics are in bytes.

worker_peak_mem_distribution_mb: WorkerPeakMemUsageDistributionInMB
toJsonImplementation()

Abstract method that must be defined by all derived classes. Converts an instance of the derived class into a jsonifiable object.

Returns:

A dict made up of JSON native datatypes or Jsonable objects. See the link below for a table of such types. https://docs.python.org/2/library/json.html#encoders-and-decoders

classmethod fromJsonImplementation(json_dict)

Abstract method that must be defined by all derived classes. Takes in a dictionary and constructs an instance of the derived class.

Parameters:

json_dict (dict) – A dictionary loaded from a JSON string or file.

Returns:

An instance of the derived class.

Return type:

cls

__init__(worker_peak_mem_distribution_mb: WorkerPeakMemUsageDistributionInMB) None
class schrodinger.seam.metric.ExecutionMetrics(output_pcoll_counts: ~typing.Dict[str, ~typing.Dict[str, int]] = <factory>, user_labels: ~typing.List[str] = <factory>, short_name: ~typing.Optional[str] = None, execution_cputime_seconds: ~typing.Dict[str, int] = <factory>, walltime_seconds: ~typing.Optional[float] = None, cputime_seconds: ~typing.Optional[float] = None, worker_metrics: ~typing.Optional[~schrodinger.seam.metric.WorkerMetrics] = None)

Bases: JsonableClassMixin

Metrics for a single execution of a stage.

Variables:
  • output_pcoll_counts – Mapping of each xform unique name in the stage to the number of elements in that xforms output PCollection(s).

  • user_labels – List of user defined labels for the stage.

  • short_name – The SHORT_NAME of the PrimitiveExecutor that executed the stage.

  • execution_cputime_seconds – Mapping of some identifier relevant to the PrimitiveExecutor to the CPU time spent in that identifier. For example, the xform unique name to the CPU time spent in that xform is used in the ExecutableStageExecutor

  • walltime_seconds – The total wall time spent in the stage.

  • cputime_seconds – The total CPU time spent in the stage.

output_pcoll_counts: Dict[str, Dict[str, int]]
user_labels: List[str]
short_name: Optional[str] = None
execution_cputime_seconds: Dict[str, int]
walltime_seconds: Optional[float] = None
cputime_seconds: Optional[float] = None
worker_metrics: Optional[WorkerMetrics] = None
toJsonImplementation()

Abstract method that must be defined by all derived classes. Converts an instance of the derived class into a jsonifiable object.

Returns:

A dict made up of JSON native datatypes or Jsonable objects. See the link below for a table of such types. https://docs.python.org/2/library/json.html#encoders-and-decoders

classmethod fromJsonImplementation(json_dict)

Abstract method that must be defined by all derived classes. Takes in a dictionary and constructs an instance of the derived class.

Parameters:

json_dict (dict) – A dictionary loaded from a JSON string or file.

Returns:

An instance of the derived class.

Return type:

cls

__init__(output_pcoll_counts: ~typing.Dict[str, ~typing.Dict[str, int]] = <factory>, user_labels: ~typing.List[str] = <factory>, short_name: ~typing.Optional[str] = None, execution_cputime_seconds: ~typing.Dict[str, int] = <factory>, walltime_seconds: ~typing.Optional[float] = None, cputime_seconds: ~typing.Optional[float] = None, worker_metrics: ~typing.Optional[~schrodinger.seam.metric.WorkerMetrics] = None) None
class schrodinger.seam.metric.SeamMetrics(*args, stage_metrics: Dict[str, ExecutionMetrics], **kwargs)

Bases: MetricResults

__init__(*args, stage_metrics: Dict[str, ExecutionMetrics], **kwargs)
query(filter: Optional[MetricsFilter] = None) Dict[str, List[MetricResult]]

Queries the runner for existing user metrics that match the filter.

It should return a dictionary, with lists of each kind of metric, and each list contains the corresponding kind of MetricResult. Like so:

{

“counters”: [MetricResult(counter_key, committed, attempted), …], “distributions”: [MetricResult(dist_key, committed, attempted), …], “gauges”: [] // Empty list if nothing matched the filter.

}

The committed / attempted values are DistributionResult / GaugeResult / int objects.

save(file_path: Union[str, Path])

Save a JSON dump of the metrics to a file.

Parameters:

file_path – The path to the file to save the metrics to.