schrodinger.seam.testing.benchmarks module

Usage:

$SCHRODINGER/run seam_benchmarks.py

class schrodinger.seam.testing.benchmarks.BenchmarkResult(value: float, units: Optional[str] = None)

Bases: object

Variables:
  • value – Current value of the metric

  • units – (optional) units of the value.

value: float
units: Optional[str] = None
__init__(value: float, units: Optional[str] = None) None
class schrodinger.seam.testing.benchmarks.Benchmark

Bases: object

UNIT_TEST_N = 1
PERFORMANCE_TEST_N = 500
setup()
run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

schrodinger.seam.testing.benchmarks.AlkaneStructure(n: int) Structure
schrodinger.seam.testing.benchmarks.tmp_cwd()
class schrodinger.seam.testing.benchmarks.WriteStructuresToFile_MaxSizeOfDirectory

Bases: Benchmark

This benchmark is to measure WriteStructuresToFile’s max shard directory size with ‘compress_intermediate_files=True’

run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.WriteStructuresToFile_Runtime

Bases: Benchmark

This benchmark is to measure WriteStructuresToFile’s runtime with ‘compress_intermediate_files=True’

run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.GroupByKeyWithSlowCoder

Bases: Benchmark

This benchmark is to test the performance of GroupByKey when using keys and values that are slow to deserialize.

run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.ExecStagesWithSlowCoder

Bases: Benchmark

This benchmark is to test the performance of ExecutableStages which have outputs that are slow to deserialize.

run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.FlattenWithSlowCoder

Bases: Benchmark

This benchmark is to test the performance of Flatten when using elements that are slow to deserialize.

run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.GroupByKey

Bases: Benchmark

property gbk_executor

Set up a GroupByKeyExecutor with an input and output EmbeddedPCollManager

Groups input type of (int, int) to output type of (int, Iterable[int])

run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.MaxBundleMemUsage

Bases: Benchmark

Test that that bundles that are loaded directly into memory to process never exceed the max bundle memory limit.

run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.WriteStructuresToFileMemoryBenchmark

Bases: Benchmark

Test that that bundles that are loaded directly into memory to process never exceed the max bundle memory limit.

run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.LargeOutputPerBundleWithInProcessWorkerBenchmark

Bases: Benchmark

Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption.

UNIT_TEST_N = 100
PERFORMANCE_TEST_N = 2000
run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.LargeOutputPerBundleWithSubprocessWorkerBenchmark

Bases: Benchmark

Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption.

UNIT_TEST_N = 100
PERFORMANCE_TEST_N = 8000
run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.LargeOutputPerBundleWithLocalWorkerBenchmark

Bases: Benchmark

Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption even when using a DoFn marked with @requires_local_execution.

UNIT_TEST_N = 100
PERFORMANCE_TEST_N = 8000
run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.WorkerServerWithSortingBundleProcessingTime

Bases: Benchmark

Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption even when using a DoFn marked with @requires_local_execution.

UNIT_TEST_N = 1000
PERFORMANCE_TEST_N = 1000000
run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.WorkerServerBundleProcessingTime

Bases: Benchmark

Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption even when using a DoFn marked with @requires_local_execution.

UNIT_TEST_N = 1000
PERFORMANCE_TEST_N = 100000
run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.PCollFileManager_CreateReadersWithManyDataFiles

Bases: Benchmark

static pcoll_mngr()
run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.GroupByKeyDiskSpaceWithLargeValues

Bases: Benchmark

UNIT_TEST_N = 100
static generate_1M_random_string()
run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.GroupByKeyLargeElements_memory

Bases: Benchmark

UNIT_TEST_N = 100
PERFORMANCE_TEST_N = 1000
run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

class schrodinger.seam.testing.benchmarks.GroupByKeyWithManyDataChunkFiles

Bases: Benchmark

Test that running GroupByKey with many data chunk files that exceed the _MAX_OPEN_FILES limit does not lead to excessive memory usage

UNIT_TEST_N = 100
PERFORMANCE_TEST_N = 1000
run(n: int) BenchmarkResult
Parameters:

n – The number of elements to process

schrodinger.seam.testing.benchmarks.get_benchmarks()
schrodinger.seam.testing.benchmarks.green(text)
schrodinger.seam.testing.benchmarks.yellow(text)
schrodinger.seam.testing.benchmarks.white(text)
schrodinger.seam.testing.benchmarks.rfill(text, width)
schrodinger.seam.testing.benchmarks.main()