schrodinger.seam.testing.benchmarks module¶
Benchmarks for different parts of the SeamRunner or Seam transforms.
Usage:
$SCHRODINGER/run seam_benchmarks.py
- class schrodinger.seam.testing.benchmarks.BenchmarkResult(value: float, units: Optional[str] = None)¶
- Bases: - object- Variables:
- value – Current value of the metric 
- units – (optional) units of the value. 
 
 - value: float¶
 - units: Optional[str] = None¶
 - __init__(value: float, units: Optional[str] = None) None¶
 
- class schrodinger.seam.testing.benchmarks.Benchmark¶
- Bases: - object- UNIT_TEST_N = 1¶
 - PERFORMANCE_TEST_N = 500¶
 - setup()¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- schrodinger.seam.testing.benchmarks.tmp_cwd()¶
- class schrodinger.seam.testing.benchmarks.WriteStructuresToFile_MaxSizeOfDirectory¶
- Bases: - Benchmark- This benchmark is to measure WriteStructuresToFile’s max shard directory size. - UNIT_TEST_N = 5¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.WriteStructuresToFile_Runtime¶
- Bases: - Benchmark- This benchmark is to measure WriteStructuresToFile’s runtime. - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.GroupByKeyWithSlowCoder¶
- Bases: - Benchmark- This benchmark is to test the performance of GroupByKey when using keys and values that are slow to deserialize. - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.ExecStagesWithSlowCoder¶
- Bases: - Benchmark- This benchmark is to test the performance of ExecutableStages which have outputs that are slow to deserialize. - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.FlattenWithSlowCoder¶
- Bases: - Benchmark- This benchmark is to test the performance of Flatten when using elements that are slow to deserialize. - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.FlattenBenchmark¶
- Bases: - Benchmark- UNIT_TEST_N = 3¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.GroupByKey¶
- Bases: - Benchmark- property gbk_executor¶
- Set up a GroupByKeyExecutor with an input and output EmbeddedPCollManager - Groups input type of (int, int) to output type of (int, Iterable[int]) 
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.MaxBundleMemUsage¶
- Bases: - Benchmark- Test that that bundles that are loaded directly into memory to process never exceed the max bundle memory limit. - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.WriteStructuresToFileMemoryBenchmark¶
- Bases: - Benchmark- Test that that bundles that are loaded directly into memory to process never exceed the max bundle memory limit. - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.LargeOutputPerBundleWithInProcessWorkerBenchmark¶
- Bases: - Benchmark- Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption. - UNIT_TEST_N = 5¶
 - PERFORMANCE_TEST_N = 2000¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.LargeOutputPerBundleWithSubprocessWorkerBenchmark¶
- Bases: - Benchmark- Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption. - UNIT_TEST_N = 5¶
 - PERFORMANCE_TEST_N = 8000¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.LargeOutputPerBundleWithLocalWorkerBenchmark¶
- Bases: - Benchmark- Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption even when using a DoFn marked with - @requires_local_execution.- UNIT_TEST_N = 5¶
 - PERFORMANCE_TEST_N = 8000¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.WorkerServerWithSortingBundleProcessingTime¶
- Bases: - Benchmark- Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption even when using a DoFn marked with - @requires_local_execution.- UNIT_TEST_N = 1000¶
 - PERFORMANCE_TEST_N = 1000000¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.WorkerServerBundleProcessingTime¶
- Bases: - Benchmark- Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption even when using a DoFn marked with - @requires_local_execution.- UNIT_TEST_N = 1000¶
 - PERFORMANCE_TEST_N = 100000¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.PCollFileManager_CreateReadersWithManyDataFiles¶
- Bases: - Benchmark- static pcoll_mngr()¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.GroupByKeyLargeValues_runtime¶
- Bases: - Benchmark- UNIT_TEST_N = 5¶
 - static generate_1M_random_string()¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.GroupByKeyDiskSpaceWithLargeValues¶
- Bases: - Benchmark- UNIT_TEST_N = 100¶
 - static generate_1M_random_string()¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.GroupByKeyLargeElements_memory¶
- Bases: - Benchmark- UNIT_TEST_N = 5¶
 - PERFORMANCE_TEST_N = 1000¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.GroupByKeyWithManyDataChunkFiles¶
- Bases: - Benchmark- Test that running GroupByKey with many data chunk files that exceed the _MAX_OPEN_FILES limit does not lead to excessive memory usage - UNIT_TEST_N = 5¶
 - PERFORMANCE_TEST_N = 1000¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.RedistributingWithLargeElements¶
- Bases: - Benchmark- Test that running a pipeline where the size of outputs generated per bundle is large doesn’t result in large memory consumption even when using a DoFn marked with - @requires_local_execution.- UNIT_TEST_N = 100¶
 - PERFORMANCE_TEST_N = 1000¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- class schrodinger.seam.testing.benchmarks.SeamRunnerOverheadBenchmark¶
- Bases: - Benchmark- Test a basic Map transform with the SeamRunner to test the overhead of the SeamRunner itself. - UNIT_TEST_N = 1¶
 - PERFORMANCE_TEST_N = 1¶
 - run(n: int) BenchmarkResult¶
- Parameters:
- n – The number of elements to process 
 
 
- schrodinger.seam.testing.benchmarks.get_benchmarks()¶