Performance Regression testing automation: Design

Overview

The design of the performance regression test is composed of the following components:

The Kitware Gitlab instance which trigger the benchmark jobs when pushes to master and manually in merge requests.
Gitlab CI jobs for performing the benchmarks and for generating the statistics and plots.
The Kitware CDASH instance which stores the benchmark measurements of each master commit

The performance regression test is performed when the user triggers the manual job benchtest which invoke the benchmark suite in a Gitlab runner job and later compare its results against the historical results, stored in CDASH, of its most immediate master ancestor. The results of this comparison are then displayed in a brief report in the form of a comment in its corresponding Gitlab merge-request.

Details

Selection of Benchmarks

While we can possibly run all of the provided benchmarks in the continuous build track to avoid potential performance and latency issues in the CI, I propose to initially limit the benchmark suites to be run to solely:

BenchmarkFilters
BenchmarkDeviceAdapters

Benchmark ctest

We must provide a CMake function named add_benchmark_test which runs the benchmark as a CTEST and uploads its corresponding output report to CDASH. The CMake function should solely take the name of the Benchmark target.

CMake options to enable this

Benchmark tests will be enabled in a CMAKE build that sets both VTKm_ENABLE_BENCHMARKS and VTKm_ENABLE_TESTING.

New Gitlab Runner requirements

It must have disabled every type of CPU scaling option both at the BIOS and Kernel level (cpugovern)
It must provide a gitlab runner with a concurrency level 1 to avoid other jobs being scheduled while the benchmark is being executed.

Gitlab job for running the benchmarks

The job should install Google Benchmarks in release mode and the appropriate options to perform benchmarks. After running the benchmarks it must upload its results to the Kitware CDASH instance.

Gitlab job for generating the statistics and plots

The job must use an docker image which provides a Python 3 distribution for compare.py script and possibly the R Tidy-verse stack to generate the plots and further statistics.

The job must write its output in the requested merge-request in the following format:

In the first field we display the Google Benchmark compare.py script raw output of a few selected benchmarks. For this purpose I propose to solely displays the output of these following benchmarks.

Contour
Gradient
Threshold
BenchSort

Below is an example of this raw output:

Tasklists

Create the CTEST that runs the desired benchmarks and captures its output file
Create Gitlab CI job that runs the specified benchmarks and upload it to CDASH
Create Gitlab CI job that generates the stats (using Gbench compare.py) and also generate plots to facilitate the user interpret its results.

References and links

Edited Jul 08, 2022 by Vicente Bolea