Add tools to analyze CI fleet status
Some kind of pipeline to gather information from our project pipelines and feed them into some persistent store (pgsql? sqlite?) would be good to then bring up dashboards (Grafana? Superset?) to help answer questions like:
- what projects could use more resources? (long queue times)
- are there projects that are underutilizing their hardware (idle hardware)
- are there excessively expensive projects (caching, targeted perf investigation)
- error pattern detection
- find failing jobs
- download logs
- analyze and classify logs
- pull down specified artifact files; may be project-specific (e.g.,
compile_output.log
orsuperbuild/*/stamp/*.log
) - store classification back to the persistent store
- statistics gathering
- e.g.,
.ninja_log
files from artifacts
- e.g.,
The API also exposes test results; extending this to analyzing test result patterns would also be useful.
Edited by Ben Boeckel