I am trying to set up an automated quality gate in a GitHub Actions CI/CD pipeline to catch microarchitecture-level performance regressions (such as instruction bloat, branch mispredictions, or bad spatial locality leading to cache misses) before deployment.
The application is written in [Insert your language here, e.g., C++ / Rust], and while the source code passes standard semantic unit tests, different compilation targets occasionally result in binaries that cause unexpected 100% CPU utilization spikes on physical silicon.
Because general source-code static analysis tools do not map to final machine code behavior, I am looking to automate this at the compiled level.
What I have tried / analyzed so far:
Dynamic Profiling (eBPF / Valgrind): Running runtime benchmarks via emulators (QEMU) or remote test rigs inside the runner. However, this introduces massive time overhead and flaky pipeline runs due to environmental noise in virtualized runners.
Manual Disassembly: Running
objdumpor a disassembler and manually inspecting the assembly output for inefficient compiler-generated patterns, which is not scalable.
My Constraints:
The validation must run entirely as an automated step in a CI/CD pipeline.
It needs to analyze the compiled binary artifact or its assembly representation statically or predictively, without executing a full dynamic benchmark suite on physical hardware.
My Questions:
Are there established automated methods or parsing frameworks capable of statically auditing a compiled binary's assembly blocks to flag known sub-optimal instruction sequences?
How can we programmatically compare the assembly structures of two build artifacts during a Pull Request to flag structural compilation differences that heavily impact microarchitecture timing?