Commit Graph

2 Commits

Author SHA1 Message Date
Rasmus Munk Larsen
93e9970964 Run clang-format on bench_small_matrix.cpp
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:28:20 -07:00
Rasmus Munk Larsen
8ddbe44799 Add small fixed-size matrix benchmarks for robotics/CV workloads
Benchmark the operations that dominate robotics and computer vision
inner loops: fixed-size matrix multiply, matrix-vector, inverse,
determinant, LLT, LDLT, PartialPivLU, ColPivHouseholderQR, JacobiSVD,
and SelfAdjointEigenSolver for sizes 2x2 through 8x9.

Key findings from the baseline measurements:
- MatMul/MatVec: excellent (<1ns for 3x3 float)
- Inverse 3x3: excellent (3.4ns)
- LLT 3x3→4x4: 8x jump (3.9→31.7ns float) due to inlining threshold
- ColPivQR 3x3: 166ns — expensive for such a small matrix
- JacobiSVD 3x3: 498ns double — the main CV bottleneck
- SelfAdjointEig: computeDirect() is 3.2x faster than iterative for 3x3
  (71ns vs 230ns) — many users may not know this API exists

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:06:32 -07:00