Technical architecture and a proposed 3x3 benchmarking methodology for the OpenXLA compiler ecosystem across TPU v6e, NVIDIA H200, and AMD MI300X.

Technical Architecture and Systematic Benchmarking of the OpenXLA Ecosystem

DRAFT v2 — April 2026

A cross-platform analysis of modern ML compilers: an architectural study of the OpenXLA pipeline (CHLO → StableHLO → XLA → LLVM) paired with a proposed 3×3 benchmarking protocol evaluating three workloads (LLM inference, dense training, sparse embedding training) across TPU v6e, NVIDIA H200, and AMD MI300X.

Status: Proposed research protocol. Architectural analysis and methodology are complete; experimental results are forthcoming pending hardware access.

Read the paper (PDF) · LaTeX source · About this work

Technical Architecture and Systematic Benchmarking of the OpenXLA Ecosystem

Posts

Paper Overview: The OpenXLA Ecosystem and a 3×3 Benchmarking Protocol