Boosting WebAssembly Performance: Speculative Optimizations with Deopts and Inlining in V8

V8 has recently enhanced WebAssembly performance by introducing speculative optimizations, including deoptimization support and speculative call_indirect inlining, shipped in Chrome M137. These techniques, long used for JavaScript, now bring speedups to WebAssembly—especially for WasmGC programs. Below, we answer common questions about how these optimizations work, why they matter, and what results you can expect.

1. What are speculative optimizations and why are they important for WebAssembly?

Speculative optimizations involve making assumptions based on runtime feedback to generate faster machine code. For example, if a variable is observed to always be an integer, the compiler can optimize for that case. If the assumption later fails, a deoptimization (deopt) reverts to slower, correct code. While JavaScript has relied on this for years, WebAssembly 1.0 did not need it because its static typing and ahead-of-time compilation already provided good performance. However, with the advent of WasmGC, which supports managed languages like Java and Dart, speculative optimizations become crucial. They allow V8 to generate more efficient code for high-level constructs like structs and arrays, leading to significant speed improvements.

Boosting WebAssembly Performance: Speculative Optimizations with Deopts and Inlining in V8 — Source: v8.dev

2. How does deoptimization work for WebAssembly in V8?

Deoptimization in V8 for WebAssembly mirrors the approach used for JavaScript. When the compiler generates speculative machine code, it records the assumptions made—for instance, the expected type of a function call target. If runtime behavior violates these assumptions, V8 triggers a deopt. This discards the optimized code and falls back to a slower, unoptimized path that can handle the new situation. The runtime then collects additional feedback and may re-optimize later. For WebAssembly, deoptimization is a new capability that makes speculative inlining and other dynamic optimizations feasible. It ensures correctness while reaping performance gains in common cases, much like it does in JavaScript JIT compilation.

3. Why hasn't WebAssembly needed speculative optimizations before?

Traditional WebAssembly (Wasm 1.0) benefits from extensive static information: all functions, instructions, and variables are statically typed. Languages like C, C++, and Rust, commonly compiled to Wasm, are amenable to ahead-of-time optimizations in toolchains such as Emscripten (LLVM) or Binaryen. These tools produce well-optimized binaries without requiring runtime speculation. Additionally, Wasm 1.0 lacks the rich type system of WasmGC—no structural subtyping or garbage collection—so the compiler can often generate efficient code statically. The introduction of WasmGC changed this landscape, as higher-level bytecode benefits from dynamic feedback to narrow down types and inline calls speculatively.

4. What motivated V8 to add speculative optimizations for WebAssembly now?

The primary motivation is the evolution of WebAssembly itself, particularly the WasmGC proposal. WasmGC enables compiling managed languages like Java, Kotlin, and Dart to WebAssembly, producing bytecode that includes rich types (structs, arrays) and subtyping relationships. These high-level constructs are less amenable to static optimization alone. By introducing speculative inlining and deoptimization, V8 can make assumptions—for example, that a polymorphic function call always targets one specific implementation—and generate highly optimized code. When those assumptions hold, execution speeds up significantly. For instance, Dart microbenchmarks show over 50% average speedup. This capability also lays the groundwork for future optimizations, making WasmGC more competitive with native runtimes.

5. How does speculative inlining of call_indirect work?

The call_indirect instruction in WebAssembly allows indirect function calls (like function pointers). Without speculation, V8 must emit a dynamic dispatch that checks the function table index and type signature every time—potentially slow. With speculative inlining, V8 collects runtime feedback on which function is actually called at a given site. If a single target dominates, the compiler inlines that function directly into the caller, assuming the call will continue to go there. A guard (type check) is inserted to verify the assumption at runtime. If it holds, execution bypasses the general dispatch, yielding faster code. If the assumption fails, a deoptimization occurs, and execution falls back to the unoptimized path. This technique, common in JavaScript JITs, is now applied to WebAssembly for the first time.

6. What speedup can users expect from these optimizations?

Performance gains vary by workload. For WasmGC programs, benefits are substantial: on Dart microbenchmarks, the combination of speculative inlining and deoptimization yields an average speedup of over 50%. For larger, realistic applications (e.g., compiled from Java or Kotlin), the improvement is more modest but still meaningful—between 1% and 8%. Traditional C/C++ Wasm binaries may see less impact because they already benefit from static optimization. However, as WasmGC adoption grows, these speculative techniques will become increasingly important. Additionally, deoptimization serves as an enabling technology for future optimizations, promising even larger gains down the line.

7. What are the future plans for deoptimization in WebAssembly?

Deoptimization is a fundamental building block for many advanced compiler optimizations. In the future, V8 plans to leverage it for techniques like speculative loop unrolling, polymorphic inline caching, and type specialization for WasmGC. These optimizations can further improve performance by making bold assumptions and safely recovering when they prove wrong. The team is also exploring how to apply these methods to other WebAssembly proposals, such as reference types and exception handling. As WebAssembly continues to evolve, runtime feedback and deoptimization will likely become standard practice, enabling Wasm to approach the performance of native code for a wider range of languages.