End-to-End Testing
Introduction
WIP. Assertion vocabulary, JSON keys, and the in-browser test designer all work today; field names may shift as the assertion engine grows.
A test is a JSON object with a name, optional setup, and an ordered list of steps. A step is one debug command (click, text_input, resize, …) or one assertion (assert_text, assert_layout, …). The runner drives a real App instance and evaluates each assertion against the live DOM. Two ways to run:
AZ_E2E=<file> ./my_appboots your application headlessly and runs the JSON file as a single batch. Prints cargo-test-style output, exits0if every test passes,1otherwise.POST /e2e/runfrom a script or the in-browser inspector queues the same payload against a running app.
A first test
{
"name": "Button click increments counter",
"setup": {
"window_width": 800,
"window_height": 600,
"dpi": 96
},
"steps": [
{ "op": "wait_frame" },
{ "op": "click", "selector": "button" },
{ "op": "wait_frame" },
{ "op": "assert_text", "selector": "p", "expected": "6" },
{ "op": "take_screenshot" }
]
}
Run it:
AZ_BACKEND=headless AZ_E2E=test.json ./my_app
Output mirrors cargo test:
running 1 test
test "Button click increments counter" ... ok
test result: ok. 1 passed; 0 failed; finished in 0.21s
A test file is either one test object or an array of them.
File and step shape
{
"name": "string",
"description": "string?",
"config": {
"continue_on_failure": false,
"delay_between_steps_ms": 0
},
"setup": {
"window_width": 800,
"window_height": 600,
"dpi": 96,
"app_state": { "any": "json" }
},
"steps": [
{ "op": "...", "screenshot": false, "<params>": "..." }
]
}
config.continue_on_failure keeps running steps after the first failure (still reports the test as failed). config.delay_between_steps_ms inserts a sleep, useful for visually inspecting a test that runs against a visible window. setup.app_state puts each test into a known state without restarting the process.
screenshot: true on any step captures the rendered window after the step and embeds the base64 PNG in that step's result.
Step operations
Every op accepted by POST / (covered in Debugging) is a valid step. Plus the assertion ops below.
assert_text(params:selector,expected). First-match text content equalsexpected.assert_exists(params:selector). At least one node matches.assert_not_exists(params:selector). No node matches.assert_node_count(params:selector,expectedint). Exactlyexpectednodes match.assert_layout(params:selector,property,expected,tolerance?).propertyisx,y,width, orheight. Default tolerance0.5px.assert_css(params:selector,property,expected). Computed CSS value equalsexpected.assert_app_state(params:path,expected). Dot-path against the JSON-serialised app data.assert_scroll(params:selector,x?,y?,tolerance?). Scroll position of the first match.assert_screenshot(params:reference,threshold?,max_diff_ratio?,save_actual?). Compares the current screenshot against a reference PNG.
Selector resolution accepts CSS selectors (.btn, #counter, div > span), explicit node_id integers, or a text substring match. Pick whichever is least brittle. selector is preferred because the inspector can build them by clicking nodes in the DOM tree.
Step results
Each step returns:
{
"step_index": 1,
"op": "click",
"status": "pass",
"duration_ms": 12,
"logs": ["[Input] click at (200, 300)", "..."],
"screenshot": "data:image/png;base64,...",
"response": { "...command-specific data..." },
"error": null
}
A test rolls steps up:
{
"name": "Button click increments counter",
"status": "pass",
"duration_ms": 156,
"step_count": 5,
"steps_passed": 5,
"steps_failed": 0,
"steps": [ ... ],
"final_screenshot": "data:image/png;base64,..."
}
final_screenshot is the screenshot from the last step that requested one, or null.
Driving with curl
The debug server's POST / accepts a run_e2e_tests op that bypasses the file-based runner:
curl -s -X POST http://127.0.0.1:8765/ \
-H 'Content-Type: application/json' \
-d @tests.json | jq '.data.value.results'
Default request timeout is 30 s; tests that take longer must either pass "timeout_secs": 600 in the request body or use AZ_E2E= (which uses a 600 s timeout internally). The Hello World shell driver in Debugging shows the same wait-frame / click / read-back pattern that an E2E step performs internally. You can build a test scenario incrementally as a curl script, then crystallise it into a JSON file once it works.
Continuation across relayout
Some steps (resize, set_node_text, delete_node) require a relayout pass to complete before the next step can read the resulting state. The runner detects these and yields back to the timer; the test resumes on the next tick. From the test author's perspective this is invisible: write resize followed by assert_layout and the runner handles the suspension.
This is why AZ_E2E requires the application to reach the event loop. The test cannot make progress while the timer is not running. With AZ_BACKEND=headless (or AZUL_HEADLESS=1) the event loop runs without an OS window, which is the standard CI configuration.
CI integration
A typical workflow:
- Build the application once:
cargo build --release. - Run the test bundle headlessly:
AZ_BACKEND=headless AZ_E2E=tests/smoke.json ./target/release/my_app. - Process exit code is the test verdict. Use it as the CI step's exit code.
- For screenshot diffs, set
assert_screenshotsteps with areferencePNG path; commit the references to the repo and update them via aBLESS=1workflow when intentional UI changes land.
For per-PR feedback, every shell script in tests/e2e/ is a curl-against-AZ_DEBUG driver that became a JSON test once it was stable.
The language-binding board (scripts/e2e_language_matrix.sh) runs the same counter scenario against every binding and prints a status table. On a --gate-shipped failure it now dumps the tail of each failed binding's build/run log and its AZ_RECORD trace, so CI logs explain the failure without a local repro.
Language toolchain prerequisites
Each binding drives the prebuilt library (built with --features build-dll,debug-server, which compiles in the AZ_E2E runner) and needs its own toolchain to build and run the hello-world. Missing toolchains report ⊘ SKIP (they do not gate); a present-but-broken toolchain reports ✗ FAILS. The shipped tier:
| Binding | Toolchain | Notes / common gotchas |
|---|---|---|
| Rust | cargo/rustc 1.88+ |
The example links the prebuilt DLL with --no-default-features --features link-dynamic (not link-static, which rebuilds Azul without the debug-server E2E runner). On Windows the import lib must be reachable as both azul.dll.lib and azul.lib. |
| C | clang/gcc; MinGW on Windows (links azul.dll.lib) |
Include the generated azul.h. |
| C++ | clang++/g++ with -std=c++20 (also 03–23) |
Include azul20.hpp. Wrapper types are move-only RAII; pass owning values with std::move — never copy a moved-from wrapper. |
| Python | CPython 3.10+ | Uses the azul extension module (a separate build), not FFI. |
| Node | node + the koffi npm package |
macOS strips DYLD_* from the hardened node, so koffi must find the lib by an explicit path or the lib must sit next to the script. |
| C# | .NET SDK (dotnet) |
P/Invoke; the native lib must be loadable next to the build output. |
| Java | JDK 17+ and Maven (mvn) + JNA 5.14 |
The build targets release 11, so a JDK older than 11 fails with invalid target release: 11 — pin JDK 17 (e.g. actions/setup-java@v4 with Temurin 17). On macOS the JVM needs -XstartOnFirstThread. |
| Kotlin | JDK 17+, kotlinc (or Gradle) + JNA 5.14 |
Same JDK requirement as Java; kotlinc compiles Azul.kt + HelloWorld.kt against the JNA jar. |
| Ruby | ruby + the ffi gem |
require 'ffi' fails with LoadError unless the gem is installed: gem install ffi. It is not preinstalled on CI runners. |
| Lua | LuaJIT (the stock PUC lua has no FFI); full E2E runs on arm64/macOS, not x86-64 |
LuaJIT's ffi cannot call a C function that passes an aggregate by value on some ABIs: on x86-64 (SysV), App.create(.., AppConfig) raises NYI: cannot call this C function (yet). This is a LuaJIT limitation, not a version issue (a freshly-built current LuaJIT 2.1 still NYIs) and not a binding bug — the same azul.lua runs on arm64/macOS, where the struct is passed differently. CI therefore reports lua as ⊘ SKIP on x86-64 (a toolchain limit, which never gates). |
| Go | go toolchain + a C compiler (cgo) |
— |
AZ_LOG is on by default (see Debugging), so a binding that builds but exits early will print the platform-layer trace on stderr, which the board captures.
Recording a test from the inspector
The in-browser inspector at http://localhost:<port>/ includes an E2E designer. Click „Record“, interact with the running window normally, and the inspector captures each click, text input, scroll, and resize as a step. Click „Stop“, review the steps, and either run them in place or export to JSON for AZ_E2E.
The recorder uses the same selector resolver as assert_*, so the captured steps are robust to layout shifts as long as your DOM has stable IDs and classes.
Limitations
- A test runs against the current application instance — there is no per-test sandbox. Use
setup.app_stateto put the app in a known state before each test. take_native_screenshotreturns the actual framebuffer of the running window. Pixel-identical comparison across platforms is unrealistic; useassert_screenshotwith amax_diff_ratiotolerance, or pin the diff job to one platform in CI.- The runner does not multi-thread tests. Each test runs to completion before the next starts. If you need parallel runs, spawn N processes on N ports.
Coming Up Next
- Debugging — Debug overlays, the inspector, and structured logging
- Headless Rendering — Running the pipeline without a window
- Code Generation — How
azul-docregenerates bindings fromapi.json