Running a virtual SDV diagnostics stack — no hardware required
The standard embedded workflow: write firmware, build it, flash it to a development board, connect a CAN analyzer, run the diagnostic sequences manually. The board lives on someone's desk. It's shared. It's not always available. The test sequence is not automated. If something breaks, you find out when you run the sequence again — which might be days later.
Software-defined vehicle programs are trying to change this. Firmware updates deploy over the air. Diagnostic interfaces need to be verified before a release, not after. The team that found a protocol regression three days before the OTA cutoff has a very bad week.
This post shows what a virtual diagnostics stack looks like in practice — a complete environment where you can define an ECU, generate its UDS implementation, run automated protocol campaigns, and inspect the results, all without a single piece of physical hardware. The same stack that runs in CI runs in your browser if you want it to.
What "virtual" means here
Virtual doesn't mean simulated at a reduced fidelity. It means the same C code that runs on your STM32 or Nordic SoC, compiled for the host platform (native_sim in Zephyr's terminology), connected to a software loopback transport instead of a CAN controller.
The ISO-TP framing runs. The session state machine runs. The ASIL-B safety chain runs — the five-step DID validation, the security access protocol, the violation counters. The DTC persistence runs against a simulated NVM. The only thing that's simulated is the physical bus.
This matters because the class of bugs that kills diagnostics programs — session gate regressions, security access sequence errors, NRC code mismatches, DTC state machine failures — are all bugs in the UDS core and the generated safety wrappers. They're not bugs in the CAN peripheral driver. You can find and eliminate them entirely in the virtual environment, before you ever touch hardware.
The full stack, from YAML to report
Step 1: Define the ECU's diagnostic interface
ecu:
name: "bms_ecu"
dids:
- id: "0xF190"
name: "VIN"
data_length: 17
access: [read]
sessions: [default, extended]
read_handler: "vin_read"
- id: "0xD001"
name: "BatteryVoltage"
data_length: 4
access: [read]
sessions: [extended]
read_security_level: 0
read_handler: "battery_voltage_read"
- id: "0xD010"
name: "CellBalanceThreshold"
data_length: 2
access: [read, write]
sessions: [extended]
read_security_level: 0
write_security_level: 1
read_handler: "cell_balance_read"
write_handler: "cell_balance_write"
dtcs:
- code: "0xD00101"
name: "CellOvervoltage"
severity: high
- code: "0xD00201"
name: "PackUndervoltage"
severity: high
Step 2: Generate the UDS implementation
python3 tools/codegen.py \
--config diagnostics_config.yaml \
--out generated/ \
--safety-wrappers \
--asil-level B \
--test-gen
The generator produces: DID handler stubs (did_handlers.c), ASIL-B safety wrappers (did_safety_wrappers.c), the DTC registration table, the complete UDS init sequence with uds_safety_self_test(), and a pytest suite covering every DID and DTC in the config. It also checks that every write-capable DID has a non-zero write_security_level — if not, the generation fails with a fatal error. This is enforced at generation time, not runtime.
Step 3: Build and run the virtual ECU
west build -b native_sim examples/bms_ecu \
-- -DDTC_OVERLAY_FILE=boards/native_sim.overlay
west build -t run
The ECU is now running on your laptop. It's listening on a virtual CAN socket. The CAN loopback is provided by Zephyr's native_sim CAN driver — two sockets wired together in process. The binary is the same firmware that will run on your STM32 target. Compiled for x86-64, but the same C code, the same compiler flags (minus architecture-specific ones), the same generated wrappers.
Step 4: Run a UDS campaign against it
testlab-run \
--config diagnostics_config.yaml \
--campaign campaigns/bms_validation.yaml \
--job extended_session_check \
--virtual \
--json reports/run_$(date +%Y%m%d_%H%M%S).json
The --virtual flag bypasses the CAN socket entirely — it uses TestLab's VirtualBus, an in-process Python queue pair. The ECU simulator (Python, all 14 UDS services) runs in-process alongside the campaign runner. No west, no vcan, no modprobe. Pure Python, runnable anywhere.
[INFO] Campaign: bms_validation / extended_session_check [PASS] session → extended [PASS] security_access → level 1 [PASS] read_did(0xD001) BatteryVoltage — 4 bytes [PASS] read_did(0xD010) CellBalanceThreshold — 2 bytes [FAIL] write_did(0xF190) VIN — NRC 0x31 requestOutOfRange [PASS] clear_dtc [PASS] read_dtc ────────────────────────────────────────────────── 7 steps · 6 PASS · 1 FAIL
One step failed: write to VIN when VIN is read-only. That's the expected result if your campaign is testing that a read-only DID rejects writes. If your campaign expected PASS on that step, the safety wrapper is correctly enforcing the access type gate (Step 4 of the five-step chain).
Step 5: Generate the HTML report
testlab report --results reports/run_*.json --out reports/bms_weekly.html
The HTML report is self-contained — no CSS CDN, no JavaScript CDN, no web server. Open it in any browser. Forward it in email. Attach it to a bug ticket. It includes per-step status, NRC codes with descriptions, trend charts across runs, and a regression diff against the previous run.
Step 6: Inspect in the dashboard
testlab serve --reports ./reports # → http://127.0.0.1:8765
The dashboard shows all ECUs, all campaigns, DTC aggregates across runs, and a live view that updates as new testlab-run results land in the reports directory. It reads the JSON files directly — no database, no daemon, no shared state.
The CI version of this
In GitHub Actions:
jobs:
diagnostics-validation:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: pip install xaloqi-testlab
- name: Run UDS campaign
env:
XALOQI_TESTLAB_KEY: ${{ secrets.XALOQI_TESTLAB_KEY }}
run: |
testlab-run \
--config docker/ecu_sim/diagnostics_config.yaml \
--campaign campaigns/bms_validation.yaml \
--job extended_session_check \
--virtual \
--json reports/ci_run_${{ github.run_id }}.json
- name: Generate report
run: |
testlab report \
--results reports/ci_run_*.json \
--out reports/ci_report.html
- uses: actions/upload-artifact@v4
with:
name: diagnostics-report
path: reports/ci_report.html
No hardware. No self-hosted runner. No vcan kernel module. Runs in three minutes on the standard GitHub-hosted Ubuntu runner. The HTML report is uploaded as an artifact — accessible from the Actions run page.
The browser path
For engineers who aren't running a local Zephyr toolchain, the same workflow runs entirely in the browser via Xaloqi Cloud.
Open xaloqi.com/workspace. Paste or type your diagnostics_config.yaml in the Monaco editor. Click Generate — the server runs codegen.py on your config and returns a ZIP with the C/H source files, the pytest suite, and CAPL scripts for CANoe. Click Run Tests — the server runs the virtual campaign against the in-process ECU simulator and streams results live. Download the ZIP. Take it into your local Zephyr environment.
The generator running on the Cloud server is the same codegen.py that runs locally in EDS. The ECU simulator running in the test runner is the same module that TestLab uses for --virtual. The YAML format is identical. There is no "Cloud mode" — it's the same tools running on a server you don't have to operate.
The AI panel generates diagnostics_config.yaml from a plain-English description of your ECU. Describe it: "BMS ECU, Cortex-M4, needs to expose battery voltage and cell balance threshold in extended session with SecurityAccess level 1 for writes, plus an over-voltage DTC." The model outputs a valid YAML config. You can also import an AUTOSAR ECU Extract ARXML file and the Cloud server parses it into diagnostics_config.yaml directly.
Where the virtual stack ends
The virtual ECU in TestLab implements all 14 UDS services correctly. It enforces session state, security access, DID access gates, DTC state machine semantics. What it doesn't replicate:
CAN bus timing and arbitration. The virtual ECU has no concept of CAN bus load, arbitration delays, or error frames. If your firmware has timing-dependent behavior in ISO-TP frame handling — edge cases in N_Bs or N_Cr timeouts — those require a real CAN bus to reproduce.
Hardware fault behavior. A real ECU can lose power mid-transfer. The NVM can have bad blocks. The CAN peripheral can enter bus-off state. None of these fault scenarios are simulated. The virtual stack assumes a reliable transport.
Peripheral driver correctness. The virtual ECU tests the UDS protocol layer. It doesn't test your HAL_CAN_AddTxMessage implementation, your flash write routine, or your interrupt priority configuration.
For those scenarios, you need hardware. But the virtual stack gets you through the protocol-layer development and verification work, which is the majority of the diagnostic stack's complexity. By the time you're testing on hardware, the protocol bugs are gone.
The practical difference
Teams that have a virtual diagnostics stack don't do manual protocol verification on shared benches. They run the campaign on every commit, look at the HTML report when something fails, and open testlab explain to understand the NRC code. Protocol regressions show up in the CI dashboard, not in a field engineer's bug report.
That shift — from bench-dependent to CI-native diagnostics testing — is what "software-defined" means for the diagnostics workflow. The vehicle is software-defined. The diagnostics toolchain should be too.