ADR-0026 — Cross-store verifier segment pattern

ADRsUpdated 2026-05-11 00:00 EDT5 min readEdit on GitHub ↗

ADR-0026 — Cross-store verifier segment pattern #

Status: Accepted
Date: 2026-05-08
Deciders: Natan
Source: PRD-07 ratification. The pattern emerged in

src/systems/snappy/domain-activation-verify.ts and needs a doc anchor before a second SUT adopts it.

Context #

A synthetic transaction drives the SUT through a workflow and observes the SUT's response (the black-box view, per Google SRE ch. 17). That answers "did the API say yes?" but not "did the side-effects the workflow promised actually land?"

For non-trivial workflows the answer to those two questions diverges. Snappy's domain-activation pipeline writes to seven stores (Mongo, S3, ClickHouse, Loki, Hatchet workflow events, …) during a single POST /api/domains { activate: true }. The API returning status: "active" after polling proves the primary state machine landed; it doesn't prove the audit row was written, the analytics mirror synced, the robots archive uploaded, or the workflow logs were emitted. Each of those is a separate failure surface in production.

Charity Majors et al., Observability Engineering (O'Reilly 2022) ch. 5: "the answer to 'is X working' is rarely a single question". The white-box view fills the gap.

Decision #

A scenario that produces non-trivial cross-store side-effects gets two segments, in this order:

Drive segment — the synthetic transaction, named

<scenario-name>.ts. POSTs / polls / DELETEs against the SUT's public API. Captures identifiers (e.g. domainId) into the probe-registry.

Verify segment — <scenario-name>-verify.ts. Runs after

drive settles. Reads <identifier> from the probe-registry and queries each store the workflow was supposed to write to. One step per store. Each step:

skips cleanly when the identifier is absent (drive failed)
skips cleanly when SQA can't reach that store (in-cluster

only, etc.) — with the skip reason naming the follow-up

passes when the expected side-effect is found
fails when the store is reachable but the side-effect is

missing (the system said it did the work and didn't)

The two segments compose at the system's index.ts as siblings of preflight. They share state through the probe-registry, not through function arguments — the registry is the documented seam (ADR-0022).

Why two segments and not one #

Different failure remediations. Drive fail = system bug

or probe-config bug (ambiguous; investigate API). Verify fail = system said yes but didn't actually do the work (unambiguous; investigate the specific store's writer). Surfacing them as separate Result subtrees gives operators the taxonomy for free.

Different reachability profiles. The drive segment hits the

SUT's public ingress (always reachable). The verify segment may need direct DB access that is only sometimes reachable (in-cluster, behind a separate ingress, …). Letting verify steps skip independently of drive prevents one missing capability from masking the rest.

Different evolution rate. New stores get added to a

workflow on a different cadence than the workflow's API contract changes. Two files, two diffs.

Why named `-verify.ts`, not `-checks.ts` or `-side-effects.ts` #

"Verify" is the role the file plays in a synthetic transaction: drive → observe → verify. The three-word phrasing belongs to the test-pyramid lineage (Cohn, Succeeding with Agile, 2009; Vocke, The Practical Test Pyramid) — Newman uses it as received vocabulary in Building Microservices Ch 10 (in-production testing), which remains the closest prior art SQA inherits from. The role is not new; pinning the suffix matches the glossary's existing vocabulary for scenario phases. See ADR-0032 for why an SQA contract doc is not a Newman-style consumer-driven contract — the words "contract" and "verify" collide with Pact-ecosystem vocabulary and the disambiguation matters.

Consequences #

Architectural #

Glossary §segment grows a third example shape: drive segment +

verify segment together form one synthetic transaction.

The probe-registry's role broadens: not just fixture IDs from

preflight, but also identifiers from the drive segment.

A scenario that doesn't cross multiple stores doesn't need a

verify segment — the drive segment alone is sufficient. This ADR doesn't mandate the split; it pins the shape when the split is justified.

Behavioural #

A green run with a verify segment shows roughly twice the leaf

count of the same scenario without one. Cron noise increases proportionally; that's the cost of catching cross-store drift.

A failing verify step never affects the drive segment's

outcome (they're sibling segments). The aggregate outcome is still worst-child-wins, so a verify-fail will surface in the run summary's HEADLINE and exit code.

Falsifiability #

If a future scenario lands a verify segment that simply

re-checks the API the drive segment polled, this ADR has failed — verify must observe a different surface (a store, a log stream, a metric).

If contributors start adding verify steps for stores the system

might write to but isn't documented to, this ADR has failed — steps should cite the specific writer in the SUT's source.