ADR-0026 — Cross-store verifier segment pattern
ADR-0026 — Cross-store verifier segment pattern #
- Status: Accepted
- Date: 2026-05-08
- Deciders: Natan
- Source: PRD-07 ratification. The pattern emerged in
src/systems/snappy/domain-activation-verify.ts and needs a doc anchor before a second SUT adopts it.
Context #
A synthetic transaction drives the SUT through a workflow and observes the SUT's response (the black-box view, per Google SRE ch. 17). That answers "did the API say yes?" but not "did the side-effects the workflow promised actually land?"
For non-trivial workflows the answer to those two questions diverges. Snappy's domain-activation pipeline writes to seven stores (Mongo, S3, ClickHouse, Loki, Hatchet workflow events, …) during a single POST /api/domains { activate: true }. The API returning status: "active" after polling proves the primary state machine landed; it doesn't prove the audit row was written, the analytics mirror synced, the robots archive uploaded, or the workflow logs were emitted. Each of those is a separate failure surface in production.
Charity Majors et al., Observability Engineering (O'Reilly 2022) ch. 5: "the answer to 'is X working' is rarely a single question". The white-box view fills the gap.
Decision #
A scenario that produces non-trivial cross-store side-effects gets two segments, in this order:
- Drive segment — the synthetic transaction, named
<scenario-name>.ts. POSTs / polls / DELETEs against the SUT's public API. Captures identifiers (e.g. domainId) into the probe-registry.
- Verify segment —
<scenario-name>-verify.ts. Runs after
drive settles. Reads <identifier> from the probe-registry and queries each store the workflow was supposed to write to. One step per store. Each step:
- skips cleanly when the identifier is absent (drive failed)
- skips cleanly when SQA can't reach that store (in-cluster
only, etc.) — with the skip reason naming the follow-up
- passes when the expected side-effect is found
- fails when the store is reachable but the side-effect is
missing (the system said it did the work and didn't)
The two segments compose at the system's index.ts as siblings of preflight. They share state through the probe-registry, not through function arguments — the registry is the documented seam (ADR-0022).
Why two segments and not one #
- Different failure remediations. Drive fail = system bug
or probe-config bug (ambiguous; investigate API). Verify fail = system said yes but didn't actually do the work (unambiguous; investigate the specific store's writer). Surfacing them as separate Result subtrees gives operators the taxonomy for free.
- Different reachability profiles. The drive segment hits the
SUT's public ingress (always reachable). The verify segment may need direct DB access that is only sometimes reachable (in-cluster, behind a separate ingress, …). Letting verify steps skip independently of drive prevents one missing capability from masking the rest.
- Different evolution rate. New stores get added to a
workflow on a different cadence than the workflow's API contract changes. Two files, two diffs.
Why named -verify.ts, not -checks.ts or -side-effects.ts #
"Verify" is the role the file plays in a synthetic transaction: drive → observe → verify. The three-word phrasing belongs to the test-pyramid lineage (Cohn, Succeeding with Agile, 2009; Vocke, The Practical Test Pyramid) — Newman uses it as received vocabulary in Building Microservices Ch 10 (in-production testing), which remains the closest prior art SQA inherits from. The role is not new; pinning the suffix matches the glossary's existing vocabulary for scenario phases. See ADR-0032 for why an SQA contract doc is not a Newman-style consumer-driven contract — the words "contract" and "verify" collide with Pact-ecosystem vocabulary and the disambiguation matters.
Consequences #
Architectural #
- Glossary §segment grows a third example shape: drive segment +
verify segment together form one synthetic transaction.
- The probe-registry's role broadens: not just fixture IDs from
preflight, but also identifiers from the drive segment.
- A scenario that doesn't cross multiple stores doesn't need a
verify segment — the drive segment alone is sufficient. This ADR doesn't mandate the split; it pins the shape when the split is justified.
Behavioural #
- A green run with a verify segment shows roughly twice the leaf
count of the same scenario without one. Cron noise increases proportionally; that's the cost of catching cross-store drift.
- A failing verify step never affects the drive segment's
outcome (they're sibling segments). The aggregate outcome is still worst-child-wins, so a verify-fail will surface in the run summary's HEADLINE and exit code.
Falsifiability #
- If a future scenario lands a verify segment that simply
re-checks the API the drive segment polled, this ADR has failed — verify must observe a different surface (a store, a log stream, a metric).
- If contributors start adding verify steps for stores the system
might write to but isn't documented to, this ADR has failed — steps should cite the specific writer in the SUT's source.
See also #
PRD-07— the implementation ratifying this ADR. PRD-07 was
deleted on 2026-05-08; the live contract is at docs/contracts/snappy/domain-activation.md ↗.
- PRD-06 ↗ —
the broader cross-store verification scope; this ADR lands its first concrete instance.
- ADR-0022 — the probe-registry
module this ADR reuses.
- Newman, Building Microservices (2nd ed.) ch. 10 (in-production
testing + synthetic transactions — the prior art SQA inherits from). Note: the drive → observe → verify phrasing predates Newman; it belongs to the test-pyramid lineage (Cohn 2009).
- Majors et al., Observability Engineering (O'Reilly 2022) ch. 5.
- ADR-0031 — the Tier-4
outbox-tail verifier pattern that complements this ADR (added by PRD-14).
— disambiguates this ADR's Newman citation (added by PRD-14).