Snappy’s promise: crawl and structure the web into a reliable, high-quality corpus — complete, fresh, and on-target for what was asked.
The claim SQA tests: The jobs the metaintro-chat search engine returned, in the chat thread, are relevant to what the user asked for. (claim C0).
This run tested the system against its contract, clause by clause. A single run can only witness some clauses; the rest stay UNKNOWN — never a faked pass.
1 pass · 0 fail · 7 unknown
C0 MUST
Headline promise
relevancy score = 73/100 (pass-band ≥ 60)
PASS
C1 MUST
User can sign in
no 'login' step in this run
UNKNOWN
C2 MUST
User can open a new thread
no 'open-thread' step in this run
UNKNOWN
C3 SHOULD
Onboarding gate completes
no 'onboarding' step in this run
UNKNOWN
C4 SHOULD
Filters from onboarding don't bias the query
no 'clear-filters' step in this run
UNKNOWN
C5 MUST
User-typed query is what the engine sees
no 'submit-query' step in this run
UNKNOWN
C10 SHOULD
Score holds across reruns
needs a sweep — a single run cannot witness this clause — needs a sweep
UNKNOWN
C12 MAY
Run completes within budget
needs a sweep — a single run cannot witness this clause — needs a sweep
UNKNOWN
TL;DR · 30-second primer
·Snappy (SUT) ran 1 run on profile corpus-f500.
·Result: Corpus Quality Index 73/100. Strong— see “Why this verdict” (each gap maps to a claim in the Contract).
1 ·THE VERDICT
the answer in one number
30-day CQI history
SNAPPY · DOMAIN ACTIVATION · RUN #8
Strong.
Run #8 of snappy on profile corpus-f500 for the query "CQI sweep — domain-activation". Corpus Quality Index 73/100.
Verdict WARN: C5 Freshness-checker cron stalled — 12% of corpus returned snapshots >30d old.
AI synthesis · openai/gpt-4o-mini
The system succeeded but with a degradation, achieving a Corpus Quality Index (CQI) score of 73 out of 100. The primary issue was a stall in the freshness-checker cron, which resulted in 12% of the corpus returning snapshots older than 30 days. Overall, the domain-activation process ran effectively, but the freshness of the data was compromised.
2 ·WHY THIS VERDICT
ranked by severity
HARD
Freshness-checker cron stalled — 12% of corpus returned snapshots >30d old
Expected
<3% of corpus older than 30 days at activation time
Observed
609 of 5,000 orgs served snapshots aged 31–47 days; freshness-checker had not advanced its cursor in ~26h
Why it matters
Customers querying the registry get stale headcount, funding, and career-page data — the corpus silently degrades while CQI on individual orgs still reads GREEN.
Recommended action· 1 sprint
Restart freshness-checker workflow; add a watchdog alert on cursor-staleness > 4h.