← All runslive api

The contract

Snappy’s promise: crawl and structure the web into a reliable, high-quality corpus — complete, fresh, and on-target for what was asked.

The claim SQA tests: The jobs the metaintro-chat search engine returned, in the chat thread, are relevant to what the user asked for. (claim C0).

This run tested the system against its contract, clause by clause. A single run can only witness some clauses; the rest stay UNKNOWN — never a faked pass.

1 pass · 0 fail · 7 unknown

C0 MUST
Headline promise
relevancy score = 78/100 (pass-band ≥ 60)
PASS
C1 MUST
User can sign in
no 'login' step in this run
UNKNOWN
C2 MUST
User can open a new thread
no 'open-thread' step in this run
UNKNOWN
C3 SHOULD
Onboarding gate completes
no 'onboarding' step in this run
UNKNOWN
C4 SHOULD
Filters from onboarding don't bias the query
no 'clear-filters' step in this run
UNKNOWN
C5 MUST
User-typed query is what the engine sees
no 'submit-query' step in this run
UNKNOWN
C10 SHOULD
Score holds across reruns
needs a sweep — a single run cannot witness this clause — needs a sweep
UNKNOWN
C12 MAY
Run completes within budget
needs a sweep — a single run cannot witness this clause — needs a sweep
UNKNOWN

TL;DR · 30-second primer

·Snappy (SUT) ran 1 run on profile corpus-f500.
·Result: Corpus Quality Index 78/100. Strong— see “Why this verdict” (each gap maps to a claim in the Contract).

1 ·THE VERDICT

the answer in one number

30-day CQI history

SNAPPY · DOMAIN ACTIVATION · RUN #4

Strong.

Run #4 of snappy on profile corpus-f500 for the query "CQI sweep — domain-activation". Corpus Quality Index 78/100.

Verdict PASS: every step completed cleanly; nothing pulled the verdict down.

AI synthesis · openai/gpt-4o-mini

The system successfully completed its job with a Corpus Quality Index (CQI) score of 78 out of 100. While the overall outcome was a pass, it was noted that 21% of activated organizations fell into the yellow band, indicating some areas for improvement. The run took a total of 184.2 seconds to complete.

2 ·WHY THIS VERDICT

ranked by severity

SOFT

21% of activated orgs sit in the YELLOW band (CQI 50–69)

Expected

≥85% of activated organizations reach GREEN (CQI ≥70)

Observed

1,043 of 5,000 F500-corpus orgs landed YELLOW; most missing the registration-number and headcount fields

Why it matters

YELLOW-band orgs are below the CLASSIFY_INDUSTRY_CQI_FLOOR (50) edge — one missing field flips them to RED and out of downstream classification.

Recommended action· 1 sprint

Raise LLM enrichment ceiling for the headcount logic module; backfill registration-number from the secondary registry source.

3 ·THE STORY

what went in, what came out

Input

what the probe sent in

Query

Skills

(no skill inferred)

ESCO —

Industry

Computer Systems Design

NAICS 541512

Location

United States

ISO US

Education

Bachelor or equivalent

ISCED ISCED 6

5 ·SESSION RECORDING

watch what the probe saw

Session recording

watch what the probe actually saw

No recording available for Metaintro.

6 ·RUN MECHANICS

provenance & reproducibility

Duration

3m 4.2s

Steps

Judges

—

Commit

demo-seed

Started

2026-04-29 13:00 UTC

Trace

Evidence by step

every artifact, link, excerpt, row, metric & recording — grouped by the step that produced it

No evidence recorded for this run.

Evidence integrity

each artifact is SHA-256 hashed at capture — proof it is unmodified

No integrity manifest recorded for this run.

7 ·SYSTEM ANATOMY

which component drove the verdict

Every component held — no failure attributed.

Press ⌘K to search