- C0 MUSTPASSHeadline promiserelevancy score = 70/100 (pass-band ≥ 60)
- C1 MUSTPASSUser can sign inlogin = pass
- C2 MUSTPASSUser can open a new threadopen-thread = pass
- C3 SHOULDPASSOnboarding gate completesonboarding = pass
- C4 SHOULDPASSFilters from onboarding don't bias the queryclear-filters = pass
- C5 MUSTPASSUser-typed query is what the engine seessubmit-query = pass
- C6 MUSTPASSThe chat returns job cardswait-for-jobs = pass
- C7 MUSTUNKNOWNJob cards have required fieldsno 'card-shape' step in this run
- C8 MUSTPASSReturned jobs are relevant to the query · HEADLINErelevancy score = 70/100 (pass-band ≥ 60)
- C9 SHOULDPASSAll aspects of the query are coveredrelevancy score = 70/100 (pass-band ≥ 60)
- C10 SHOULDUNKNOWNScore holds across rerunsneeds a sweep — a single run cannot witness this clause — needs a sweep
- C11 SHOULDUNKNOWNCompetitive vs LinkedIn / Indeed / Googleneeds a sweep — a single run cannot witness this clause — needs a sweep
- C12 MAYUNKNOWNRun completes within budgetneeds a sweep — a single run cannot witness this clause — needs a sweep
- ·Metaintro Chat (SUT) ran 1 run on behalf of 1 seeker (P1, Lena Park).
- ·The chat returned 10 jobs. The judge scored them.
- ·Result: Job-Seeker Index 70/100. Mostly relevant— see “Why this verdict” (each gap maps to a claim in the Contract).
- ·Compared to 3 competitors (LinkedIn / Indeed / Google) further down.
1 ·THE VERDICT
the answer in one numberMostly relevant.
Run #25 of metaintro-chat on profile P1 for the query "senior react engineer remote". Job-Seeker Index 70/100.
Verdict WARN: c2-relevancy outcome score; c3-coverage outcome score; legacy-composite outcome score; baselines baselines disabled (set captureBaselines: true to enable). All journey steps (login, open-thread, onboarding, clear-filters, submit-query, wait-for-jobs, observe, c1-job-card-shape) passed.
The system successfully returned job listings but received a warning due to degraded relevancy, scoring 70 out of 100. This lower score indicates that while ten jobs were provided, they did not closely match the query for a senior React engineer role. Notably, the results included positions like Senior Full Stack Developer and Fullstack Engineer, which may not align with the specific request for React expertise.
2 ·WHY THIS VERDICT
ranked by severity3 ·THE STORY
what went in, what came outInput
the same input was run against all 4 platformsOutput · jobs returned
| Title | Company | Location | Posted | Link |
|---|---|---|---|---|
| Job: Senior Full Stack Developer | an AI-powered platform | — | — | open ↗ |
| Job: Fullstack Engineer | RYZ Labs | — | — | open ↗ |
| Job: Staff Full Stack Engineer | Assured | — | — | open ↗ |
| Job: Full-Stack Engineer | Elation | — | — | open ↗ |
| Job: Full Stack Product Engineer | Vanta | — | — | open ↗ |
| Job: Senior Frontend React Developer | Global | — | — | open ↗ |
| Job: Senior Frontend Developer (React) | Capco | — | — | open ↗ |
| Job: Senior Full-Stack Engineer | Human Agency | — | — | open ↗ |
| Job: Senior React Native Software Engineer (Javascript) | Bouncy | — | — | open ↗ |
| Job: Senior Full Stack Engineer | Cobalt AI | — | — | open ↗ |
4 ·THE BENCHMARK
vs. LinkedIn, Indeed, GoogleBenchmark · 4 platforms × 7 axes
- Metaintro (us)
- Indeed
Capability matrix · platforms × axes
| Platform | Recognition | Specificity | Reachability | Recency | Match quality | Hostility filter | Salary surface | Overall |
|---|---|---|---|---|---|---|---|---|
| Metaintro ·us | 65 | |||||||
| 59 | ||||||||
| Indeed | 53 | |||||||
| 53 |
5 ·SESSION RECORDING
watch the probe drive each platformSession recording
watch what the probe actually saw6 ·RUN MECHANICS
provenance & reproducibilityrerun-realEvidence by step
every artifact, link, excerpt, row, metric & recording — grouped by the step that produced itNo evidence recorded for this run.
Evidence integrity
each artifact is SHA-256 hashed at capture — proof it is unmodifiedNo integrity manifest recorded for this run.
7 ·SYSTEM ANATOMY
which component drove the verdictEvery component held — no failure attributed.