Skip to content
SQA Cockpit
← All runslive api
The contract
Metaintro Chat’s promise: tell us what you need — we return relevant, quality job results based on your ask, verified daily, no ghost jobs.
The claim SQA tests: The jobs returned by Metaintro Chat answer what the user asked for, judged by an LLM ensemble against the query and the user's profile. (claim C8).
This run tested the system against its contract, clause by clause. A single run can only witness some clauses; the rest stay UNKNOWN — never a faked pass.
6 pass · 0 fail · 7 unknown
  • C0 MUST
    Headline promise
    could-not-evaluate — no relevancy score in this run
    UNKNOWN
  • C1 MUST
    User can sign in
    login = pass
    PASS
  • C2 MUST
    User can open a new thread
    open-thread = pass
    PASS
  • C3 SHOULD
    Onboarding gate completes
    onboarding = pass
    PASS
  • C4 SHOULD
    Filters from onboarding don't bias the query
    clear-filters = pass
    PASS
  • C5 MUST
    User-typed query is what the engine sees
    submit-query = pass
    PASS
  • C6 MUST
    The chat returns job cards
    wait-for-jobs = pass
    PASS
  • C7 MUST
    Job cards have required fields
    no 'card-shape' step in this run
    UNKNOWN
  • C8 MUST
    Returned jobs are relevant to the query · HEADLINE
    could-not-evaluate — no relevancy score in this run
    UNKNOWN
  • C9 SHOULD
    All aspects of the query are covered
    could-not-evaluate — no relevancy score in this run
    UNKNOWN
  • C10 SHOULD
    Score holds across reruns
    needs a sweep — a single run cannot witness this clause — needs a sweep
    UNKNOWN
  • C11 SHOULD
    Competitive vs LinkedIn / Indeed / Google
    needs a sweep — a single run cannot witness this clause — needs a sweep
    UNKNOWN
  • C12 MAY
    Run completes within budget
    needs a sweep — a single run cannot witness this clause — needs a sweep
    UNKNOWN
TL;DR · 30-second primer
  • ·Metaintro Chat (SUT) ran 1 run on behalf of 1 seeker (P1, Lena Park).
  • ·The chat returned 10 jobs. The judge scored them.
  • ·Result: Job-Seeker Index 0/100. Not relevant— see “Why this verdict” (each gap maps to a claim in the Contract).
  • ·Compared to 3 competitors (LinkedIn / Indeed / Google) further down.

1 ·THE VERDICT

the answer in one number
30-day JSI history
METAINTRO CHAT · JSI · RUN #3

Not relevant.

Run #3 of metaintro-chat on profile P1 for the query "senior react engineer remote". Job-Seeker Index 0/100.

Verdict WARN: evaluate OPENROUTER_API_KEY not configured — score not computed; query-coverage OPENROUTER_API_KEY not configured — coverage not computed; legacy-composite evaluate=skip, query-coverage=skip — both must be score for composite; baselines baselines disabled (set captureBaselines: true to enable). All journey steps (login, open-thread, onboarding, clear-filters, submit-query, wait-for-jobs, observe, c1-job-card-shape) passed.

AI synthesis · openai/gpt-4o-mini

The system successfully returned job listings but received a warning due to a configuration issue. Specifically, the OPENROUTER_API_KEY was not configured, which prevented the Job Search Index (JSI) score from being computed. Despite this, the system drove through all necessary steps, returning 10 job listings, including positions like Fullstack Engineer and Senior Frontend React Developer. The total duration of the run was 73.5 seconds.

2 ·WHY THIS VERDICT

ranked by severity
Verdict WARN: evaluate OPENROUTER_API_KEY not configured — score not computed; query-coverage OPENROUTER_API_KEY not configured — coverage not computed; legacy-composite evaluate=skip, query-coverage=skip — both must be score for composite; baselines baselines disabled (set captureBaselines: true to enable). All journey steps (login, open-thread, onboarding, clear-filters, submit-query, wait-for-jobs, observe, c1-job-card-shape) passed.

3 ·THE STORY

what went in, what came out

Input

the same input was run against all 4 platforms
Query
Job-seeker profile
Skills
React.js, JavaScript
ESCO S6.0.2
Industry
Software Publishers
NAICS 511210
Location
Remote · United States
ISO US · remote=true
Education
Bachelor or equivalent
ISCED ISCED 6

Output · jobs returned

TitleCompanyLocationPostedLink
Job: Fullstack Engineerour growing teamopen ↗
Job: AI Engineer (Full-Stack & Applied UI)Reltioopen ↗
Job: Full-Stack Software EngineerGovWellopen ↗
Job: Full Stack DeveloperMiratechopen ↗
Job: Full-Stack Software EngineerGovWellopen ↗
Job: Senior Frontend React DeveloperGlobalopen ↗
Job: Senior Frontend Developer (React)Capcoopen ↗
Job: Senior Full-Stack EngineerHuman Agencyopen ↗
Job: Senior React Native Software Engineer (Javascript)Bouncyopen ↗
Job: Senior Full Stack EngineerCobalt AIopen ↗

4 ·THE BENCHMARK

vs. LinkedIn, Indeed, Google

Benchmark · 4 platforms × 7 axes

RecognitionSpecificityReachabilityRecencyMatch qualityHostility filterSalary surface
  • Metaintro (us)
  • LinkedIn
  • Indeed
  • Google

Capability matrix · platforms × axes

PlatformRecognitionSpecificityReachabilityRecencyMatch qualityHostility filterSalary surfaceOverall
Metaintro ·us67
LinkedIn 58
Indeed 50
Google 54

5 ·SESSION RECORDING

watch the probe drive each platform

Session recording

watch what the probe actually saw
No recording available for Metaintro.

6 ·RUN MECHANICS

provenance & reproducibility
Duration
1m 13.5s
Steps
14
Judges
Commit
rerun-real
Started
2026-05-28 14:08 UTC
Trace

Evidence by step

every artifact, link, excerpt, row, metric & recording — grouped by the step that produced it

No evidence recorded for this run.

Evidence integrity

each artifact is SHA-256 hashed at capture — proof it is unmodified

No integrity manifest recorded for this run.

7 ·SYSTEM ANATOMY

which component drove the verdict

Every component held — no failure attributed.

Press ⌘K to search