The five agents

Each market goes through a structured debate. Three sub-researchers run in parallel with isolated context — they don't see each other's drafts. A Supervisor merges with a weighted-Bayesian rule + mandates a falsifiable claim. A Critic audits the result across six rigor dimensions; if any dim falls below 0.4 the Supervisor re-runs once with the critic's feedback inlined. Receipts that fail audit on the second pass never reach the chain.

Pipeline

     scanner — Polymarket Gamma poll
          │
          ▼
   ┌──────┼──────┐
   ▼      ▼      ▼
 [Bull]  [Bear]  [Edge]       ← parallel, isolated context, ~3s per stance
   │      │      │
   └──────┼──────┘
          ▼
     [Supervisor]              ← weighted-Bayesian merge, mandates falsifiable claim,
          │                      consumes calibration_prior from past Brier
          ▼
       [Critic]                ← 6-dim audit; verdict ∈ {approved, needs_revision, rejected}
          │
     ┌────┴────┐
     │ needs_  │ approved │ rejected
     │revision │          │
     ▼         ▼          ▼
   Supervisor  emit on    SKIP — no on-chain commit,
   re-runs    Arc V2 +    no calibration noise
   once       Irys upload

Agent cards

Bull

Argue YES — strongest defensible case

Model: Gemini 3.1 Pro Preview (Vertex AI, global region)
Grounding: Google Search at request time
Context isolation: Sees only the market prompt — never Bear or Edge's drafts
Output: probability_estimate ≥ 0.55, key factors, ≥ 2 cited evidence URLs

Bear

Argue NO — strongest defensible case

Model: Gemini 3.1 Pro Preview (Vertex AI, global region)
Grounding: Google Search at request time
Context isolation: Same — opposite advocate, independent context
Output: probability_estimate ≤ 0.45, key factors, ≥ 2 cited evidence URLs

Edge

Surface tail risks both partisans miss

Model: Gemini 3.1 Pro Preview (Vertex AI, global region)
Grounding: Google Search at request time
Context isolation: Same — adversarial-to-conventional-wisdom, independent context
Output: Tail-risk factors, structural assumptions, ≥ 1 historical analog

Supervisor

Weighted-Bayesian merge of three stances

Model: Gemini 3.1 Pro Preview, low temperature (0.2)
Grounding: None — synthesises drafts, no fresh search
Context isolation: Reads all three drafts. Cannot reach back to a stance for clarification.
Output: final probability + confidence, stance weights ∈ [0.1, 0.7] summing to 1.0, disagreement_pp, mandatory ≥ 1 falsifiable claim with checkable_by date, calibration_prior_used

Critic

Audit the merged trace across 6 rigor dimensions

Model: Gemini 3 Flash Preview (smaller, faster, cheaper)
Grounding: None — reads only the trace under audit
Context isolation: Single pass. Returns verdict: approved / needs_revision / rejected.
Output: Per-dim score [0, 1]: evidence_relevance, falsifiability, scope, coherence, exploration_integrity, methodology. Rule overrides model self-report.

Multi-model fallback

Every Gemini call routes through a fallback chain. When the primary 429s — usually Pro Preview hitting the free-tier quota mid-tick — the wrapper retries the next model in the chain transparently. The fallback has fired hundreds of times in production today, keeping the loop emitting receipts without any operator intervention.

Stance + Supervisor: gemini-3.1-pro-preview → gemini-3-flash-preview → gemini-2.5-flash
Critic:              gemini-3-flash-preview → gemini-2.5-flash → gemini-2.5-flash-lite

Watch them debate in real time

The home page has a live SSE feed. The v3 pill on a row tells you the receipt came out of a 5-agent debate; hover to see the Bull/Bear/Edge disagreement in percentage points. Click any v3 row → see the full ensemble panel, the critic radar, and the falsifiable claims the supervisor committed to.