Skip to content

ELO-Ranked Bid Selection & Escalation Pipeline

Author: Larry (via OpenClaw)
Date: 2026-02-11
Status: Proposal
Audience: Developers, Product Managers


Overview

This system implements an automated ranking and escalation pipeline for incoming project proposals (bids). It uses an Elo rating system—similar to chess rankings—to dynamically score bids against each other based on quality, feasibility, and strategic fit. High-scoring bids are automatically escalated to human reviewers, while low-scoring bids are filtered out.

Core Problem

When the "Bid Factory" generates hundreds of potential project bids per day, human review becomes the bottleneck. We need a way to mathematically sort the "signal" from the "noise" without manually reading every submission.

System Architecture

1. Bid Ingestion & Normalization

Bids arrive from various sources (User input, Agent generated). They are normalized into a standard JSON structure suitable for analysis.

2. Pairwise Comparison Engine

The core logic. An LLM (gemini-2.0-flash) acts as the judge. - It takes two bids (A and B). - It evaluates them on 5 key dimensions: 1. Completeness: Is the plan fully formed? 2. Evidence: Is it backed by data? 3. ROI: Is the return worth the risk? 4. Feasibility: Can we actually build this? 5. Strategic Fit: Does it align with current goals? - It outputs a "Win Probability" for Bid A.

3. Elo Update Worker

A background worker processes the LLM's decision and updates the Elo scores of both bids. - K-Factor: We use a dynamic K-factor. New bids have high K (volatile rating), established bids have low K (stable rating).

4. Escalation Monitor

A specialized service that watches for: - Elite Bids: Elo > 1800 (Top 5%). Immediate Slack/Email alert to Investment Committee. - Promising Bids: Elo > 1500 (Top 50%). Added to the "Weekly Review" queue. - Junk Bids: Elo < 1200. Auto-archived.


Ranking Model

Standard Elo Formula

We use the standard logistic curve for expected score:

$$E_A = \frac{1}{1 + 10^{(R_B - R_A) / 400}}$$

Where: - $E_A$ is the expected score for Bid A. - $R_A$ and $R_B$ are the current ratings.

Update Rule

$$R_A' = R_A + K \cdot (S_A - E_A)$$

Where: - $S_A$ is the actual score (1 for win, 0 for loss, 0.5 for draw). - $K$ is the K-factor (defaults to 32).

Dynamic K-Factor

To quickly identify diamonds in the rough: - If bids_count < 10: $K = 64$ - If bids_count > 10: $K = 32$ - If escalated = True: $K = 16$ (Stability mode)


Database Schema

bids

The central table for all project proposals.

Column Type Description
id UUID Primary Key
title TEXT Project Title
author_id UUID Creator
current_elo INT Default 1500
status ENUM new, ranking, escalated, rejected
metadata JSONB Full bid content

comparisons

Log of all pairwise battles.

Column Type Description
id UUID Primary Key
bid_a_id UUID FK to Bids
bid_b_id UUID FK to Bids
winner_id UUID FK to Bids (or NULL for draw)
score_delta INT Points exchanged
judge_model TEXT LLM used (e.g., gemini-2.0-flash)
reasoning TEXT LLM explanation for the decision

escalation_queue

The priority list for human review.

Column Type Description
id UUID Primary Key
bid_id UUID FK to Bids
reason TEXT "Top 5% Elo" or "High ROI"
priority INT 1 (Urgent) to 5 (Standard)
assigned_to UUID Reviewer ID

API Reference

POST /api/bids/submit

Ingest a new bid for ranking.

{
  "title": "Mars Colony Logistics",
  "content": "A plan to..."
}

GET /api/bids/queue/escalated

Fetch the current top priorities for human review.

[
  {
    "bid_id": "b_999",
    "title": "Mars Colony Logistics",
    "elo": 1850,
    "escalation_reason": "Top 1% Elo Score",
    "link": "/review/b_999"
  }
]

POST /api/admin/force_match

Manually trigger a comparison between two specific bids (for calibration).

{
  "bid_a": "uuid_1",
  "bid_b": "uuid_2"
}


Integration with Notification Systems

The Escalation Monitor connects to external Webhooks: - Slack: Posts to #deal-flow-elite for >1800 Elo. - Email: Weekly digest of >1500 Elo bids. - Dashboard: Real-time leaderboard widget.

Future Enhancements

  • Tournament Mode: Periodically re-rank the top 50 bids against each other to ensure the "King of the Hill" is truly the best.
  • Niche Pools: Separate Elo ladders for different sectors (e.g., "BioTech Elo", "Crypto Elo").

Detailed Implementation Plan

Phase A — Selection Funnel Definition (1 week)

  1. Define funnel cutoffs and policy defaults (20% -> 5% -> 1%).
  2. Define promotion/demotion rules with override controls.
  3. Define confidence requirements for borderline candidates.

Phase B — Ranking Pipeline Integration (2 weeks)

  1. Pull ELO and percentile outputs from proposal-07 stack.
  2. Add domain-fit and verification features to ranking vector.
  3. Compute composite selection score for escalation decisions.

Phase C — Escalation Workflow (2 weeks)

  1. Route top cohort to expert verification and premium refinement.
  2. Track escalation outcomes and review costs.
  3. Add auto-stop for candidates that fail critical post-escalation checks.

Phase D — Outcome Learning Loop (2 weeks)

  1. Ingest real bid outcomes (win/loss/shortlist).
  2. Recalibrate ranking and threshold policies.
  3. Add model drift alerts when ranking precision degrades.

Data model additions

  • selection_funnel_runs
  • selection_scores
  • escalation_events
  • bid_outcome_feedback

Validation checklist

  • Precision@top cohorts
  • Win-rate lift vs non-ranked baseline
  • Cost savings from reduced deep review
  • Stability of thresholds across domains