ELO-Ranked Bid Selection & Escalation Pipeline
Author: Larry (via OpenClaw)
Date: 2026-02-11
Status: Proposal
Audience: Developers, Product Managers
Overview
This system implements an automated ranking and escalation pipeline for incoming project proposals (bids). It uses an Elo rating system—similar to chess rankings—to dynamically score bids against each other based on quality, feasibility, and strategic fit. High-scoring bids are automatically escalated to human reviewers, while low-scoring bids are filtered out.
Core Problem
When the "Bid Factory" generates hundreds of potential project bids per day, human review becomes the bottleneck. We need a way to mathematically sort the "signal" from the "noise" without manually reading every submission.
System Architecture
1. Bid Ingestion & Normalization
Bids arrive from various sources (User input, Agent generated). They are normalized into a standard JSON structure suitable for analysis.
2. Pairwise Comparison Engine
The core logic. An LLM (gemini-2.0-flash) acts as the judge.
- It takes two bids (A and B).
- It evaluates them on 5 key dimensions:
1. Completeness: Is the plan fully formed?
2. Evidence: Is it backed by data?
3. ROI: Is the return worth the risk?
4. Feasibility: Can we actually build this?
5. Strategic Fit: Does it align with current goals?
- It outputs a "Win Probability" for Bid A.
3. Elo Update Worker
A background worker processes the LLM's decision and updates the Elo scores of both bids. - K-Factor: We use a dynamic K-factor. New bids have high K (volatile rating), established bids have low K (stable rating).
4. Escalation Monitor
A specialized service that watches for: - Elite Bids: Elo > 1800 (Top 5%). Immediate Slack/Email alert to Investment Committee. - Promising Bids: Elo > 1500 (Top 50%). Added to the "Weekly Review" queue. - Junk Bids: Elo < 1200. Auto-archived.
Ranking Model
Standard Elo Formula
We use the standard logistic curve for expected score:
$$E_A = \frac{1}{1 + 10^{(R_B - R_A) / 400}}$$
Where: - $E_A$ is the expected score for Bid A. - $R_A$ and $R_B$ are the current ratings.
Update Rule
$$R_A' = R_A + K \cdot (S_A - E_A)$$
Where: - $S_A$ is the actual score (1 for win, 0 for loss, 0.5 for draw). - $K$ is the K-factor (defaults to 32).
Dynamic K-Factor
To quickly identify diamonds in the rough:
- If bids_count < 10: $K = 64$
- If bids_count > 10: $K = 32$
- If escalated = True: $K = 16$ (Stability mode)
Database Schema
bids
The central table for all project proposals.
| Column | Type | Description |
|---|---|---|
id |
UUID | Primary Key |
title |
TEXT | Project Title |
author_id |
UUID | Creator |
current_elo |
INT | Default 1500 |
status |
ENUM | new, ranking, escalated, rejected |
metadata |
JSONB | Full bid content |
comparisons
Log of all pairwise battles.
| Column | Type | Description |
|---|---|---|
id |
UUID | Primary Key |
bid_a_id |
UUID | FK to Bids |
bid_b_id |
UUID | FK to Bids |
winner_id |
UUID | FK to Bids (or NULL for draw) |
score_delta |
INT | Points exchanged |
judge_model |
TEXT | LLM used (e.g., gemini-2.0-flash) |
reasoning |
TEXT | LLM explanation for the decision |
escalation_queue
The priority list for human review.
| Column | Type | Description |
|---|---|---|
id |
UUID | Primary Key |
bid_id |
UUID | FK to Bids |
reason |
TEXT | "Top 5% Elo" or "High ROI" |
priority |
INT | 1 (Urgent) to 5 (Standard) |
assigned_to |
UUID | Reviewer ID |
API Reference
POST /api/bids/submit
Ingest a new bid for ranking.
GET /api/bids/queue/escalated
Fetch the current top priorities for human review.
[
{
"bid_id": "b_999",
"title": "Mars Colony Logistics",
"elo": 1850,
"escalation_reason": "Top 1% Elo Score",
"link": "/review/b_999"
}
]
POST /api/admin/force_match
Manually trigger a comparison between two specific bids (for calibration).
Integration with Notification Systems
The Escalation Monitor connects to external Webhooks:
- Slack: Posts to #deal-flow-elite for >1800 Elo.
- Email: Weekly digest of >1500 Elo bids.
- Dashboard: Real-time leaderboard widget.
Future Enhancements
- Tournament Mode: Periodically re-rank the top 50 bids against each other to ensure the "King of the Hill" is truly the best.
- Niche Pools: Separate Elo ladders for different sectors (e.g., "BioTech Elo", "Crypto Elo").
Detailed Implementation Plan
Phase A — Selection Funnel Definition (1 week)
- Define funnel cutoffs and policy defaults (20% -> 5% -> 1%).
- Define promotion/demotion rules with override controls.
- Define confidence requirements for borderline candidates.
Phase B — Ranking Pipeline Integration (2 weeks)
- Pull ELO and percentile outputs from proposal-07 stack.
- Add domain-fit and verification features to ranking vector.
- Compute composite selection score for escalation decisions.
Phase C — Escalation Workflow (2 weeks)
- Route top cohort to expert verification and premium refinement.
- Track escalation outcomes and review costs.
- Add auto-stop for candidates that fail critical post-escalation checks.
Phase D — Outcome Learning Loop (2 weeks)
- Ingest real bid outcomes (win/loss/shortlist).
- Recalibrate ranking and threshold policies.
- Add model drift alerts when ranking precision degrades.
Data model additions
selection_funnel_runsselection_scoresescalation_eventsbid_outcome_feedback
Validation checklist
- Precision@top cohorts
- Win-rate lift vs non-ranked baseline
- Cost savings from reduced deep review
- Stability of thresholds across domains