Monte Carlo Plan Success Probability Engine
Author: PlanExe Team
Date: 2026-02-11
Status: Proposal
Audience: Data Scientists, Project Managers
Overview
The Monte Carlo Plan Success Engine moves planning from deterministic "best guesses" to probabilistic distributions. By running 10,000 stochastic simulations of the project schedule and budget, it provides a mathematical confidence level for success (e.g., "There is a 12% chance of finishing by Q3").
It replaces single-point estimates (e.g., "This task takes 5 days") with probability distributions (e.g., "Between 3 and 10 days, most likely 5").
Core Problem
"The Flaw of Averages": The average outcome of a project is worse than the plan constructed from average inputs, due to non-linear dependencies (Jensen's Inequality). Deterministic plans systematically underestimate risk.
System Architecture
1. Distribution Modeler
Converts scalar plan inputs into statistical distributions. - Tasks: Beta-PERT distribution ($O + 4M + P / 6$) for duration. - Costs: Lognormal distribution (skewed right) for budget items. - Risks: Bernoulli trials ($p$) with impact distributions ($I$).
2. Simulation Loop (The Engine)
Runs $N$ iterations (default 10,000). For each iteration $i$: 1. Sample duration $d_{i,t}$ for every task $t$. 2. Sample cost $c_{i,b}$ for every budget line $b$. 3. Trigger risks based on probability. 4. Recompute Critical Path ($CP_i$) and Total Cost ($TC_i$). 5. Store pair $(CP_i, TC_i)$.
3. Analytics Service
Aggregates the simulation results into interpretable metrics (P10, P50, P90).
Technical Specifications
Input Distributions
Most task durations follow a Beta-PERT distribution: $$E = \frac{Optimistic + 4 \cdot MostLikely + Pessimistic}{6}$$ $$Var = \left(\frac{Pessimistic - Optimistic}{6}\right)^2$$
Cost inputs often follow a Lognormal distribution: $$f(x) = \frac{1}{x \sigma \sqrt{2\pi}} \exp\left(-\frac{(\ln x - \mu)^2}{2\sigma^2}\right)$$ This reflects the reality that costs can explode (10x overrun) but rarely shrink below a floor.
The Simulation Algorithm
def run_simulation(plan, n_iterations=10000):
results = []
for _ in range(n_iterations):
scenario = {}
# 1. Sample Tasks
for task in plan.tasks:
duration = sample_pert(task.min, task.mode, task.max)
scenario[task.id] = duration
# 2. Trigger Risks
for risk in plan.risks:
if random.random() < risk.probability:
# Add delay to linked tasks
for target in risk.impacts:
scenario[target] += risk.delay_days
# 3. Solve Critical Path
finish_date = solve_cpm(plan.dependencies, scenario)
results.append(finish_date)
return aggregate_stats(results)
Output Analysis (Sensitivity)
Tornado Charts
We calculate the correlation coefficient between each task's duration and the total project duration. - High Correlation: This task is a "Driver". Focusing on it reduces overall variance. - Low Correlation: This task has "Slack". Delays here likely won't hurt the project.
Probabilistic S-Curves
A plot of $Cumulative Probability$ vs $Time/Cost$. - P50 (Median): The "Coin Flip" outcome. - P80 (Commitment): The recommended internal deadline. - P95 (Safe Bet): The recommended external promise.
Integration
Assumption Drift Monitor
When drift is detected (e.g., "Steel prices up 10%"), the Monte Carlo engine automatically re-simulates using the new actuals as the baseline, updating the P-values in real-time.
API Reference
POST /api/simulate/plan/{id}
Trigger a full 10k run.
GET /api/simulate/results/{id}
Get the distribution data.
{
"p10_date": "2026-06-01",
"p50_date": "2026-07-15",
"p90_date": "2026-09-01",
"success_prob_deadline": 0.42 # 42% chance to hit deadline
}
Detailed Implementation Plan
Phase 1: Simulation Foundation (2–3 weeks)
- Define normalized simulation input contract:
- task duration distributions (optimistic/most likely/pessimistic)
- cost uncertainty distributions per CBS line item
-
risk event probabilities and impact models
-
Build Monte Carlo core package:
- deterministic seed support for reproducibility
- vectorized simulation runner (NumPy/JAX-compatible)
-
scenario persistence for post-run diagnostics
-
Add baseline output artifacts:
- percentile summaries (P10/P50/P90)
- success probability against target date/cost
- critical-path frequency map
Phase 2: Pipeline Integration (2 weeks)
- Add a post-planning stage in
run_plan_pipeline.py: simulate_plan_success-
consumes WBS, CBS, risk register
-
Persist results in structured storage:
- run-level summary table
-
optional per-iteration table for top-N scenario replay
-
Expose API and report sections:
/api/simulate/plan/{id}trigger endpoint/api/simulate/results/{id}retrieval endpoint- markdown report block in generated plan output
Phase 3: Calibration + Reliability (2–4 weeks)
- Backtest against historical completed projects.
- Calibrate distribution parameters by domain.
- Add quality gates:
- minimum input completeness threshold
-
confidence labels (high/medium/low) by data quality
-
Add drift detection:
- trigger re-sim when assumptions or benchmark inputs change.
Data and API contracts
Suggested summary payload extension:
{
"simulation": {
"iterations": 10000,
"seed": 42,
"success_probability": 0.42,
"schedule": {"p10": "2026-06-01", "p50": "2026-07-15", "p90": "2026-09-01"},
"cost": {"p10": 1200000, "p50": 1500000, "p90": 2100000},
"top_drivers": ["foundation_work", "steel_supply", "permit_delay"]
}
}
Operational safeguards
- Hard cap on iterations for interactive usage.
- Queue long simulations asynchronously.
- Cache deterministic runs by
(plan_hash, assumptions_hash, seed).
Validation checklist
- Unit tests for distribution samplers.
- Statistical sanity tests (mean/variance bounds).
- Integration tests for API + report rendering.
- Backtest acceptance criteria per domain.
Detailed Implementation Plan (Compute + Reliability)
Compute Strategy
- Interactive mode: 1,000 runs for quick feedback
- Batch mode: 10,000+ runs for decision-grade outputs
- Queue heavy runs with cancellation support
Reliability Controls
- Seeded reproducibility for audit
- Input validation for distribution parameter sanity
- Guardrails against invalid dependency graphs
Explainability
- Output top driver contributions per percentile shift
- Include “what changed this outcome” narrative snippets
SLOs
- P95 interactive response < 8s (1k runs)
- Batch completion < 2 min for standard plan size