Skip to content

Monte Carlo Plan Success Probability Engine

Author: PlanExe Team
Date: 2026-02-11
Status: Proposal
Audience: Data Scientists, Project Managers


Overview

The Monte Carlo Plan Success Engine moves planning from deterministic "best guesses" to probabilistic distributions. By running 10,000 stochastic simulations of the project schedule and budget, it provides a mathematical confidence level for success (e.g., "There is a 12% chance of finishing by Q3").

It replaces single-point estimates (e.g., "This task takes 5 days") with probability distributions (e.g., "Between 3 and 10 days, most likely 5").

Core Problem

"The Flaw of Averages": The average outcome of a project is worse than the plan constructed from average inputs, due to non-linear dependencies (Jensen's Inequality). Deterministic plans systematically underestimate risk.

System Architecture

1. Distribution Modeler

Converts scalar plan inputs into statistical distributions. - Tasks: Beta-PERT distribution ($O + 4M + P / 6$) for duration. - Costs: Lognormal distribution (skewed right) for budget items. - Risks: Bernoulli trials ($p$) with impact distributions ($I$).

2. Simulation Loop (The Engine)

Runs $N$ iterations (default 10,000). For each iteration $i$: 1. Sample duration $d_{i,t}$ for every task $t$. 2. Sample cost $c_{i,b}$ for every budget line $b$. 3. Trigger risks based on probability. 4. Recompute Critical Path ($CP_i$) and Total Cost ($TC_i$). 5. Store pair $(CP_i, TC_i)$.

3. Analytics Service

Aggregates the simulation results into interpretable metrics (P10, P50, P90).


Technical Specifications

Input Distributions

Most task durations follow a Beta-PERT distribution: $$E = \frac{Optimistic + 4 \cdot MostLikely + Pessimistic}{6}$$ $$Var = \left(\frac{Pessimistic - Optimistic}{6}\right)^2$$

Cost inputs often follow a Lognormal distribution: $$f(x) = \frac{1}{x \sigma \sqrt{2\pi}} \exp\left(-\frac{(\ln x - \mu)^2}{2\sigma^2}\right)$$ This reflects the reality that costs can explode (10x overrun) but rarely shrink below a floor.

The Simulation Algorithm

def run_simulation(plan, n_iterations=10000):
    results = []
    for _ in range(n_iterations):
        scenario = {}
        # 1. Sample Tasks
        for task in plan.tasks:
            duration = sample_pert(task.min, task.mode, task.max)
            scenario[task.id] = duration

        # 2. Trigger Risks
        for risk in plan.risks:
            if random.random() < risk.probability:
                # Add delay to linked tasks
                for target in risk.impacts:
                     scenario[target] += risk.delay_days

        # 3. Solve Critical Path
        finish_date = solve_cpm(plan.dependencies, scenario)

        results.append(finish_date)

    return aggregate_stats(results)

Output Analysis (Sensitivity)

Tornado Charts

We calculate the correlation coefficient between each task's duration and the total project duration. - High Correlation: This task is a "Driver". Focusing on it reduces overall variance. - Low Correlation: This task has "Slack". Delays here likely won't hurt the project.

Probabilistic S-Curves

A plot of $Cumulative Probability$ vs $Time/Cost$. - P50 (Median): The "Coin Flip" outcome. - P80 (Commitment): The recommended internal deadline. - P95 (Safe Bet): The recommended external promise.


Integration

Assumption Drift Monitor

When drift is detected (e.g., "Steel prices up 10%"), the Monte Carlo engine automatically re-simulates using the new actuals as the baseline, updating the P-values in real-time.

API Reference

POST /api/simulate/plan/{id} Trigger a full 10k run.

{
  "iterations": 10000,
  "seed": 42
}

GET /api/simulate/results/{id} Get the distribution data.

{
  "p10_date": "2026-06-01",
  "p50_date": "2026-07-15",
  "p90_date": "2026-09-01",
  "success_prob_deadline": 0.42 # 42% chance to hit deadline
}

Detailed Implementation Plan

Phase 1: Simulation Foundation (2–3 weeks)

  1. Define normalized simulation input contract:
  2. task duration distributions (optimistic/most likely/pessimistic)
  3. cost uncertainty distributions per CBS line item
  4. risk event probabilities and impact models

  5. Build Monte Carlo core package:

  6. deterministic seed support for reproducibility
  7. vectorized simulation runner (NumPy/JAX-compatible)
  8. scenario persistence for post-run diagnostics

  9. Add baseline output artifacts:

  10. percentile summaries (P10/P50/P90)
  11. success probability against target date/cost
  12. critical-path frequency map

Phase 2: Pipeline Integration (2 weeks)

  1. Add a post-planning stage in run_plan_pipeline.py:
  2. simulate_plan_success
  3. consumes WBS, CBS, risk register

  4. Persist results in structured storage:

  5. run-level summary table
  6. optional per-iteration table for top-N scenario replay

  7. Expose API and report sections:

  8. /api/simulate/plan/{id} trigger endpoint
  9. /api/simulate/results/{id} retrieval endpoint
  10. markdown report block in generated plan output

Phase 3: Calibration + Reliability (2–4 weeks)

  1. Backtest against historical completed projects.
  2. Calibrate distribution parameters by domain.
  3. Add quality gates:
  4. minimum input completeness threshold
  5. confidence labels (high/medium/low) by data quality

  6. Add drift detection:

  7. trigger re-sim when assumptions or benchmark inputs change.

Data and API contracts

Suggested summary payload extension:

{
  "simulation": {
    "iterations": 10000,
    "seed": 42,
    "success_probability": 0.42,
    "schedule": {"p10": "2026-06-01", "p50": "2026-07-15", "p90": "2026-09-01"},
    "cost": {"p10": 1200000, "p50": 1500000, "p90": 2100000},
    "top_drivers": ["foundation_work", "steel_supply", "permit_delay"]
  }
}

Operational safeguards

  • Hard cap on iterations for interactive usage.
  • Queue long simulations asynchronously.
  • Cache deterministic runs by (plan_hash, assumptions_hash, seed).

Validation checklist

  • Unit tests for distribution samplers.
  • Statistical sanity tests (mean/variance bounds).
  • Integration tests for API + report rendering.
  • Backtest acceptance criteria per domain.

Detailed Implementation Plan (Compute + Reliability)

Compute Strategy

  • Interactive mode: 1,000 runs for quick feedback
  • Batch mode: 10,000+ runs for decision-grade outputs
  • Queue heavy runs with cancellation support

Reliability Controls

  • Seeded reproducibility for audit
  • Input validation for distribution parameter sanity
  • Guardrails against invalid dependency graphs

Explainability

  • Output top driver contributions per percentile shift
  • Include “what changed this outcome” narrative snippets

SLOs

  • P95 interactive response < 8s (1k runs)
  • Batch completion < 2 min for standard plan size