Counterfactual Scenario Explorer

Author: PlanExe Team
Date: 2026-02-11
Status: Proposal
Audience: Architects, Data Scientists

Overview

The Counterfactual Scenario Explorer allows stakeholders to test the resilience of a plan by simulating "What If?" scenarios. Instead of a single linear roadmap, it treats the plan as a probabilistic graph that can be "stressed" by changing key input parameters.

It uses a Monte Carlo simulation engine to spawn thousands of "parallel universe" outcomes, helping decision makers understand the range of possible futures.

Core Problem

Standard plans suffer from "Planning Fallacy"—the tendency to underestimate time, costs, and risks. A static gantt chart implies certainty where none exists.

System Architecture

1. The World Generator

The heart of the system. It takes the "Base Plan" and tweaks input variables based on statistical distributions. - Inputs: Plan Tasks, Durations, Costs, Risk Registers. - Variables: "Inflation Rate", "Supplier Delay", "Weather Impact". - Distributions: Normal, Log-Normal, Beta (PERT).

2. Simulation Engine

For each scenario: 1. Perturb: Apply random noise to task durations/costs based on the scenario type. 2. Propagate: Recalculate the critical path and total budget. 3. Detect Failure: Check if any "hard constraints" (e.g., launch date) are violated. 4. Log Outcome: Record success/failure, final cost, final duration.

3. Resilience Scorer

Aggregates the results of N simulations (typically 10,000) into a single "Resilience Score".

Scenario Types

Scenario	Description	Simulation Logic
Optimistic	"Blue Sky"	Skew distributions to P10 (Best Case). Remove 50% of risks.
Pessimistic	"Murphy's Law"	Skew distributions to P90 (Worst Case). Trigger 80% of risks.
Black Swan	"Total Chaos"	Introduce 1-2 "Catastrophic" events (e.g., factory fire, regulation ban).
Inflationary	"Cost Shock"	Increase all material/labor costs by 20-50%. Keep schedule same.
Delay Spiral	"Gridlock"	Increase all task durations by 20-50%. Keep costs same.

Resilience Scoring Formula

The ResilienceScore (0-100) measures how robust the plan is across all scenarios.

$$Score = (0.4 \times P_{success}) + (0.3 \times (1 - \frac{Cost_{P90}}{Budget})) + (0.3 \times (1 - \frac{Time_{P90}}{Deadline}))$$

Where: - $P_{success}$: Probability of meeting minimum success criteria. - $Cost_{P90}$: The 90th percentile cost outcome. - $Time_{P90}$: The 90th percentile duration outcome.

Output Schema (JSON)

The result of a full simulation run:

{
  "plan_id": "plan_123",
  "scenarios_run": 10000,
  "resilience_score": 72, 
  "baseline": {
    "cost": 1000000,
    "duration_days": 180
  },
  "p90_outcome": {
    "cost": 1450000,
    "duration_days": 210
  },
  "key_drivers": [
    {
      "task_id": "task_45",
      "name": "Wait for Permit",
      "sensitivity": 0.85, # 85% correlation with project delay
      "suggestion": "Parallelize this task"
    }
  ]
}

User Interface: "The Matrix View"

A 2x2 grid visualizing the trade-offs.

X-Axis: Cost (Budget vs Overrun)
Y-Axis: Time (Early vs Late)
Scatter Plot: Each dot is one simulation outcome.
Heatmap: Colored zones showing "Safe", "Risky", and "Failed".

The user can iterate by adjusting plan parameters (e.g., "Add 2 more engineers") and re-running the simulation to see if the dots move into the "Safe" zone.

Future Enhancements

AI Recommendations: "If you split Task B into two parallel tasks, your P90 duration drops by 15 days."
Historical Training: Calibrate distributions based on actual past project data (e.g., "Software projects usually slip 30%, not 10%").

Detailed Implementation Plan

Phase A — Scenario DSL and Input Model (2 weeks)

Define a scenario definition language (DSL):
variable overrides
distribution overrides
deterministic shocks
policy constraints
Add scenario library:
optimistic
pessimistic
black swan
inflationary
delay spiral
Validate scenario compatibility with plan domain.

Phase B — Counterfactual Engine (2–3 weeks)

Build engine to clone baseline plan state and apply scenario transforms.
Recompute schedule, cost, and risk metrics per scenario.
Run Monte Carlo for each scenario profile.
Store output distributions and key drivers.

Phase C — Comparative Analytics Layer (2 weeks)

Compute deltas vs baseline:
schedule delta distribution
cost delta distribution
probability of failing hard constraints
Generate resilience score and explainability outputs:
top 5 sensitivity drivers
counterfactual interventions ranked by impact
Add recommendation generator constrained by feasibility and capacity.

Phase D — Interactive UX + API (2 weeks)

Add scenario explorer UI with matrix/heatmap + scatter cloud.
Add parameter sliders with guardrails.
Expose APIs:
create scenario
run analysis
retrieve comparative report

Data model additions

scenario_definitions (scenario_id, plan_id, dsl_json)
scenario_runs (scenario_id, run_id, metrics_json)
scenario_comparisons (baseline_run_id, scenario_run_id, deltas_json)

Operational safeguards

async queue for heavy scenario batches
seed control for reproducibility
max scenario complexity limits to avoid runaway compute

Validation checklist

deterministic baseline replay
scenario transform correctness tests
resilience score monotonicity sanity checks
UI interpretability tests with PMs and analysts