Skip to content

title: MCP Interface — Evaluation and Roadmap

date: 2026-02-26

MCP Interface — Evaluation and Roadmap

An honest audit of the current MCP surface (mcp_cloud + mcp_local), followed by concrete improvements and promotion ideas.

Revision history:

  • 2026-02-26 (rev 1): Initial version after task_* → plan_* rename.
  • 2026-02-26 (rev 2): Updated after app.py refactor into modules, plan_list user_api_key made optional in schema (auto-injected by HTTP layer), and re-evaluation of all open issues.
  • 2026-02-26 (rev 3): Updated after completing 4.9 — all stale task variable names, request classes, helper functions, and backward-compat aliases renamed/removed across mcp_cloud and mcp_local. Test files renamed from test_task_* to test_plan_*.
  • 2026-02-26 (rev 4): Updated after completing 4.2 — added separate download rate limiter with configurable limits (default 10 req/60s).
  • 2026-02-26 (rev 5): Renamed external-facing fields: task_idplan_id, tasksplans, error codes TASK_NOT_FOUNDPLAN_NOT_FOUND, TASK_NOT_FAILEDPLAN_NOT_FAILED. Internal function names and download URL paths unchanged.

1. Current Tool Surface

Nine tools, split across two transports:

Tool Cloud (mcp_cloud) Local (mcp_local) Auth Annotations
example_prompts yes yes Public readOnly, idempotent
model_profiles yes yes Public readOnly, idempotent
plan_create yes yes Required openWorld
plan_status yes yes Required readOnly, idempotent
plan_stop yes yes Required destructive, idempotent
plan_retry yes yes Required openWorld
plan_file_info yes Required readOnly, idempotent
plan_download yes Required openWorld
plan_list yes yes Required readOnly, idempotent

plan_download is a local-only synthetic tool that internally proxies to plan_file_info on the cloud, then downloads and saves the artifact to the user's filesystem. This intentional asymmetry is tested in test_tool_surface_consistency.py.

Auth model for plan_create and plan_list: Both tools accept an optional user_api_key in the visible MCP input schema. When called over HTTP, the middleware authenticates the caller via the X-API-Key header and auto-injects user_api_key into handler arguments. This means MCP clients never need to pass user_api_key explicitly — the key is invisible in the tool's published schema but enforced at runtime. Both handlers return USER_API_KEY_REQUIRED if no key arrives by either path.


2. What's Working Well

Dual transport. mcp_cloud (stateless HTTP / Railway) and mcp_local (stdio proxy) cover the two major deployment patterns. Most users can pick one without reading source code.

Clean module structure. mcp_cloud/app.py is now a thin re-export facade (~195 lines). Logic lives in focused modules: handlers.py (tool handlers), schemas.py (tool definitions), tool_models.py (Pydantic models), db_queries.py (DB operations), auth.py (key hashing/user resolution), download_tokens.py (signed tokens), model_profiles.py, worker_fetchers.py, zip_utils.py, example_prompts.py. This makes PRs reviewable and bugs easy to isolate.

Consistent plan_* naming throughout. The rename from task_* to plan_* covers the full stack: external tool names, handler functions, request classes (PlanCreateRequest, etc.), DB query helpers (_create_plan_sync, get_plan_by_id, etc.), local variable names, and test file names. No backward-compat aliases remain.

Layered authentication. Two distinct auth paths — a server-wide PLANEXE_MCP_API_KEY for self-hosters, and per-user pex_… keys issued by home.planexe.org — are a good design. The key-normalisation helper (_normalize_api_key_value in http_server.py) handles common copy-paste artefacts (Bearer prefix, surrounding quotes, full header line pasted as value).

Auto-injected user_api_key. For plan_create and plan_list, the HTTP layer reads the authenticated user from the request context and injects user_api_key into handler arguments automatically. Callers never see user_api_key as a required field in the MCP schema — a clean separation between transport-level auth and tool-level logic.

Structured output schemas. Every tool declares an output_schema, so MCP clients can validate responses without guessing. TestAllToolsHaveOutputSchema enforces this at CI time.

Tool annotations. readOnlyHint, destructiveHint, idempotentHint, openWorldHint are set on every tool and tested. This is ahead of most MCP servers.

**plan_retry with model_profile selection.** Allowing the caller to re-run a failed task with a stronger model (e.g. upgrade from baseline to premium) at retry time is genuinely useful.

Signed download tokens. plan_file_info returns download URLs with HMAC-SHA256 signed, time-limited tokens (15-min default TTL) scoped to one artifact (task_id:filename:expiry). Tokens work in a browser without an API key header. Defence-in-depth: the download endpoint re-validates even after middleware has passed the token. The secret fallback chain is: PLANEXE_DOWNLOAD_TOKEN_SECRETPLANEXE_API_KEY_SECRET → per-process random (with warning).

Glama + llms.txt. Being listed in the Glama registry and providing llms.txt lowers the discovery barrier for new users.

Rate limiting on all MCP endpoints. _enforce_rate_limit in http_server.py applies to /mcp, /mcp/, and /mcp/tools/call. The default limit (60 req / 60 s per client, keyed by API key or IP) is high enough that normal plan_status polling is never affected.

Prompt guidance in schema. The prompt field description ("300–800 words … objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria") sets user expectations up front.

**plan_list for plan recovery.** Authenticated users can list their most recent plans (up to 50, newest-first) to recover a lost plan_id. Each entry includes plan_id, state, progress_percentage, created_at, and prompt_excerpt.

Comprehensive test suite. 12 test files covering tool surface consistency, auth key parsing, CORS config, download tokens, HTTP routing, and individual tool behaviour (test_plan_create_tool.py, test_plan_status_tool.py, test_plan_retry_tool.py, test_plan_file_info_tool.py, test_model_profiles_tool.py).


3. What's Been Fixed (Previously Reported)

3.1 ~~skills/planexe-mcp/SKILL.md says "5 tools"~~ (FIXED)

Updated to nine tools; SKILL.md now lists all tools with example JSON-RPC calls.

3.2 ~~Trailing-slash inconsistency~~ (FIXED)

The canonical URL (https://mcp.planexe.org/mcp, no trailing slash) is used in all JSON config files and registry entries.

3.3 ~~speed_vs_detail documented but hidden from agents~~ (FIXED)

Removed entirely from the MCP interface.

3.4 ~~plan_file_info returns {} on success instead of isError~~ (FIXED)

Now returns {"ready": false, "reason": "processing"} while running and {"ready": false, "reason": "failed", "error": {...}} on failure.

3.5 ~~Rate limiting covers REST but not Streamable HTTP /mcp~~ (FIXED)

_enforce_rate_limit now covers /mcp, /mcp/, and /mcp/tools/call.

3.6 ~~No plan_list tool — lost task_id = lost task~~ (FIXED)

Added plan_list to both mcp_cloud and mcp_local. Returns up to 50 tasks newest-first.

3.7 ~~Signed, expiring download tokens~~ (FIXED)

HMAC-SHA256 tokens, 15-minute default TTL, scoped per-artifact.

3.8 ~~Tools used task_* prefix instead of plan_*~~ (FIXED)

All external tool names renamed to plan_*.

3.9 ~~app.py is a 76 KB monolith~~ (FIXED)

Refactored into 10+ focused modules (commit 9f1a7db9). app.py is now a thin re-export facade.

3.10 ~~plan_list requires user_api_key in visible MCP schema~~ (FIXED)

user_api_key is now optional in the PlanListInput schema (not in required list), matching plan_create. The HTTP layer auto-injects it from the X-API-Key header via _get_authenticated_user_api_key(). The handler still enforces the key at runtime (returns USER_API_KEY_REQUIRED if absent).


4. What's Broken or Inconsistent

~~4.1 Dev-secret fallback in production~~ (FIXED)

auth.py now exports validate_api_key_secret() which raises RuntimeError when PLANEXE_API_KEY_SECRET is not set. download_tokens.py exports validate_download_token_secret() which raises when neither PLANEXE_DOWNLOAD_TOKEN_SECRET nor PLANEXE_API_KEY_SECRET is set. Both are called at module level in http_server.py when AUTH_REQUIRED is true, so the server fails hard at startup instead of silently falling back to dev secrets. The existing runtime fallbacks ("dev-api-key-secret" and random per-process secret) remain for local development with PLANEXE_MCP_REQUIRE_AUTH=false.

~~4.2 /download endpoint not rate-limited~~ (FIXED)

A separate download rate limiter (_enforce_download_rate_limit) now covers /download paths with its own bucket and configurable limits: PLANEXE_MCP_DOWNLOAD_RATE_LIMIT (default 10 req) and PLANEXE_MCP_DOWNLOAD_RATE_WINDOW_SECONDS (default 60s). This is deliberately tighter than the MCP rate limit (60 req/60s) since download responses are 700KB–6MB. The sweep task cleans up download buckets alongside MCP buckets.

~~4.3 Body size validation only on REST endpoint~~ (FIXED)

_enforce_body_size now checks both /mcp/tools/call and /mcp/ POST requests. The Content-Length requirement (411) is only enforced on the REST endpoint since Streamable HTTP may use chunked encoding without Content-Length; however, when Content-Length is present on either endpoint it is validated against MAX_BODY_BYTES.

~~4.4 plan_file_info silently defaults invalid artifact to "report"~~ (FIXED)

Both handle_plan_file_info (cloud) and handle_plan_download (local) now return INVALID_ARGUMENT with a descriptive message when the artifact value is not "report" or "zip".

~~4.5 No dedicated plan_list test~~ (FIXED)

Added mcp_cloud/tests/test_plan_list_tool.py with 8 tests covering: tool listed, returns tasks, empty result, limit clamping (both directions), invalid API key, USER_API_KEY_REQUIRED when env requires key, no-key passthrough when not required (user_id=None), and default limit.

~~4.6 CORS default is wildcard~~ (FIXED)

When AUTH_REQUIRED is true and PLANEXE_MCP_CORS_ORIGINS is unset, the default is now ["https://mcp.planexe.org", "https://home.planexe.org"] instead of ["*"]. Wildcard CORS is only used in dev mode (PLANEXE_MCP_REQUIRE_AUTH=false) so browser-based tools like MCP Inspector work without extra configuration. Operators can override via PLANEXE_MCP_CORS_ORIGINS.

~~4.7 No request logging for successful tool calls~~ (FIXED)

handle_call_tool now logs every tool call at INFO level with tool name, result (ok/error/exception), and duration in milliseconds. Unknown tools are logged at WARNING. Format: tool_call tool=<name> result=<ok|error|exception> duration_ms=<N>.

~~4.8 Prompt excerpt length hardcoded~~ (FIXED)

Extracted to PROMPT_EXCERPT_MAX_LENGTH = 100 at module level in db_queries.py.

~~4.9 Stale task variable names and backward-compat aliases~~ (FIXED)

All internal naming now uses plan consistently. Request classes renamed (TaskCreateRequestPlanCreateRequest, etc.), DB query helpers renamed (_create_task_sync_create_plan_sync, get_task_by_idget_plan_by_id, etc.), local variables renamed (task_snapshotplan_snapshot, etc.), all backward-compat aliases removed from tool_models.py, schemas.py, handlers.py, app.py, and mcp_local/planexe_mcp_local.py (~86 lines deleted). Test files renamed from test_task_*.py to test_plan_*.py with patch targets updated.

~~4.10 plan_list auth differs from plan_create~~ (FIXED)

plan_list now uses the same PLANEXE_MCP_REQUIRE_USER_KEY check as plan_create. When the key is not required and not provided, plan_list returns all tasks (no user scoping). _list_tasks_sync accepts user_id=None to support this.


5. Proposed Improvements

5.1 SSE progress streaming (UX)

Long-running plans (10–20 minutes) give the user no feedback. A log_lines array in the plan_status response (last 50 lines of agent output) would dramatically improve perceived responsiveness.

5.2 Webhook / push notification (power users)

Add an optional webhook_url to plan_create. When the task transitions to completed or failed, POST a JSON summary to that URL. This removes the need for polling and enables CI/CD integrations.

5.3 API versioning

All tool names and schemas are currently unversioned. A future breaking change will silently break clients. Add a server_version field to the plan_status output and document a stability policy.

5.4 Startup environment validation

Add an explicit check at server startup that required secrets (PLANEXE_API_KEY_SECRET, PLANEXE_DOWNLOAD_TOKEN_SECRET) are set when auth is enabled. Fail loudly instead of falling back to dev defaults.


6. Promotion and Growth Strategies

6.1 MCP registries

  • Glama — already listed
  • mcp.so — submit server.json; high traffic from Claude desktop users
  • Smithery — another fast-growing directory; supports one-click install
  • awesome-mcp-servers (GitHub) — submit a PR; maintainers merge quickly
  • OpenTools — focus on enterprise MCP discovery

6.1.1 Glama

Already listed, but stuck in an unclaimed state, where I can't customize it.

https://glama.ai/mcp/connectors/io.github.PlanExeOrg/planexe

MCP Servers that have been claimed, can alter their profile, and assign an icon.

I'm seing any health connectors that have been claimed and customized.

https://glama.ai/blog/2025-10-22-what-are-mcp-connectors

https://glama.ai/mcp/connectors?attributes=status%3Ahealthy&sort=featured%3Adesc

Outstanding issues:

  • Claim ownership of PlanExe inside Glama.ai
  • Add a /.well-known/glama.json to claim ownership of planexe. No luck.
  • Add a /glama.json to repo to claim ownership of planexe. No luck.
  • Customize profile text, categories, favicon.

6.2 Content

  • Blog post: "From prompt to project plan in 60 seconds" — a short walkthrough showing MCP Inspector → plan_createplan_status → download. Publish on dev.to, Hacker News (Show HN), and the PlanExe GitHub Discussions.
  • YouTube demo (2–3 minutes) — screen recording of Claude Desktop using PlanExe MCP end-to-end. Pin it to the README.
  • Twitter/X thread — "I built an MCP server that turns a ~500-word prompt into a full project plan. Here's how it works:"

6.3 Community integrations

  • Claude Desktop config snippet — provide a ready-to-paste claude_desktop_config.json block in the README.
  • Cursor / Windsurf rule — provide a .cursorrules or .windsurfrules snippet that wires PlanExe MCP automatically.
  • GitHub Actions — a reusable workflow planexe/create-plan@v1 that runs plan_create and uploads the result as a release asset. This is a high-visibility integration channel.

Add 10–15 high-quality example prompts (startup, research paper, home renovation, hiring plan, …) to example_prompts. Agents and users copy-paste these; each successful use is a social proof data point.

6.5 Observability / social proof

  • Add a public counter to the homepage: "X plans created this week".
  • Post a monthly changelog to GitHub Discussions so subscribers see activity.
  • Badge in the README: ![Plans created](https://img.shields.io/badge/dynamic/json?url=https://mcp.planexe.org/stats&label=plans+created).

7. Quick-win Checklist

Priority Task Effort Status
P0 ~~Fix SKILL.md tool count~~ DONE
P0 ~~Standardise URL trailing slash~~ DONE
P0 ~~Fix speed_vs_detail schema/docs mismatch~~ DONE
P0 ~~Rename tools from task_* to plan_*~~ DONE
P1 ~~Add plan_list tool~~ DONE
P1 ~~Fix plan_file_info empty-dict response~~ DONE
P1 ~~Add rate limiting to /mcp endpoint~~ DONE
P1 ~~Signed download tokens~~ DONE
P1 ~~Refactor app.py into modules~~ DONE
P1 ~~Remove user_api_key from plan_list visible schema~~ DONE
P1 ~~Fail-hard on missing secrets in production (4.1)~~ DONE
P1 ~~Rate-limit /download endpoint (4.2)~~ DONE
P1 ~~Add plan_list handler tests (4.5)~~ DONE
P1 Submit to mcp.so + Smithery 30 min
P1 Write README demo GIF / YouTube link 1 h
P2 ~~Body size validation on Streamable HTTP (4.3)~~ DONE
P2 ~~Return error for invalid artifact value (4.4)~~ DONE
P2 ~~Add tool-call audit logging (4.7)~~ DONE
P2 Add log_lines to plan_status (5.1) 4 h
P2 ~~Rename internal task variables/classes/helpers to plan (4.9)~~ DONE
P2 ~~Remove backward-compat Task*/handle_task_*/TASK_* aliases (4.9)~~ DONE
P2 ~~Rename test files from test_task_* to test_plan_* (4.9)~~ DONE
P2 ~~Tighten default CORS origins (4.6)~~ DONE
P2 ~~Align plan_list auth with plan_create (4.10)~~ DONE
P3 Webhook support (5.2) 1 day
P3 API versioning (5.3) 4 h
P3 GitHub Actions integration (6.3) 1 day

8. Summary

The MCP surface is functionally solid and ahead of most MCP servers in terms of schema rigour, annotation coverage, and security (signed download tokens, layered auth, auto-injected user keys). The codebase has been significantly improved since rev 1: app.py was refactored from a 76 KB monolith into 10+ focused modules, plan_list now follows the same auth-injection pattern as plan_create, and all P0 issues are resolved.

All P1 code-quality issues are now resolved, including fail-hard on missing secrets in production (4.1). The remaining checklist items are promotion/growth tasks (mcp.so submission, README demo) and lower-priority enhancements (CORS tightening, SSE streaming, webhooks, API versioning).