ADR-014: Removal of Server-Side LLM Dependency in Favor of MCP Sampling
Status
Accepted (proposed 2026-04-30; supersedes Phase 2.5 of ADR-009)
Date
2026-04-30
Context
DocuMCP currently bundles a multi-provider LLM client (src/utils/llm-client.ts, ~369 LOC) that is called from src/utils/semantic-analyzer.ts and src/tools/simulate-execution.ts to enable hybrid semantic code analysis. This was introduced in v0.5.3 (Phase 2.5 of ADR-009, commit f7b6fcd) to support DeepSeek, OpenAI, Anthropic, and Ollama as providers behind the same interface.
This is misaligned with current MCP best practice as documented in docs/CE-MCP-FINDINGS.md: MCP clients are the correct execution layer for LLM calls, not servers. Bundling an LLM client inside a server creates five concrete problems:
- Duplicated trust boundary. The host already has an LLM relationship governed by the user's policies, quotas, and consent. A second, server-internal LLM bypasses that boundary.
- Unauthorized data egress. User code is forwarded to third-party APIs (DeepSeek, OpenAI, Anthropic) that the host did not authorize. The server-side client makes this default, not opt-in.
- Parallel configuration surface.
DOCUMCP_LLM_PROVIDER,DOCUMCP_LLM_API_KEY,DOCUMCP_LLM_MODEL, andDOCUMCP_LLM_BASE_URLare env vars users must learn, set, and rotate independently of their host's API keys. - Dual cost streams. A single user session can incur LLM charges on both the host's account and on whatever provider is configured server-side, with no unified visibility.
- Non-deterministic test baseline. Hybrid mode is the default analysis path, which makes deterministic AST tests harder to author and makes regressions harder to attribute.
Adoption signals from the documcp Knowledge Graph show zero deployments that opted into hybrid mode in production, suggesting the feature primarily adds risk without measurable benefit.
The issue set #106–#110, filed 2026-04-30, is opened against this decision: #106 is the umbrella epic; #107 deletes the LLM client; #108 collapses semantic-analyzer.ts to AST-only; #109 removes the simulate_execution tool and execution-simulator utility; #110 updates the documentation surface.
Decision
Remove the bundled LLM client and all server-side LLM execution paths from DocuMCP. Specifically:
- Delete
src/utils/llm-client.ts,src/utils/execution-simulator.ts,src/tools/simulate-execution.ts,tests/utils/llm-client.test.ts, and thedocs/how-to/llm-integration.mdhow-to. - Refactor
src/utils/semantic-analyzer.tsto AST-only mode: removeuseLLM,confidenceThreshold, andllmConfigoptions fromcreateSemanticAnalyzer();analysisModealways returns'ast'; theSemanticAnalyzerpublic API stays compatible where reasonable. - Drop
DOCUMCP_LLM_PROVIDER,DOCUMCP_LLM_API_KEY,DOCUMCP_LLM_MODEL, andDOCUMCP_LLM_BASE_URLparsing fromsrc/index.ts. Emit a one-version deprecation warning before deleting the parsing logic to give downstream users a clear signal. - Remove the
simulate_executiontool entry from theTOOLSarray andCallToolRequestSchemaswitch insrc/index.ts. - Mark Phase 2.5 of ADR-009 as Superseded by this ADR. The rest of ADR-009 (the broader content-accuracy framework) remains Accepted.
- Defer any future LLM-driven semantic analysis to MCP Sampling. When DocuMCP genuinely needs an LLM completion (for example, context-aware example generation that AST cannot reach), it will issue a
sampling/createMessagerequest through the host so the host's LLM, policy, and quotas apply uniformly. - Ship under semver as v0.6.0 with a
BREAKING CHANGE:footer in CHANGELOG.md and a clear migration note for hybrid-mode users.
Alternatives Considered
Keep the LLM client but mark it deprecated and feature-flag-off by default
- Pros: Lowest immediate disruption; preserves a code path that can be revived if Sampling proves inadequate.
- Cons: The code path still exists, still requires maintenance whenever any provider's API drifts, and still creates a tempting non-Sampling shortcut for future contributors. The cost of keeping it dormant exceeds the cost of removing and re-implementing via Sampling later if needed.
- Decision: Rejected.
Migrate the LLM client to use MCP Sampling immediately
- Pros: Preserves the user-visible feature; aligns with MCP best practice without a feature gap.
- Cons: Sampling support is uneven across MCP hosts in 2026 — Claude Desktop has it; several other clients do not. Shipping a feature that silently degrades on hosts without Sampling is worse than removing it cleanly.
- Decision: Rejected for v0.6.0; revisit when host coverage is broad.
Restrict hybrid mode to Ollama only (local-only)
- Pros: Solves the privacy concern by keeping inference on the user's machine.
- Cons: Still maintains the multi-provider client codebase; still requires per-OS Ollama setup documentation; still leaves
DOCUMCP_LLM_BASE_URLas an env-var surface. Marginal benefit, full cost. - Decision: Rejected.
Keep the client; document privacy implications loudly; let users opt in
- Pros: Avoids the breaking change; lets sophisticated users self-select.
- Cons: "Opt-in privacy" is widely understood to be an antipattern. Aligns poorly with the MCP best-practice direction documented in CE-MCP-FINDINGS.md.
- Decision: Rejected.
Consequences
Positive
- Smaller surface. ~1,378 LOC removed across four source files plus tests.
- Simpler deployment. No LLM env vars to set, rotate, or document.
- Stronger privacy default. No automatic data egress to third-party LLM providers.
- Reduced maintenance burden. One fewer multi-provider client to track for breaking API changes.
- Cleaner test baseline. Deterministic AST analysis only.
- Architectural alignment. Aligns DocuMCP with MCP best practice as documented in CE-MCP-FINDINGS and reinforced by ADR-011.
Negative
simulate_executionand hybrid semantic analysis are no longer available. Users who relied on them must implement their own host-side workflow or wait for the MCP Sampling integration.- A subset of behavioral-change detection (semantic, beyond AST diffs) is lost in v0.6.0.
Risks and Mitigations
- Risk: External orchestrators that import
documcpprogrammatically and depend onLLMClientorSemanticAnalyzer's hybrid mode will break. Mitigation: The v0.6.0BREAKING CHANGE:tag, a one-minor-version deprecation-warning grace period, and a CHANGELOG migration section for hybrid-mode consumers. - Risk: Loss of behavioral-change detection beyond AST-level diffs. Mitigation: Continued investment in AST analysis (see ADR-015) and an open path to MCP Sampling for cases that genuinely need LLM reasoning.
- Risk: The DocuMCP Orchestrator (external repo) must update to either drop hybrid mode or implement Sampling client-side. Mitigation: Flagged as an integration item; out of scope for this ADR but tracked separately.
Implementation Tracking
This ADR is implemented by the following GitHub issues, all in milestone v0.6.0 — LLM Removal:
- #106 Epic: Remove server-side LLM dependency from DocuMCP
- #107 Delete LLM client and provider configuration
- #108 Refactor semantic-analyzer to AST-only mode
- #109 Remove simulate_execution tool and execution-simulator utility
- #110 Update ADR-009 and remove LLM how-to documentation
Evidence
- docs/CE-MCP-FINDINGS.md (2025-12-09): documents that MCP clients, not servers, are the correct LLM execution layer.
src/utils/llm-client.ts: 369 LOC of multi-provider client logic spanning DeepSeek, OpenAI, Anthropic, and Ollama; each provider's API surface drifts independently.src/utils/semantic-analyzer.ts: 456 LOC that branches onllm/ast/hybridmodes; the hybrid path is the default and complicates the test baseline.src/tools/simulate-execution.ts: 553 LOC of LLM-based code tracing that cannot be made deterministic without a server-side LLM.docs/how-to/llm-integration.md: documents theDOCUMCP_LLM_*env-var surface that this ADR removes.- GitHub issues #106–#110 (filed 2026-04-30): the concrete implementation tracking.
- MCP specification 2025-06-18,
sampling/createMessage: defines the host-side completion request flow that replaces server-side LLM calls when needed.
Related Decisions
- ADR-009: Content Accuracy and Validation Framework (Phase 2.5 superseded by this ADR; rest remains Accepted)
- ADR-011: CE-MCP Compatibility
- ADR-015: Multi-Language AST Analysis Strategy (compensates for the loss of LLM-driven semantic detection)