AI Vyuh Security
aivyuh security
MCPThreat ModelSecurity ResearchAI Agents

MCP Security: The Complete Threat Model for AI Agents

7+ attack vectors in the Model Context Protocol — from tool poisoning to shadow servers. Includes a hardening checklist for production MCP deployments.

AI Vyuh Security ·

MCP security is the blind spot in most AI agent deployments today. The Model Context Protocol — Anthropic’s open standard for connecting AI agents to external tools — launched in late 2024 and is now integrated into Claude Desktop, Cursor, Windsurf, VS Code, and dozens of agent frameworks. It solves a real problem: giving AI agents structured, consistent access to databases, APIs, file systems, and third-party services.

But Model Context Protocol security hasn’t kept pace with adoption. MCP creates a new attack surface that traditional security tooling doesn’t cover. Your pentest won’t flag a poisoned tool description. Your WAF won’t catch a confused deputy attack. Your SIEM won’t see a shadow MCP server running on a developer’s laptop. The MCP security risks are real — researchers found 43% of public MCP server implementations contain command injection flaws.

This article is the definitive guide to MCP vulnerabilities: every known attack vector, real-world exploits and CVEs, and an 18-point hardening checklist for production deployments. It draws on disclosed vulnerabilities from Invariant Labs, Trail of Bits, JFrog, Elastic Security Labs, and the OWASP MCP Top 10 project.


The 7 MCP Attack Vectors

1. Tool Poisoning

Tool poisoning is the signature MCP vulnerability. Attackers embed hidden instructions inside MCP tool descriptions — text that the LLM processes as part of its context but that may be hidden from users in the IDE interface.

How it works: When a client calls tools/list, the MCP server returns tool definitions including descriptions. Attackers inject directives (often wrapped in <IMPORTANT> tags) into these docstrings. The LLM follows the embedded instructions, believing them to be legitimate tool documentation.

Real-world PoC (Invariant Labs): A poisoned add tool included a docstring instructing the model to read ~/.cursor/mcp.json and ~/.ssh/id_rsa, then exfiltrate them via a hidden sidenote parameter. The Cursor IDE confirmation dialog displayed only simplified information, masking the malicious payload. The attack succeeded — the agent read and transmitted the credentials.

The rug pull variant: MCP servers can change tool definitions after initial approval. A tool that was clean on Day 1 can silently reroute API keys by Day 7. No re-approval is triggered. This makes one-time audits insufficient — you need continuous monitoring of tool definition integrity.

Defenses: Pin tool packages with hash verification. Display full tool descriptions to users, not summaries. Use Invariant Labs’ mcp-scan to detect definition changes between sessions.


2. Prompt Injection via Tool Responses

This isn’t traditional prompt injection through user input — it’s injection through the data that MCP tools return. Attackers craft malicious content in data sources the MCP server reads, and when the agent fetches that data, the embedded instructions hijack subsequent behavior.

How it works: A GitHub issue, Jira ticket, email, or database record contains hidden prompt instructions. When a developer’s AI assistant processes this data via MCP tools, the embedded commands redirect agent behavior — exfiltrating data, modifying files, or escalating permissions.

Real-world PoC (Invariant Labs, May 2025): A public GitHub issue contained hidden prompt injection. When a developer’s AI assistant processed open issues via the GitHub MCP server, the embedded commands caused it to exfiltrate private repository data into a public pull request. The attack combined an over-privileged Personal Access Token with untrusted content flowing through the MCP tool chain.

Obfuscation techniques (Elastic Security Labs): Base64-encoded commands, ASCII smuggling using invisible Unicode characters, and ANSI escape sequences that are invisible to human reviewers but parsed by the LLM.

Defenses: Treat all tool outputs as untrusted input. Implement output sanitization layers. Consider Trail of Bits’ mcp-context-protector as a defense wrapper. Deploy guardrail models to scan tool responses before they reach the primary agent.


3. Confused Deputy Attacks

In a confused deputy attack, a malicious MCP server manipulates the LLM into misusing tools from a different, trusted MCP server. The trusted server becomes the unwitting “deputy” executing attacker-directed actions.

How it works: When multiple MCP servers are connected to a single agent, a malicious server’s tool description instructs the LLM to alter how it uses tools from other servers. The LLM doesn’t enforce trust boundaries between servers — it sees all tools in a single flat namespace.

Real-world PoC (Invariant Labs): A malicious add tool’s description contained instructions forcing a trusted send_email tool (from a separate, legitimate MCP server) to redirect all emails to attacker-controlled addresses. The user approved the add tool with no awareness of the cross-server manipulation.

Why this matters: Most MCP deployments connect multiple servers. A filesystem server, a database server, a communications server — each trusted independently. But the LLM treats them as a single toolbox with no isolation between them.

Defenses: Implement strict namespace isolation between MCP servers. Monitor cross-server tool invocations for unexpected patterns. Enforce per-server permission boundaries that the LLM cannot override.


4. Shadow MCP Servers

Shadow MCP servers are unapproved deployments operating outside organizational security governance — invisible to your SIEM, firewall, and security team. OWASP ranks this as MCP09 in their MCP Top 10.

How it works: Developers spin up local MCP servers with default credentials and permissive configurations for experimentation. These communicate over localhost, bypass centralized authentication, and create unmonitored attack surfaces. They persist long after the experiment ends.

Real-world scenarios documented by OWASP:

  • Unprotected shadow MCPs indexed by internal search engines, exposing customer datasets
  • Outdated framework versions on forgotten servers exploited for persistent backdoors
  • Malicious IDE plugins installed on shadow MCPs uploading API keys to command-and-control servers

Defenses: Maintain a centralized MCP registry tied to CI/CD enforcement. Run automated weekly shadow detection scans across your network. Require mTLS and OAuth for all MCP connections — no exceptions for “local” servers.


5. Supply Chain Attacks

The MCP ecosystem is growing fast, and the package registries (npm, pip) are a target. Malicious MCP server packages, typosquatting, and dependency attacks are already documented.

CVE-2025-6514 (CVSS 9.6) — mcp-remote RCE: The mcp-remote package had 437,000+ downloads and was recommended by Cloudflare, Hugging Face, and Auth0. A malicious MCP server could return a crafted OAuth authorization_endpoint that triggered command injection via PowerShell’s $() subexpression operator. Full remote code execution on the client machine. Discovered by JFrog, fixed in v0.1.16.

Malicious Postmark MCP Server (Sep 2025): A fake Postmark MCP package — a typosquat of the legitimate one — injected BCC copies of all outgoing emails to attacker-controlled addresses. Classic supply chain attack, MCP-specific impact.

Smithery Path Traversal (Oct 2025): A path traversal vulnerability in smithery.yaml configuration exposed Docker config files and Fly.io API tokens, affecting 3,000+ applications. Discovered by GitGuardian.

Defenses: Generate SBOMs for all MCP server dependencies. Pin packages with cryptographic hash verification. Only install from verified registries. Run MCP servers in Docker containers with minimal permissions. Audit your mcp.json / mcp_config.json for unfamiliar entries.


6. Transport-Layer Vulnerabilities

MCP supports two transports: stdio (local process communication) and SSE/Streamable HTTP (network-based). Both have security gaps that are easy to overlook.

stdio risks: No network port exposed, but scripts piped from the internet execute with zero validation. No built-in authentication mechanism. The security model relies entirely on OS-level process permissions.

SSE/HTTP risks: Exposes a network port vulnerable to DNS rebinding attacks — malicious websites can access local MCP servers by manipulating DNS records. Many implementations ship without TLS, OAuth, or CORS protections.

Credential storage (Trail of Bits, April 2025): Claude Desktop stores claude_desktop_config.json with world-readable permissions (-rw-r--r--). Cursor and Windsurf store conversation logs containing credentials at ~/.cursor/logs/conversations/ with identical permissions. The Figma MCP Server used fs.writeFileSync() creating files with 0666 permissions — world-readable and world-writable.

CVE-2025-6515 — Session hijacking in oatpp-mcp: Session ID generation used memory pointers (this pointer) instead of cryptographically secure random values. Memory allocators reuse freed addresses, enabling session hijacking by spraying messages with low event numbers. Discovered by JFrog.

Defenses: Enforce TLS/mTLS on all HTTP-based transports. Use OS-native credential APIs (Windows Credential Manager, macOS Keychain) instead of plaintext config files. Validate Origin headers on SSE connections. Generate session IDs with 128+ bits of cryptographic entropy.


7. Permission and Capability Escalation

Agents can gain more access than intended through MCP tool interactions — not through exploiting bugs, but through the natural behavior of LLMs following instructions in tool contexts.

Implicit tool invocation (Elastic Security Labs): A poisoned tool description instructs the LLM to call other tools without user approval. A documented PoC showed a benign daily_quote tool that triggered payment skim operations through indirect manipulation of a payment tool.

Pre-authorized tool exploitation: Tools like grep_search in GitHub Copilot ship pre-authorized — they execute without user confirmation. Attackers embed instructions requiring the LLM to invoke these built-in tools to automatically exfiltrate secrets from the local filesystem.

Semantic parameter leakage: Parameters with names like summary_of_environment_details cause the LLM to auto-populate them with system state, file contents, and chat history — without any explicit request. The parameter name itself is the attack vector.

Excessive permission scopes (OWASP MCP02): MCP servers request broad OAuth scopes that expand over time. Temporary elevated permissions granted for a single operation become permanent, creating ambient authority.

Defenses: Enforce least-privilege access per session, not per server. Disable auto-approve for all tools. Require manual approval for operations touching sensitive resources. Audit parameter names in tool definitions for semantic leakage patterns.


Real-World Exploits and CVEs

The MCP attack surface isn’t theoretical. Here are documented incidents:

IncidentSeverityDiscovererDate
CVE-2025-6514 — RCE in mcp-remote via OAuth endpoint injectionCVSS 9.6JFrogJul 2025
CVE-2025-6515 — Session hijacking via memory pointer reuseHighJFrog2025
CVE-2025-53967 — RCE in Figma MCP Server via command injectionCVSS 7.5CommunityOct 2025
CVE-2025-49596 — RCE in MCP Inspector via unauthenticated listenersCVSS 9.4Researchers2025
WhatsApp MCP chat history exfiltrationCriticalInvariant LabsApr 2025
GitHub MCP private repo exfiltrationHighInvariant LabsMay 2025
Asana MCP cross-tenant data exposureHighAsanaJun 2025
Anthropic Filesystem MCP sandbox escape via symlinkHighResearchersAug 2025
Malicious Postmark MCP (typosquat)CriticalCommunitySep 2025
Smithery path traversal exposing 3,000+ app tokensCriticalGitGuardianOct 2025

The key statistic: Researchers analyzing publicly available MCP server implementations found that 43% contained command injection flaws and 30% permitted unrestricted URL fetching.


The MCP Hardening Checklist

This is the actionable output. 18 controls across 5 domains, designed for production MCP deployments.

Transport Security

  1. Enforce TLS 1.3 on all HTTP-based MCP transports. No plaintext SSE connections, even in staging.
  2. Implement mTLS for server-to-server MCP connections. Both sides authenticate, not just the client.
  3. Validate Origin and Host headers on all SSE endpoints to block DNS rebinding attacks.
  4. Use OS-native credential storage (macOS Keychain, Windows Credential Manager, Linux Secret Service) — never store tokens in plaintext config files.

Tool Governance

  1. Maintain an explicit tool allowlist. Default-deny. No tool executes without prior approval.
  2. Pin MCP server packages with cryptographic hashes. Detect supply chain tampering before it reaches your agents.
  3. Hash tool definitions and alert on changes. Use mcp-scan or equivalent to detect rug-pull modifications between sessions.
  4. Display full tool descriptions to users — not truncated summaries. Hidden text in descriptions is the #1 attack vector.

Input/Output Security

  1. Sanitize all tool outputs before they reach the LLM context. Strip control characters, Unicode smuggling, ANSI escapes, and HTML/script tags.
  2. Validate tool inputs against strict schemas. Reject unexpected parameters. Audit parameter names for semantic leakage patterns (summary_of_, details_about_).
  3. Rate-limit tool calls per session. An agent making 500 file reads in 30 seconds is exfiltrating, not working.
  4. Implement output size limits. A tool returning 10MB of data is either broken or being exploited.

Server Provenance

  1. Run MCP servers in containerized environments (Docker/OCI) with minimal permissions and no host network access.
  2. Generate and maintain SBOMs for all MCP server dependencies. Scan for known CVEs weekly.
  3. Maintain a centralized MCP server registry. Every server deployed in your organization must be registered, version-tracked, and owner-assigned.
  4. Run weekly shadow MCP server scans. Detect unauthorized deployments across your network.

Monitoring and Response

  1. Log all MCP tool calls with full input/output payloads, timestamps, and session context. Ship to your SIEM.
  2. Deploy anomaly detection on agent behavior patterns: unusual tool sequences, cross-server calls, after-hours activity, spike in data volume returned by tools.

MCP vs Direct API Calls: Security Trade-Offs

A common question: why not skip MCP and call APIs directly?

Direct API calls give you full control — you define the exact request, validate the response, and handle errors explicitly. The attack surface is well-understood and covered by existing security tooling (WAFs, API gateways, rate limiters).

MCP adds a layer of abstraction. The LLM decides which tool to call, what parameters to pass, and how to interpret the response. This creates the attack surface described above: the LLM becomes an intermediary that can be manipulated through tool descriptions, tool responses, and cross-server interactions.

The trade-off is real:

FactorDirect APIMCP
Agent flexibilityLow — hardcoded integrationsHigh — dynamic tool discovery
Attack surfaceTraditional API risksAPI risks + LLM manipulation layer
Security tooling maturityMature (WAFs, gateways)Nascent (mcp-scan, context-protector)
Multi-tool orchestrationManual plumbingBuilt-in protocol
Audit trailStandard API logsRequires MCP-aware logging

The pragmatic answer: MCP is worth using for multi-tool agent systems where the flexibility gains are substantial. But it must be hardened. Treat MCP like you treated early REST APIs — powerful, useful, and dangerous without proper security controls. For a broader view of the threat landscape beyond MCP, see our coverage of broader AI agent security risks.


Frequently Asked Questions

Is MCP safe for production?

MCP can be used safely in production, but it requires deliberate hardening. Out-of-the-box MCP deployments carry significant risks — researchers found 43% of public MCP server implementations contain command injection flaws. Production use demands transport-layer encryption (TLS/mTLS), tool allowlisting, output sanitization, server provenance verification, and continuous monitoring. Follow the 18-point hardening checklist above before deploying MCP to production.

How to secure MCP servers?

Secure MCP servers by: (1) enforcing TLS/mTLS on all transports, (2) implementing OAuth 2.0 with short-lived tokens instead of static API keys, (3) allowlisting approved tools explicitly, (4) validating and sanitizing all tool inputs and outputs, (5) running servers in containerized environments with minimal permissions, (6) pinning server packages with hash verification to prevent supply chain attacks, (7) logging all tool calls with full input/output for audit, and (8) deploying anomaly detection on agent behavior patterns.


Further Reading


Need an MCP security assessment? Your agents are only as secure as the tools they connect to. AI Vyuh Security provides purpose-built AI agent security assessments — including MCP-specific threat modeling, tool definition auditing, and cross-server trust boundary testing. Backed by AI Vyuh, the AI agent economy platform.

Request an assessment →


MCP vulnerabilities are a critical part of the broader AI agent attack surface. For the big picture on why traditional pentests miss these risks entirely, read why AI agents need their own security assessment.

If your MCP-connected agents run on AI-generated code, the security exposure compounds — 53% of AI-generated code ships with vulnerabilities, and many of those vulnerabilities sit in the exact input validation and error handling paths that MCP exploits target.