June 10, 2026Averta Team20 minute read

MCP Security: Threats, Best Practices, and Hardening

MCP is the default integration layer for AI agents. The threat model, real incidents and CVEs, MCP security best practices, and how to deploy MCP safely.

The Model Context Protocol (MCP), introduced by Anthropic in late 2024 and donated to the Linux Foundation's Agentic AI Foundation in December 2025, has become the de facto standard for connecting AI agents to tools, data, and other systems. With tens of millions of monthly SDK downloads and thousands of active servers, it is the default integration layer for new agentic deployments. That adoption velocity has also created the fastest-growing attack surface in AI security.

MCP servers expose tools, resources, and prompts to agents. The agent reads the descriptions, decides what to call, and acts on the results. Every step of that loop is a potential injection point: the description text, the tool schema, the resource content, the result returned, and the metadata of the server itself. And the surface is no longer theoretical. 2025 brought the first malicious MCP server found in the wild, multiple critical CVEs in widely used MCP components, and a steady stream of attack research from Microsoft, Palo Alto Networks Unit 42, Docker, Invariant Labs, and OWASP working groups.

The sections below cover the working MCP threat model, the MCP security best practices checklist for anyone building or operating servers, the four-layer defender's architecture for deploying MCP safely in production, the real incidents defenders should know, and how MCP security maps to the OWASP Top 10 for Agentic Applications 2026. The article is written for security and platform leaders. It contains no working payloads.

What is MCP security?

MCP security is the discipline of protecting AI agents and their integrations from threats introduced by the Model Context Protocol layer. It covers the security of MCP servers (the tool and resource providers), MCP clients (the agents calling them), the communication between them, and the runtime controls that govern what actions the agent is allowed to take based on MCP-mediated information.

The Model Context Protocol is an open standard, created by Anthropic and now governed under the Linux Foundation's Agentic AI Foundation, that lets agents discover and use tools, data sources, and prompts in a uniform way. MCP security is what stops a hostile MCP server, a poisoned tool description, a malicious resource, or a compromised tool result from compromising the agent that consumes it.

Why MCP security matters now

Three things changed in the last 18 months that put MCP security at the top of the agent-security agenda.

MCP became the default agent integration layer. Custom integration code and ad-hoc framework wrappers are giving way to MCP servers. Adoption by Microsoft, Google, OpenAI, and the broader vendor ecosystem, and the protocol's move to neutral governance under the Agentic AI Foundation (co-founded by Anthropic, OpenAI, and Block), made MCP the lowest-friction path to agent-tool integration. Every new MCP server is a new tool surface every consuming agent can reach.

The supply chain expanded. The official MCP Registry and community registries host thousands of MCP servers. Some are curated. Many are not. Agents that bind to community MCP servers inherit the trust assumptions of those servers without a vendor security review. In September 2025 that risk stopped being hypothetical: a fake "postmark-mcp" npm package became the first malicious MCP server found in the wild, silently copying every email sent through it to an attacker.

The first wave of attacks and CVEs landed. Beyond research from Microsoft, Unit 42, Docker, and Invariant Labs, 2025-2026 produced real, scored vulnerabilities in MCP components themselves: a CVSS 9.6 remote code execution flaw in the mcp-remote proxy, an RCE in the MCP Inspector developer tool, and a persistent code-execution flaw in how a major AI IDE handled MCP configuration.

Combined, these forces moved MCP security from "emerging concern" in mid-2025 to "table-stakes operational risk" in mid-2026. MCP security is one part of the broader agentic AI security program; this piece covers the MCP layer specifically.

The MCP threat model: 11 categories of attack

The categories below describe what attackers can do when MCP is part of the deployment. Each entry is at the level of mechanism, not technique.

MCP security injection surface: tool descriptions, tool results, resource content, and sampling all flow from server to agent
The MCP session loop. The agent only sends tool calls; everything flowing the other way is text the model will read, which makes each flow an injection surface.

1. Server impersonation

An attacker stands up an MCP server that impersonates a legitimate one (typo-squatted name, identical tool list, same metadata). An agent or developer who selects the server based on name and description connects to the malicious server instead. From that moment, every tool call goes to the attacker. The postmark-mcp incident was exactly this: a package impersonating a legitimate email service, with one hostile line added.

Defender's view. Discovery and trust on MCP must be cryptographic, not nominal. Servers should be signed, the agent should pin known-good signatures, and any new server should require explicit approval before binding.

2. MCP tool poisoning

The MCP server publishes a tool description that includes hidden instructions inside the description text or the schema. When the agent reads the available tools (which is part of every MCP session bootstrap), the injected instruction takes effect before any tool is even called. The attacker only has to control what the agent reads about the tool, not the tool's behavior. Invariant Labs documented this category in April 2025 and it has been a staple of MCP attack research since.

Defender's view. Treat tool descriptions as untrusted input. Classify them at ingestion. Mark instruction-eligible content explicitly so the agent's policy layer can detect override attempts.

3. Indirect prompt injection via tool description

Closely related to tool poisoning but worth its own entry. An MCP server hosts a tool that looks legitimate but whose description contains an injected instruction targeted at a specific downstream behavior. The agent reads the description, the instruction enters the model's context, and the model treats it as authoritative.

Defender's view. Same as tool poisoning, plus runtime plan-level review. The agent's plan should be inspected separately from its text output, and plans that exceed the agent's chartered scope should be blocked even if the model produced them. For the broader walkthrough, see our What is Prompt Injection guide.

4. Indirect prompt injection via resource content

MCP servers expose resources (documents, data sources, structured content). When an agent reads a resource, the resource content enters its context. An attacker who can write to the resource (a Confluence page, a wiki, a database row, a GitHub issue) can place hostile instructions there. The agent reads the resource, follows the instructions, and the user has no idea what just happened.

Defender's view. Input classification on every resource the agent ingests. Treat resources as untrusted by default, regardless of the system they came from.

5. Sampling injection

MCP supports a "sampling" capability where the server can request the client (the agent) to generate text on its behalf. Public Unit 42 research demonstrated that sampling can be abused: a hostile server can request samples that, when combined with the agent's existing context, produce instruction-overriding behavior in the agent's next action. The attack mechanism is non-obvious to developers who think of sampling as a benign capability.

Defender's view. Disable sampling for untrusted servers, classify sampling requests, and treat sample outputs as untrusted before they re-enter the agent's planning loop.

6. Capability escalation

The agent connects to multiple MCP servers, each with a limited set of capabilities. An attacker exploits the combination: server A's tool is benign, server B's tool is benign, but the chain (A then B with specific parameters) achieves an action the agent would not have been allowed to take through either alone.

Defender's view. Plan-level review that inspects the agent's full proposed sequence, not individual tool calls in isolation. Identity-bound scope that limits what combinations are reachable.

7. Subcontractor MCP server compromise (supply chain)

The MCP server the agent uses is itself a wrapper around upstream services. If an upstream service is compromised, the MCP server faithfully relays the compromise. The agent has no way to know the upstream is hostile. This is the MCP instance of ASI04:2026 Agentic Supply Chain Compromise in the OWASP agentic framework, and of LLM03:2025 Supply Chain in the LLM framework.

Defender's view. AI Bill of Materials extended to MCP server composition. Subcontractor disclosures from MCP server operators. Periodic re-evaluation of upstream trust assumptions.

8. Privilege confusion across tools

The agent has been granted permission for tool A and tool B, each with appropriate scope. An attacker crafts a sequence of tool calls that uses tool A's identity context to authenticate to tool B, or that uses tool B's data to satisfy tool A's expected input shape. The composition crosses scope boundaries the developer thought were isolated.

Defender's view. Identity guardrails that enforce scope at the moment of action, not just at session start. Audit of every tool call's identity context and parameter origin.

9. Resource exhaustion via MCP

A hostile MCP server returns huge or recursive responses that consume the agent's context budget, the agent's compute budget, or downstream API quotas. This is LLM10:2025 Unbounded Consumption applied at the MCP layer, and a common trigger for cascading failures across connected agents.

Defender's view. Cost and rate guardrails at the MCP boundary. Hard limits on per-server response size, per-tool call rate, and total tokens per agent run.

10. Server-side data exfiltration

The MCP server records every prompt, every tool call, and every result it processes for the agent. An attacker who controls the server has visibility into the agent's entire operational trace, including any sensitive content the agent processed. Even if the server's code is benign today, a compromise of the server's infrastructure becomes a compromise of every agent that ever connected to it.

Defender's view. Treat MCP servers as data processors under the same controls as any other third-party service. Apply data residency, retention, and breach-notification expectations. Prefer self-hosted or signed-and-attested MCP servers for sensitive workflows.

11. Remote code execution through vulnerable MCP components

The newest category, and in CVE terms the most active: classic software vulnerabilities in the MCP tooling itself. In 2025-2026 this included a CVSS 9.6 command injection in the widely used mcp-remote proxy (CVE-2025-6514, over 400,000 weekly downloads at disclosure), an RCE in the MCP Inspector developer tool (CVE-2025-49596), a persistent code-execution flaw in Cursor's MCP configuration handling (CVE-2025-54136, "MCPoison"), and a three-CVE RCE chain in the reference mcp-server-git implementation. An attacker does not need to poison a prompt if the MCP plumbing executes their code directly.

Defender's view. MCP components are software dependencies and need the same hygiene: version pinning, vulnerability scanning, fast patching, and sandboxed execution so a compromised component cannot reach the host. This is ASI05:2026 Unexpected Code Execution territory.

MCP security best practices: the server hardening checklist

For developers building or operating MCP servers, the following MCP security best practices cover the controls most often missed in early deployments. The list draws from the OWASP Agentic Security Initiative's Practical Guide for Secure MCP Server Development and from the public research and incidents above.

MCP authentication and authorization

  • Implement MCP authorization per the official MCP authorization specification, which is now OAuth 2.1 based and has been hardened substantially since mid-2025
  • Use short-lived, scoped tokens for every client session
  • Enforce per-tool authorization, not just per-server
  • Validate the calling client's identity on every tool call, not just at session start
  • Reject tool calls when the calling identity does not match the requested scope

Tool definition integrity

  • Sign tool definitions and schemas
  • Pin published descriptions to a versioned hash
  • Validate that runtime tool invocations match the signed schema
  • Reject any tool definition change without a re-signature event

Resource content classification

  • Classify all returned resource content as untrusted by default
  • Mark instruction-eligible content explicitly so consuming agents can apply policy
  • Apply the same content classifiers to resources that are applied to user inputs

Sandboxing

  • Run tool implementations in sandboxed environments (containers, microVMs) with no host filesystem access by default
  • Drop privileges aggressively in the tool execution context
  • Network egress allowlisted per tool, not per server

Logging and audit

  • Log every tool invocation with full request and response content (subject to data residency rules)
  • Log identity context for every call
  • Make audit logs append-only and tamper-evident
  • Retain logs long enough to satisfy incident-response and compliance obligations

Rate limits and resource caps

  • Per-tool rate limits per identity
  • Per-session token caps
  • Maximum response size per tool call
  • Recursion-depth limits for chained tool calls

Dependency and supply chain hygiene

  • Pin MCP component versions (clients, proxies, servers) and scan them like any other dependency
  • Document and disclose every upstream service the MCP server depends on
  • Apply equivalent security expectations to upstream services
  • Provide vendor change-notification to consuming customers

Update and patch management

  • Signed releases
  • Semantic versioning for breaking changes
  • Vulnerability disclosure process advertised in the server metadata
  • Fast-channel security updates separate from feature releases

Pentesting and red teaming

  • Pre-launch adversarial testing on every new MCP server
  • Continuous red teaming for production servers
  • Public bug bounty for community-deployed servers

Vulnerability disclosure

  • Standard security.txt advertising the disclosure contact
  • Coordinated disclosure timelines
  • Public advisories when vulnerabilities are fixed

This checklist is the operational backbone of MCP server security. Most early MCP deployments cover authentication and rate limits but skip tool definition signing, sandboxing, and audit. The skipped items are exactly the ones the public attack research targets.

The defender's architecture: where to enforce MCP security

A production MCP deployment needs controls at four layers. Each catches a different class of attack. None is sufficient on its own.

Four-layer MCP security architecture: server hardening, transport, client-side classification, and a governed MCP gateway
Where MCP security gets enforced. The gateway is the one layer that holds even when the agent's own SDK or a previously trusted server is compromised.

Server-side controls. Run the hardening checklist above on the MCP server itself. This is the only layer that sits inside the server operator's control and the only layer that can attest to the integrity of tool definitions and resource content.

Transport-layer controls. Mutual TLS, signed messages, and integrity checks on the MCP traffic itself. Catches tampering and impersonation between client and server. Becomes critical when MCP traffic crosses untrusted networks or transits through brokers.

Client-side and SDK-level controls. The agent's MCP client should classify every input it receives from the MCP layer (tool descriptions, resource content, tool results) before passing it to the model. The classification should happen in the agent's SDK, not in the model.

A governed MCP gateway with inline runtime guardrails. Independent of the agent's own code, a governed MCP gateway consolidates every MCP connection behind one proxy: per-agent tool permissions, MCP authentication held at the gateway instead of scattered across agents, and an audit trail for every tool call. Combined with inline runtime enforcement on every tool call, this is the only layer that can catch attacks where the agent's SDK has been compromised, where a previously-trusted server starts behaving badly, or where multi-server interactions cross scope boundaries. For the vendor landscape that addresses this layer, see our top AI agent security tools buyer's guide.

The strongest production deployments combine all four. Server-side controls and transport controls protect the MCP layer itself. SDK, gateway, and runtime controls protect the agent that consumes it. The two halves of the architecture defend against different threat models and together produce a defensible posture.

MCP security in OWASP and the broader AI security frameworks

MCP-specific risks map cleanly to the OWASP Top 10 for Agentic Applications 2026 (ASI01:2026 to ASI10:2026) and to several entries in the OWASP Top 10 for LLM Applications (LLM01:2025 to LLM10:2025).

MCP threat categoryOWASP mapping
Server impersonationASI03 Agent Identity & Privilege Abuse, ASI04 Agentic Supply Chain Compromise
MCP tool poisoningASI01 Agent Goal Hijack, ASI04, LLM01:2025 Prompt Injection
Indirect injection via tool descriptionASI01, LLM01:2025
Indirect injection via resource contentASI01, ASI06 Memory & Context Poisoning, LLM01:2025
Sampling injectionASI01, LLM01:2025
Capability escalationASI02 Tool Misuse & Exploitation, LLM06:2025 Excessive Agency
Subcontractor compromiseASI04, LLM03:2025 Supply Chain
Privilege confusion across toolsASI03, ASI02
Resource exhaustion via MCPLLM10:2025 Unbounded Consumption, amplifier for ASI08 Cascading Agent Failures
Server-side data exfiltrationLLM02:2025 Sensitive Information Disclosure
RCE through vulnerable componentsASI05 Unexpected Code Execution, ASI04, LLM03:2025

MCP security is the operational specialization of the agentic threat model; the OWASP frameworks are the canonical reference. For the broader twelve-category agentic threat model and the eight-layer defense, see our agentic AI security guide.

Real-world MCP security incidents

The incidents and research below are what moved MCP security from theory to practice. Each is summarized at the level of mechanism. Specific exploit details are in the original disclosures.

Five real MCP security incidents: postmark-mcp, mcp-remote RCE, MCP Inspector, MCPoison, and the GitHub data heist
The 2025 incident record that made MCP security an operational concern rather than a research topic.

postmark-mcp: the first malicious MCP server in the wild (September 2025). A fake npm package impersonating the Postmark email service shipped fifteen legitimate versions, built trust, and then added one line in version 1.0.16 that silently BCC'd every email sent through it to an attacker-controlled address. Roughly 1,500 weekly installs were affected. It is the canonical demonstration of MCP supply-chain risk: the server worked exactly as advertised, plus one thing more.

CVE-2025-6514: remote code execution in mcp-remote (2025). A CVSS 9.6 command injection in the mcp-remote proxy used by local MCP clients (Claude Desktop, Cursor, Windsurf) to reach remote servers. A malicious or compromised remote server could execute arbitrary commands on the connecting developer's machine. The package had over 400,000 weekly downloads at disclosure.

CVE-2025-49596: the MCP Inspector drive-by (2025). An RCE in Anthropic's MCP Inspector developer tool, documented in Docker's MCP Horror Stories series as the "drive-by localhost breach": a malicious webpage could reach the Inspector listening on localhost and execute code on the developer's machine.

MCPoison, CVE-2025-54136 (August 2025). A persistent code-execution flaw in how Cursor handled MCP configuration changes: once a user approved an MCP server, the configuration could be silently swapped for a malicious one without re-approval.

The Invariant Labs GitHub MCP data heist (April-May 2025). Researchers demonstrated that a crafted GitHub issue could hijack an agent connected to the GitHub MCP server, turning a routine "check open issues" request into exfiltration of private repository data through the agent's own over-broad personal access token.

Palo Alto Networks Unit 42 on MCP threats (2025-2026). Unit 42's agentic threat research catalogued the top AI agent risks and demonstrated sampling-based injection vectors in MCP-using agents, changing how several agent platforms handle the sampling capability.

Microsoft research on indirect prompt injection in MCP (2025). Microsoft's security teams published research on how indirect prompt injection through MCP-mediated content compromises consuming agents, informing the guardrails in Microsoft's own copilot products.

OWASP Agentic Security Initiative MCP guide (2025-2026). OWASP's Practical Guide for Secure MCP Server Development is the consolidated community reference and the source for several items in the hardening checklist above.

The pattern across this record is consistent: MCP attacks exploit trust assumptions that are not validated. Tool descriptions are trusted as documentation. Resource content is trusted as data. Configuration is trusted after first approval. Components are trusted as plumbing. Each assumption is a defender's responsibility to validate.

MCP security FAQ

What is MCP security? MCP security is the discipline of protecting AI agents and their integrations from threats introduced by the Model Context Protocol layer. It covers MCP server security, MCP client security, transport security, and runtime guardrails on the actions an agent takes based on MCP-mediated information.

What is the Model Context Protocol? The Model Context Protocol is an open standard, introduced by Anthropic in late 2024 and governed since December 2025 by the Agentic AI Foundation under the Linux Foundation, that lets AI agents discover and use tools, data, and prompts in a uniform way. MCP servers expose capabilities; MCP clients (agents) consume them. Documentation is at modelcontextprotocol.io.

What are the most common MCP security risks? The eleven categories covered in this guide: server impersonation, tool poisoning, indirect prompt injection through tool descriptions, indirect prompt injection through resource content, sampling injection, capability escalation, subcontractor compromise, privilege confusion across tools, resource exhaustion, server-side data exfiltration, and remote code execution through vulnerable MCP components.

Is MCP secure by default? Out of the box, MCP provides authorization, authentication, and basic transport security per the official specification, and the authorization spec has been substantially hardened since mid-2025. It does not, by default, address tool poisoning, indirect prompt injection through tool descriptions or resource content, or supply-chain risks from upstream services. Those require additional controls at the server, client, gateway, and runtime layers.

How does MCP authentication work? MCP authentication follows the official authorization specification, which is built on OAuth 2.1: the MCP server acts as a resource server, clients obtain short-lived scoped tokens from an authorization server, and recent spec revisions added enterprise-managed authorization for organization-wide control. In practice, many teams centralize MCP authentication at a governed gateway so individual agents never hold downstream credentials at all.

Should I use community-published MCP servers in production? Only with explicit security review. The official MCP Registry and community registries are valuable for discovery but carry the same supply-chain risk as any third-party software, as the postmark-mcp incident demonstrated. Pin signed versions, verify the server's source, and apply the hardening checklist before binding production agents to community servers.

What is MCP prompt injection? MCP prompt injection is indirect prompt injection that arrives through the MCP layer: in tool descriptions, in resource content, or in tool results returned to the agent. It is one of the most-cited categories of MCP attack and overlaps with OWASP LLM01:2025 Prompt Injection and ASI01:2026 Agent Goal Hijack. For the broader walkthrough, see our What is Prompt Injection guide.

How do I pentest an MCP server? A working MCP pentest covers: authentication and authorization (can scope be bypassed?), tool definition integrity (can descriptions be tampered with?), resource content classification (does the server treat its own content as untrusted?), sandboxing (can a tool escape its execution environment?), dependency hygiene (are the MCP components themselves vulnerable?), and supply chain (what upstream services does the server depend on?). The OWASP Agentic Security Initiative's MCP server guide provides a more detailed test plan, and continuous adversarial testing should follow the pre-launch pentest.

Are MCP guardrails the same as AI agent guardrails? Closely related but distinct. AI agent guardrails are the broader runtime control layer that governs what an agent does. MCP guardrails are the subset that operates at the MCP boundary specifically: validating tool definitions, classifying resource content, enforcing scope on tool calls, and rate-limiting MCP traffic. A governed MCP gateway is the cleanest place to enforce them.

What is "agentic MCP"? The phrase "agentic MCP" describes MCP-mediated capabilities designed for autonomous agent workflows: long-running sessions, agent-to-agent handoff, capability discovery, and multi-step orchestration. The agentic-MCP framing is increasingly common in vendor marketing, but the underlying protocol is the same. The security concerns are the ones documented above.

Related articles

See Averta OS in action

Book a demo and see how Averta OS secures your AI agents from input to execution.

Book a demo