AWS Just Published an MCP Strategy Guide. Here Is What Actually Matters.

AWS Just Published an MCP Strategy Guide. Here Is What Actually Matters.

Table of Contents

AWS quietly dropped a prescriptive guidance document on MCP strategies this month. If you have spent any time with MCP servers, you know the protocol itself is straightforward. The hard part is everything around it: how many tools to expose, where to host the servers, how to stop an agent from deleting your production database with inherited admin credentials.

This guide covers all three. Most of it is what you would expect from AWS documentation. But buried in the tool design and governance sections are insights backed by actual research that are worth pulling out.

Here is what I found genuinely useful, and what you can skip.

The Goldilocks Problem with Tool Counts

The guide opens with a deceptively simple observation: too few tools and your agent hallucinates because it cannot find the right context. Too many tools and it gets confused about selection and sequencing, leading to different hallucinations. The goal is to get the number just right.

Everyone building with MCP has felt this. You start with three tools and the agent works beautifully. You add twelve more because they seem useful, and suddenly the agent is calling the wrong tool half the time and burning tokens on tool definitions it never uses.

AWS cites research showing that agents using structured tool wrappers are roughly 3x more accurate on database tasks than agents accessing raw APIs directly (Middleware for LLMs, ACL Anthology 2024). That is not a small number. The implication is clear: do not just expose your API endpoints as MCP tools. Wrap them in workflow-scoped abstractions that match how the agent actually needs to use them.

The practical takeaway: if you have 20 tools, you are spending 5,000 to 10,000 tokens per invocation on tool descriptions alone. That is latency, cost, and degraded accuracy compounding on every single call. Scope your tools to workflows, not endpoints.

Golden Path Abstractions Are the Real Insight

The guide introduces a concept it calls “golden path abstractions” and this is the part I keep coming back to. Instead of letting an agent figure out how to deploy infrastructure by reasoning over raw API calls, you build an MCP tool called deploy_secure_infrastructure that embeds your organisation’s standards directly. The agent does not need to reason about encryption policies, access controls, or compliance patterns. The tool handles it deterministically.

This applies far beyond deployment. Think about any complex, multi-step workflow where getting the sequence wrong has consequences:

  • A process_patient_data tool that validates HIPAA compliance, anonymises PII, and transforms to HL7 FHIR format. The agent calls one tool instead of orchestrating five steps where getting any wrong means a compliance violation.
  • A create_production_database tool that enforces your naming conventions, backup policies, and network configuration. No more hoping the LLM remembers your internal standards.

The principle: if the cost of getting a workflow wrong is high and the correct sequence is known, do not make the agent discover it. Encode it in the tool. Let the LLM decide when to use the tool. Let deterministic code decide how.

The Credential Scenario Everyone Should Read

The governance section contains a scenario that should be required reading for anyone building agentic systems with production access.

A user with full admin permissions asks an agent to clone a production database for use in pre-production. To do this, the agent only needs READ and CREATE permissions. But then the LLM hallucinates and believes it needs to clean up the old database as part of the request. If the agent inherited the user’s full credentials, the delete succeeds. Production database gone.

AWS’s recommendation: token isolation. The MCP server should use explicitly scoped, purpose-generated tokens for each downstream call. The user’s credentials should never propagate through the agentic system. If the clone operation gets a token with only READ and CREATE permissions, the hallucinated delete fails safely.

This is not theoretical. Every team running agents with database access should be thinking about this pattern right now. The guide recommends Amazon Bedrock AgentCore Identity for managing both workload tokens (machine-to-machine) and user tokens (delegated access), but the principle applies regardless of your identity provider: scope credentials per-tool, validate audience claims, and never reuse tokens between servers.

Three Hosting Models, One Clear Direction

The guide maps out a hosting spectrum that most teams will recognise:

ModelHow it worksBest for
LocalMCP server runs as subprocess on dev machine, communicates via stdioIndividual developer tooling, IDE integrations
RemoteMCP server hosted centrally, accessed over HTTPSShared team tools, centralised governance
GatewaySingle proxy endpoint routing to multiple MCP serversEnterprise scale, tool discovery, auth orchestration

Most teams today are running local MCP servers. That is fine for individual productivity. The operational challenge is lifecycle management: you cannot control which version users are running, you cannot enforce security patches, and you cannot audit tool usage.

The gateway model is where this is heading. AgentCore Gateway and Docker MCP Gateway both provide a single endpoint that handles auth, routing, and protocol translation. New tools become available to agents without redeployment. The gateway can perform semantic search across registered tools, which directly addresses the context window problem: the agent does not need all 50 tool definitions loaded, just the ones relevant to the current task.

The gateway pattern is also how MCP governance becomes practical. You cannot audit tool usage, enforce rate limits, or manage credentials centrally when servers are running as subprocesses on developer laptops.

The 28% You Get for Free

One statistic worth highlighting: the guide cites research showing that following governance recommendations improves task accuracy by 28 to 32 percent (MARCO: Multi-Agent Real-time Chat Orchestration, ACL Anthology 2024). Governance is not just a compliance checkbox. Proper tool scoping, credential isolation, and rate limiting make your agents measurably better at their actual job.

The operational metrics section is practical too. AWS recommends tracking tool selection accuracy, output token volume per tool (alarm when a tool exceeds a threshold for context window usage), and building golden datasets for regression testing generated synthetically from historical API invocation logs. That last one is clever: use your production tool-call history to generate test cases, then measure whether agent upgrades or tool changes degrade accuracy.

What You Can Skip

The “What is MCP?” section is unnecessary if you have built anything with MCP. The Well-Architected Framework mapping is checkbox compliance content. The hosting section is useful for the comparison but light on implementation detail.

What to Actually Do with This

If you are running MCP servers today, three things from this guide are worth acting on immediately:

  1. Audit your tool count. If you have more than 10 tools registered with any single agent, you are likely paying a context window tax. Scope tools to workflows, not API endpoints.

  2. Implement token isolation. If your MCP servers are using the calling user’s credentials for downstream access, stop. Generate purpose-scoped tokens per tool call with minimum necessary permissions.

  3. Build one golden path tool. Pick your most complex, most error-prone multi-step workflow and encode it as a single MCP tool. Measure the accuracy difference.

The full guide is at docs.aws.amazon.com/prescriptive-guidance/latest/mcp-strategies. Read the governance section first. That is where the real value is.

Share :

Related Posts

A Reminder of the Power of AWS Config

A Reminder of the Power of AWS Config

Today, I was reminded of the rich content stored in AWS Config and how easily it can reveal so much about an AWS Organisation across one or many accounts.

Read More
AWS Bedrock AgentCore Policy & Evaluations: AI Agent Governance at Scale

AWS Bedrock AgentCore Policy & Evaluations: AI Agent Governance at Scale

Many organisations are rushing into deploying AI agents with the same enthusiasm they had for serverless in 2016, great technology, terrible operational discipline. The pattern is predictable: build a proof-of-concept that works brilliantly in a demo, deploy it cautiously to production, then discover you have no idea how to govern what it’s actually doing once users interact with it at scale.

Read More
LLM Prompt Injection Attacks: Types, Examples & Mitigation Strategies

LLM Prompt Injection Attacks: Types, Examples & Mitigation Strategies

If you’ve been building with LLMs lately, you’re probably as excited as I am about the possibilities! But let me tell you about something that’s been keeping security folks up at night… prompt injection vulnerabilities.

Read More