The 21 Production Cliffs of Wiring Microsoft Copilot Studio to AWS — A Senior Engineer's Field Notes
Shipping Production AI:
I Wired Microsoft Copilot Studio to AWS via MCP.
104 tools. $12/month. Zero hallucinations (because zero write permissions).
Here's the architecture, the code, and the 21 cliffs I fell off so you don't have to.
It's 2 AM on a Saturday.
PagerDuty is shrieking. A production EC2 instance can't reach its database. Our incident channel has 30 people in it, and the on-call junior engineer — let's call her Sarah, six months out of bootcamp — is staring at the AWS Console with 47 browser tabs open.
She doesn't know which security group, which subnet, which NACL, or which route table to look at first.
She types in our incident channel:
Three of us wake up. We spend the next 90 minutes pulling threads. Eventually we find it: a security group rule got removed by an unrelated Terraform change four hours earlier.
I went back to bed at 4 AM thinking the same thing every senior engineer thinks at 4 AM:
"There has to be a better way."
By Sunday night, there was.
What I built
An AI agent. 104 read-only AWS tools. Two front doors. Junior engineers can use it from inside Microsoft Teams. Senior engineers can use it from Claude Desktop on their laptops. Both talk to the exact same brain.
Now when Sarah types in Teams:
90 minutes of war room → 4 seconds in Teams.
This isn't a mockup. This is Microsoft Copilot Studio talking to my MCP server right now.
↑ The agent AWSNETOPSMCPSHRN in Copilot Studio with all 104 AWS tools discovered live: whoami, describe_security_group, find_instance, query_flow_logs, simulate_principal_policy, and 99 more. Each one with a description that the AI reads to decide when to call it.
How it works (the senior-engineer version)
Here's the architecture. One Python codebase. Two AI front doors. Same brain.
┌──────────────────────┐ ┌──────────────────────┐
│ 💼 Microsoft Teams │ │ 🖥️ Claude Desktop │
│ (junior engineers) │ │ (senior engineers) │
└──────────┬───────────┘ └──────────┬───────────┘
│ │
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ 🤖 Copilot Studio │ │ 📡 stdio (local) │
│ + Generative AI │ └──────────┬───────────┘
└──────────┬───────────┘ │
│ HTTPS + MCP │
▼ │
┌──────────────────────────────────┐ │
│ ☁️ Azure Container Apps │ │
│ /mcp + /health endpoints │◄───────┘
│ Python · FastMCP · Uvicorn │
└──────────┬───────────────────────┘
│ boto3 + Read-Only IAM
▼
┌──────────────────────────────────┐
│ 🟧 AWS APIs (READ-ONLY) │
│ EC2 · VPC · IAM · CloudTrail │
│ Security Hub · GuardDuty · ... │
└──────────────────────────────────┘
The key insight: Anthropic's Model Context Protocol (MCP) is essentially USB-C for AI. Write the tool server once, and any compatible AI client — Claude Desktop, Microsoft Copilot Studio, GitHub Copilot, Cursor, ChatGPT — can use it.
For Microsoft Copilot Studio specifically, I needed the tools accessible over HTTPS. So I wrapped the local stdio MCP server in a Starlette + Uvicorn ASGI app and deployed it to Azure Container Apps. Total cost: ~$12/month.
Each tool ships with a description the AI reads to decide when to call it.
↑ The Copilot Studio Tools tab. Server, Connection, and Available-to fields all green. The right column is the killer feature — every tool's docstring becomes a prompt-aware description. The AI reads these descriptions to decide which tool to call. Write a clear docstring → get a smart agent.
The 21 cliffs I fell off (and how I climbed back up)
Here's the part where most engineering blog posts say "and it just worked!" 🤥
It absolutely did not just work. I hit 21 distinct issues. Each one cost me 15-180 minutes. The codebase is now battle-hardened against every single one of them — and the GitHub repo includes a PreFlight-Check.ps1 script that catches all 21 before you spend 10 minutes deploying.
A few highlights:
🎯 The 3-hour boss fight: "Invalid Host header"
MCP 1.27+ added DNS rebinding protection that only allows localhost in the Host header. Azure's load balancer sends the public FQDN. Result: every request rejected. The fix is 3 lines of Python — but you only know which 3 lines after spending 3 hours instrumenting the running container with the Azure Portal Console (which most engineers don't even know exists).
⚠️ The silent killer: PowerShell paste duplication
Right-click paste in PowerShell sometimes pastes your AWS access key twice. Your 20-char key becomes 40 characters. AWS rejects it with InvalidClientTokenId. You delete and recreate the key three times before realizing the problem isn't the key. This single bug cost the engineering community untold hours.
🐳 The buildpack hijack
If you put your Dockerfile in a subfolder (organized!), Azure Cloud Build silently ignores it and falls back to its Python buildpack — which generates a broken gunicorn application:app command that doesn't match anything in your code. The fix is: keep the Dockerfile at the repo root. No errors. No warnings. You just spend 45 minutes wondering why your perfectly fine container won't start.
I documented every single fix — root cause + solution + how to detect it — in THE-21-FIXES-EXPLAINED.md in the repo. It's the post-mortem document I wish someone had handed me on Friday night.
📚 Lessons Learned — All 21 Fixes
Here's the full list. Bookmark this section if you're building anything similar — every single one of these cost me real time and is now solved in code.
🐍 Category 1 — Python Wrapper Fixes (7 fixes)
All in aws_netops_mcp_http.py. Without these, the container starts but Copilot Studio can't talk to it.
🐳 Category 2 — Build & Packaging Fixes (2 fixes)
🚀 Category 3 — PowerShell & Azure CLI Fixes (6 fixes)
🔍 Category 4 — Debugging Methodology (3 fixes)
🎯 Category 5 — Copilot Studio Configuration (3 fixes)
FIX #1 — DNS Rebinding Protection
This was the boss fight. Everything else is prevention or polish.
This one fix is what makes MCP fundamentally work behind any cloud load balancer.
.enable_dns_rebinding_protection = False
📊 Before vs After Applying All 21 Fixes
All 21 fixes are now baked into the deployment script. A fresh user can extract the package, set their AWS keys, and have a working MCP integration with Copilot Studio in 10 minutes on the first try. No "Invalid Host header." No paste duplication. No buildpack hijacks. Just a working AI agent talking to AWS.
"But what about security?"
When I first showed this to my CISO, his first question was: "What if the AI hallucinates and deletes prod?"
My answer: It physically cannot.
The agent can investigate. It cannot modify. That's a hardware-level guarantee at the AWS API level — not a soft "the prompt told it not to" guarantee.
This is what 8 years of security engineering teaches you: the only defense you can trust is the one the attacker can't politely talk their way around.
Download the complete codebase
Production-tested. MIT licensed. 21 cliffs documented.
Battle-hardened deployment scripts. Works first try.
🛠️ Want to Deploy It Yourself? Step-by-Step Guide for Beginners
If you've never done anything like this before, don't worry. I'll walk you through every single step — from creating accounts to seeing the AI agent answer your first question. Total time: ~45 minutes for a complete beginner, ~15 minutes if you've done some cloud work before.
📋 What you'll need before starting
🟧 Set Up AWS (Read-Only IAM User)
First, we'll create an AWS user with read-only access. This is the user the AI will impersonate when investigating your AWS environment. It cannot modify anything — by design.
Step 1.1 — Sign in to AWS Console
Go to https://console.aws.amazon.com and sign in. If you don't have an account, click "Create a new AWS account" (free tier is plenty).
Step 1.2 — Create the IAM user
- Type "IAM" in the top search bar → click the IAM service
- Left menu: Users → Create user
- User name:
netops-mcp - UNCHECK "Provide user access to AWS Management Console" (we only need API access)
- Click Next
Step 1.3 — Attach the ReadOnly policy
- Choose "Attach policies directly"
- In the search box, type:
ReadOnlyAccess - Check the box next to the AWS-managed
ReadOnlyAccesspolicy - Click Next → Create user
Step 1.4 — Create the access key
- Click on your new
netops-mcpuser - Go to Security credentials tab → scroll to Access keys
- Click Create access key
- Use case: Command Line Interface (CLI) → check the confirmation box → Next
- Description tag:
aws-netops-mcp→ Create access key
Step 1.5 — Open the CSV in Notepad (this saves you hours)
Open the downloaded CSV in Notepad (NOT Excel — Excel mangles the values). You'll see two lines:
Access key ID,Secret access key
AKIAIOSFODNN7EXAMPLE,wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Verify in Notepad:
- Access Key ID is exactly 20 characters, starts with
AKIA - Secret Access Key is exactly 40 characters
Keep Notepad open. You'll copy values from here in later steps. Why Notepad? Because copying from Excel or browser windows can introduce invisible duplicate characters that cause the famous InvalidClientTokenId error. (This is FIX #10 in the lessons learned.)
🖥️ Connect to Claude Desktop (The Easy Win)
This gives you the AI agent on your laptop. You'll have it working in 10 minutes. The Copilot Studio integration (for your team in Teams) comes later.
Step 2.1 — Install Python 3.10 or newer
- Windows: Download from python.org — during install, check "Add Python to PATH"
- Mac: Run
brew install python@3.12 - Linux: Already installed — verify with
python3 --version
Open a fresh terminal and verify:
python --version
pip --version
Step 2.2 — Download the AWS NetOps MCP repo
git clone https://github.com/PowerofAutomation2026/aws-netops-mcp.git
cd aws-netops-mcp
pip install -r requirements.txt
Don't have Git? Download the ZIP from the GitHub page and unzip it.
Step 2.3 — Install AWS CLI
- Windows:
winget install Amazon.AWSCLI - Mac:
brew install awscli - Linux: Follow the official guide
Verify: aws --version
Step 2.4 — Configure your AWS profile
Open Notepad with your CSV from Step 1.5, then in your terminal:
aws configure --profile netops
It will ask 4 questions. For each, follow this exact ritual:
- In Notepad, double-click the value to select (don't drag-select)
- Press Ctrl+C
- Click in the terminal
- Press Ctrl+V (or right-click once) — never twice!
- Press Enter
Test it works:
aws sts get-caller-identity --profile netops
You should see your account ID and the netops-mcp ARN. If you get an error, fix it now before continuing.
Step 2.5 — Install Claude Desktop
Download from https://claude.ai/download and sign in with your Anthropic account.
Step 2.6 — Add the MCP server to Claude Desktop's config
Find the config file:
If the file doesn't exist, create it. Open it in Notepad and paste:
{
"mcpServers": {
"aws-netops": {
"command": "python",
"args": ["C:\\path\\to\\aws-netops-mcp\\aws_netops_mcp.py"],
"env": {
"AWS_PROFILE": "netops",
"AWS_DEFAULT_REGION": "us-east-1"
}
}
}
}
Replace the path with where you cloned the repo. Use double backslashes on Windows.
Step 2.7 — Restart Claude Desktop and test
Fully quit Claude Desktop (right-click system tray icon → Quit), then reopen. Click the 🔌 plug icon in the chat input. You should see "aws-netops" with 104 tools.
Test it: type into Claude Desktop:
✅ You should get back your AWS account ID and the netops-mcp ARN. If yes, you've got a working AI agent on your laptop.
☁️ Set Up Azure (For Microsoft Teams Access)
If you only need the agent on your laptop, you can stop after Part 2. Continue if you want the agent in Microsoft Teams for your whole team.
Step 3.1 — Sign up for Azure (if you haven't)
Go to azure.microsoft.com/free. The free tier gives you $200 credit for 30 days.
Step 3.2 — Install Azure CLI
- Windows:
winget install Microsoft.AzureCLI - Mac:
brew install azure-cli - Linux:
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
Verify: az --version (should print 2.84+)
Step 3.3 — Sign in to Azure
az login
A browser opens. Sign in. Pick your subscription if prompted.
🚀 Deploy to Azure Container Apps
Step 4.1 — Set AWS credentials in PowerShell (the Notepad ritual again)
Open the same CSV in Notepad. Then in PowerShell:
# Type the line, then paste your 20-char key between the quotes:
$env:AWS_ACCESS_KEY_ID = "PASTE-AKIA-KEY-HERE"
# Type the line, then paste your 40-char secret between the quotes:
$env:AWS_SECRET_ACCESS_KEY = "PASTE-SECRET-HERE"
# Region — type the value, no need to paste:
$env:AWS_DEFAULT_REGION = "us-east-1"
For each paste: double-click in Notepad → Ctrl+C → click in PowerShell between the quotes → Ctrl+V once → Enter.
Step 4.2 — MANDATORY: Verify the credentials before deploying
$env:AWS_ACCESS_KEY_ID.Length # MUST be 20
$env:AWS_SECRET_ACCESS_KEY.Length # MUST be 40
$env:AWS_ACCESS_KEY_ID.Substring(0,4) # MUST be AKIA
$env:AWS_DEFAULT_REGION # us-east-1 or your region
If any check fails: the most common reason is paste duplication (Length is 40 instead of 20). Re-paste from Notepad with single Ctrl+V.
Step 4.3 — Run the pre-flight check
cd C:\path\to\aws-netops-mcp
.\copilot-studio\PreFlight-Check.ps1
This validates all 21 fixes are in place. Should complete in ~30 seconds. You want to see ✓ ALL CRITICAL CHECKS PASSED.
Step 4.4 — Run the deploy script
.\copilot-studio\Deploy-AwsNetOpsMcp.ps1
The script does 9 stages automatically:
- Verifies prerequisites (Azure CLI, sign-in, AWS keys)
- Installs Container Apps CLI extension
- Registers Azure resource providers
- Creates resource group
rg-aws-netops-mcp - Creates Container Apps environment (~2 min)
- Detects and cleans zombie containers
- Builds and deploys Docker image (~5 min)
- Injects AWS credentials as env vars
- Tests
/healthand/mcpendpoints
Total time: ~8-10 minutes. The script prints your MCP endpoint URL at the end. Copy it.
💼 Connect Microsoft Copilot Studio
Step 5.1 — Open Copilot Studio
Go to copilotstudio.microsoft.com and sign in with your work account.
Step 5.2 — Create a new agent
- Click Create at the top → New agent
- Name:
AWS NetOps Assistant - Description:
Read-only AWS troubleshooting agent for the network team - For instructions, paste:
You are an AWS network troubleshooting assistant for engineers.
When asked about an AWS resource:
1. ALWAYS call get_path_trace_methodology first if the question is about reachability
2. Use whoami to confirm AWS account context
3. Prefer specific tools over generic ones
4. Cite the exact resource IDs (sg-..., vpc-..., i-...) in your answer
5. If you find a problem, suggest the AWS CLI command to fix it
6. Never claim to have made changes — you are read-only
Step 5.3 — CRITICAL: Switch orchestration to Generative
- Open Settings (top right)
- Go to Generative AI
- Set Orchestration to Generative (NOT Classic)
Step 5.4 — Add the MCP tool
- Click the Tools tab
- Click + Add a tool → New tool → Model Context Protocol
- Fill in:
- Click Create
- Wait ~5 seconds — you should see "Discovered 104 tools" ✅
Step 5.5 — Test it
In the right-side test panel, type:
✅ The agent should reply with your AWS account ID and the netops-mcp ARN — same answer Claude Desktop gives you.
Step 5.6 — Publish to Microsoft Teams
- Click Publish (top right)
- Go to the Channels tab
- Click the Microsoft Teams tile → Add channel
- Copilot Studio gives you a deep-link to install in Teams. Send it to your team.
🎉 You did it!
You now have an AI agent that can investigate your AWS environment from both Claude Desktop on your laptop and Microsoft Teams for your whole team.
Total time: ~45 minutes. Total cost: ~$12/month.
Tribal knowledge: democratized.
🚨 If something breaks
All 21 fixes are documented in detail in the repo at THE-21-FIXES-EXPLAINED.md.
Why I'm sharing all of this
Honest answer? Three reasons.
1. The next on-call engineer shouldn't have to wake up at 2 AM for problems an AI can solve in 4 seconds. Tribal knowledge dies with people. Code lives.
2. The MCP ecosystem is 12 months old. The patterns are still being figured out. If this saves one team a weekend of debugging, the post earned its keep.
3. I'm a senior engineer who likes shipping things — and I'm always open to talking with teams building at the intersection of AI, security, and cloud infrastructure. If that's you, my contact info is below.
What this project demonstrates about my engineering
🧠 Technical depth
|
🛠️ Engineering rigor
|
What's next
The MCP pattern extends to almost anything:
- Azure NetOps MCP — same idea, but Resource Graph + Defender for Cloud
- GCP NetOps MCP — for the multi-cloud teams
- Datadog/Splunk MCP — let the agent query observability data
- ServiceNow/Jira MCP — close the loop from incident detection to ticket resolution
Write the tools once. Use them in every AI client. That's the real promise of MCP — and we're still in the first inning.
If this project ever saves you 3 hours of 2 AM debugging,
please ⭐ the repo so the next on-call engineer can find it.
Made with ☕, 🐍, and a 2 AM PagerDuty alert.
Tags: #AI #MCP #ModelContextProtocol #Anthropic #Claude #CopilotStudio
#Azure #AzureContainerApps #AWS #SecurityEngineering #DevOps
#Cisco #Microsoft #InfrastructureAsCode #SRE #IncidentResponse #OnCall

Comments
Post a Comment