The first version of our internal MCP server ran on a developer's laptop. It worked beautifully, until they took a Friday off and the FinOps Slack channel filled with "MCP server unavailable" complaints by 11am.
We moved it to Azure Container Apps with workload identity federation. Same protocol, same code, no keys, autoscale to zero when no one's asking. This is the Bicep, the Dockerfile, and the three things that broke before it stuck.
Why Container Apps over AKS, Functions, or App Service
For a stdio-mode MCP server fronted by an SSE-over-HTTP shim, Container Apps wins on three axes:
- Scale-to-zero with HTTP-based scaling, most internal MCP servers see traffic in bursts.
- Workload identity built in,
az containerapp identity assign --system-assignedand you're done. - No node-pool to babysit like AKS, no cold-start tax like Functions consumption tier.
App Service is fine but doesn't scale to zero on the cheaper SKUs and over-charges for low-traffic workloads.
The Dockerfile
FROM node:22-alpine AS build
WORKDIR /src
COPY package*.json tsconfig.json ./
RUN npm ci
COPY src ./src
RUN npm run build && npm prune --production
FROM node:22-alpine
WORKDIR /app
COPY --from=build /src/node_modules ./node_modules
COPY --from=build /src/dist ./dist
COPY package.json ./
USER 1000
ENV NODE_ENV=production
EXPOSE 8080
CMD ["node", "dist/server.js", "--transport=http"]
Two non-obvious bits:
- Run as UID 1000. Container Apps doesn't enforce non-root by default, but Defender for Cloud flags it. Cheaper to fix once.
- Multi-stage build drops the image from ~340MB to ~120MB. Pull time on a cold scale-up is the difference between a 4-second cold start and an 11-second one.
The Bicep, workload identity, not a managed identity assignment
param location string = resourceGroup().location
param image string
resource env 'Microsoft.App/managedEnvironments@2024-03-01' = {
name: 'cae-mcp'
location: location
properties: {
workloadProfiles: [
{ name: 'Consumption', workloadProfileType: 'Consumption' }
]
}
}
resource app 'Microsoft.App/containerApps@2024-03-01' = {
name: 'ca-mcp-cost'
location: location
identity: { type: 'SystemAssigned' }
properties: {
managedEnvironmentId: env.id
workloadProfileName: 'Consumption'
configuration: {
ingress: {
external: false
targetPort: 8080
transport: 'auto'
traffic: [{ latestRevision: true, weight: 100 }]
}
}
template: {
containers: [
{
name: 'mcp'
image: image
resources: { cpu: json('0.5'), memory: '1Gi' }
probes: [
{
type: 'Liveness'
httpGet: { path: '/healthz', port: 8080 }
initialDelaySeconds: 5
}
]
}
]
scale: { minReplicas: 0, maxReplicas: 5 }
}
}
}
// Cost Management Reader at subscription scope
resource costReader 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
name: guid(subscription().id, app.id, 'cost-reader')
scope: subscription()
properties: {
roleDefinitionId: subscriptionResourceId(
'Microsoft.Authorization/roleDefinitions',
'72fafb9e-0641-4937-9268-a91bfd8191a3') // Cost Management Reader
principalId: app.identity.principalId
principalType: 'ServicePrincipal'
}
}
external: false, internal-only ingress on the managed environment's private IP. The MCP server is reached over a private endpoint (or VNet integration) by clients inside the corporate network. Internet exposure on an MCP server is almost never what you want.
The HTTP shim, stdio doesn't fit serverless
The official MCP transports are stdio and SSE. Container Apps wants HTTP. The trick is running an SSE transport over HTTP, most SDKs ship one:
import express from "express";
import { SSEServerTransport } from "@modelcontextprotocol/sdk/server/sse.js";
const app = express();
const transports = new Map<string, SSEServerTransport>();
app.get("/sse", async (req, res) => {
const transport = new SSEServerTransport("/messages", res);
transports.set(transport.sessionId, transport);
res.on("close", () => transports.delete(transport.sessionId));
await mcpServer.connect(transport);
});
app.post("/messages", express.json(), async (req, res) => {
const id = req.query.sessionId as string;
const transport = transports.get(id);
if (!transport) return res.status(404).end();
await transport.handlePostMessage(req, res);
});
app.get("/healthz", (_req, res) => res.status(200).send("ok"));
app.listen(8080);
Two endpoints, GET /sse opens the long-lived event stream, POST /messages is how the client sends tool calls. Sessions are keyed on a UUID minted by the SSE transport.
What broke first
Container Apps closes idle HTTP connections at 4 minutes. SSE connections look idle to the load balancer between events. Set additionalPortMappings with a longer timeout, or, easier, emit a : keepalive\n\n SSE comment every 25 seconds. The official SDK's SSE transport doesn't do this; you wrap it.
const keepalive = setInterval(() => {
res.write(": keepalive\n\n");
}, 25_000);
res.on("close", () => clearInterval(keepalive));
Cold start on the first request after scale-to-zero was 9 seconds. Most of it was Node startup + DefaultAzureCredential token acquisition. Fix: bake a 0.25-CPU "always-on" replica with minReplicas: 1 for prod, accept the $7/month. Dev environments stay at zero.
DefaultAzureCredential tried IMDS, MSAL, and AzureCLI in order. First call took ~1.4s. Pin to the actual credential type once you know it:
import { ManagedIdentityCredential } from "@azure/identity";
const credential = new ManagedIdentityCredential();
That dropped first-call latency by 800ms.
What I'd do differently
Front it with Application Gateway instead of Front Door. Front Door is overkill for an internal MCP server; AGW with a private listener gets you WAF + private IP + a stable DNS name without a global edge. Cost halved.
I would NOT host one MCP server per tool category. We tried that early, cost-mcp, boards-mcp, monitor-mcp as separate Container Apps. Total complexity went up faster than the modularity helped. One MCP server, multiple tool definitions, a single revision pipeline. Split if it grows beyond ~30 tools.

Conversation
Reactions & commentsLiked this? Tap a reaction. Want to push back, share a war story, or ask a follow-up? Drop a comment below — replies are threaded and markdown works.