I review Bicep PRs. A lot of them. Half my comments before this tool were variants of "did you run what-if?", because the answer was usually no, and the diff would have caught it.
I built a small MCP server that wraps az deployment what-if and az resource show, exposes them as Tools, and lets me ask Copilot in the editor "what would this Bicep change in the prod resource group, and is anything in the live state already drifted from the file?" The answer comes back as a markdown table inline. This is the implementation and the gotchas.
Two tools, that's it
const tools = [
{
name: "bicep_whatif",
description:
"Runs `az deployment group what-if` against a resource group and " +
"returns the structured change list. Use this to preview the effect " +
"of a Bicep file before deploying.",
inputSchema: {
type: "object",
required: ["resourceGroup", "templatePath"],
properties: {
subscriptionId: { type: "string" },
resourceGroup: { type: "string" },
templatePath: { type: "string", description: "absolute or workspace-relative" },
parameterPath: { type: "string", description: "optional .bicepparam or .json" },
},
},
},
{
name: "drift_detect",
description:
"Compares live resource state against a Bicep file. Returns properties " +
"that exist in Azure but not in the template, or differ in value.",
inputSchema: {
type: "object",
required: ["resourceGroup", "templatePath"],
properties: {
resourceGroup: { type: "string" },
templatePath: { type: "string" },
scope: { enum: ["all", "core"], default: "core" },
},
},
},
];
Running what-if
import { execFile } from "node:child_process";
import { promisify } from "node:util";
const exec = promisify(execFile);
async function bicepWhatIf(args: WhatIfArgs) {
const cmd = [
"deployment", "group", "what-if",
"--subscription", args.subscriptionId,
"--resource-group", args.resourceGroup,
"--template-file", args.templatePath,
"--no-pretty-print",
"--result-format", "FullResourcePayloads",
...(args.parameterPath ? ["--parameters", args.parameterPath] : []),
];
const { stdout } = await exec("az", cmd, {
maxBuffer: 50 * 1024 * 1024, // 50MB; what-if can be huge
env: { ...process.env, AZURE_CORE_OUTPUT: "json" },
});
const payload = JSON.parse(stdout);
return summarise(payload.changes);
}
--result-format FullResourcePayloads is the option that makes the output useful. The default ResourceIdOnly tells you something changed without telling you what, which the model then has to ask follow-up questions about.
Summarising the result
The model consumes JSON badly when there are many changes. I summarise to a markdown table before returning:
function summarise(changes: any[]) {
const buckets: Record<string, any[]> = {
Create: [], Modify: [], Delete: [], Deploy: [], NoChange: [], Ignore: [],
};
for (const c of changes) (buckets[c.changeType] ||= []).push(c);
const rows = [
`| Type | Resource | Property changes |`,
`| --- | --- | --- |`,
];
for (const c of [...buckets.Delete, ...buckets.Modify, ...buckets.Create]) {
const props = (c.delta ?? [])
.filter((d: any) => d.propertyChangeType !== "NoEffect")
.map((d: any) => d.path)
.slice(0, 5)
.join(", ");
rows.push(`| ${c.changeType} | \`${shortName(c.resourceId)}\` | ${props || "—"} |`);
}
rows.push("");
rows.push(
`**Counts:** ` +
Object.entries(buckets)
.filter(([, v]) => v.length)
.map(([k, v]) => `${k}: ${v.length}`)
.join(" · ")
);
return { content: [{ type: "text", text: rows.join("\n") }] };
}
Three categories matter, Delete, Modify, Create. NoChange is noise; Ignore is intentional.
Drift detection, diff what's in Azure against the file
What-if shows what would happen if you deployed the file. Drift shows what someone changed in the portal that isn't reflected in the file. They're different signals, both useful.
async function detectDrift(args: DriftArgs) {
const liveRg = await exec("az", ["resource", "list",
"--resource-group", args.resourceGroup, "--output", "json"]);
const live = JSON.parse(liveRg.stdout);
// ARM-compile the template once
const { stdout: arm } = await exec("bicep", ["build", args.templatePath, "--stdout"]);
const template = JSON.parse(arm);
const declared = new Set(
(template.resources ?? []).map((r: any) => `${r.type}/${r.name}`.toLowerCase())
);
const drifts: { resource: string; reason: string }[] = [];
for (const r of live) {
const key = `${r.type}/${r.name}`.toLowerCase();
if (!declared.has(key)) {
drifts.push({ resource: key, reason: "exists in Azure, missing from template" });
}
}
// (deeper property-level diff is a longer routine — see repo)
return formatDriftTable(drifts);
}
The bicep build --stdout step is what makes this feasible, you can't diff structurally against a .bicep file, but the compiled ARM JSON is a fixed shape.
What broke first
az shells out as the logged-in identity. When the MCP server runs as a Container App, that's the workload identity, fine. When it runs as a stdio process on a developer laptop, it's the az login user. Mixing the two is what turned a "I see drift" call into a 403 error. The fix is to require the identity to be passed explicitly via env var at server startup, log it on every tool call, and refuse to run if the identity name doesn't match a configured allowlist.
What-if against a 200-resource group took 90 seconds. The model would time out. I added a tool argument scope: "all" | "core" where core filters to networking + compute + identity resource types. Most "did this PR break something?" questions don't need to wait on the answer for storage and tags. Default to core, opt into all.
The CLI's --no-pretty-print flag changed behavior in az 2.61+. Before that release, JSON output was pretty by default; the flag disabled it. After 2.61 it's compact by default and the flag is a no-op. Pin the CLI version in the Container App image and don't trust shell-out behaviour to be stable across users.
Bicep registries needed login. If your template imports from br:myregistry.azurecr.io/..., the MCP server's identity needs AcrPull on that registry, and bicep build needs to be run after az acr login. I now run az acr login --name myregistry on server startup and on every 8 hours.
Numbers, after a quarter
- ~40% of Bicep PRs now have a what-if comment by the time I review them
- 6 instances of accidental resource deletion caught at PR time, before merge
- 11 drifts surfaced, most innocuous, two were undocumented production changes
What I'd do differently
Bake the bicep and az CLIs into the container image rather than relying on the host. We had a few weeks of "works on my laptop" issues until I pinned versions in the Dockerfile. Add a tools_version Tool that reports what's installed; the model uses it when answering "why did this break?" questions.
I would NOT auto-comment on PRs from the MCP server itself. The server is a retrieval tool. PR commenting is a separate workflow with different security implications. Keep the boundary clean.

Conversation
Reactions & commentsLiked this? Tap a reaction. Want to push back, share a war story, or ask a follow-up? Drop a comment below — replies are threaded and markdown works.