DevOps
Pipelines, runners, IaC, release engineering on Azure.
Five Gotchas When Wiring Azure DevOps MCP Server Into VS Code Copilot
The Azure DevOps MCP Server's setup docs make it look like a five-minute task. It is, if everything goes right. Most teams hit one or more of these five issues, lose an afternoon, and conclude the tool is "buggy" when really it's …
Time-Slicing vs MIG for Bursty LLM Inference Traffic on AKS GPU Node Pools
NVIDIA gives you two ways to share a single GPU across multiple workloads on Kubernetes: time-slicing and MIG (Multi-Instance GPU). The first is software-based and flexible. The second is hardware-partitioned and rigid.
Swapping ACR for Harbor in an AKS GitOps Pipeline: What Broke, What Didn't
Azure Container Registry (ACR) is the default registry for AKS workloads, and for most teams it's the right call — managed, integrated with Entra ID, geo-replicated.
Backstage on AKS With CAPZ + ASO Instead of Crossplane: When the Tooling Choice Matters
Most Backstage-on-AKS internal-platform tutorials reach for Crossplane to do the resource provisioning. We started there too.
Day 1 vs Day 90 on an AKS Internal Platform: What I'd Wire Differently
Three months ago I stood up a new internal developer platform on AKS for a 30-engineer team. Backstage as the portal, ArgoCD for delivery, Crossplane for resource provisioning, the usual stack.
Migrating an AKS Cluster Off Flux v2 to the New ArgoCD Extension Without Dropping Reconciliation
When the ArgoCD extension for AKS hit GA at KubeCon Europe 2026, we had four production AKS clusters running Flux v2 GitOps and a long-standing internal preference for ArgoCD's UI for application-team developers.
Letting Copilot Agent Mode Own Our Monthly AKS Maintenance Run: Five Failure Modes I Hit
Once a month I do the same boring AKS chore: rotate certificates, prune unused resources, check node pool versions against the support matrix, and update the Helm releases for our common platform services.
Building a Free Bicep-Aware PR Reviewer With GitHub Actions and Azure OpenAI
We had a tool gap. Our application code got AI review on every PR. Our infrastructure code — Bicep templates, Terraform modules, Helm charts — went through whatever the human on rotation was willing to look at, which was usually "…
What an SRE Agent Caught Last Quarter (and What It Missed)
The Azure SRE Agent has been running against our production AKS cluster for one quarter. Three months. About 90 incidents.
30 Days With the Azure DevOps MCP Server: What Actually Changed in My Backlog Triage
I track tickets like most people: poorly. The backlog has 240 open work items in it, the average age is 71 days, and roughly a third are duplicates of each other under slightly different wording.
Plugging Azure OpenAI Into Azure Pipelines for PR Review: A Real-World Setup
The first time we tried this, the bot left a comment on every PR that just said "Looks good!" — including on a PR that introduced a hard-coded SAS token.