Claude Code · Azure · FinOps

I built a Claude Code skill that does a 20-minute Azure cost review

A read-only Azure cost auditor that produces a written report a human can read and act on. Calibrated against a real tenant. Finds roughly 3× what Azure Advisor surfaces.

May 4, 2026 · GitHub: mindaugasnakrosis/azure-costs-analyzer ·Also on Medium

Code

https://github.com/mindaugasnakrosis/azure-costs-analyzer

The pitch

Most Azure cost reviews are some combination of: someone clicking through the portal for a day, a £15k consultant deck three weeks later, or Azure Advisor — which is honest about what it sees but conservative by design.

I wanted a fourth option: a Claude Code skill that runs in twenty minutes, against a snapshot of the tenant, and produces a written report a human can read and act on. Read-only. Citation-grounded. Reproducible.

Run against a tenant doing roughly £30k/month in Azure spend, the latest run produces:

Total estimated monthly savings: £5,436 – £7,369 / month

Findings by severity: 0 Critical · 18 High · 38 Medium · 174 Low · 8 Info

That's roughly 3× Azure Advisor's published recommendation ceiling for the same tenant (~£20k/year). Not because Advisor is wrong — because the skill carries opinions Advisor cannot.

Below: what it catches, and the architecture that lets it grow without becoming flaky.

What it found

The single biggest finding wasn't a misconfigured VM. It was a 5× App Service P2v3 reservation running at low utilisation with 9 months remaining before expiry. The skill reports both the monthly band and the term-to-expiry forecast:

Underused reservation: P2v3 × 5 — 40% utilisation Forecast: £4,558 – £5,535 between now and expiry

That single line is worth more attention than any number of orphaned-disk findings. It's also the kind of thing Advisor never says, because Advisor doesn't combine spend + utilisation + term remaining into a single forward-looking £ figure.

Other examples from the same run:

GeoRedundant backup vault flagged on a non-production subscription — switching to LocallyRedundant is a 50% storage cut on Microsoft published rates, but only if there are no protected items at switch time.
Backup retention bloat on a workload-policy vault: weekly=104, monthly=60, yearly=10 — well above what the source workload actually needs.
Subscription on full PAYG/EA/CSP whose name says non-prod: a hint that whoever provisioned it skipped the Dev/Test offer. The skill is offer-aware (CSP vs EA vs PAYG) and writes the recommendation accordingly.
Premium SSD P30 reservations at 25% utilisation, forecast £1,447–£1,953 over the remaining 8 months.

The point isn't any single number — it's the texture. The skill surfaces the kinds of findings that show up in a senior FinOps consultant's deck, not the kinds an Advisor cron job emits.

The architecture

The skill is a monorepo of two installable Python packages:

azure-investigator-core — the read-only collectors. Pulls from az CLI, normalises into a snapshot directory.
azure-cost-investigator — the rule engine. Reads a snapshot, evaluates rules, writes a report.

Numbers as of the latest commit:

15 cost rules
14 knowledge corpus docs
18 collectors
227 passing tests (1 skipped — a smoke test gated on a real-tenant snapshot env var)
Output: report.md + report.html + findings.yaml

Reservation forecast: a small change with a big narrative effect

A naïve "underused reservation" rule reports a monthly band: "this reservation is wasting £X/month at current utilisation". True, but small.

The skill reads expiryDateTime from the reservation snapshot and adds two evidence fields:

term_remaining_months: 9
forecast_low_gbp_to_expiry: 4558
forecast_high_gbp_to_expiry: 5535

If expiryDateTime is missing, it falls back to a 12-month default (overridable via config). The recommendation copy reads "£4,558–£5,535 between now and expiry" instead of "£506–£615/month" — same underlying data, dramatically clearer impact.

The same rule dispatches across VM, App Service, and Premium SSD SKU families, not just VMs.

Six architectural bets

1. Read-only is absolute. A single function (azcli.run_json) gates every Azure CLI call. Twenty-three unit tests verify it rejects every one of 33 forbidden write verbs before subprocess is reached. Every collector — vms, disks, reservations, recovery_services, log_analytics — goes through the same gate. There is no escape hatch.

2. Citation-grounded knowledge corpus. Every rule declares KNOWLEDGE_REFS pointing at committed .md files in the corpus. Each knowledge doc has frontmatter with source_url, source_retrieved, and source_sha256. A rule that cites a missing document fails to load. This is how I keep the model from inventing numbers — the rule is allowed to compute, but it must justify its limits from a doc that has a Microsoft URL on it.

3. Severity × Confidence as independent axes. Findings carry both. A high-severity / low-confidence finding ("we think this £4k reservation is wasted, but utilisation data is sparse") gets handled differently from a low-severity / high-confidence finding ("this £4 IP is definitely orphaned"). Single-axis lists hide that distinction.

4. Savings as ranges, not point estimates. Every monetary recommendation comes with a non-empty assumption string explaining what's not netted out: negotiated discounts, reservation coverage, regional price drift. Treating the range itself as the answer is a feature.

5. Persona separation in the core package. The core package is shared with a sibling azure-security-investigator skill (a v2 stub in the same repo). When I added the recovery-services collector for cost, the security skill inherited it for free.

6. Rendering is a first-class output. The skill produces report.html automatically alongside report.md, with print-friendly CSS. Browser-print-to-PDF works without extra tooling. This matters because cost reports get sent to humans who don't read terminal Markdown.

The rule corpus

The fifteen rules currently in the skill:

orphaned_disks — unattached managed disks billed by capacity.
unattached_public_ips — Public IPs not bound to a NIC or load balancer.
vm_rightsizing — P95 CPU under 3% and outbound under 2 Mb/s on a 7-day window.
underused_reservations — VM, App Service, and Premium SSD reservations under utilisation threshold, with term-to-expiry forecast.
appservice_idle_plans — App Service Plans with zero traffic.
storage_tier_mismatch — Hot blobs older than the Cool break-even.
basic_sku_retirement — Public IP Basic SKU and other deprecated SKUs.
tagging_governance — missing CAF-recommended tags.
env_mismatch_in_prod_sub — non-prod-named resources living in production subs.
unused_snapshots — disk snapshots older than the retention threshold.
nic_orphans — NICs detached from any VM.
rsv_backup_retention — Recovery Services Vault retention bloat plus GRS-redundancy scrutiny.
dev_test_offer_eligibility — PAYG/EA/CSP subscriptions that look non-prod by name.
log_analytics_retention — Log Analytics workspace retention exceeding workload need.
advisor_cost_recommendations — surfaces Azure Advisor's own £-quantified findings.

Adding a sixteenth rule means: writing a new file in rules/, declaring its KNOWLEDGE_REFS, writing the knowledge doc, writing tests. No registration step.

What it deliberately doesn't do

A short list:

No remediation verbs. Read-only is a guarantee, not a default.
No live web fetches at runtime. Knowledge is committed and SHA256-tracked.
No reservation purchase recommendations. Sizing reservations is a finance decision, not a script's call.
No multi-cloud comparisons. AWS/GCP have different mental models.
No ticket-system integration. The output is a Markdown file. What you do with it is yours.
No web UI. A static report.html is the closest concession.

Security findings live in azure-security-investigator, a separate v2 stub in the same repo, sharing the core package. Splitting cost from security at the skill boundary keeps each persona's report focused.

Running it

az login                                  # interactive
uv sync --all-packages                    # one-time
azure-investigator pull -s <SUB_NAME>     # one or more --subscription flags
azure-cost-investigator analyse latest    # writes report.md + report.html + findings.yaml

Twenty minutes is roughly the wall-clock for a tenant with a few dozen subscriptions worth of resources. Most of it is az calls; the rule evaluation is sub-second.

For a printable PDF: open report.html in a browser, ⌘P → Save as PDF. The CSS is print-tuned.

Repo

github.com/mindaugasnakrosis/azure-costs-analyzer

MIT-licensed. Python 3.11+. CI green on 3.11 and 3.12.

If you run it against your own tenant and it surfaces something interesting (or — more usefully — something wrong), open an issue. Calibration against more tenants is the next thing keeping this honest.

Written with Claude Code. The code is the artefact; the article is the receipt.

Try it

Clone the repo and follow the README — the install path is documented end-to-end, including a smoke test that runs without network or LLM. mindaugasnakrosis/azure-costs-analyzer →