← All writing

Claude Code · Azure · FinOps

I built a Claude Code skill that does a 20-minute Azure cost review

A read-only Azure cost auditor that produces a written report a human can read and act on. Calibrated against a real tenant. Finds roughly 3× what Azure Advisor surfaces.

The pitch

Most Azure cost reviews are some combination of: someone clicking through the portal for a day, a £15k consultant deck three weeks later, or Azure Advisor — which is honest about what it sees but conservative by design.

I wanted a fourth option: a Claude Code skill that runs in twenty minutes, against a snapshot of the tenant, and produces a written report a human can read and act on. Read-only. Citation-grounded. Reproducible.

Run against a tenant doing roughly £30k/month in Azure spend, the latest run produces:

Total estimated monthly savings: £5,436 – £7,369 / month

Findings by severity: 0 Critical · 18 High · 38 Medium · 174 Low · 8 Info

That's roughly 3× Azure Advisor's published recommendation ceiling for the same tenant (~£20k/year). Not because Advisor is wrong — because the skill carries opinions Advisor cannot.

Below: what it catches, and the architecture that lets it grow without becoming flaky.


What it found

The single biggest finding wasn't a misconfigured VM. It was a 5× App Service P2v3 reservation running at low utilisation with 9 months remaining before expiry. The skill reports both the monthly band and the term-to-expiry forecast:

Underused reservation: P2v3 × 5 — 40% utilisation Forecast: £4,558 – £5,535 between now and expiry

That single line is worth more attention than any number of orphaned-disk findings. It's also the kind of thing Advisor never says, because Advisor doesn't combine spend + utilisation + term remaining into a single forward-looking £ figure.

Other examples from the same run:

The point isn't any single number — it's the texture. The skill surfaces the kinds of findings that show up in a senior FinOps consultant's deck, not the kinds an Advisor cron job emits.


The architecture

The skill is a monorepo of two installable Python packages:

Numbers as of the latest commit:


Reservation forecast: a small change with a big narrative effect

A naïve "underused reservation" rule reports a monthly band: "this reservation is wasting £X/month at current utilisation". True, but small.

The skill reads expiryDateTime from the reservation snapshot and adds two evidence fields:

term_remaining_months: 9
forecast_low_gbp_to_expiry: 4558
forecast_high_gbp_to_expiry: 5535

If expiryDateTime is missing, it falls back to a 12-month default (overridable via config). The recommendation copy reads "£4,558–£5,535 between now and expiry" instead of "£506–£615/month" — same underlying data, dramatically clearer impact.

The same rule dispatches across VM, App Service, and Premium SSD SKU families, not just VMs.


Six architectural bets

1. Read-only is absolute. A single function (azcli.run_json) gates every Azure CLI call. Twenty-three unit tests verify it rejects every one of 33 forbidden write verbs before subprocess is reached. Every collector — vms, disks, reservations, recovery_services, log_analytics — goes through the same gate. There is no escape hatch.

2. Citation-grounded knowledge corpus. Every rule declares KNOWLEDGE_REFS pointing at committed .md files in the corpus. Each knowledge doc has frontmatter with source_url, source_retrieved, and source_sha256. A rule that cites a missing document fails to load. This is how I keep the model from inventing numbers — the rule is allowed to compute, but it must justify its limits from a doc that has a Microsoft URL on it.

3. Severity × Confidence as independent axes. Findings carry both. A high-severity / low-confidence finding ("we think this £4k reservation is wasted, but utilisation data is sparse") gets handled differently from a low-severity / high-confidence finding ("this £4 IP is definitely orphaned"). Single-axis lists hide that distinction.

4. Savings as ranges, not point estimates. Every monetary recommendation comes with a non-empty assumption string explaining what's not netted out: negotiated discounts, reservation coverage, regional price drift. Treating the range itself as the answer is a feature.

5. Persona separation in the core package. The core package is shared with a sibling azure-security-investigator skill (a v2 stub in the same repo). When I added the recovery-services collector for cost, the security skill inherited it for free.

6. Rendering is a first-class output. The skill produces report.html automatically alongside report.md, with print-friendly CSS. Browser-print-to-PDF works without extra tooling. This matters because cost reports get sent to humans who don't read terminal Markdown.


The rule corpus

The fifteen rules currently in the skill:

Adding a sixteenth rule means: writing a new file in rules/, declaring its KNOWLEDGE_REFS, writing the knowledge doc, writing tests. No registration step.


What it deliberately doesn't do

A short list:

Security findings live in azure-security-investigator, a separate v2 stub in the same repo, sharing the core package. Splitting cost from security at the skill boundary keeps each persona's report focused.


Running it

az login                                  # interactive
uv sync --all-packages                    # one-time
azure-investigator pull -s <SUB_NAME>     # one or more --subscription flags
azure-cost-investigator analyse latest    # writes report.md + report.html + findings.yaml

Twenty minutes is roughly the wall-clock for a tenant with a few dozen subscriptions worth of resources. Most of it is az calls; the rule evaluation is sub-second.

For a printable PDF: open report.html in a browser, ⌘P → Save as PDF. The CSS is print-tuned.


Repo

github.com/mindaugasnakrosis/azure-costs-analyzer

MIT-licensed. Python 3.11+. CI green on 3.11 and 3.12.

If you run it against your own tenant and it surfaces something interesting (or — more usefully — something wrong), open an issue. Calibration against more tenants is the next thing keeping this honest.


Written with Claude Code. The code is the artefact; the article is the receipt.

Try it

Clone the repo and follow the README — the install path is documented end-to-end, including a smoke test that runs without network or LLM. mindaugasnakrosis/azure-costs-analyzer →