QuotaWatch: Proactive AWS Quota Monitoring with AI-Powered Doc Parsing

The Problem

I kept running into the same failure mode: a deployment would fail because some AWS service quota was exhausted, and nobody knew until the terraform apply blew up. It happened with VPC subnets, Transit Gateway attachments, PrivateLink endpoints. Every time, it was the same scramble: figure out which quota was hit, check if it’s adjustable, file a support ticket, wait.

The frustrating part is that AWS has the Service Quotas API. You can query it programmatically. But it has real gaps. Not every quota shows up in the API. The metadata is often incomplete: you might get a quota name and a limit value, but no context about what happens when you hit it, whether it’s adjustable, or what the typical lead time is for an increase. And if you’re working across multiple AWS accounts, which most teams are, you’re repeating this exercise per account.

Networking quotas are where this hurts the most. These are the limits that silently prevent infrastructure expansion. You’re trying to peer a new VPC, attach another spoke to a Transit Gateway, or spin up a new PrivateLink endpoint, and the deployment fails. The blast radius is high because networking changes tend to gate everything else.

I looked at the existing options. CloudWatch alarms can alert on some quotas, but you need to set them up individually per quota per account, and they still can’t tell you about quotas that aren’t in the Service Quotas API. Commercial monitoring tools like Datadog or CloudHealth have quota features, but they’re expensive, still limited by the same API gaps, and adding another vendor dependency for what should be a straightforward data problem felt wrong.

So I built QuotaWatch.

The Approach

Local-First Architecture

QuotaWatch is a Python CLI that stores everything in a local SQLite database. The database holds quota definitions, current values, usage metrics, and enriched metadata. No cloud infrastructure to manage. No servers to run. You install it, configure your AWS profiles, and start querying.

The configuration lives in ~/.config/quotawatch/config.toml. Each AWS account gets its own profile section with credentials, default region, and any account-specific settings. This makes it straightforward to monitor quotas across a whole AWS organization from one machine.

QuotaWatch data pipeline from AWS APIs to SQLite

AWS Service Quotas API Integration

The core data layer syncs directly from the AWS Service Quotas API. QuotaWatch pulls quota definitions, current applied values, and default values for 10 AWS networking services:

VPC (subnets, route tables, security groups, NAT gateways)
Transit Gateway (attachments, route tables, peerings)
Cloud WAN (segments, attachments, core network edges)
PrivateLink (endpoints, endpoint services, gateway endpoints)
Direct Connect (connections, virtual interfaces, gateways)
Route 53 (hosted zones, records per zone, health checks)
Route 53 Resolver (endpoints, rules, rule associations)
Network Firewall (firewalls, rule groups, policies)
VPC Lattice (services, service networks, targets)
Global Accelerator (accelerators, listeners, endpoint groups)
Elastic Load Balancing (load balancers, target groups, listeners)

Each sync stores the full quota record in SQLite with timestamps, so you can track changes over time without hitting the API repeatedly.

AI-Powered Documentation Parsing

Here’s where QuotaWatch goes beyond what the raw API gives you. The AWS Service Quotas API is incomplete. Some quotas are missing entirely. Others have names but no description, no adjustability flag, no context.

QuotaWatch uses Anthropic Claude to parse AWS documentation pages and extract quota information that the API doesn’t provide. It reads the official docs, identifies quota tables and descriptions, and structures the extracted data into the same schema as the API records. The enrichment pipeline then cross-references what it got from the API with what it parsed from the docs, filling in the gaps.

This is not summarization. It’s structured extraction. Claude reads a documentation page for, say, VPC Lattice quotas and pulls out the quota name, default value, whether it’s adjustable, which API action it applies to, and any caveats mentioned in the docs. That structured data goes into the same SQLite database alongside the API-sourced records.

You need an ANTHROPIC_API_KEY to use the doc parsing features. The rest of the tool works fine without it, but you’ll miss the enrichment layer.

Multi-Account Support

Most teams don’t have one AWS account. They have dev, staging, production, shared services, maybe a dedicated networking account. QuotaWatch handles this with TOML-based profiles:

[profiles.production]
aws_profile = "prod-admin"
regions = ["us-east-1", "us-west-2"]

[profiles.staging]
aws_profile = "staging-admin"
regions = ["us-east-1"]

[profiles.networking]
aws_profile = "network-admin"
regions = ["us-east-1", "us-west-2", "eu-west-1"]

Each profile maps to an AWS credentials profile. You can sync and query them independently or sweep across all of them. The SQLite database tags every record with its source account and region, so you always know where a quota value came from.

Watchlist and Priority Monitoring

Not every quota matters equally. The watchlist system lets you flag specific quotas as high priority and set custom utilization thresholds. When a watched quota crosses its threshold during a sync, QuotaWatch flags it immediately.

This is the difference between “here are 500 quotas across your accounts” and “these 3 quotas are at 80% utilization and one of them is not adjustable.” The watchlist turns a data dump into actionable monitoring.

What I Learned

The AWS Service Quotas API is better than nothing, but it’s not enough. I started this project assuming the API would cover most of what I needed and I’d just need a nice CLI wrapper. In practice, the API has significant coverage gaps, especially for newer services and for quota metadata beyond the raw limit value. The documentation parsing wasn’t a nice-to-have; it was necessary to get a complete picture.

AI-powered extraction works well for structured documentation. AWS docs follow consistent patterns: quota tables with columns for name, default, adjustable, and description. Claude handles this structured extraction reliably. The key was being specific about the output schema rather than asking for a general summary. When you tell the model exactly what fields you need and what format they should be in, the extraction is accurate.

SQLite is the right database for CLI tools. No setup, no server, no dependencies. The database file lives alongside the config and travels with the user. Queries are fast enough that the CLI feels instant even with thousands of quota records. And because it’s a real database, you can do things like “show me all quotas across all accounts where utilization is above 70% and the quota is not adjustable” as a single query.

Multi-account is table stakes, not a feature. I initially built this for a single account and immediately found it inadequate. Real AWS environments have multiple accounts, and quota exhaustion in a shared networking account can block deployments across all of them. Adding multi-account support early changed the architecture for the better: it forced clean separation between data collection and data storage, which made the enrichment pipeline simpler too.

The gap between “data exists” and “data is useful” is where tooling lives. AWS exposes quota data through the API, through the console, through documentation. But none of those surfaces answer the question an engineer actually has: “Am I going to hit a limit on my next deployment?” Turning scattered data into that answer required combining multiple data sources, enriching them, and presenting the result in a way that’s actionable. That’s the entire value proposition of QuotaWatch.

Co-authored with AI, based on the author's working sessions, dictations, and notes.

QuotaWatch: Proactive AWS Quota Monitoring with AI-Powered Doc Parsing

Impact Summary

Role

Timeline

Scale

Links

Decision Summary

The Problem

The Approach

Local-First Architecture

AWS Service Quotas API Integration

AI-Powered Documentation Parsing

Multi-Account Support

Watchlist and Priority Monitoring

What I Learned

Related Projects

AWS Security Group Mapper: Visual Analysis Tool for Cloud Security

Fighters Paradise: Modern Game Engine Reimplementation in Rust

Agent-Eval: CI Evaluation Harness for Multi-Agent Development