CloudChat logo
#0027

Whoops, No VMs!!!

Published on

Summary

You’ve planned for redundancy, scaling, and failover, but what happens when the cloud itself runs out of space? In this episode, Carl and Brandon untangle capacity (what the provider physically or logically has available in a region or zone) versus quota (the soft limit on what you can consume). Mixing the two leads to painful surprises during scale events and failovers.

We talk through how capacity shortfalls show up in real life—zones that are full, SKUs that vary by location, and limited supply for GPU-heavy instances, and the patterns that help: design for multiple zones and regions, add retry and fallback logic with flexible SKUs, balance spot with on-demand, and hold a baseline with reservations or time-bound commitments.

We close on the business side: the price of headroom, when commitments make sense, and simple pipeline and monitoring checks so “no capacity” errors fail fast instead of 30 minutes into a deploy.