Question 1

Why is Claude rate-limiting users in 2026?

Accepted Answer

Anthropic's revenue run-rate went from $9B to $30B in four months (end of 2025 to March 2026). Capacity buildout takes 1-2 years. The result is acute compute scarcity. In late March 2026 Anthropic capped Pro/Max session usage during weekday peak hours (5-11am Pacific) and moved Claude Code's prompt-cache TTL from 1 hour back to 5 minutes. API uptime over the 90 days ending April 8 was 98.95% — well below the 99.99% standard for cloud services.

Question 2

Why are GPU rental prices spiking?

Accepted Answer

Demand is outrunning supply. As of April 2026 the spot rental for an Nvidia Blackwell GPU was $4.08 per hour, up 48% in two months. CoreWeave raised prices 20%+ at end of 2025 and now requires 3-year contracts from smaller customers. Bank of America analysts expect demand to outstrip supply through at least 2029.

Question 3

How does electricity scarcity affect local AI?

Accepted Answer

Asymmetrically — and in local AI's favor. The PJM region's capacity auction prices jumped 9.3x year-over-year, and PJM is projecting supply shortages by 2027 if data center demand keeps growing. 56 GW of planned data centers will bypass the grid entirely with on-site generation. None of this affects a Mac running a 30B model at 30-50W under load. The local-AI hedge is a hedge against centralized constraints; if the constraints bind harder, the hedge becomes more valuable, not less.

Question 4

When does cloud compute capacity catch up?

Accepted Answer

Anthropic's announced 1 GW of Google TPU capacity comes online 'starting 2027.' Anthropic CEO Dario Amodei has publicly said data centers take 1-2 years to build. The implication is that the rate-limiting and consumption-billing posture is structural for at least 12-24 more months.

Question 5

What's the rational hedge for a developer right now?

Accepted Answer

Buy a Mac with 48-64 GB unified memory once; pay $25/year in electricity; run open-weight models in the 8B-70B range that increasingly close the quality gap with frontier APIs for coding, drafting, and most agentic workflows. Use cloud APIs for the truly hard tasks where local quality lags. The mix shifts in local's favor every month the cloud rationing continues.

Compute is constrained.
Power is constrained.
The cloud is rationing you.

Frontier providers are rationing.

The grid can't take the load.

None of this affects a Mac on your desk.

What you actually need.

The rational hedge has a toolchain.

Compute is constrained.Power is constrained.The cloud is rationing you.