The sunset of cheap vibe coding

GeorgeChief Technology Officer

PublishedApril 23, 2026

7 min read

Key takeaways

Token economics broke the flat plan

API costs scale with context size. A stuck agent can burn 30,000–75,000 tokens before it recovers, enough to wipe out a month of $20 subscription margin in a few days.

Everyone is repricing

Cursor moved to credit pools tied to raw API cost. Replit went to effort-based billing. Anthropic pushed Claude Code into Max ($100/month) and above.

The Month 3 Problem is real

Vibe-coded prototypes feel great for weeks. Then you try to add a feature and discover the codebase has no structure anyone can hold in their head.

Senior engineers got more valuable, not less

AI dropped the floor on who can ship something. It raised the ceiling on what good architectural judgment is worth. Refactoring vibe-coded debt usually costs more than building it right the first time.

Compute is the line item nobody budgets for

Modeling across UK firms shows every £1 spent on AI licenses pulls £1.80 of compute, storage, and network spend in year one. By year three, it's £3.20.

Vibe coding had a good run. For about 18 months, $20 a month bought you an autonomous agent that could scaffold a feature, debug a flaky test, and ship something usable by Friday. The pitch wasn't subtle: real software development, priced like a Spotify plan.

That math was never going to hold. Anthropic, Cursor, and Replit have all pushed heavy usage behind higher tiers, and the people surprised by it weren't looking at the meter. The $20 plan was a customer-acquisition price propped up by venture capital and undercosted compute. Now the bill is closer to what these systems actually cost to run.

Two things drive the real cost. The first is context bloat: a session's context window fills up as the conversation goes on, so each call costs more than the last. The second is the agent loop tax: when an agent gets stuck, it keeps trying variations of the same failed attempt with the meter running the whole time.

The agent loop tax

Context bloat is the slow leak. The agent loop tax is the burst pipe. When a human engineer hits a test failure, they stop and think. When an autonomous agent hits one, it tries another variation of the same approach, then another, with no real sense that it's going in circles (morphllm.com).

Each retry feeds the failed code and the new error trace back into the context window. The context keeps growing while the agent gets nowhere. Reviews of failed sessions show models burning 30,000 to 75,000 tokens across 15 useless iterations before someone steps in (morphllm.com). You pay for the entire context window on every call, so the 15th attempt costs a lot more than the first. A boring refactor can turn into a $50 API bill because the model didn't know it was stuck (morphllm.com, devteam.space).

The table below shows what that looks like over a 30-day cycle: flat subscription revenue on one side, compute that grows much faster than linearly on the other.

Billing day	Subscription revenue	Cumulative autocomplete cost	Cumulative agentic workflow cost	Margin status
Day 1	$20.00	$0.15	$4.50	Profitable
Day 5	$20.00	$0.75	$32.00	Loss / subsidized
Day 15	$20.00	$2.25	$145.00	Severe loss
Day 30	$20.00	$4.50	$350.00+	Unsustainable

Autocomplete is cheap and stays cheap. Agentic workflows blow through the $20 margin inside a week. That's the force pulling pricing up across the industry. Nobody chose to make things more expensive. The numbers chose for them.

How Cursor and Replit changed their pricing

Once token costs got hard to ignore, the big platforms moved to hybrid models: a baseline subscription plus credits that get drawn down by actual API spend. The goal is to keep something that looks predictable for customers while not eating losses on power users (flexprice.io, wpbrigade.com).

Cursor's messy switch to credit pools

Cursor is one of the most popular AI-native IDEs, and its transition was rough. Before mid-2025, Pro users got a flat allowance of “fast requests” per month — predictable, easy to reason about (vantage.sh, getaiperks.com). Once people started routing those requests to frontier models like Claude 3.5 Sonnet on multi-file agentic tasks, the math stopped working.

In June 2025, Cursor swapped request quotas for usage-based credits tied to raw API costs (vantage.sh, getaiperks.com, flexprice.io). The $20 Pro plan stayed, but now you got $20 of compute credits, not a guaranteed number of requests. The rollout went badly. Tasks that used to count as one request started draining wallets in minutes. Cursor issued public apologies and refunds (vantage.sh).

The current tiers tell the story. Pro at $20 is for light users. Pro+ is $60 a month with $70 of credits. Ultra is $200 a month with $400 of API usage (getaiperks.com, uibakery.io, nocode.mba). The messaging around Ultra is what's interesting: it isn't sold as a productivity subscription. It's sold as infrastructure for people doing AI-native development full time.

Replit and effort-based billing

Replit went a different direction. In 2025–2026 it rolled out effort-based billing alongside an aggressive autonomous Agent (wpbrigade.com, launchpad.io). You pay a baseline subscription (Core is $25/month) that comes with a matching pool of usage credits.

Replit doesn't bill on tokens. It bills on the computational “effort” the AI spends (wpbrigade.com, launchpad.io). That is harder to predict than it sounds. The Agent doesn't just generate code; it audits, builds, verifies, and iterates. Building and auditing each consume credits, so work that used to cost $0.50 can run $3.00 before the first prompt is even finished (launchpad.io, vitara.ai).

Turn on Turbo Mode and the burn rate jumps (wpbrigade.com). Run out of credits and the platform charges overages automatically, with the financial risk landing on the developer. The product is genuinely impressive — idea to deployment in one browser tab is real — but for teams that lean on it heavily, the bill arrives with surprises (launchpad.io, vitara.ai).

Visual continuation of the vibe-coding economic shift

The Month 3 Problem

Token costs and subscription tiers are the loud part of the story. The expensive part is the technical debt these tools generate at machine speed, and the human time it takes to clean up (ibagroupit.com, reddit.com).

Inside venture studios and startups, this has a name: the Month 3 Problem (reddit.com). The first month feels great. A non-technical founder or junior developer types prompts into Bolt, Lovable, or Replit and gets demo-ready applications back in days instead of weeks. The shipping speed is real (reddit.com).

The architecture underneath is usually a mess. AI is good at writing a function. It is much worse at keeping a whole application coherent as it grows (ibagroupit.com, reddit.com). By month three, adding a small feature feels disproportionately hard, and edge cases turn into multi-day chases.

The codebase becomes spaghetti. People end up arguing with a chatbot about why a tweak to the auth module broke three unrelated UI components (reddit.com). The founder has shipped an application they no longer understand well enough to safely change. They built it, but they can't really own it.

The numbers on cleanup are bleak. One product manager tracked the economics on a vibe-coded prototype: 43 hours of back-and-forth with an AI to reach a demo, which works out to about $6,450 of labor (reddit.com). The architecture was so unsound that a senior engineer had to rewrite the whole thing — and did it in three hours, using AI as a tool rather than a substitute.

That pattern shows up over and over. AI lowers the floor on who can ship something. It raises the ceiling on what good engineering judgment is worth. Whole consultancies now exist to rescue and refactor vibe-coded applications, and the cost of remediation often exceeds what building it correctly would have cost in the first place (reddit.com).

Security, compliance, and energy

The bill for vibe coding doesn't stop at API invoices and engineering time. It also includes a security problem and an energy problem, both growing fast.

Security people are nervous, and they have reason to be (infosecurity-magazine.com). A developer can now produce thousands of lines of code in an afternoon. Nobody has the human bandwidth to audit it at that speed. If your agentic workflows hit third-party APIs hosted overseas, you also have a compliance problem under the EU AI Act and GDPR — particularly in regulated industries where data sovereignty matters (ibagroupit.com).

The energy side is becoming a real number. Processing one token takes roughly 0.4 joules (clune.org). Run a few coding agents in parallel across a team and you're pulling the kind of continuous electricity load that adds up to a major household appliance running 24/7.

The International Energy Agency projects that by 2026, AI and its supporting data centers will use about as much electricity as all of Japan (wustl.edu). Generative AI alone is on track to consume ten times the energy it used in 2023, mostly because of the larger context windows and multi-agent orchestration that make modern coding agents possible.

The Alan Turing Institute modeled what this looks like for UK firms. For every £1 spent on AI licenses, the firm spends another £1.80 on compute, storage, and network in year one (360strategy.co.uk). By year three, that ratio rises to £3.20. The license fee is a small part of what AI actually costs to run.

Where this leaves us

The end of cheap vibe coding isn't a step backward. It's the bill arriving. The idea that anyone could rent a software team for $20 a month was always going to break. Billions in venture capital and undercosted compute were holding it up, and that doesn't last forever.

Anthropic, GitHub, Cursor, and Replit are now pricing closer to what these systems actually cost to run. Agentic coding is still a useful technology. It can compress weeks into hours when it works. But it works by orchestrating large, energy-hungry compute clusters, and someone has to pay for those clusters.

The unlimited-loops era is done, for enterprises and solo developers alike. What works now looks more like normal engineering with AI inside it: budget discipline, careful context management, and a senior human in the loop who can tell when the generated code is actually correct. The model writes a lot of code. Knowing whether it should ship is the part that still needs a human.

Frequently asked questions

What is the agent loop tax?

It's the compounding cost when an agent gets stuck and keeps trying near-identical fixes. Every retry adds the failed code and the new error to the context window, so each call costs more than the last. A stuck agent can burn 30,000–75,000 tokens across 15 useless iterations before anyone steps in.

Why are AI coding tools getting more expensive?

The $20 plans were propped up by venture capital. Agentic workflows use far more compute than chat, and a heavy user blows through the actual API cost of a $20 subscription in a few days. Anthropic, Cursor, and Replit have all moved heavy usage behind credit pools, effort-based billing, or higher tiers.

What is the Month 3 Problem?

It's the pattern where vibe-coded prototypes feel great for the first month, then turn painful to extend by month three. AI writes individual functions well but doesn't keep the whole application coherent as it grows. Founders end up with a codebase they shipped but can't safely change.

Is vibe coding still worth it?

For prototypes and throwaway experiments, yes. It breaks when the code becomes load-bearing — real users depending on code no engineer has reviewed or owns. The unlimited-loops era is over. What works now is AI used by senior engineers, not in place of them.