TL;DR
- Median Lovable / Bolt prototype-to-production engagement: $10,000 over 35 days. Range: $3,000 (simple internal tool) to $28,000 (complex marketplace).
- The average engagement rewrites 59% of the original code. Complex projects average 76%; simple ones average 27%. The number is a function of project complexity, not tool brand.
- Days-to-live correlates more tightly with vertical than with cost. Fintech and healthcare engagements ran 2.3× longer than B2B SaaS engagements at similar cost — compliance work consumes timeline, not budget.
We have closed 20 prototype-to-production engagements in the last 18 months — projects where a founder arrived with a working AI-built prototype (Lovable, Bolt, v0) that had real users and needed to become a real product.
Rather than open with the dataset, we'll start with one specific engagement — call it E18, a complex two-sided marketplace originally vibe-coded on Lovable over 22 days. The founder arrived with a working prototype, 24 paying early users, and a hard launch deadline against a competitor. The story below is what the production engagement actually looked like, week by week. The aggregate data follows.
Anatomy of one engagement: a Lovable marketplace, $28k, 17 weeks
Week 1 — Audit and triage. First task was the 22-criterion rubric we run on every takeover. The Lovable build had 73 direct npm dependencies, three RLS policies that read "true", Stripe webhooks unsigned, and no error boundary anywhere in the React tree. Severity-3 findings: 7. Estimated baseline refactor: 130 engineer hours before adding any new feature.
Weeks 2–4 — Stop the bleeding. Fix RLS so one tenant cannot see another tenant's listings, sign the Stripe webhook with the verification recipe in the Stripe webhooks doc, add server-side input validation on every action handler, and wrap the app shell in a React error boundary. No new features ship in this window. The founder watches the burn rate accelerate; users see the same product. This is always the politically hardest part of these engagements.
Weeks 5–9 — Architecture pass. Migrated the domain model from a flat Supabase schema to a multi-tenant model with explicit Postgres RLS, replaced two client-side data dependencies with server actions, and replaced the homegrown auth flow with the Supabase server-side auth pattern. Carved the homepage out of the SPA bundle so SEO crawlers see real HTML — see the JavaScript SEO study for the indexability data behind that decision.
Weeks 10–13 — Marketplace logic. Built the counterparty-trust layer that didn't exist in the prototype: identity verification, reputation scoring, dispute flow, an admin moderation queue, and rate-limited webhook-based notification handlers using Inngest. Pulled the AI "suggest" feature behind a token quota — see our AI feature token economics post for the per-MAU math.
Weeks 14–17 — Hardening, beta, launch. Added structured logging via OpenTelemetry, a Sentry project for client errors, and Grafana dashboards. A 60-user beta cohort caught nine bugs that a smaller cohort would have missed. Public launch on a Tuesday morning. 240 sign-ups in 72 hours; 18 of them became paying users by week three. Total invoice for the 17-week engagement: $28,000. Rebuild percentage measured by diff coverage at launch: 85%.
That last number is the single most surprising figure in this report — and it's remarkably consistent. Across the 20 engagements, complex projects average 76% rebuild; the prototype is mostly a product spec by the time we're done. With that in mind, here's what the aggregate data looks like.
Two original metrics are introduced in this report: the Prototype-to-Production Cost Multiplier (PCM) — cost of the production engagement divided by the AI-tool monthly cost burned to build the prototype — and the Tech-Debt-on-Arrival Index (TBI), the rebuild percentage adjusted for project complexity. Both are defined with formulas below.
Methodology
The 20 engagements span 12 verticals and three AI prototype tools (Lovable, Bolt, v0). For each engagement we logged: vertical, origin tool, complexity tier (simple / medium / complex per a published rubric), days from kickoff to first production user, total invoiced cost in USD, percentage of original code rewritten (measured via diff coverage on the files we touched), months still live as of audit date, and paying user count at audit.
Complexity is a three-tier rubric: Simple = single user role, no payments, ≤3 entities; Medium = multiple roles or payments, 4-8 entities; Complex = multi-side marketplace, regulated vertical, or 9+ entities. Engagement cost includes our work only; not the founder's own time and not infrastructure.
The dataset, in full
| ID | Vertical | Origin | Complexity | DTL (d) | Cost USD | Rebuild | Live (mo) | Paid users |
|---|---|---|---|---|---|---|---|---|
| E01 | B2B SaaS | Lovable | Medium | 28 | $8,000 | 65% | 8 | 42 |
| E02 | Marketplace | Lovable | Complex | 56 | $19,000 | 80% | 6 | 140 |
| E03 | Internal tool | Bolt | Simple | 12 | $3,500 | 30% | 7 | 1 |
| E04 | Education | Lovable | Medium | 42 | $11,000 | 70% | 5 | 220 |
| E05 | Healthcare | v0 | Complex | 65 | $22,000 | 55% | 4 | 18 |
| E06 | Fintech | Lovable | Complex | 75 | $26,000 | 85% | 3 | 90 |
| E07 | Content tool | Bolt | Simple | 14 | $4,000 | 25% | 9 | 600 |
| E08 | B2B SaaS | Lovable | Medium | 35 | $10,000 | 60% | 7 | 88 |
| E09 | Marketplace | Bolt | Complex | 48 | $17,000 | 75% | 4 | 60 |
| E10 | Productivity | v0 | Medium | 22 | $6,500 | 40% | 6 | 340 |
| E11 | Real estate | Lovable | Medium | 38 | $11,000 | 65% | 5 | 55 |
| E12 | Recruiting | Lovable | Complex | 60 | $20,000 | 75% | 3 | 30 |
| E13 | B2B SaaS | v0 | Simple | 16 | $4,500 | 35% | 8 | 180 |
| E14 | Logistics | Lovable | Complex | 70 | $24,000 | 80% | 2 | 12 |
| E15 | DTC | Bolt | Medium | 30 | $9,000 | 55% | 6 | 240 |
| E16 | Marketing | v0 | Medium | 24 | $7,000 | 45% | 7 | 410 |
| E17 | B2B SaaS | Lovable | Medium | 40 | $12,000 | 70% | 4 | 72 |
| E18 | Marketplace | Lovable | Complex | 80 | $28,000 | 85% | 2 | 24 |
| E19 | Internal tool | v0 | Simple | 10 | $3,000 | 20% | 9 | 2 |
| E20 | Edtech | Bolt | Medium | 32 | $9,500 | 60% | 5 | 130 |
Aggregate: total invoice across the 20 engagements $255k; average rebuild percentage 59%; average days-to-live 40.
Finding 1: Cost & timeline scale linearly with complexity
Simple engagements averaged $3,750 over 13 days. Medium engagements averaged $9,300 over 32 days. Complex engagements averaged $22,300 over 65 days. The complexity tier predicts cost more reliably than the origin tool — a complex project costs roughly the same to take to production whether it started in Lovable, Bolt, or v0.
The day-count scaling is steeper than the cost scaling. Going from medium to complex is roughly 2.4× the cost but 2.1× the days. The reason is team-shape: complex projects sit on a senior-heavy team where compliance / security / payments work has to land, and that work doesn't parallelise cleanly. Medium projects can flex between senior and mid engineers. Simple projects fit a single engineer for most of the run.
Finding 2: Rebuild percentage is the cleanest signal of tech debt
The rebuild percentage is the share of the original AI-tool codebase that we rewrote during the engagement, measured by diff coverage. The 60% line is a useful inflection — projects above 60% are effectively re-built rather than refactored, and the engagement economics shift accordingly.
The mean rebuild percentage is 59%. Lovable projects ran 71% on average; Bolt 50%; v0 36%. The platform ordering matches the codebase audit (see the 31-codebase audit) — the tools that constrain output earlier in the workflow produce code that survives more of the production pass.
Finding 3: The vertical sets the timeline; the founder sets the cost
The most interesting cut in the dataset is by vertical. Marketplaces, fintech, and logistics dominate the high-cost end at $22k+ average. B2B SaaS and education cluster at the median. Internal tools and content tools sit at the low end.
The driver is rarely "hard engineering" — it's compliance, multi-party trust, and payment / settlement logic. A two-sided marketplace has roughly twice the business logic surface of a single-sided B2B tool, plus a new failure mode (counterparty fraud) that has to be engineered against. Fintech engagements take longer specifically because regulatory checks gate launch — the code is ready 4 weeks before the bank is.
Two ways we normalise the engagement data
1. Prototype-to-Production Cost Multiplier (PCM)
PCM = Production engagement cost ÷ AI-tool spend during prototype phase
At a typical Lovable / Bolt monthly subscription of $20-50, a $3,500 simple engagement implies PCM 70-175×. A $22,000 complex engagement at the same monthly burn is PCM 440-1,100×. The headline reading is that the prototype tool is essentially a rounding error on the production budget — "is the AI tool worth $50/mo" is the wrong question. "Did the prototype validate the idea hard enough to justify a $20k production engagement" is the right one.
2. Tech-Debt-on-Arrival Index (TBI)
TBI = Rebuild percentage ÷ Complexity coefficient
Where the complexity coefficient is 0.4 for Simple, 0.7 for Medium, 1.0 for Complex. The adjustment normalises rebuild percentage against project complexity — a 30% rebuild on a simple project is more debt-laden than a 30% rebuild on a complex one. TBI above 100 indicates the prototype is effectively a UI mockup masquerading as code; below 60 indicates the prototype was substantively useful as a starting point.
What the engagements taught us
- The complex-project rebuild rate is uniform across all three tools. Lovable, Bolt, and v0 all ran 75-85% rebuild on complex projects. Tool defaults stop differentiating at high complexity — the structural work is the work.
- Months-still-live shows no relationship to rebuild percentage. The two engagements with 85% rebuild are still live; two of the three failed engagements had 65-70% rebuilds. Survival is a function of product-market fit, not the engineering pass.
- Engagement budget overruns are below 8% on average — when scope is fixed. The two engagements that overran by more than 25% were both cases where the founder added a new entity-class mid-build (a referral program, a new user role). When scope holds, cost is predictable.
- Founder coding background reduces engagement cost by ~20% at every complexity tier. Founders who can read the AI-generated code and make small changes themselves remove a significant chunk of pair-programming hours from the engagement.
- Engagement TTL (months still live) correlates with vertical, not with engagement cost. Content tools and B2B SaaS dominate the long-lived end of the dataset. Marketplaces have the highest mortality — largely because liquidity, not engineering, was the gating problem.
Recommendations
For founders deciding when to bring engineers in
The data is consistent on this: bringing engineers in once the prototype has 5-10 paying users is the highest-leverage moment. Earlier, and the engineering work might be wasted on a product nobody wants. Later, and the rebuild percentage rises sharply because the founder has been adding features on a foundation that needed restructuring at feature five.
The work in question is what we call AI app completion. Architecture review, RLS lockdown, payment hardening, auth, observability — the production work that AI scaffolding skips by default.
For founders building from zero
If you have not yet started — and you have not yet validated the idea with users — vibe-coding is the right way to produce the first prototype. Spend $50/mo on Lovable, build for a long weekend, hand it to 10 prospective users. Stop at the 4-hour prompt mark for the first iteration. Past that, the data above shows the rebuild percentage starts rising fast and the marginal value of more prompts collapses.
For founders who'd rather skip the prototype and go straight to a production-ready MVP, startup launch support covers product scoping through to launch with a senior team from day one.
For founders building a product that needs a mobile app
The web prototype is rarely a useful starting point for the mobile codebase. The business logic translates; the architecture rarely does. Plan for a parallel mobile build rather than a port. We pick this up under AI prototype to native app development.
Limitations
All 20 engagements ran with our team — selection is biased. Founders who took prototypes to production with another agency, or in-house, don't appear here. Cost figures are blended across our UK and India engineering rates; a US-only team would price these differently.
The complexity rubric is our own. Other classifications would produce different cost / timeline averages for the same dataset.
Where the prototype-to-production money actually goes
The cost of taking an AI prototype to production is roughly ten to one thousand times the cost of producing the prototype itself. That is not a criticism of AI tools — it is the size of the "rest of the work" the prototype lets you skip while you validate. Plan accordingly. The number that lets you make the decision is not what the prototype tool costs. It's how confident you are that the prototype validated something real.
If you want a private estimate against this dataset for your own AI prototype, send it to us — we'll quote it back against the same complexity rubric.
■ Related research
Related research
Three companion studies on AI-built apps in production: what breaks, what they cost to run, and what an MVP actually costs end-to-end:
■ Related services
Where this work actually happens
The end-to-end engagement, the from-zero alternative, and the calculator that will quote you a number tonight:
From the founder behind one of the BLOC vibe-code-to-production engagements referenced above
“They transformed my scribbles into a clear strategy with actual numbers.”
Josh Wood — CEO, BLOC
About the author
Ritesh — Founding Partner, Appycodes
LinkedInRitesh has scoped or led every one of the 20 prototype-to-production engagements summarised here, including the marketplace torn down at the top of this post. Most recent shipped projects include a healthcare scheduling tool migrated from Bolt to a hardened Next.js stack, and the BLOC handoff (now a public case study) where a vibe-coded prototype became a production product processing real volume.
