Why didn't you start with credits from day one?

Because in March 2023, every SaaS product used subscription tiers. It was the default model, and we didn't question it. The problem with subscriptions only became obvious as we added more AI models with wildly different per-token costs. A user on the Starter plan paying a fixed monthly fee but consuming expensive Claude API calls was losing us money. Credits aligned cost with actual usage, but we needed two years of billing data to see that clearly.

How does per-model pricing work with credits?

Each AI model has a credit cost based on its actual API pricing. A GPT-4o request costs fewer credits than a Claude Opus request because the underlying token costs differ. The LLM proxy calculates credit cost at request time based on input tokens, output tokens, and the model's rate card. Users see their credit balance decrease in real-time. This transparency was impossible with subscription tiers where usage was hidden behind a fixed monthly payment.

What happens when credits run out mid-conversation?

The agent completes the current response, then the system notifies the user that their credits are depleted. We don't cut off mid-stream -- that would be a terrible experience. Users can purchase more credits immediately or switch to a free-tier model if available. The free tier exists to keep users engaged even when credits are exhausted, which was a lesson from subscription churn: users who hit a wall leave. Users who get a degraded experience stay and eventually upgrade.

From Subscriptions to Credits

March 2023. Our first Stripe integration was 200 lines of code. Create a checkout session, listen for the webhook, flip a boolean on the user record. Subscribed or not subscribed. That was the entire billing system.

Three years and two complete rewrites later, we run a credits-based pay-as-you-go system with per-model pricing, free-tier detection, admin credit grants, and Google Play IAP for mobile. The journey from “is this user subscribed” to “does this user have enough credits for this specific model” is a story about how AI products break traditional SaaS billing assumptions.

Version 1: Direct Stripe (March 2023)

Dashboard v1 launched with the simplest possible billing. Stripe Checkout handled the payment flow. A webhook listener caught checkout.session.completed events and updated the user’s subscription status in MongoDB. A middleware checked the subscription status on every API request.

The subscription tiers were straightforward:

Free: 50 model requests per day, 2 assistants, 3 image generations, 12 Pro Search queries
Starter: Everything in Free plus access to GPT-4-turbo and GPT-4o
Boss Mode: All models including Claude, DeepSeek R1, and 20+ OpenRouter models

This worked for six months. Then it started breaking.

The problem was economics. A Boss Mode subscriber paying a fixed monthly fee could consume unlimited Claude API calls. Claude’s token pricing at the time made some conversations cost us more than the monthly subscription fee. We were subsidizing power users with revenue from casual users, and the power users were growing faster.

The tier boundaries were also artificial. Why should access to Claude require Boss Mode? The cost difference between GPT-4o and Claude per request was knowable and specific. Bundling them into tiers hid that cost from users and from us.

Version 2: External billing API (July 2023)

Four months after launch, we extracted billing into its own service. billing-api became a standalone Node.js service that handled all Stripe operations, subscription management, and usage tracking. Communication happened through RabbitMQ – when a user signed up, the auth service published a message. When they subscribed, billing-api published a confirmation that other services consumed.

The motivation was multi-product support. Autopilot needed billing too. Running separate Stripe integrations in each product meant separate customer records, separate webhook handlers, and separate subscription logic. billing-api centralized all of that.

This architecture was technically sound and operationally painful. Here’s what went wrong.

RabbitMQ message ordering. When a user subscribed and immediately started using premium features, there was a race condition. The billing confirmation message might not have reached the dashboard service before the user’s first premium request. We added a synchronous fallback – if the local subscription cache was stale, the dashboard made a direct gRPC call to billing-api. That defeated the purpose of async messaging for the most common use case.

Stripe webhook complexity. Billing-api had to handle every Stripe webhook event type: invoice.paid, invoice.payment_failed, customer.subscription.updated, customer.subscription.deleted, checkout.session.completed. Each event could arrive out of order, multiple times, or not at all. Idempotency handling alone was hundreds of lines of code.

Cross-service debugging. When a user reported “I subscribed but I don’t have access,” the investigation spanned three services, two message queues, and Stripe’s event log. Resolution time for billing issues tripled.

We lived with this architecture for a year and a half. It worked. It was stable. It was nobody’s favorite part of the system.

The corporate subscription experiment (October 2024)

In October 2024, we added a Corporate tier with a different acquisition model. Instead of individual checkout, organizations could auto-enroll users based on email domain. Everyone at @company.com got Corporate access with 500,000 credits per month.

This was our first exposure to credits as a concept. Corporate accounts had credits. Consumer accounts had subscription tiers. The two systems coexisted awkwardly. The codebase had conditional logic everywhere: if (user.isCorporate) { checkCredits() } else { checkSubscription() }.

The Corporate tier taught us two things. First, credits were a better mental model for users. Corporate users understood “you have 500K credits, each model costs X credits per request” better than “you have Boss Mode access.” Second, mixing credits and subscriptions in the same codebase was a maintenance nightmare.

Version 3: Pay-as-you-go credits (March 2025 - February 2026)

The transition to credits happened in two phases.

Phase 1 (March 2025): Credits added alongside subscriptions. Users could buy credit packs on top of their subscription. This was the “migration” phase – existing subscribers kept their plans, but credits were available as a top-up. The LLM proxy started tracking per-request credit costs, which gave us data we’d never had: actual cost per user per model per request.

That data was revealing. We discovered that 80% of our API costs came from 12% of users. The subscription model meant those users paid the same as everyone else. Credits would have captured the actual cost difference.

Phase 2 (February 2026): Credits only. The commit message was explicit: feat: pay-as-you-go billing refactor -- gate by credits, not subscription. Subscription tiers were replaced entirely. Model selection required credits. The LLM proxy calculated credit cost at request time based on token counts and per-model pricing.

The implementation changes were substantial:

Per-model credit pricing in the LLM proxy. Every model has a credit rate card: input tokens per credit, output tokens per credit. When a user selects Claude Sonnet 4.5, the proxy estimates the credit cost before execution and deducts after completion. The estimate handles edge cases like tool calls that generate additional tokens.

Free-tier detection. Users without credits can still use a limited set of models. The system checks whether the selected model is free-tier-eligible and skips credit deduction for qualifying requests. This keeps new users engaged without requiring immediate payment.

Admin credit grant endpoint. An API endpoint that lets administrators add credits to any user account by email. Used for promotions, customer support resolutions, and beta testing programs. Simple feature, high utility.

Real-time balance display. The UI shows credit balance and per-request cost estimates before the user hits send. No more surprise bills. No more “I thought my plan included Claude.”

The LikeClaw billing variant

When we built the LikeClaw standalone variant in February 2026, we had to make billing work without the external billing-api service. LikeClaw runs as a self-contained deployment – no dependency on shared infrastructure.

The solution was local Stripe billing with credits only. The implementation used inline price_data in Stripe Checkout sessions instead of pre-created Stripe products. This meant we didn’t need to maintain a product catalog in Stripe’s dashboard. Each credit purchase creates a one-time checkout session with the credit amount and price calculated on the fly.

Google Play IAP was added for mobile users. The server verifies purchase tokens through Google’s API, grants credits, and tracks the transaction. Same credit system, different payment rail.

Why credits beat subscriptions for AI products

The fundamental problem with subscription tiers for AI products is that usage costs vary by orders of magnitude. A user who sends ten GPT-4o-mini messages a day costs us pennies. A user who runs twenty Claude Opus conversations with tool execution costs us dollars. Putting both users on the same $20/month plan is a pricing fiction.

Credits solve this by making cost proportional to usage. Heavy users pay more because they consume more. Light users pay less because they don’t. The economics are transparent for everyone.

Per-model pricing takes this further. When a user sees that a Claude Opus request costs 5x more credits than a GPT-4o-mini request, they make informed choices. Some conversations justify the premium model. Others don’t. That cost awareness was impossible under subscription tiers where all models were “included.”

The user behavior shift was measurable. Under subscriptions, users defaulted to the most expensive model available – why wouldn’t they, it’s included. Under credits, users match model capability to task complexity. Quick questions go to cheaper models. Complex analysis goes to expensive models. Average per-user costs dropped while user satisfaction (measured by retention) stayed flat.

Three architectures, three lessons

Version 1 taught us: billing needs to be a first-class concern, not a boolean. “Subscribed or not” doesn’t capture the reality of AI product economics.

Version 2 taught us: distributed billing adds latency, debugging complexity, and message ordering headaches. Centralize only when you genuinely need multi-product support, and understand the operational cost.

Version 3 taught us: align your pricing model with your cost model. When usage costs are variable and per-request, pricing should be variable and per-request. Credits aren’t just a billing mechanism. They’re a communication tool that tells users what things actually cost.

If we could go back to March 2023, we’d ship credits from day one. Not because we were wrong to start with subscriptions – we needed to ship fast. But every month we ran subscriptions, we accumulated users whose expectations were set by a pricing model we’d eventually change. That migration was harder than building credits from scratch. The billing system is never done. But for the first time in three years, we have one that matches how our product actually works.

From Subscriptions to Credits

Version 1: Direct Stripe (March 2023)

Version 2: External billing API (July 2023)

The corporate subscription experiment (October 2024)

Version 3: Pay-as-you-go credits (March 2025 - February 2026)

The LikeClaw billing variant

Why credits beat subscriptions for AI products

Three architectures, three lessons

Related Posts

168 Integrations, One Plugin at a Time

5,876 Commits Across Three AI Products

Building Custom GPTs Before OpenAI Did

See what AIWAYZ can do for your team

Products

Solutions

Company

Legal