Quiet Failures in Usage Metering

Engineering 16 Feb 2026 5 min. read

Usage metering usually does not fail with a big crash. It fails in a way that is much worse: it looks fine.

Everything keeps working. Requests return 200. Jobs keep running. Your graphs still move. And yet, somewhere in the background, the numbers start drifting away from reality.

It happens in boring places:

Two requests arrive at the same time and both pass the limit check.
A client retries after a timeout and the same action is counted twice.
A queue job fails halfway through and nobody returns the quota.
Overage gets written twice because an integration ran again.

None of these show up as a dramatic error. They show up later, as a customer complaint you cannot confidently answer.

The first time you feel it, it is never a log line. It is a message like this:

"Why am I blocked? We still have quota."

Or:

"Can you explain this invoice, item by item?"

And if you cannot explain it, you cannot defend it. That is the moment usage metering stops being an engineering detail and turns into a business risk.

At Moneo, we have spent years building in fintech and insurtech environments where the system is expected to be correct under stress. Not mostly correct. Correct when traffic spikes, when integrations retry, when background jobs fail, and when money is involved. We brought that mindset into SaaS usage metering and built Laravel Usage Limiter — and we open sourced it so teams can inspect the hard parts, improve them, and avoid relearning the same painful lessons in production.

The real problem is not counting, it is being right

Most products start with a counter and a plan table. That is fine until the day you scale.

The usual failure patterns are painfully predictable:

Race condition: Two parallel requests read the same usage value, both think there is room, and both increment. You overshoot the limit.
Double counting: A client retries after a timeout. Your endpoint runs twice. Usage is counted twice. Sometimes billing is too.
Phantom usage: A job crashes. The work never completes. The quota is still consumed.
Duplicate overage: Overage is recorded twice because a webhook was delivered again or a worker restarted mid-flow.

And because none of this is a clean crash, it is hard to notice early. Your app keeps working. But the numbers stop being reliable.

Once you monetize usage, you are running something that behaves like a financial system. The numbers have to be defensible.

What we built: Laravel Usage Limiter

Laravel Usage Limiter is a production-grade, metric-agnostic usage metering and enforcement engine for Laravel.

It is built for teams that want three things without duct tape:

Atomic concurrency safety so limits do not overshoot under traffic
Idempotency so retries do not turn into double counting
Pricing flexibility so you can evolve plans without rewriting the core

It lets you track any resource, enforce any limit, and bill in prepaid, postpaid, or hybrid ways.

Metric-agnostic by design

A metric is just a string. That is the point.

ai_tokens, api_calls, execution_minutes, storage_mb, events, seats, projects, or whatever your business decides next month. The engine does not care what the metric represents. It only cares about enforcing rules correctly.

That makes the package feel calm even when your product changes fast.

Billing account aggregation that matches multi-tenant SaaS

All usage is tracked per billing_account_id. That single dimension keeps things consistent across teams and workspaces:

If your billing entity is a team, the billing account belongs to that team.
If you sell per user, you create one billing account per user.
If you have organizations and workspaces, the billing account sits at the organization level and everyone shares the same quota.

Your app can change models without breaking the metering engine.

Atomic enforcement: the part that saves you later

The most common metering bug is simple: check, then increment.

Under concurrency, that breaks.

Laravel Usage Limiter uses an atomic pattern where the database performs the check and reserve as a single operation. Two requests cannot both claim the last available unit. That is the difference between limits that feel correct and limits that randomly embarrass you.

This is the kind of detail we tend to care about in fintech-style systems, because once money and trust are involved, best effort is not good enough. It is also why our long-term collaboration model works so well in these environments.

Reserve, execute, commit or release

A lot of usage is tied to work that can fail:

An AI call can throw.
A transcode job can crash.
An export can time out.

If you count usage before the work succeeds, you will create phantom usage and angry customers.

The package models a lifecycle that matches reality:

Reserve capacity before work begins
Commit on success
Release on failure

The package provides this through the ExecutionGateway, which wraps your callable so you do not need to hand-roll the control flow every time.

When the callback throws, the reservation is released automatically.
When it succeeds, usage is committed.

This is how you keep numbers honest without turning every feature into a billing project.

Idempotency: the retry tax you should not pay

Retries are normal. Your system should act as if the action happened once, even if it arrives twice.

The package supports idempotency keys across critical operations so you can safely retry without mutating state again. If the same key is used, the original result is returned. No side effects, no duplicate records.

This is the difference between a resilient system and one that slowly turns retries into revenue disputes.

Three integration points that keep your codebase sane

One reason metering gets messy is that it spreads everywhere.

Laravel Usage Limiter gives you three clear routes so metering stays consistent and reviewable.

1. `ExecutionGateway`

Use it when there is a real execution phase and you only want to count on success.

Reserve happens first. If it is denied, you do not even start the expensive work.
Commit happens on success.
Release happens on exception.

Great for: AI calls, media processing, and any operation where failure should not consume quota.

2. `EventIngestor`

Use it when there is no execution phase and you simply want to record that something happened.

API calls, inbound events, webhook deliveries, bytes stored.
Supports batch ingestion so you can push multiple metric events in one call.

3. Job Middleware

Queue jobs are a common source of quiet failures:

A worker can restart.
A job can retry.
A job can crash mid-run.

The middleware enforces usage consistently around the job execution. If the job throws, the reservation is released. If it retries, idempotency prevents double counting.

It keeps your job logic clean and keeps billing logic out of random places.

Hard and soft enforcement, because the business decides

Sometimes you need a hard stop at the limit. Sometimes you want to allow usage but warn and record overage.

Mode	Behavior
Hard enforcement	Denies reservations that would exceed the effective limit.
Soft enforcement	Allows them but returns a warning decision, records overage, and fires events so you can notify the customer and bill correctly.

Both are useful. The important part is that the behavior is consistent and predictable.

Prepaid, postpaid, hybrid: pricing that can evolve

Pricing changes more often than engineers wish. Wallets, invoicing, included allowances, negotiated deals, credits, top-ups.

Laravel Usage Limiter separates pricing into policies so your billing model can evolve without rewriting enforcement and metering:

Prepaid: Debits a wallet on commit and denies usage when the wallet cannot afford it.
Postpaid: Accumulates overages for end-of-period invoicing.
Hybrid: Gives an included allowance with overflow handled by prepaid or postpaid.

And if your pricing needs are special, the architecture is designed to be extended through contracts and configuration.

Plans, overrides, and the reality of enterprise

Sooner or later, you will have a customer that does not fit your default plans:

Maybe they negotiated extra quota.
Maybe they need a temporary boost.
Maybe you want a VIP exception.

The package supports per-account, per-metric overrides that take precedence over plan defaults. You can adjust a single field without cloning a whole plan.

That keeps the system flexible without making it fragile.

Events that connect metering to your application

Metering does not exist in isolation. When usage hits a threshold, someone needs to be notified. When a wallet runs low, a payment needs to happen. When overage is recorded, your invoicing system needs to know.

Laravel Usage Limiter dispatches Laravel events at every critical point in the lifecycle:

LimitApproaching fires when usage reaches the warning threshold (default 80%). Wire it to a Slack notification or an in-app banner so customers are never surprised.
LimitExceeded fires when committed usage passes the included amount under soft enforcement.
WalletTopupRequested fires when a prepaid wallet balance drops below the auto-topup threshold. Your listener charges the customer's payment method and credits the wallet — the engine handles the rest.
OverageAccumulated fires when postpaid overage is recorded on commit, so your invoicing pipeline can pick it up.
ReconciliationDivergenceDetected fires when the reconciliation command finds a mismatch between aggregates and reservation records.

Usage lifecycle events (UsageReserved, UsageCommitted, UsageReleased) are also dispatched, so you can build audit trails, analytics, or webhook integrations on top of them.

This is what turns a metering engine into something your application can react to instead of polling.

Pluggable architecture: nothing is locked in

One of the most intentional decisions in the package is that every major behavior is behind a contract interface. If the defaults do not fit your business, you swap the implementation — you do not fork the package.

Pricing is governed by the PricingPolicy contract. The package ships with prepaid, postpaid, and hybrid policies. If you need credit-based billing, tiered pricing, or anything custom, you implement the interface and register it in the config. Your new mode becomes a first-class citizen immediately.

Enforcement is governed by the EnforcementPolicy contract. Hard and soft are built in, but you can write a grace-period policy, a gradual throttle, or a time-of-day rule. Same pattern: implement, register, use.

Billing periods are governed by the PeriodResolver contract. The package ships with calendar month, weekly, and rolling 30-day resolvers. If your customers have anniversary-based billing cycles, you write your own resolver and point the config at it.

Plan resolution is governed by the PlanResolver contract. If your plans live in Stripe, a feature flag system, or an external API instead of the local database, you replace the default Eloquent resolver with your own.

This means the package can evolve with your business without becoming a bottleneck. The contracts are stable; the implementations are yours to change.

The maintenance side you will eventually need

Even with good patterns, real systems need maintenance. Jobs crash. Networks fail. The world is messy.

Laravel Usage Limiter includes schedulable commands to keep things clean and auditable:

Expire stale reservations that were never committed or released
Reconcile aggregates against reservation records and detect drift, with optional auto-correction
Clean up expired idempotency records to keep the database lean
Recalculate overages from actual usage
Reconcile wallet balances against the transaction ledger

This is not glamorous, but it is what turns metering into something you can trust month after month.

Where it fits

If your Laravel application meters, limits, or bills for resource consumption, this engine is designed for you:

AI platforms selling tokens
API providers metering endpoint calls
Media processing platforms billing by compute time
IoT and event ingestion systems counting inbound data at scale
Storage providers metering disk usage
Multi-tenant SaaS products tracking seats, projects, builds, or custom metrics

If you are building any of these, you already know: this is not about counting. It is about being right under pressure.

Give it a try

Laravel Usage Limiter is our answer to that problem.

It is built for the exact moments where most homegrown solutions start breaking: real concurrency, real retries, real background jobs, and real billing pressure. The goal is simple: when someone asks “why”, you can point to a system that is consistent, auditable, and fair.

If you are building a Laravel SaaS and you do not want metering to become a constant source of support tickets and revenue uncertainty, take it for a spin, read the code, and use what helps. Since it is open source, you can also shape it with us. Issues, discussions, and pull requests are always welcome. If you need hands-on guidance scaling your SaaS infrastructure, we can help.

Links:

Moneo as Your Enterprise Partner

We collaborate closely with enterprise teams to design, deliver, and operate systems built for the long run.

Start Partnership

Emir Karşıyakalı

Founder & CEO

Moneo as Your Enterprise Partner

We collaborate closely with enterprise teams to design, deliver, and operate systems built for the long run.

Start Partnership

Related Services

How We Can Help

Long-term Collaboration

Engineers who join long-term to support, scale, and grow your product with deep domain expertise.

Learn more

Software Consulting

Expert guidance on technical decisions, architecture, and scaling for SaaS products.

Learn more

Latest Blog Read more

Engineering 30 Mar 2026

RAG Pipelines, Now Native to Laravel

A complete, production ready RAG pipeline for Laravel. Embed, retrieve, and generate AI powered answers from your own data with a single fluent call.

Engineering 24 Feb 2026

GEO & AI Search Optimization for Shopify

LLMRank is a Shopify app that makes product catalogs discoverable by ChatGPT, Perplexity, Google AI Overviews, and every AI system that reads the web through a language model. Here is how it works.

Engineering 18 Feb 2026

The Web Is Being Tokenized. Serve Markdown.

We built a Laravel package that unifies Cloudflare's three Markdown conversion services under one elegant API. Convert URLs, files, and raw HTML to Markdown. Make your Laravel app agent ready with a single middleware.

Partnership is at the core of what we do.

Quiet Failures in Usage Metering

The real problem is not counting, it is being right