Designing a Simple Quota System

The AI products are powerful, but they are not free to run. Every prompt, experiment, and evaluation consumes real infrastructure resources behind the scenes. At the same time, teams use LaikaTest very differently. Some users are experimenting casually, while others run large-scale prompt comparisons daily across teams.

Early on, we made a conscious decision to offer generous free access. That helped teams get value quickly, but it also exposed a real problem: without limits, it becomes difficult to guarantee performance, fairness, and long-term sustainability. Usage limits are not about restriction. They are about ensuring reliability and making sure every team gets a predictable experience as the platform grows.

Our goal was to build limits that guide users toward the right plan as their usage grows, without breaking trust or disrupting workflows.

Why Most Usage Limits Feel Broken

Most quota systems fail at the user experience layer. Actions suddenly stop working. Users see messages like “403 Forbidden” or “Access denied” with no context. Limits feel arbitrary because users do not know what they consumed, what they exceeded, or what to do next.

From an engineering perspective, these systems are often tightly coupled, opaque, and difficult to evolve. From a user’s perspective, they feel like bugs.

We wanted to avoid both outcomes. Our north star was simple: when a limit is hit, the system should explain what happened clearly, preserve the user’s flow, and give them control over the next step.

Starting Point: No Limits at All

In the earliest versions of LaikaTest, we did not enforce any limits. Every organization effectively had unlimited access. This helped us validate the product quickly, but it was not sustainable.

As usage increased, we realized we needed a quota system that could:

Scale with growing traffic
Support multiple plans and pricing tiers
Avoid slowing down core user actions
Remain flexible as plans evolve

This marked the beginning of our quota system design.

First Design: A Brute-Force Database Approach

Our initial approach was straightforward and database-driven.

At the time of any resource creation, the system would:

Fetch subscription plan details from PostgreSQL
Fetch the organization’s active subscription
Fetch the organization’s current usage
Compare usage with allowed limits
Allow or block the action

This worked functionally, but it quickly showed performance issues. Multiple database calls were happening for every create operation. Some of this data changed rarely, while other parts changed frequently. Treating all of it the same was inefficient.

This forced us to step back and rethink.

Observing Access Patterns and Rethinking the Design

We analyzed how often different pieces of data changed:

Subscription plans table: Changes infrequently. Plans and pricing are stable.
Organization subscription mapping: Changes occasionally when a customer upgrades or downgrades.
Organization usage counters: Changes frequently with every prompt, experiment, or team member added.

This insight led to a clear optimization opportunity.

Introducing Caching Where It Made Sense

We redesigned the system using a hybrid approach with PostgreSQL and Redis.

The final flow looks like this:

Subscription plan metadata is cached in Redis
Organization’s active subscription is cached in Redis
Organization usage counters are fetched directly from PostgreSQL
Limits are computed by comparing cached plan data with live usage data

Redis entries use a TTL of one month.

After implementing this approach, we observed a 66 percent reduction in latency for quota checks.

Organization-Level Limits, Not User-Level Limits

Another key design shift came during implementation. Initially, we considered enforcing limits at the user level. However, LaikaTest is fundamentally an organization-centric product. All work happens inside an organization.

We moved all limits to the organization level, which aligned naturally with: - Subscription billing - Team workflows - Shared resources

End-to-end usage decision flow from resource creation to user outcome

Making the System Flexible for the Future

Limits are defined in a flexible structure that allows:

Adding new resource types
Increasing limits for existing plans
Introducing unlimited tiers
Supporting future add-ons

Lightweight Enforcement Through Middleware

Before creating any resource such as a project, prompt, experiment, or team member, the system performs a quota check. Because the heavy lifting is done ahead of time through caching and counters, this middleware remains lightweight and fast.

How subscription data and live usage are combined during a resource limit check

When a Limit Is Reached

When usage exceeds the allowed limit:

The action is paused
The user is informed exactly which limit was reached
Current usage is shown
Upgrade options are presented clearly

Upgrading Should Feel Like a Level-Up, Not a Tax

Upgrading in LaikaTest is intentionally designed to feel like a natural progression, not a penalty for hitting a limit. Users are never forced into an upgrade, and staying on the current plan is always a valid and respected choice. If a team’s existing plan continues to meet their needs, they can keep using it without disruption. When an upgrade option is shown, it is presented with clear context. Users can see what additional capacity or features they would unlock and decide whether it aligns with their current goals. This approach ensures that upgrades happen because the product is delivering real value, not because users feel blocked or pressured. Growth becomes an informed choice rather than an obligation.

Costumer First Experience System

A Customer-First System by Design

Every major design decision in this system was guided by one principle: the customer comes first. Performance matters because users feel latency. Clarity matters because confusion breaks trust. Flexibility matters because teams grow in unpredictable ways. Instead of treating usage limits as a billing mechanism, we treated them as part of the core product experience. That mindset shaped everything from where we introduced caching, to how limits are enforced, to how messages are shown when a boundary is reached. The goal was never to block users, but to guide them clearly and respectfully as they scale.

Conclusion

Building a quota system forced us to think beyond limits and pricing. It required us to treat usage enforcement as part of the core system design rather than an afterthought. The choices we made around data ownership, caching, and enforcement boundaries were all driven by a single goal: keep the product fast, predictable, and respectful of how teams actually work.

By combining PostgreSQL for accuracy, Redis for low-latency access, and organization-level constraints, we arrived at a system that scales without becoming brittle. More importantly, it gives us a foundation we can confidently build on. As LaikaTest evolves, this system allows plans, usage, and features to change without breaking trust or experience. In the end, a good quota system is not defined by the limits it enforces, but by how invisibly and reliably it supports growth.