Rate Limiting

Upstash-style rate limiting with Convex-first storage and middleware-friendly DX.

Why this package

kitcn/ratelimit is designed as a hard cutover from component-driven APIs.

What you get	Why it matters
Upstash-style API	Easier migration if your team already knows Upstash (`limit`, `check`, `getRemaining`, `resetUsedTokens`)
Convex-first tables	No component registration and no component migration path to manage
Read dedupe helpers	Common repeated reads may reuse cached results and reduce duplicate DB fetches
React hook support	`hookAPI()` + `useRatelimit()` gives accurate client countdown and button states
Fail-closed default	Safer behavior under pressure (`failureMode: "closed"`)

Install

npm install kitcn

pnpm add kitcn

yarn add kitcn

bun add kitcn

Scaffold the starter (required)

Rate limiting is opt-in, so scaffold the full starter once:

npx kitcn add ratelimit

That creates:

convex/lib/plugins/ratelimit/schema.ts
convex/lib/plugins/ratelimit/plugin.ts

and registers ratelimitExtension() in convex/functions/schema.ts.

convex/functions/schema.ts

import { defineSchema } from 'kitcn/orm';
import { ratelimitExtension } from '../lib/plugins/ratelimit/schema';

export const tables = {
  // your tables...
};

export default defineSchema(tables).extend(ratelimitExtension());

Create a local ratelimit plugin

Scaffold a local plugin and keep one default bucket for every mutation. Add named buckets only when a procedure genuinely needs stricter behavior.

convex/lib/plugins/ratelimit/plugin.ts

import { MINUTE, Ratelimit, RatelimitPlugin } from 'kitcn/ratelimit';
import type { MutationCtx } from '../../../functions/generated/server';

const fixed = (rate: number) => Ratelimit.fixedWindow(rate, MINUTE);

export const ratelimitBuckets = {
  default: {
    public: fixed(30),
    free: fixed(60),
    premium: fixed(200),
  },
} as const;

type RatelimitTier = keyof (typeof ratelimitBuckets)['default'];
export type RatelimitBucket = keyof typeof ratelimitBuckets;

type RatelimitUser = {
  id: string;
  isAdmin?: boolean;
  plan?: 'premium' | 'team' | null;
};

type RatelimitCtx = MutationCtx & {
  user?: RatelimitUser | null;
};

type RatelimitMeta = {
  ratelimit?: RatelimitBucket;
};

export function getUserTier(user: RatelimitUser | null): RatelimitTier {
  if (!user) return 'public';
  if (user.isAdmin || user.plan) return 'premium';
  return 'free';
}

async function getRequestSignals(ctx: RatelimitCtx) {
  const { ip, userAgent } = await ctx.meta.getRequestMetadata();

  return {
    ...(ip ? { ip } : {}),
    ...(userAgent ? { userAgent } : {}),
  };
}

export const ratelimit = RatelimitPlugin.configure({
  buckets: ratelimitBuckets,
  getBucket: ({ meta }: { meta: RatelimitMeta }) => meta.ratelimit ?? 'default',
  getUser: ({ ctx }: { ctx: RatelimitCtx }) => ctx.user ?? null,
  getIdentifier: ({ user }: { user: RatelimitUser | null }) =>
    user?.id ?? 'anonymous',
  getTier: getUserTier,
  getSignals: ({ ctx }: { ctx: RatelimitCtx }) => getRequestSignals(ctx),
  prefix: ({ bucket, tier }) => `ratelimit:${bucket}:${tier}`,
  failureMode: 'closed',
  enableProtection: true,
  denyListThreshold: 30,
});

Convex exposes request metadata in mutation and action functions: request ID, client IP, and client user-agent. For RatelimitPlugin, read it from mutation middleware. Use it for anonymous flows too — session-based helpers return {} when no session exists, which is exactly when IP-aware protection matters most.

Note: ctx.meta.getRequestMetadata() requires Convex 1.38.0 or newer.

Wire it into middleware

Apply the plugin once in your mutation builders. The default bucket covers normal writes. meta.ratelimit is an optional named-bucket override.

convex/lib/crpc.ts

import { ratelimit, type RatelimitBucket } from './plugins/ratelimit/plugin';

const c = initCRPC
  .meta<{ ratelimit?: RatelimitBucket }>()
  .create();

export const publicMutation = c.mutation.use(ratelimit.middleware());

Normal writes use the default bucket:

export const createTodo = authMutation
  .input(z.object({ title: z.string().min(1) }))
  .mutation(async ({ ctx, input }) => {
    // business logic
  });

Add a named bucket only for exceptions:

export const stressTest = publicMutation
  .meta({ ratelimit: 'interactive' })
  .input(z.object({ id: z.string() }))
  .mutation(async ({ ctx, input }) => {
    // stricter bucket
  });

Choose your algorithm

Start simple and pick based on workload shape.

Fixed window

Best when hard windows are acceptable. Tokens reset at the start of each window.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'post:create',
  limiter: Ratelimit.fixedWindow(10, '1 m'),
});

Sliding window

Best when you want smoother request shaping without hard resets. Weighs the previous window proportionally so you don't get bursts at window boundaries.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'search',
  limiter: Ratelimit.slidingWindow(50, '1 m'),
});

Token bucket

Best for burst-friendly throughput with long-term control. Tokens refill at a steady rate up to maxTokens. Use maxReserved to allow requests to "borrow" from future tokens when the bucket is empty.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'llm:tokens',
  limiter: Ratelimit.tokenBucket(1000, '1 m', 1000, { maxReserved: 3000 }),
});

Algorithm options

All three algorithm builders accept an optional options object as the last argument.

Option	Type	Default	Description
`shards`	`number`	`1`	Number of shards for write distribution. Higher values reduce contention at the cost of less precise counts (see Sharding).
`maxReserved`	`number`	`undefined`	Maximum tokens a request can "borrow" from future capacity. Only applies to `fixedWindow` and `tokenBucket`. Not supported by `slidingWindow`.
`capacity`	`number`	`limit`	Maximum stored tokens. Only applies to `fixedWindow`. Useful when you want a higher burst capacity than the per-window refill.
`start`	`number`	`0`	Epoch offset (ms) for window alignment. Only applies to `fixedWindow`. Aligns windows to a custom origin instead of epoch zero.

Duration formats

Every window or interval parameter accepts a Duration — either a raw millisecond number or a human-readable string.

String format: "<number> <unit>" or "<number><unit>". Both '1 m' and '1m' work.

Unit	Meaning	Example
`ms`	milliseconds	`'500 ms'`
`s`	seconds	`'30 s'`
`m`	minutes	`'1 m'`
`h`	hours	`'1 h'`
`d`	days	`'1 d'`

You can also use the pre-defined constants from kitcn/ratelimit:

import { SECOND, MINUTE, HOUR, DAY, WEEK } from 'kitcn/ratelimit';

Ratelimit.fixedWindow(100, MINUTE);        // 60_000 ms
Ratelimit.slidingWindow(50, 30 * SECOND);  // 30_000 ms
Ratelimit.tokenBucket(10, HOUR, 100);      // 3_600_000 ms

Add a client-side limiter UX

Server enforcement is mandatory. Client checks are for better UX — disabled buttons, countdowns, and retry hints.

Expose the hook API

First, export the hook API from a Convex file. The hookAPI() method returns a getRatelimit query and a getServerTime mutation that the React hook consumes.

convex/functions/ratelimit.ts

import { Ratelimit } from 'kitcn/ratelimit';

const limiter = new Ratelimit({
  limiter: Ratelimit.fixedWindow(3, '30 s'),
});

export const { getRatelimit, getServerTime } = limiter.hookAPI({
  identifier: async (_ctx, fromClient) => fromClient ?? 'anonymous',
  sampleShards: 1,
});

The identifier option can be a static string, or an async callback that receives (ctx, fromClient). Use the callback to resolve the identifier server-side (e.g. from auth) while still accepting a client-provided fallback.

sampleShards controls how many shards to read when estimating the remaining count. Set it to 1 for low-cost reads, or increase it for more accurate estimates on high-shard configs.

Use the React hook

Then wire it up in your component with useRatelimit:

src/components/send-button.tsx

import { useRatelimit } from 'kitcn/ratelimit/react';

const ratelimitRef = 'ratelimitDemo:getInteractiveRatelimit' as const;
const serverTimeRef = 'ratelimitDemo:getInteractiveServerTime' as const;

const { status, check } = useRatelimit(ratelimitRef, {
  identifier: sessionId,
  count: 1,
  getServerTimeMutation: serverTimeRef,
});

const blocked = status?.ok === false;
const retryAt = status?.retryAt;

useRatelimit accepts either:

a Convex function path string ('module:functionName') — this is what the /ratelimit demo uses.
a generated FunctionReference from api.

The hook returns:

Field	Type	Description
`status`	`HookStatus \| undefined`	`undefined` while loading. `{ ok: true }` when allowed, `{ ok: false, retryAt: number }` when blocked. Auto-updates when `retryAt` passes.
`check`	`(ts?, count?) => HookCheckValue \| undefined`	Manual projection function. Call it with a timestamp and count to get a precise snapshot for custom gauges or progress bars.

The HookCheckValue returned by check() has this shape:

Field	Type	Description
`value`	`number`	Projected remaining tokens (negative means over-limit)
`ts`	`number`	Timestamp of the projection (client time)
`config`	`ResolvedAlgorithm`	The algorithm config for further calculations
`shard`	`number`	Which shard was sampled
`ok`	`boolean`	`true` when `value >= 0`
`retryAt`	`number \| undefined`	Client timestamp when tokens become available

If you need precise projected values (for custom gauges), call check(ts, count).

Protection and deny lists

When enableProtection is on, the limiter tracks repeated failures per identifier, IP, user-agent, and country. Once a value reaches denyListThreshold, it gets blocked for 24 hours — without even checking the database.

You can also provide static deny lists to block known bad actors immediately.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'api',
  limiter: Ratelimit.fixedWindow(100, '1 m'),
  failureMode: 'closed',
  enableProtection: true,
  denyListThreshold: 30,
  denyList: {
    identifiers: ['known-bad-user-id'],
    ips: ['203.0.113.0'],
    userAgents: ['BadBot/1.0'],
    countries: ['XX'],
  },
});

To trigger deny-list matching on request metadata, pass ip, userAgent, or country in the limit() call. In mutation code, prefer Convex request metadata:

const { ip, userAgent } = await ctx.meta.getRequestMetadata();

const result = await limiter.limit(userId, {
  ip: ip ?? undefined,
  userAgent: userAgent ?? undefined,
});

For HTTP actions, read the Request headers directly:

const result = await limiter.limit(userId, {
  ip: request.headers.get('x-forwarded-for') ?? undefined,
  userAgent: request.headers.get('user-agent') ?? undefined,
  country: request.headers.get('x-country') ?? undefined,
});

Important: Deny-list state is in-memory and non-durable. It can survive across warm runtime requests, but is lost on cold starts/deploys. For persistent blocking, use an external deny list or database-backed blocklist.

Dynamic limits

Dynamic limits let you change rate limits at runtime — useful for feature flags, admin overrides, or gradual rollouts. Enable them with dynamicLimits: true in the constructor.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'api:search',
  limiter: Ratelimit.fixedWindow(100, '1 m'),
  dynamicLimits: true,
});

Then use setDynamicLimit to override the configured limit at runtime:

// Double the limit during a sale
await limiter.setDynamicLimit({ limit: 200 });

// Read the current override
const { dynamicLimit } = await limiter.getDynamicLimit();
// dynamicLimit === 200

// Remove the override (reverts to configured limit)
await limiter.setDynamicLimit({ limit: false });

The dynamic limit overrides the limit field of the algorithm. For token bucket, it overrides refillRate (and maxTokens if they were originally equal).

Limits and mitigations you should know

Important: This is application-layer limiting. It protects business logic and expensive downstream work, but it is not a network firewall or DDoS shield.

API Reference

Constructor options

Create a Ratelimit instance with a config object:

const limiter = new Ratelimit(config: RatelimitConfig);

Option	Type	Default	Description
`db`	`ctx.db`	—	Convex database context. Required for `limit`, `check`, `getRemaining`, `getValue`, `resetUsedTokens`, `setDynamicLimit`, `getDynamicLimit`. Not needed for `hookAPI()` (it receives `db` from the query/mutation context).
`limiter`	`ResolvedAlgorithm`	—	Required. Algorithm created by `Ratelimit.fixedWindow()`, `Ratelimit.slidingWindow()`, or `Ratelimit.tokenBucket()`.
`prefix`	`string`	`'kitcn/ratelimit'`	Namespaces stored state in the database. Use unique prefixes for different rate limit scopes.
`dynamicLimits`	`boolean`	`false`	Enables `setDynamicLimit()` / `getDynamicLimit()`.
`failureMode`	`'closed' \| 'open'`	`'closed'`	Behavior on timeout. `'closed'` rejects, `'open'` allows.
`timeout`	`number`	`5000`	Milliseconds before triggering `failureMode` behavior.
`enableProtection`	`boolean`	`false`	Enables deny-list tracking on repeated failures.
`denyListThreshold`	`number`	`30`	Consecutive failures before an identifier is blocked (24h). Requires `enableProtection: true`.
`denyList`	`ProtectionLists`	`undefined`	Static deny lists. See Protection and deny lists.
`ephemeralCache`	`Map<string, number> \| false`	`new Map()`	In-memory block cache. Shared across requests in the same Convex invocation. Pass `false` to disable.

Algorithm builders

All builders are available as static methods on Ratelimit.

`Ratelimit.fixedWindow(limit, window, options?)`

fixedWindow(limit: number, window: Duration, options?: AlgorithmOptions): FixedWindowAlgorithm

Parameter	Type	Description
`limit`	`number`	Tokens replenished per window
`window`	`Duration`	Window length (`number` in ms, or string like `'1 m'`)
`options.shards`	`number`	Write distribution shards (default `1`)
`options.maxReserved`	`number`	Max tokens that can be borrowed from future windows
`options.capacity`	`number`	Max stored tokens (default = `limit`)
`options.start`	`number`	Epoch offset for window alignment

`Ratelimit.slidingWindow(limit, window, options?)`

slidingWindow(limit: number, window: Duration, options?: AlgorithmOptions): SlidingWindowAlgorithm

Parameter	Type	Description
`limit`	`number`	Max requests in the sliding window
`window`	`Duration`	Window length
`options.shards`	`number`	Write distribution shards (default `1`)
`options.maxReserved`	`number`	Max tokens that can be borrowed

Note: reserve is not supported with sliding window. The algorithm needs both current and previous window counts, which makes reservation impractical.

`Ratelimit.tokenBucket(refillRate, interval, maxTokens, options?)`

tokenBucket(refillRate: number, interval: Duration, maxTokens: number, options?: AlgorithmOptions): TokenBucketAlgorithm

Parameter	Type	Description
`refillRate`	`number`	Tokens added per interval
`interval`	`Duration`	Refill interval
`maxTokens`	`number`	Maximum bucket capacity
`options.shards`	`number`	Write distribution shards (default `1`)
`options.maxReserved`	`number`	Max tokens that can be borrowed from future refills

Core methods

`limit(identifier, options?)`

Consume tokens and return a response. This is the primary method for enforcing rate limits.

limit(identifier: string, options?: LimitRequest): Promise<RatelimitResponse>

`check(identifier, options?)`

Evaluate without consuming tokens. Use this for read-only checks (e.g. showing a warning before the user submits).

check(identifier: string, options?: CheckRequest): Promise<RatelimitResponse>

`getRemaining(identifier)`

Return the remaining tokens, reset time, and limit for an identifier.

getRemaining(identifier: string): Promise<RemainingResponse>

`getValue(identifier, options?)`

Return a raw snapshot for custom projections and UI calculations.

getValue(identifier: string, options?: { sampleShards?: number }): Promise<RatelimitSnapshot>

`resetUsedTokens(identifier)`

Clear all stored state for an identifier. Useful for admin resets.

resetUsedTokens(identifier: string): Promise<void>

`setDynamicLimit(options)`

Override the configured limit at runtime. Pass { limit: false } to remove the override. Requires dynamicLimits: true.

setDynamicLimit(options: { limit: number | false }): Promise<void>

`getDynamicLimit()`

Read the current dynamic override. Returns { dynamicLimit: number | null }. Requires dynamicLimits: true.

getDynamicLimit(): Promise<DynamicLimitResponse>

`hookAPI(options?)`

Export a getRatelimit query and getServerTime mutation for the React hook.

hookAPI(options?: HookAPIOptions): {
  getRatelimit: FunctionReference<'query'>;
  getServerTime: FunctionReference<'mutation'>;
}

Request options

`LimitRequest`

Pass these options to limit() to customize behavior per-call.

Field	Type	Default	Description
`rate`	`number`	`1`	Alias for `count`. Tokens to consume.
`count`	`number`	`1`	Tokens to consume. Takes precedence if both `rate` and `count` are set.
`reserve`	`boolean`	`false`	Allow borrowing from future capacity (up to `maxReserved`). Not supported by `slidingWindow`.
`ip`	`string`	—	IP address for deny-list matching
`userAgent`	`string`	—	User-agent for deny-list matching
`country`	`string`	—	Country code for deny-list matching
`geo`	`unknown`	—	Reserved for future geo-based rules

`CheckRequest`

Same as LimitRequest but reserve defaults to not consuming tokens (since check() is read-only).

Response types

`RatelimitResponse`

Returned by limit() and check().

Field	Type	Description
`success`	`boolean`	`true` if the request was allowed
`ok`	`boolean`	Alias for `success` (Convex DX parity)
`limit`	`number`	Maximum tokens for this algorithm
`remaining`	`number`	Tokens left after this request (floored to 0)
`reset`	`number`	Epoch ms when tokens will be available
`pending`	`Promise<unknown>`	Resolves when async side-effects complete
`reason`	`'timeout' \| 'cacheBlock' \| 'denyList'`	Present when a reason applies. Note: `failureMode: 'open'` can return `success: true` with `reason: 'timeout'`.
`deniedValue`	`string`	Present only when `reason === 'denyList'`. The value that triggered the block.

`RemainingResponse`

Returned by getRemaining().

Field	Type	Description
`remaining`	`number`	Tokens available
`reset`	`number`	Epoch ms of next replenishment
`limit`	`number`	Maximum tokens

`RatelimitSnapshot`

Returned by getValue(). Used for custom projections and the React hook.

Field	Type	Description
`value`	`number`	Current token count
`ts`	`number`	Timestamp of last state update
`shard`	`number`	Which shard was read
`config`	`ResolvedAlgorithm`	Full algorithm config for `calculateRatelimit()`

Hook API

`HookAPIOptions`

Options for hookAPI().

Field	Type	Default	Description
`identifier`	`string \| (ctx, fromClient?) => string \| Promise<string>`	—	How to resolve the identifier. A string uses it directly. A callback receives the Convex context and the optional client-provided identifier.
`sampleShards`	`number`	`1`	How many shards to sample when reading. Higher = more accurate, more reads.

`UseRatelimitOptions`

Options for the useRatelimit() React hook.

useRatelimit(
  getRatelimitValueQuery: FunctionReference<'query'> | string,
  options?: UseRatelimitOptions
)

Field	Type	Default	Description
`identifier`	`string`	—	Passed to the `getRatelimit` query
`count`	`number`	`1`	Tokens to project for status calculation
`sampleShards`	`number`	—	Override sampleShards from hook API
`getServerTimeMutation`	`FunctionReference \| string`	—	Enables clock-skew correction between client and server

Time constants

Pre-defined millisecond constants exported from kitcn/ratelimit:

Constant	Value
`SECOND`	`1_000`
`MINUTE`	`60_000`
`HOUR`	`3_600_000`
`DAY`	`86_400_000`
`WEEK`	`604_800_000`

Internal tables

The rate limiter stores state in three local schema keys. These come from your local convex/lib/plugins/ratelimit/schema.ts extension, so do not define these keys twice. The underlying Convex storage table names stay underscored.

Schema key	Purpose
`ratelimitState`	Per-identifier, per-shard token state
`ratelimitDynamicLimit`	Dynamic limit overrides per prefix
`ratelimitProtectionHit`	Protection tracking (hits, blocks) per prefix

Advanced notes

`calculateRatelimit`

The calculateRatelimit function is exported for custom projections and UI calculations. It takes a state snapshot, algorithm config, current timestamp, and count, and returns the evaluated result without touching the database.

import { calculateRatelimit } from 'kitcn/ratelimit';

const result = calculateRatelimit(
  { value: 8, ts: Date.now() - 30_000 },
  Ratelimit.fixedWindow(10, '1 m'),
  Date.now(),
  1
);
// result.remaining, result.reset, result.retryAfter

Sharding

When shards > 1, each limit() call picks a random shard (or two, using power-of-two-choices when shards >= 3) to reduce write contention. The trade-off: reads (check, getRemaining, getValue) only sample a subset of shards, so remaining counts are approximate. For most use cases, shards: 1 (the default) is fine. Increase shards only when you see write contention on hot identifiers.

Ephemeral cache

The ephemeral block cache is an in-memory Map<string, number> that caches "blocked until" timestamps. When a limit() call fails, subsequent calls for the same identifier skip the database read entirely until the block expires. The cache is per-Ratelimit instance and resets on each Convex function invocation. Pass ephemeralCache: false to disable it, or pass a shared Map across multiple Ratelimit instances to share the cache.

Rate Limiting

Plugins Overview

Error Handling

Middlewares

On this page