Mnemos AI
Developers

Rate limits

The Mnemos API meters per workspace, per key, and per endpoint class. Every response tells you where you stand.

Tiers

PlanStandardBurstChat / searchNote
Starter6012030 / minPer workspace.
Business6001,200300 / minPer workspace, with per-key overrides.
EnterpriseContractualContractualContractualNegotiated. Hard ceilings live in the rate-limit dashboard.

Response headers

Every API response includes the following headers. They are the source of truth — use them rather than hardcoding the table above.

headers
X-RateLimit-Limit:      600
X-RateLimit-Remaining:  427
X-RateLimit-Reset:      2026-05-19T15:00:00Z
X-RateLimit-Bucket:     workspace:org_01H...:chat
Retry-After:            14    (only on 429)

Backoff

On 429 rate_limited, honor Retry-After if present. Otherwise back off with full jitter: pick a random delay between 0 and an exponentially growing cap, starting at 1s and doubling on each failure, with a maximum of 60s. Do not retry the same request more than 5 times.

example
async function withBackoff<T>(fn: () => Promise<T>, max = 5): Promise<T> {
  let attempt = 0;
  for (;;) {
    try { return await fn(); }
    catch (err: any) {
      if (err.status !== 429 || attempt >= max) throw err;
      const ra = Number(err.headers?.get("retry-after") ?? 0);
      const cap = Math.min(60_000, 1_000 * 2 ** attempt);
      const delay = ra > 0 ? ra * 1000 : Math.random() * cap;
      await new Promise(r => setTimeout(r, delay));
      attempt++;
    }
  }
}
Need higher limits?

Business customers can request key-scoped overrides for a single workload (for example, a one-time backfill) without amending the contract. Enterprise customers can negotiate permanent ceilings.