API Reference
Rate Limits
Rate limiting by API key and plan tier.
Rate Limits
QANATIX uses a Redis token bucket for rate limiting. Limits are per tenant, keyed by API key.
Plan tiers
| Plan | General (rpm) | Search (rpm) |
|---|---|---|
| Free | 60 | 30 |
| Starter | 300 | 150 |
| Pro | 1,000 | 500 |
| Enterprise | 5,000 | 2,500 |
Search endpoints have separate, lower limits to protect the vector search infrastructure.
Response headers
Every response includes rate limit headers:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 57
X-RateLimit-Reset: 1709251200| Header | Description |
|---|---|
X-RateLimit-Limit | Max requests per window |
X-RateLimit-Remaining | Tokens remaining in current bucket |
X-RateLimit-Reset | Unix timestamp when bucket refills |
Rate limited response
When you exceed the limit:
HTTP 429 Too Many Requests
Retry-After: 42
{"detail": "Rate limit exceeded. Retry after 42 seconds."}Degraded mode
If Redis is unavailable, rate limiting is bypassed and all requests are allowed.
Self-hosted
Rate limiting is enabled by default. Set RATE_LIMIT_ENABLED=false to disable entirely (not recommended for production).