Per-user quotas (refilled on a schedule)
| Limit | Free | Pro | Resets |
|---|---|---|---|
| Chat queries | 50 | 2,000 (fair use) | Monthly |
| Sources (total stored) | 100 | 2,000 | Never (you delete to free up) |
| Active vaults | 1 | 5 | Archive to free up |
| ScholarFlow ingests | 1/day | 100/day | Daily |
| Gemini cost (USD) | $1/day | $5/day | Daily |
Org-wide cost ceiling
Across all users combined, we cap Gemini spend at $50/day. If yesterday's spend hit that, the ScholarFlow cron is frozen for today and the chat endpoint may return 503 with a friendly "Service is rate-limited today, try again tomorrow" message. This is the parachute against a worst-case usage spike (rare at this scale, but the cap exists).
Per-IP rate limits (anonymous demo)
- Public demo chat — 5 questions per IP per day. Resets midnight UTC.
- Signup — 5 per IP per hour (anti-abuse).
Per-route burst limits
Within any 1-minute window we cap:
- Chat: 60 requests per user
- Ingest: 30 requests per user
- ScholarFlow search: 20 per user per hour
Hitting these returns HTTP 429 with a Retry-After header.
What you see when you hit a quota
For per-user quotas (queries, sources, vaults), the relevant endpoint returns HTTP 402 with a JSON body like:
{
"error": "upgrade_required",
"limit_hit": "queries_quota_month",
"tier": "free",
"message": "You've used your 50 chat queries for this month. Upgrade to Pro for unlimited queries."
}The UI catches that response and shows an upgrade banner with a link to /billing.
Why we cap Gemini cost
Every chat call hits Gemini API. Even on Free at 50 queries/month, a runaway bug could spike that. The $1/day cap is the failsafe — if our model-routing logic or your usage pattern blows up, you don't end up with a surprise bill on our side and we don't have to email you about throttling.
How to check current usage
/billing shows live usage bars for every metric. Bars turn red when you're at the cap.