Quotas, limits, and rate caps

Every limit we enforce, what triggers it, and what response you get when you hit it.

Per-user quotas (refilled on a schedule)

LimitFreeProResets
Chat queries502,000 (fair use)Monthly
Sources (total stored)1002,000Never (you delete to free up)
Active vaults15Archive to free up
ScholarFlow ingests1/day100/dayDaily
Gemini cost (USD)$1/day$5/dayDaily

Org-wide cost ceiling

Across all users combined, we cap Gemini spend at $50/day. If yesterday's spend hit that, the ScholarFlow cron is frozen for today and the chat endpoint may return 503 with a friendly "Service is rate-limited today, try again tomorrow" message. This is the parachute against a worst-case usage spike (rare at this scale, but the cap exists).

Per-IP rate limits (anonymous demo)

  • Public demo chat — 5 questions per IP per day. Resets midnight UTC.
  • Signup — 5 per IP per hour (anti-abuse).

Per-route burst limits

Within any 1-minute window we cap:

  • Chat: 60 requests per user
  • Ingest: 30 requests per user
  • ScholarFlow search: 20 per user per hour

Hitting these returns HTTP 429 with a Retry-After header.

What you see when you hit a quota

For per-user quotas (queries, sources, vaults), the relevant endpoint returns HTTP 402 with a JSON body like:

{
  "error": "upgrade_required",
  "limit_hit": "queries_quota_month",
  "tier": "free",
  "message": "You've used your 50 chat queries for this month. Upgrade to Pro for unlimited queries."
}

The UI catches that response and shows an upgrade banner with a link to /billing.

Why we cap Gemini cost

Every chat call hits Gemini API. Even on Free at 50 queries/month, a runaway bug could spike that. The $1/day cap is the failsafe — if our model-routing logic or your usage pattern blows up, you don't end up with a surprise bill on our side and we don't have to email you about throttling.

How to check current usage

/billing shows live usage bars for every metric. Bars turn red when you're at the cap.

Related articles