Chatting with your vault

Ask anything; every answer is cited. We refuse off-topic questions instead of fabricating.

How a single answer is built

  1. Query rewrite — multi-turn condense (so "what about side effects" resolves against the previous question) plus peptide-synonym expansion (e.g. "BPC-157" → also searches for "Body Protection Compound 157").
  2. Retrieve + ground — Gemini File Search pulls the top-K (10) most relevant chunks from your vault, then generates the answer using only those chunks. Grounding is mandatory; we can't answer outside your sources.
  3. Rerank citations — a small Gemini call scores the citations the model used for relevance, drops the bottom half, and applies a diversity filter (max 2 cites per source).
  4. Final top-5 citations attached to the answer for the UI to render.

The off-topic refusal

If you ask something the corpus can't answer (e.g. "what's the weather"), you get the canonical refusal:

I can only answer questions about the documents in this vault. Try rephrasing your question to be about something in your sources.

This is deliberate — we'd rather refuse than fabricate. If you think the corpus DOES have the answer and you got refused, try adding 1-2 more specific keywords (an author name, a peptide abbreviation) to surface the right chunks.

Model routing

We pick the cheapest model that still answers well, based on your plan and the shape of your question:

  • Definitional queries on Free ("what is BPC-157?") → gemini-2.5-flash-lite
  • Everything elsegemini-2.5-flash
  • Pro + deep-research mode (Phase 6+)gemini-2.5-pro

Sessions and history

Each chat sits in a session. Sessions remember the prior turns so follow-ups work naturally ("and the dosing in that study?"). You can rename sessions, browse them at /vaults/{id}/history, and delete them individually.

What we don't do (yet)

  • Streaming answers — the non-streaming endpoint returns the full reply at once. Phase 8 work.
  • Source toggles in the prompt — every enabled source in the vault is searched. To exclude a source, toggle it off in the sources list.
  • Multi-vault queries — one chat = one vault. To search across, copy sources between vaults or wait for the cross-vault feature.

Quota

Free: 50 queries/month. Pro: unlimited within fair-use (~2000/mo and a $5/day Gemini cost ceiling). When you hit a cap the chat endpoint returns HTTP 402 upgrade_required — see quotas.

Related articles