Chatting with your vault — Help — Peptides Vault

How a single answer is built

Query rewrite — multi-turn condense (so "what about side effects" resolves against the previous question) plus peptide-synonym expansion (e.g. "BPC-157" → also searches for "Body Protection Compound 157").
Retrieve + ground — Gemini File Search pulls the top-K (10) most relevant chunks from your vault, then generates the answer using only those chunks. Grounding is mandatory; we can't answer outside your sources.
Rerank citations — a small Gemini call scores the citations the model used for relevance, drops the bottom half, and applies a diversity filter (max 2 cites per source).
Final top-5 citations attached to the answer for the UI to render.

The off-topic refusal

If you ask something the corpus can't answer (e.g. "what's the weather"), you get the canonical refusal:

I can only answer questions about the documents in this vault. Try rephrasing your question to be about something in your sources.

This is deliberate — we'd rather refuse than fabricate. If you think the corpus DOES have the answer and you got refused, try adding 1-2 more specific keywords (an author name, a peptide abbreviation) to surface the right chunks.

Model routing

We pick the cheapest model that still answers well, based on your plan and the shape of your question:

Definitional queries on Free ("what is BPC-157?") → gemini-2.5-flash-lite
Everything else → gemini-2.5-flash
Pro + deep-research mode (Phase 6+) → gemini-2.5-pro

Sessions and history

Each chat sits in a session. Sessions remember the prior turns so follow-ups work naturally ("and the dosing in that study?"). You can rename sessions, browse them at /vaults/{id}/history, and delete them individually.

What we don't do (yet)

Streaming answers — the non-streaming endpoint returns the full reply at once. Phase 8 work.
Source toggles in the prompt — every enabled source in the vault is searched. To exclude a source, toggle it off in the sources list.
Multi-vault queries — one chat = one vault. To search across, copy sources between vaults or wait for the cross-vault feature.

Quota

Free: 50 queries/month. Pro: unlimited within fair-use (~2000/mo and a $5/day Gemini cost ceiling). When you hit a cap the chat endpoint returns HTTP 402 upgrade_required — see quotas.