A vector‑store backed memory layer the model writes to on its own. A frozen profile that survives prompt‑cache. Facts and notes scored by semantic relevance with recency decay. Auto‑sweep session summaries. Pin what matters; archive what doesn't. Recall is one tool call away.
The actual memory manager — every fact keyed, ranked, and recallable, with pin and archive controls. No black box.
One unified LanceDB table, three semantic kinds. The model picks the right one when it calls update_memory. You curate from the UI.
Who you are, what you do, how you like things explained. Baked into the system prompt at session start so prompt‑cache stays warm. The model edits it via tool — mid‑session writes don't destabilize the cache because the system prompt doesn't re‑render live.
"Our CRO prioritizes net retention through FY26." "We deploy on Cloudflare Pages." One claim per row. Vector‑recallable when relevant. Semantic dedup merges duplicates; recency decay (30‑day half‑life) pushes stale ones down without deleting them.
After 10+ minutes of session inactivity, sessions with ≥3 messages get summarized in the background and saved as notes. No manual "save this session" step. The conversation becomes recallable without your intervention.
A brand‑new chat — the profile is already in the system prompt, and the model pulls the rest with recall_memory. You stop re‑briefing it.
Vector stores tend to return everything ranked. That's a quality problem in chat — the model gets 30 dusty rows and burns context. Ours is opinionated.
The recall tool drops anything under the threshold before it reaches the model. No noise, no "loosely related" injection that bloats the prompt and confuses the answer.
Every time a fact gets surfaced or referenced, its last_referenced timestamp updates. Older ones rank lower. Yesterday's note beats a six‑month‑old one with identical similarity.
For the handful of facts you never want to lose — the org chart, the brand guidelines, the "always use postgres‑replica for reads" rule — pin them. Recall returns them regardless of score.
Archive an entry and it stays in the database but stops appearing in recall results. Un‑archive when relevant again. Nothing's gone; nothing's noisy.
The background curator runs semantic comparison across facts; near‑duplicates merge with the canonical row keeping the better wording and the union of sources.
When the 2000‑char profile fills up, the model is asked to prune itself before adding new content — keeping the highest‑signal claims and discarding noise. The cache stays warm; the profile stays sharp.
LanceDB on disk (or wherever you point it). Sentence‑transformer embeddings computed locally. The recall tool returns rows to the model in your tenant — they never leave the deployment. If you self‑host, no third party sees what your AI remembers about you.
CRO prioritizes net retention over new logo growth through FY26.
Board agreed to defer mid‑market hiring until pipeline coverage exceeds 3.2×.
Enterprise expansion ARR concentrated in top 14 accounts; renewals Q1–Q2.
(below 50% threshold — not returned)
Every session inherits the profile. Every important fact is recallable. Nothing leaves your tenant. The model gets sharper; you stop repeating yourself.