# Pi-Inspired Personal Assistant Memory Design Date: 2026-02-25 Status: approved Inspired by: [badlogic/pi-mono](https://github.com/badlogic/pi-mono) Scope: Flynn-native implementation — no dependency on pi-agent-core ## Problem Flynn's memory model has four concrete gaps that make it feel like a generic chatbot rather than a personal assistant: 1. **Forgets across sessions** — memory extraction runs but is unreliable; facts don't survive compaction consistently. 2. **Clunky compaction** — compaction summaries are generic and discarded after context trimming; important personal context is lost. 3. **No proactive recall** — `buildAdaptiveMemoryContext` exists but only scores by keyword overlap with the current message, never surfaces context unprompted. 4. **Fragmented across channels** — Telegram, Discord, and the gateway each have isolated sessions with no shared sense of "you." ## Design Goals - Pick up where the last conversation left off, across any channel - Never lose recent context to compaction - Stable facts (preferences, patterns) persist indefinitely - All behavior gated behind config; default is current behavior (opt-in) ## Non-Goals (this phase) - Multi-user deployments with per-user auth - Proactive mid-session memory surfacing (beyond session start) - Full vector/semantic replacement of adaptive injection --- ## Architecture Two-tier memory structure added to the orchestrator: ``` Long-term store (existing) Working memory (new) memory/user/profile ←→ memory/user/working memory/user/patterns (TTL: ~14 days) memory/sessions/... (replaced per compaction) ↓ ↓ injected via injected wholesale adaptive scoring at session start (keyword/vector match) (always present if fresh) ``` **Long-term store** — existing `MemoryStore` namespaces, unchanged. Stable facts extracted from conversations, searched adaptively per-turn. **Working memory** — a new `user/working` namespace written on every compaction. Acts as a "what's been happening lately" snapshot. Injected in full at session start. Expires after N days (default 14). **Unified user namespace** — a canonical `user/*` tree shared across all channels, replacing today's session-scoped isolation. --- ## Section 1: Unified User Namespace ### Namespace Layout ``` memory/ user/ profile ← stable facts: name, timezone, role, preferences patterns ← recurring behaviors: working style, recurring topics working ← rolling compaction summary (TTL-based) sessions/ telegram:123/... ← session-specific (unchanged, existing behavior) ws:abc/... ``` ### Identity Model A single `memory.user_namespace` config key (default: unset) ties all channels together. All channels on the Flynn instance with this config treat memory as belonging to one person. Unset = current session-scoped behavior, unchanged. This is appropriate for personal assistant deployments (one person, many surfaces). Multi-user is out of scope. ### Config ```yaml memory: user_namespace: "user" # enables shared identity; absent = session-scoped (current) ``` ### Extraction Routing When `user_namespace` is set: - Stable extracted facts → `user/profile`, `user/patterns` - Compaction summary → `user/working` - Session-specific context → `sessions//...` (unchanged) --- ## Section 2: Working Memory Layer ### Storage `user/working` namespace in the existing `MemoryStore` (flat file, no new storage engine). File format: ``` # Working Memory Updated: 2026-02-25T11:30:00Z Expires: 2026-03-10T11:30:00Z [compaction summary content] ``` ### Lifecycle | Event | Action | |---|---| | Compaction runs | Write summary to `user/working`, replacing previous content | | Session starts | Read `user/working`; inject if `Expires` is in the future | | `Expires` in the past | File is ignored; overwritten on next compaction | | Memory store not configured | Entire feature is a no-op | No background cleanup job required — expiry is checked lazily on read. ### Size Budget Capped at `working_memory_max_tokens` (default 1000 tokens). If the compaction summary exceeds the budget it is truncated before writing. Keeps injection overhead predictable. ### Config ```yaml memory: working_memory_ttl_days: 14 # expiry window; default 14 working_memory_max_tokens: 1000 # injection size cap; default 1000 ``` --- ## Section 3: Compaction → Working Memory Flow ### Current Flow ``` history exceeds threshold → compactHistory() produces summary string → summary replaces trimmed messages in session history → summary string discarded ``` ### New Flow ``` history exceeds threshold → compactHistory() produces summary string → summary replaces trimmed messages in session history → summary written to user/working (replaces previous) → memory extraction writes facts to user/profile + user/patterns ``` ### Compaction Prompt Change Today's compaction uses a generic summarization prompt. Add a personal-assistant-focused variant that explicitly captures: - What the user was working on and current status - Decisions made and their outcomes - Preferences or constraints the user expressed - Open threads and follow-up items This makes `user/working` genuinely useful as a "picking up where we left off" snapshot rather than a generic recap. The improved prompt is only used when `user_namespace` is set. Existing generic compaction is unchanged otherwise. --- ## Section 4: Session Start Injection ### Injection Point A new one-time `_injectSessionContext()` call in the orchestrator, triggered before the first user message of a new session. Separate from the existing per-turn `_injectMemoryContext()`. ### Injection Order in System Prompt ``` [base system prompt — SOUL.md / IDENTITY.md / etc.] --- Who you're talking to --- [user/profile content] ← always injected if present --- Recent context --- [user/working content] ← injected if not expired [adaptive per-turn memory injection — unchanged, runs every turn] ``` ### Idempotency Session-start injection is tracked by a boolean flag on the orchestrator instance. Reconnects to the same session ID do not re-inject. ### Graceful Degradation | Condition | Behavior | |---|---| | No `user/profile` file | Skip block silently | | `user/working` expired | Skip block, log at debug level | | Memory store not configured | Entire feature no-ops | | `user_namespace` not set | Current behavior, unchanged | ### Optional Proactive Greeting When `proactive_session_greeting: true`, include a system instruction on the first turn: > "If relevant, briefly acknowledge what the user was last working on before responding to their first message." Off by default. Gives the Pi-like "picking up the thread" feel when enabled. ```yaml memory: proactive_session_greeting: false # default off ``` --- ## Section 5: Cross-Channel Identity No per-channel plumbing needed. All channels share the orchestrator config. When `user_namespace` is set, every channel reads/writes `user/*` automatically. **First message on a new channel** — if the user switches from web UI to Telegram, `user/working` from web UI sessions is already present. The Telegram session injects it on first turn. This is the intended behavior. --- ## File-Level Change Summary | File | Change | |---|---| | `src/memory/workingMemory.ts` | **New** — read/write/expiry logic for `user/working` | | `src/memory/store.ts` | Add `writeWithMetadata()` supporting timestamped/expiry headers | | `src/context/compaction.ts` | Add personal-assistant-focused compaction prompt option | | `src/backends/native/orchestrator.ts` | Session-start injection + write working memory after compaction | | `src/config/schema.ts` | New fields: `user_namespace`, `working_memory_ttl_days`, `working_memory_max_tokens`, `proactive_session_greeting` | | `src/daemon/index.ts` | Pass user namespace config through to orchestrator | --- ## Config Reference (full) ```yaml memory: # Shared identity namespace. When set, all channels share user/* memory. # Absent (default) = current session-scoped behavior, unchanged. user_namespace: "user" # How long working memory stays valid after the last compaction. working_memory_ttl_days: 14 # Token budget for working memory injection at session start. working_memory_max_tokens: 1000 # If true, instruct the model to acknowledge prior context on session start. proactive_session_greeting: false ``` --- ## Success Criteria 1. Working memory survives a daemon restart and is injected on next session start. 2. Switching channels (e.g. Telegram → web UI) injects the same `user/working` content. 3. `user_namespace` absent = zero behavior change vs today (regression-safe). 4. Compaction with `user_namespace` set writes to `user/working` on every run. 5. Expired working memory is silently ignored without error.