# Pi-Inspired Personal Assistant Memory Design

Date: 2026-02-25
Status: approved
Inspired by: [badlogic/pi-mono](https://github.com/badlogic/pi-mono)
Scope: Flynn-native implementation — no dependency on pi-agent-core

## Problem

Flynn's memory model has four concrete gaps that make it feel like a generic chatbot rather than a personal assistant:

1. **Forgets across sessions** — memory extraction runs but is unreliable; facts don't survive compaction consistently.
2. **Clunky compaction** — compaction summaries are generic and discarded after context trimming; important personal context is lost.
3. **No proactive recall** — `buildAdaptiveMemoryContext` exists but only scores by keyword overlap with the current message, never surfaces context unprompted.
4. **Fragmented across channels** — Telegram, Discord, and the gateway each have isolated sessions with no shared sense of "you."

## Design Goals

- Pick up where the last conversation left off, across any channel
- Never lose recent context to compaction
- Stable facts (preferences, patterns) persist indefinitely
- All behavior gated behind config; default is current behavior (opt-in)

## Non-Goals (this phase)

- Multi-user deployments with per-user auth
- Proactive mid-session memory surfacing (beyond session start)
- Full vector/semantic replacement of adaptive injection

---

## Architecture

Two-tier memory structure added to the orchestrator:

```
Long-term store (existing)          Working memory (new)
  memory/user/profile      ←→         memory/user/working
  memory/user/patterns                  (TTL: ~14 days)
  memory/sessions/...                   (replaced per compaction)
        ↓                                      ↓
   injected via                         injected wholesale
   adaptive scoring                     at session start
   (keyword/vector match)               (always present if fresh)
```

**Long-term store** — existing `MemoryStore` namespaces, unchanged. Stable facts extracted from conversations, searched adaptively per-turn.

**Working memory** — a new `user/working` namespace written on every compaction. Acts as a "what's been happening lately" snapshot. Injected in full at session start. Expires after N days (default 14).

**Unified user namespace** — a canonical `user/*` tree shared across all channels, replacing today's session-scoped isolation.

---

## Section 1: Unified User Namespace

### Namespace Layout

```
memory/
  user/
    profile      ← stable facts: name, timezone, role, preferences
    patterns     ← recurring behaviors: working style, recurring topics
    working      ← rolling compaction summary (TTL-based)
  sessions/
    telegram:123/...    ← session-specific (unchanged, existing behavior)
    ws:abc/...
```

### Identity Model

A single `memory.user_namespace` config key (default: unset) ties all channels together. All channels on the Flynn instance with this config treat memory as belonging to one person. Unset = current session-scoped behavior, unchanged.

This is appropriate for personal assistant deployments (one person, many surfaces). Multi-user is out of scope.

### Config

```yaml
memory:
  user_namespace: "user"   # enables shared identity; absent = session-scoped (current)
```

### Extraction Routing

When `user_namespace` is set:
- Stable extracted facts → `user/profile`, `user/patterns`
- Compaction summary → `user/working`
- Session-specific context → `sessions/<id>/...` (unchanged)

---

## Section 2: Working Memory Layer

### Storage

`user/working` namespace in the existing `MemoryStore` (flat file, no new storage engine). File format:

```
# Working Memory
Updated: 2026-02-25T11:30:00Z
Expires: 2026-03-10T11:30:00Z

[compaction summary content]
```

### Lifecycle

| Event | Action |
|---|---|
| Compaction runs | Write summary to `user/working`, replacing previous content |
| Session starts | Read `user/working`; inject if `Expires` is in the future |
| `Expires` in the past | File is ignored; overwritten on next compaction |
| Memory store not configured | Entire feature is a no-op |

No background cleanup job required — expiry is checked lazily on read.

### Size Budget

Capped at `working_memory_max_tokens` (default 1000 tokens). If the compaction summary exceeds the budget it is truncated before writing. Keeps injection overhead predictable.

### Config

```yaml
memory:
  working_memory_ttl_days: 14      # expiry window; default 14
  working_memory_max_tokens: 1000  # injection size cap; default 1000
```

---

## Section 3: Compaction → Working Memory Flow

### Current Flow

```
history exceeds threshold
  → compactHistory() produces summary string
  → summary replaces trimmed messages in session history
  → summary string discarded
```

### New Flow

```
history exceeds threshold
  → compactHistory() produces summary string
  → summary replaces trimmed messages in session history
  → summary written to user/working (replaces previous)
  → memory extraction writes facts to user/profile + user/patterns
```

### Compaction Prompt Change

Today's compaction uses a generic summarization prompt. Add a personal-assistant-focused variant that explicitly captures:

- What the user was working on and current status
- Decisions made and their outcomes
- Preferences or constraints the user expressed
- Open threads and follow-up items

This makes `user/working` genuinely useful as a "picking up where we left off" snapshot rather than a generic recap.

The improved prompt is only used when `user_namespace` is set. Existing generic compaction is unchanged otherwise.

---

## Section 4: Session Start Injection

### Injection Point

A new one-time `_injectSessionContext()` call in the orchestrator, triggered before the first user message of a new session. Separate from the existing per-turn `_injectMemoryContext()`.

### Injection Order in System Prompt

```
[base system prompt — SOUL.md / IDENTITY.md / etc.]

--- Who you're talking to ---
[user/profile content]          ← always injected if present

--- Recent context ---
[user/working content]          ← injected if not expired

[adaptive per-turn memory injection — unchanged, runs every turn]
```

### Idempotency

Session-start injection is tracked by a boolean flag on the orchestrator instance. Reconnects to the same session ID do not re-inject.

### Graceful Degradation

| Condition | Behavior |
|---|---|
| No `user/profile` file | Skip block silently |
| `user/working` expired | Skip block, log at debug level |
| Memory store not configured | Entire feature no-ops |
| `user_namespace` not set | Current behavior, unchanged |

### Optional Proactive Greeting

When `proactive_session_greeting: true`, include a system instruction on the first turn:

> "If relevant, briefly acknowledge what the user was last working on before responding to their first message."

Off by default. Gives the Pi-like "picking up the thread" feel when enabled.

```yaml
memory:
  proactive_session_greeting: false   # default off
```

---

## Section 5: Cross-Channel Identity

No per-channel plumbing needed. All channels share the orchestrator config. When `user_namespace` is set, every channel reads/writes `user/*` automatically.

**First message on a new channel** — if the user switches from web UI to Telegram, `user/working` from web UI sessions is already present. The Telegram session injects it on first turn. This is the intended behavior.

---

## File-Level Change Summary

| File | Change |
|---|---|
| `src/memory/workingMemory.ts` | **New** — read/write/expiry logic for `user/working` |
| `src/memory/store.ts` | Add `writeWithMetadata()` supporting timestamped/expiry headers |
| `src/context/compaction.ts` | Add personal-assistant-focused compaction prompt option |
| `src/backends/native/orchestrator.ts` | Session-start injection + write working memory after compaction |
| `src/config/schema.ts` | New fields: `user_namespace`, `working_memory_ttl_days`, `working_memory_max_tokens`, `proactive_session_greeting` |
| `src/daemon/index.ts` | Pass user namespace config through to orchestrator |

---

## Config Reference (full)

```yaml
memory:
  # Shared identity namespace. When set, all channels share user/* memory.
  # Absent (default) = current session-scoped behavior, unchanged.
  user_namespace: "user"

  # How long working memory stays valid after the last compaction.
  working_memory_ttl_days: 14

  # Token budget for working memory injection at session start.
  working_memory_max_tokens: 1000

  # If true, instruct the model to acknowledge prior context on session start.
  proactive_session_greeting: false
```

---

## Success Criteria

1. Working memory survives a daemon restart and is injected on next session start.
2. Switching channels (e.g. Telegram → web UI) injects the same `user/working` content.
3. `user_namespace` absent = zero behavior change vs today (regression-safe).
4. Compaction with `user_namespace` set writes to `user/working` on every run.
5. Expired working memory is silently ignored without error.