flynn/docs/plans/2026-02-25-pi-personal-assistant-memory-plan.md

# Pi-Inspired Personal Assistant Memory — Implementation Plan

> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

**Goal:** Add two-tier personal-assistant memory: a unified `user/*` namespace shared across channels and a working-memory layer (`user/working`) that survives restarts and is injected at session start.

**Architecture:** `user/working` is written on every compaction (TTL-based flat file with header metadata). On the first message of a new session, `user/profile` and `user/working` are composed into the system prompt alongside the existing memory context. All behavior is behind a `memory.user_namespace` config key; when unset the feature is entirely inert.

**Tech Stack:** TypeScript, Vitest, existing `MemoryStore` (`src/memory/store.ts`), `AgentOrchestrator` (`src/backends/native/orchestrator.ts`), Zod config schema

**Key design decisions (from review):**
- Session context is stored in a separate `_sessionContext` field — `_systemPromptBase` is never mutated
- Compaction prompt is parameterized via `buildCompactionPrompt()` rather than duplicating the constant
- Expired-memory test uses a hardcoded past date (not TTL=0) for determinism
- Memory extraction writes to `{userNamespace}/facts` when user namespace is set

---

## Task 1: Config schema — new memory fields

**Files:**
- Modify: `src/config/schema.ts`

Add four new fields to `memorySchema`. They must come before `.default({})`.

**Step 1: Run baseline typecheck**

```bash
pnpm typecheck
```
Expected: passes (baseline).

**Step 2: Add fields to memorySchema**

In `src/config/schema.ts`, inside `memorySchema` (after `qmd: qmdSchema,`, before `}).default({});`), add:

```typescript
  /**
   * When set, all channels share user/* memory (unified identity namespace).
   * Absent = current session-scoped behavior, unchanged.
   */
  user_namespace: z.string().optional(),
  /** How long working memory remains valid after the last compaction (days). */
  working_memory_ttl_days: z.number().min(1).max(365).default(14),
  /** Token budget for working memory injection at session start. */
  working_memory_max_tokens: z.number().min(100).max(4000).default(1000),
  /** When true, instruct the model to acknowledge prior context on session start. */
  proactive_session_greeting: z.boolean().default(false),
```

**Step 3: Run typecheck**

```bash
pnpm typecheck
```
Expected: no errors.

**Step 4: Commit**

```bash
git add src/config/schema.ts
git commit -m "feat(memory): add user_namespace and working memory config fields"
```

---

## Task 2: WorkingMemory module

**Files:**
- Create: `src/memory/workingMemory.ts`
- Create: `src/memory/workingMemory.test.ts`

Working memory file format (stored at `{ns}/working.md`):

```
# Working Memory
Updated: 2026-02-25T11:30:00Z
Expires: 2026-03-10T11:30:00Z

[content]
```

**Step 1: Write the failing tests**

Create `src/memory/workingMemory.test.ts`:

```typescript
import { describe, it, expect } from 'vitest';
import { join } from 'path';
import { mkdtempSync, rmSync } from 'fs';
import { tmpdir } from 'os';
import { MemoryStore } from './store.js';
import { writeWorkingMemory, readWorkingMemory } from './workingMemory.js';

function makeStore(): { store: MemoryStore; dir: string } {
  const dir = mkdtempSync(join(tmpdir(), 'wm-test-'));
  const store = new MemoryStore({ dir, maxContextTokens: 2000 });
  return { store, dir };
}

describe('writeWorkingMemory', () => {
  it('writes a file with Updated/Expires headers', () => {
    const { store, dir } = makeStore();
    writeWorkingMemory(store, 'user/working', 'some content', 14, 1000);
    const raw = store.read('user/working');
    expect(raw).toContain('# Working Memory');
    expect(raw).toContain('Updated:');
    expect(raw).toContain('Expires:');
    expect(raw).toContain('some content');
    rmSync(dir, { recursive: true });
  });

  it('truncates content to token budget', () => {
    const { store, dir } = makeStore();
    const longContent = 'x'.repeat(10000);
    writeWorkingMemory(store, 'user/working', longContent, 14, 100);
    const raw = store.read('user/working');
    // 100 tokens * 4 chars = 400 chars budget for content
    const contentPart = raw.split('\n\n').slice(1).join('\n\n');
    expect(contentPart.length).toBeLessThanOrEqual(400 + 10); // small tolerance
    rmSync(dir, { recursive: true });
  });
});

describe('readWorkingMemory', () => {
  it('returns null when file does not exist', () => {
    const { store, dir } = makeStore();
    expect(readWorkingMemory(store, 'user/working')).toBeNull();
    rmSync(dir, { recursive: true });
  });

  it('returns content when not expired', () => {
    const { store, dir } = makeStore();
    writeWorkingMemory(store, 'user/working', 'hello world', 14, 1000);
    const result = readWorkingMemory(store, 'user/working');
    expect(result).not.toBeNull();
    expect(result!.content).toBe('hello world');
    rmSync(dir, { recursive: true });
  });

  it('returns null when expired', () => {
    const { store, dir } = makeStore();
    // Write a file with a hardcoded past expiry date for deterministic testing
    const expiredFile = [
      '# Working Memory',
      'Updated: 2025-01-01T00:00:00Z',
      'Expires: 2025-01-02T00:00:00Z',
      '',
      'stale content',
    ].join('\n');
    store.write('user/working', expiredFile, 'replace');
    const result = readWorkingMemory(store, 'user/working');
    expect(result).toBeNull();
    rmSync(dir, { recursive: true });
  });

  it('returns null for malformed file', () => {
    const { store, dir } = makeStore();
    store.write('user/working', 'no headers here', 'replace');
    expect(readWorkingMemory(store, 'user/working')).toBeNull();
    rmSync(dir, { recursive: true });
  });
});
```

**Step 2: Run test to verify it fails**

```bash
pnpm test:run src/memory/workingMemory.test.ts
```
Expected: FAIL — module not found.

**Step 3: Implement `src/memory/workingMemory.ts`**

```typescript
import type { MemoryStore } from './store.js';

export interface WorkingMemoryEntry {
  content: string;
  updatedAt: Date;
  expiresAt: Date;
}

const HEADER_PREFIX = '# Working Memory\n';

/**
 * Write a compaction summary to the working memory namespace.
 * Content is capped at maxTokens (estimated at 4 chars/token).
 * The file format includes Updated/Expires timestamps for lazy expiry checks.
 */
export function writeWorkingMemory(
  store: MemoryStore,
  namespace: string,
  content: string,
  ttlDays: number,
  maxTokens: number,
): void {
  const now = new Date();
  const expiresAt = new Date(now.getTime() + ttlDays * 24 * 60 * 60 * 1000);
  const maxChars = maxTokens * 4;
  const truncatedContent = content.length > maxChars ? content.slice(0, maxChars) : content;

  const file = [
    '# Working Memory',
    `Updated: ${now.toISOString()}`,
    `Expires: ${expiresAt.toISOString()}`,
    '',
    truncatedContent,
  ].join('\n');

  store.write(namespace, file, 'replace');
}

/**
 * Read working memory. Returns null if the file is absent, malformed, or expired.
 * MemoryStore.read() returns '' for missing files, so falsy check works.
 * Expiry is checked lazily here — no background cleanup needed.
 */
export function readWorkingMemory(
  store: MemoryStore,
  namespace: string,
): WorkingMemoryEntry | null {
  const raw = store.read(namespace);
  if (!raw) {
    return null;
  }

  if (!raw.startsWith(HEADER_PREFIX)) {
    return null;
  }

  const lines = raw.split('\n');
  let updatedAt: Date | null = null;
  let expiresAt: Date | null = null;
  let contentStartLine = 0;

  for (let i = 1; i < lines.length; i++) {
    const line = lines[i];
    if (line.startsWith('Updated: ')) {
      updatedAt = new Date(line.slice('Updated: '.length));
    } else if (line.startsWith('Expires: ')) {
      expiresAt = new Date(line.slice('Expires: '.length));
    } else if (line === '' && expiresAt !== null) {
      contentStartLine = i + 1;
      break;
    }
  }

  if (!updatedAt || !expiresAt || isNaN(expiresAt.getTime())) {
    return null;
  }

  if (expiresAt.getTime() <= Date.now()) {
    console.debug('[Flynn:working-memory] Working memory expired, skipping injection');
    return null;
  }

  const content = lines.slice(contentStartLine).join('\n').trim();

  return { content, updatedAt, expiresAt };
}
```

**Step 4: Run tests**

```bash
pnpm test:run src/memory/workingMemory.test.ts
```
Expected: all pass.

**Step 5: Typecheck**

```bash
pnpm typecheck
```
Expected: no errors.

**Step 6: Commit**

```bash
git add src/memory/workingMemory.ts src/memory/workingMemory.test.ts
git commit -m "feat(memory): add working memory read/write with TTL expiry"
```

---

## Task 3: Parameterized compaction prompt

**Files:**
- Modify: `src/backends/native/prompts.ts`

Instead of duplicating `COMPACTION_SYSTEM_PROMPT`, add a `buildCompactionPrompt()` function that shares rules and parameterizes the focus section. The existing `COMPACTION_SYSTEM_PROMPT` constant stays for backward compatibility.

**Step 1: Add the parameterized prompt builder**

In `src/backends/native/prompts.ts`, after `COMPACTION_SYSTEM_PROMPT`, add:

```typescript
/**
 * Build a compaction system prompt. When `personalAssistant` is true,
 * the prompt focuses on continuity context (what the user was working on,
 * decisions, preferences, open threads) rather than generic summarization.
 *
 * The shared rules (preserve facts, 20% length, bullet points, no invention,
 * skip transient content) are identical in both variants.
 */
export function buildCompactionPrompt(opts?: { personalAssistant?: boolean }): string {
  const focus = opts?.personalAssistant
    ? `Focus on:
- What the user was working on and its current status (be specific: which files, commands, or steps were involved)
- Decisions made and why (include rationale when stated)
- Preferences or constraints the user expressed (tools, styles, approaches to avoid or prefer)
- Open threads, unresolved questions, or explicit follow-up items`
    : `Focus on:
- Key topics discussed and conclusions reached
- Important decisions, commitments, or action items
- Technical details, code changes, or configurations that were established`;

  const preamble = opts?.personalAssistant
    ? 'You are summarising a conversation for a personal assistant. Your summary will be injected at the start of the next session so the assistant can pick up exactly where things left off.'
    : 'You are a conversation summarizer. Create a concise summary of the conversation that captures all important information.';

  return `${preamble}

${focus}

Rules:
- Preserve key facts, file paths, error messages, and specific values verbatim.
- Be concise but complete — aim for roughly 20% of the original length.
- Use bullet points under short descriptive headings.
- Never invent information not present in the conversation.
- Skip purely transient content (one-off commands, status messages with no lasting significance).

Output format:
Return a markdown summary. No preamble — output only the summary.`;
}
```

**Step 2: Typecheck**

```bash
pnpm typecheck
```
Expected: no errors.

**Step 3: Commit**

```bash
git add src/backends/native/prompts.ts
git commit -m "feat(memory): add parameterized compaction prompt builder"
```

---

## Task 4: Thread working memory through compaction

**Files:**
- Modify: `src/context/compaction.ts`

Add `summary` to `CompactionResult` so the orchestrator can write it to `user/working` without re-computing it. Thread the PA prompt option. Also route memory extraction to `{userNamespace}/facts` when namespace is set.

**Step 1: Update `CompactionResult` interface**

In `src/context/compaction.ts`, add `summary` to `CompactionResult`:

```typescript
export interface CompactionResult {
  /** The compacted messages: [summary, ...recentMessages]. */
  messages: Message[];
  /** Number of messages that were compacted (removed). */
  compactedCount: number;
  /** Estimated tokens before compaction. */
  tokensBefore: number;
  /** Estimated tokens after compaction. */
  tokensAfter: number;
  /** The raw summary text produced by the compaction model (populated when compaction ran). */
  summary?: string;
}
```

**Step 2: Add `usePersonalAssistantPrompt` option and populate `summary`**

Update the `compactHistory` function signature:

```typescript
export async function compactHistory(opts: {
  messages: Message[];
  orchestrator: AgentOrchestrator;
  config: CompactionConfig;
  memoryStore?: MemoryStore;
  autoExtract?: boolean;
  usePersonalAssistantPrompt?: boolean;
  memoryExtractionNamespace?: string;
}): Promise<CompactionResult> {
```

Update the imports to include `buildCompactionPrompt`:

```typescript
import { COMPACTION_SYSTEM_PROMPT, MEMORY_EXTRACTION_PROMPT, buildCompactionPrompt } from '../backends/native/prompts.js';
```

In the body, replace the hardcoded `COMPACTION_SYSTEM_PROMPT` in the `orchestrator.delegate()` call:

```typescript
const systemPrompt = opts.usePersonalAssistantPrompt
  ? buildCompactionPrompt({ personalAssistant: true })
  : COMPACTION_SYSTEM_PROMPT;

const result = await orchestrator.delegate({
  task: 'compaction',
  tier,
  systemPrompt,
  message: formattedConversation,
  maxTokens: config.summaryMaxTokens,
});
```

Populate `summary` in the return value:

```typescript
  return {
    messages: [...preservedMessages, summaryMessage, ...toKeep],
    compactedCount: toSummarize.length,
    tokensBefore: estimateMessageTokens(messages),
    tokensAfter: estimateMessageTokens([...preservedMessages, summaryMessage, ...toKeep]),
    summary: result.content,
  };
```

**Step 3: Route memory extraction to user namespace when set**

In the memory extraction block (around line 133), change the write target:

```typescript
  if (opts.memoryStore && opts.autoExtract !== false) {
    try {
      // ...existing extraction delegate call...

      const extractedContent = extraction.content.trim();
      if (extractedContent.length > 0 && !extractedContent.toLowerCase().includes('no facts')) {
        const extractionNs = opts.memoryExtractionNamespace ?? 'global';
        opts.memoryStore.write(extractionNs, extractedContent, 'append');
        console.log(`[Flynn:memory] Extracted ${extractedContent.length} chars of facts to ${extractionNs} memory`);
      }
    } catch (error) {
      console.warn('[Flynn:memory] Failed to extract facts during compaction:', error);
    }
  }
```

**Step 4: Typecheck**

```bash
pnpm typecheck
```
Expected: no errors.

**Step 5: Run full test suite**

```bash
pnpm test:run
```
Expected: all pass (summary field is additive, existing tests unaffected).

**Step 6: Commit**

```bash
git add src/context/compaction.ts src/backends/native/prompts.ts
git commit -m "feat(memory): thread summary and PA prompt through compaction result"
```

---

## Task 5: Orchestrator — session-start injection + post-compaction write

**Files:**
- Modify: `src/backends/native/orchestrator.ts`

This task has two parts:
1. Write working memory after every compaction (when `userNamespace` is set)
2. Inject `user/profile` + `user/working` into the system prompt on first message

**Critical design note:** `_systemPromptBase` must remain immutable (set once in constructor). Session context is stored in a separate `_sessionContext` field and composed into the system prompt in `_injectMemoryContext()`.

**Step 1: Explore existing test patterns**

```bash
ls src/backends/native/*.test.ts
```

Examine the existing orchestrator test setup to understand mock patterns before writing the integration test.

**Step 2: Add new fields to `OrchestratorConfig`**

In `src/backends/native/orchestrator.ts`, in `OrchestratorConfig` (after `attachmentCollector`), add:

```typescript
  /** Shared identity namespace for cross-channel memory (e.g. 'user'). Absent = session-scoped. */
  userNamespace?: string;
  /** TTL in days for working memory. Defaults to 14. */
  workingMemoryTtlDays?: number;
  /** Token budget for working memory injection. Defaults to 1000. */
  workingMemoryMaxTokens?: number;
  /** When true, instruct the model to acknowledge prior context on session start. */
  proactiveSessionGreeting?: boolean;
```

**Step 3: Store new config fields on the class**

Add private fields (near the other `_memory*` fields):

```typescript
private _userNamespace?: string;
private _workingMemoryTtlDays: number;
private _workingMemoryMaxTokens: number;
private _proactiveSessionGreeting: boolean;
private _sessionContext: string | null = null;
```

In the constructor, after the existing memory field assignments:

```typescript
this._userNamespace = config.userNamespace;
this._workingMemoryTtlDays = config.workingMemoryTtlDays ?? 14;
this._workingMemoryMaxTokens = config.workingMemoryMaxTokens ?? 1000;
this._proactiveSessionGreeting = config.proactiveSessionGreeting ?? false;
```

**Step 4: Post-compaction write in `compact()`**

Import `writeWorkingMemory` at the top of orchestrator.ts:

```typescript
import { writeWorkingMemory } from '../../memory/workingMemory.js';
```

In the `compact()` method, after the `auditLogger?.sessionCompact(...)` call, add:

```typescript
    // Write working memory when user namespace is configured
    if (result.summary && this._userNamespace && this._memoryStore) {
      const workingNs = `${this._userNamespace}/working`;
      writeWorkingMemory(
        this._memoryStore,
        workingNs,
        result.summary,
        this._workingMemoryTtlDays,
        this._workingMemoryMaxTokens,
      );
      console.log(`[Flynn:working-memory] Updated ${workingNs} after compaction`);
    }
```

Also, pass `usePersonalAssistantPrompt` and `memoryExtractionNamespace` to `compactHistory()` in `compact()`:

```typescript
    const result = await compactHistory({
      messages,
      orchestrator: this,
      config,
      memoryStore: this._memoryStore,
      autoExtract: this._memoryAutoExtract,
      usePersonalAssistantPrompt: Boolean(this._userNamespace),
      memoryExtractionNamespace: this._userNamespace ? `${this._userNamespace}/facts` : undefined,
    });
```

**Step 5: Session-start injection — `_buildSessionContext()`**

Add a private method that builds the session context string (but does NOT mutate `_systemPromptBase`):

```typescript
  /**
   * Build session context from user/profile and user/working memory.
   * Called once on first process() call. Returns null if no context available.
   * Does NOT mutate _systemPromptBase — the result is stored in _sessionContext
   * and composed into the system prompt by _injectMemoryContext().
   */
  private _buildSessionContext(): void {
    if (this._sessionContext !== null || !this._memoryStore || !this._userNamespace) {
      return;
    }

    const sections: string[] = [];

    // User profile block
    const profile = this._memoryStore.read(`${this._userNamespace}/profile`);
    if (profile.length > 0) {
      sections.push(`--- Who you're talking to ---\n${profile}`);
    }

    // Working memory block
    const working = readWorkingMemory(this._memoryStore, `${this._userNamespace}/working`);
    if (working) {
      sections.push(`--- Recent context ---\n${working.content}`);
    }

    if (sections.length === 0) {
      // Set to empty string (not null) to indicate we've run but found nothing.
      // Null means "not yet computed".
      this._sessionContext = '';
      return;
    }

    let ctx = sections.join('\n\n');

    if (this._proactiveSessionGreeting) {
      ctx += '\n\n[If relevant, briefly acknowledge what the user was last working on before responding to their first message.]';
    }

    this._sessionContext = ctx;
  }
```

Add the import at the top of orchestrator.ts:

```typescript
import { writeWorkingMemory, readWorkingMemory } from '../../memory/workingMemory.js';
```

**Step 6: Compose session context in `_injectMemoryContext()`**

In `_injectMemoryContext()`, change the base prompt computation to include session context:

Where the method currently does:
```typescript
this._agent.setSystemPrompt(this._systemPromptBase);
// and later:
const enrichedPrompt = `${this._systemPromptBase}\n\n# Memory Context\n\n...`;
```

Change to compose session context into the effective base:
```typescript
const effectiveBase = this._sessionContext
  ? `${this._systemPromptBase}\n\n${this._sessionContext}`
  : this._systemPromptBase;
```

Then use `effectiveBase` everywhere `_systemPromptBase` was used in that method.

**Step 7: Call `_buildSessionContext()` from `process()`**

In the `process()` method, before `_injectMemoryContext()`:

```typescript
    // One-time session-start context injection (user/profile + user/working)
    this._buildSessionContext();
    this._injectMemoryContext(userMessage);
```

**Step 8: Reset session context in `reset()`**

```typescript
  reset(): void {
    this._agent.reset();
    this._usageByTier.clear();
    this._lastContextAlertLevel = null;
    this._pendingContextAlert = undefined;
    this._lastCheckpointAt = 0;
    this._sessionContext = null;   // ← add: re-read on next process()
  }
```

**Step 9: Typecheck**

```bash
pnpm typecheck
```
Expected: no errors.

**Step 10: Run full test suite**

```bash
pnpm test:run
```
Expected: all pass.

**Step 11: Commit**

```bash
git add src/backends/native/orchestrator.ts
git commit -m "feat(memory): inject session context and write working memory after compaction"
```

---

## Task 6: Daemon routing — wire new config fields

**Files:**
- Modify: `src/daemon/routing.ts`

**Step 1: Add the new fields after `memoryDailyLogMaxAssistantChars`**

In `src/daemon/routing.ts`, after:
```typescript
        memoryDailyLogMaxAssistantChars: deps.config.memory?.daily_log?.max_assistant_chars,
```

Add:
```typescript
        userNamespace: deps.config.memory?.user_namespace,
        workingMemoryTtlDays: deps.config.memory?.working_memory_ttl_days,
        workingMemoryMaxTokens: deps.config.memory?.working_memory_max_tokens,
        proactiveSessionGreeting: deps.config.memory?.proactive_session_greeting,
```

**Step 2: Typecheck**

```bash
pnpm typecheck
```
Expected: no errors.

**Step 3: Run full test suite**

```bash
pnpm test:run
```
Expected: all pass.

**Step 4: Commit**

```bash
git add src/daemon/routing.ts
git commit -m "feat(memory): wire user_namespace and working memory config to orchestrator"
```

---

## Task 7: Smoke test and state.json update

**Step 1: Run full test suite one final time**

```bash
pnpm test:run
```
Expected: all pass.

**Step 2: Manual smoke test (optional but recommended)**

Add this to your `config.yaml`:
```yaml
memory:
  user_namespace: "user"
  working_memory_ttl_days: 14
  working_memory_max_tokens: 1000
  proactive_session_greeting: false
```

Start a session, chat for a while, trigger a compaction (or wait for auto-compaction), then restart and start a new session. Verify `user/working.md` exists in your memory dir and is injected on the new session's first turn.

**Step 3: Update state.json**

In `docs/plans/state.json`, add an entry to `completed`:

```json
"pi_personal_assistant_memory": {
  "status": "complete",
  "commit": "<sha>",
  "summary": "Two-tier personal assistant memory: working memory (user/working, TTL-based) written on compaction, injected at session start; unified user/* namespace across channels; parameterized compaction prompt; memory extraction routed to user/facts; proactive session greeting option."
}
```

**Step 4: Final commit**

```bash
git add docs/plans/state.json
git commit -m "docs(state): mark pi personal assistant memory as complete"
```

---

## Success Criteria Checklist

- [ ] `user_namespace` absent → zero behavior change (all new code paths guarded by `this._userNamespace` check)
- [ ] After compaction with `user_namespace` set, `user/working.md` exists in memory dir
- [ ] Daemon restart + new session → `user/working` content appears in the first-turn system prompt
- [ ] Telegram and web UI sessions share the same `user/working` file
- [ ] Expired working memory (TTL elapsed) is silently ignored
- [ ] `_systemPromptBase` is never mutated — session context composed via `_sessionContext` field
- [ ] Memory extraction writes to `{userNamespace}/facts` when namespace set, `global` otherwise
- [ ] All existing tests pass

---

## Future considerations (not in scope)

- **OrchestratorConfig refactor:** Group the 17+ flat `memory*` fields into a nested `MemoryConfig` interface. Mechanical but improves readability. Do this in a separate PR.
- **Concurrent compaction guard:** Multiple sessions could write `user/working` simultaneously. `writeFileSync` is atomic per-call so data won't corrupt, but the last-writer-wins. Consider a merge strategy if this becomes a problem.
- **Token counting:** Currently using `maxTokens * 4` char heuristic. Could reuse `estimateMessageTokens()` for consistency.

---

## Reference: Config

```yaml
memory:
  # Enables shared identity and working memory. Absent = unchanged behavior.
  user_namespace: "user"
  working_memory_ttl_days: 14       # default
  working_memory_max_tokens: 1000   # default
  proactive_session_greeting: false # default
```