Files
flynn/docs/plans/2026-02-25-pi-personal-assistant-memory-plan.md
William Valentin ed53d6d215 docs(memory): revise pi personal assistant memory plan after code review
Key changes from review:
- Use separate _sessionContext field instead of mutating _systemPromptBase
- Parameterize compaction prompt via buildCompactionPrompt() instead of duplicating
- Fix flaky TTL=0 test to use hardcoded past expiry date
- Route memory extraction to {userNamespace}/facts when namespace is set
- Document future considerations (config refactor, concurrent writes, token counting)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 12:50:24 -08:00

805 lines
25 KiB
Markdown

# Pi-Inspired Personal Assistant Memory — Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Add two-tier personal-assistant memory: a unified `user/*` namespace shared across channels and a working-memory layer (`user/working`) that survives restarts and is injected at session start.
**Architecture:** `user/working` is written on every compaction (TTL-based flat file with header metadata). On the first message of a new session, `user/profile` and `user/working` are composed into the system prompt alongside the existing memory context. All behavior is behind a `memory.user_namespace` config key; when unset the feature is entirely inert.
**Tech Stack:** TypeScript, Vitest, existing `MemoryStore` (`src/memory/store.ts`), `AgentOrchestrator` (`src/backends/native/orchestrator.ts`), Zod config schema
**Key design decisions (from review):**
- Session context is stored in a separate `_sessionContext` field — `_systemPromptBase` is never mutated
- Compaction prompt is parameterized via `buildCompactionPrompt()` rather than duplicating the constant
- Expired-memory test uses a hardcoded past date (not TTL=0) for determinism
- Memory extraction writes to `{userNamespace}/facts` when user namespace is set
---
## Task 1: Config schema — new memory fields
**Files:**
- Modify: `src/config/schema.ts`
Add four new fields to `memorySchema`. They must come before `.default({})`.
**Step 1: Run baseline typecheck**
```bash
pnpm typecheck
```
Expected: passes (baseline).
**Step 2: Add fields to memorySchema**
In `src/config/schema.ts`, inside `memorySchema` (after `qmd: qmdSchema,`, before `}).default({});`), add:
```typescript
/**
* When set, all channels share user/* memory (unified identity namespace).
* Absent = current session-scoped behavior, unchanged.
*/
user_namespace: z.string().optional(),
/** How long working memory remains valid after the last compaction (days). */
working_memory_ttl_days: z.number().min(1).max(365).default(14),
/** Token budget for working memory injection at session start. */
working_memory_max_tokens: z.number().min(100).max(4000).default(1000),
/** When true, instruct the model to acknowledge prior context on session start. */
proactive_session_greeting: z.boolean().default(false),
```
**Step 3: Run typecheck**
```bash
pnpm typecheck
```
Expected: no errors.
**Step 4: Commit**
```bash
git add src/config/schema.ts
git commit -m "feat(memory): add user_namespace and working memory config fields"
```
---
## Task 2: WorkingMemory module
**Files:**
- Create: `src/memory/workingMemory.ts`
- Create: `src/memory/workingMemory.test.ts`
Working memory file format (stored at `{ns}/working.md`):
```
# Working Memory
Updated: 2026-02-25T11:30:00Z
Expires: 2026-03-10T11:30:00Z
[content]
```
**Step 1: Write the failing tests**
Create `src/memory/workingMemory.test.ts`:
```typescript
import { describe, it, expect } from 'vitest';
import { join } from 'path';
import { mkdtempSync, rmSync } from 'fs';
import { tmpdir } from 'os';
import { MemoryStore } from './store.js';
import { writeWorkingMemory, readWorkingMemory } from './workingMemory.js';
function makeStore(): { store: MemoryStore; dir: string } {
const dir = mkdtempSync(join(tmpdir(), 'wm-test-'));
const store = new MemoryStore({ dir, maxContextTokens: 2000 });
return { store, dir };
}
describe('writeWorkingMemory', () => {
it('writes a file with Updated/Expires headers', () => {
const { store, dir } = makeStore();
writeWorkingMemory(store, 'user/working', 'some content', 14, 1000);
const raw = store.read('user/working');
expect(raw).toContain('# Working Memory');
expect(raw).toContain('Updated:');
expect(raw).toContain('Expires:');
expect(raw).toContain('some content');
rmSync(dir, { recursive: true });
});
it('truncates content to token budget', () => {
const { store, dir } = makeStore();
const longContent = 'x'.repeat(10000);
writeWorkingMemory(store, 'user/working', longContent, 14, 100);
const raw = store.read('user/working');
// 100 tokens * 4 chars = 400 chars budget for content
const contentPart = raw.split('\n\n').slice(1).join('\n\n');
expect(contentPart.length).toBeLessThanOrEqual(400 + 10); // small tolerance
rmSync(dir, { recursive: true });
});
});
describe('readWorkingMemory', () => {
it('returns null when file does not exist', () => {
const { store, dir } = makeStore();
expect(readWorkingMemory(store, 'user/working')).toBeNull();
rmSync(dir, { recursive: true });
});
it('returns content when not expired', () => {
const { store, dir } = makeStore();
writeWorkingMemory(store, 'user/working', 'hello world', 14, 1000);
const result = readWorkingMemory(store, 'user/working');
expect(result).not.toBeNull();
expect(result!.content).toBe('hello world');
rmSync(dir, { recursive: true });
});
it('returns null when expired', () => {
const { store, dir } = makeStore();
// Write a file with a hardcoded past expiry date for deterministic testing
const expiredFile = [
'# Working Memory',
'Updated: 2025-01-01T00:00:00Z',
'Expires: 2025-01-02T00:00:00Z',
'',
'stale content',
].join('\n');
store.write('user/working', expiredFile, 'replace');
const result = readWorkingMemory(store, 'user/working');
expect(result).toBeNull();
rmSync(dir, { recursive: true });
});
it('returns null for malformed file', () => {
const { store, dir } = makeStore();
store.write('user/working', 'no headers here', 'replace');
expect(readWorkingMemory(store, 'user/working')).toBeNull();
rmSync(dir, { recursive: true });
});
});
```
**Step 2: Run test to verify it fails**
```bash
pnpm test:run src/memory/workingMemory.test.ts
```
Expected: FAIL — module not found.
**Step 3: Implement `src/memory/workingMemory.ts`**
```typescript
import type { MemoryStore } from './store.js';
export interface WorkingMemoryEntry {
content: string;
updatedAt: Date;
expiresAt: Date;
}
const HEADER_PREFIX = '# Working Memory\n';
/**
* Write a compaction summary to the working memory namespace.
* Content is capped at maxTokens (estimated at 4 chars/token).
* The file format includes Updated/Expires timestamps for lazy expiry checks.
*/
export function writeWorkingMemory(
store: MemoryStore,
namespace: string,
content: string,
ttlDays: number,
maxTokens: number,
): void {
const now = new Date();
const expiresAt = new Date(now.getTime() + ttlDays * 24 * 60 * 60 * 1000);
const maxChars = maxTokens * 4;
const truncatedContent = content.length > maxChars ? content.slice(0, maxChars) : content;
const file = [
'# Working Memory',
`Updated: ${now.toISOString()}`,
`Expires: ${expiresAt.toISOString()}`,
'',
truncatedContent,
].join('\n');
store.write(namespace, file, 'replace');
}
/**
* Read working memory. Returns null if the file is absent, malformed, or expired.
* MemoryStore.read() returns '' for missing files, so falsy check works.
* Expiry is checked lazily here — no background cleanup needed.
*/
export function readWorkingMemory(
store: MemoryStore,
namespace: string,
): WorkingMemoryEntry | null {
const raw = store.read(namespace);
if (!raw) {
return null;
}
if (!raw.startsWith(HEADER_PREFIX)) {
return null;
}
const lines = raw.split('\n');
let updatedAt: Date | null = null;
let expiresAt: Date | null = null;
let contentStartLine = 0;
for (let i = 1; i < lines.length; i++) {
const line = lines[i];
if (line.startsWith('Updated: ')) {
updatedAt = new Date(line.slice('Updated: '.length));
} else if (line.startsWith('Expires: ')) {
expiresAt = new Date(line.slice('Expires: '.length));
} else if (line === '' && expiresAt !== null) {
contentStartLine = i + 1;
break;
}
}
if (!updatedAt || !expiresAt || isNaN(expiresAt.getTime())) {
return null;
}
if (expiresAt.getTime() <= Date.now()) {
console.debug('[Flynn:working-memory] Working memory expired, skipping injection');
return null;
}
const content = lines.slice(contentStartLine).join('\n').trim();
return { content, updatedAt, expiresAt };
}
```
**Step 4: Run tests**
```bash
pnpm test:run src/memory/workingMemory.test.ts
```
Expected: all pass.
**Step 5: Typecheck**
```bash
pnpm typecheck
```
Expected: no errors.
**Step 6: Commit**
```bash
git add src/memory/workingMemory.ts src/memory/workingMemory.test.ts
git commit -m "feat(memory): add working memory read/write with TTL expiry"
```
---
## Task 3: Parameterized compaction prompt
**Files:**
- Modify: `src/backends/native/prompts.ts`
Instead of duplicating `COMPACTION_SYSTEM_PROMPT`, add a `buildCompactionPrompt()` function that shares rules and parameterizes the focus section. The existing `COMPACTION_SYSTEM_PROMPT` constant stays for backward compatibility.
**Step 1: Add the parameterized prompt builder**
In `src/backends/native/prompts.ts`, after `COMPACTION_SYSTEM_PROMPT`, add:
```typescript
/**
* Build a compaction system prompt. When `personalAssistant` is true,
* the prompt focuses on continuity context (what the user was working on,
* decisions, preferences, open threads) rather than generic summarization.
*
* The shared rules (preserve facts, 20% length, bullet points, no invention,
* skip transient content) are identical in both variants.
*/
export function buildCompactionPrompt(opts?: { personalAssistant?: boolean }): string {
const focus = opts?.personalAssistant
? `Focus on:
- What the user was working on and its current status (be specific: which files, commands, or steps were involved)
- Decisions made and why (include rationale when stated)
- Preferences or constraints the user expressed (tools, styles, approaches to avoid or prefer)
- Open threads, unresolved questions, or explicit follow-up items`
: `Focus on:
- Key topics discussed and conclusions reached
- Important decisions, commitments, or action items
- Technical details, code changes, or configurations that were established`;
const preamble = opts?.personalAssistant
? 'You are summarising a conversation for a personal assistant. Your summary will be injected at the start of the next session so the assistant can pick up exactly where things left off.'
: 'You are a conversation summarizer. Create a concise summary of the conversation that captures all important information.';
return `${preamble}
${focus}
Rules:
- Preserve key facts, file paths, error messages, and specific values verbatim.
- Be concise but complete — aim for roughly 20% of the original length.
- Use bullet points under short descriptive headings.
- Never invent information not present in the conversation.
- Skip purely transient content (one-off commands, status messages with no lasting significance).
Output format:
Return a markdown summary. No preamble — output only the summary.`;
}
```
**Step 2: Typecheck**
```bash
pnpm typecheck
```
Expected: no errors.
**Step 3: Commit**
```bash
git add src/backends/native/prompts.ts
git commit -m "feat(memory): add parameterized compaction prompt builder"
```
---
## Task 4: Thread working memory through compaction
**Files:**
- Modify: `src/context/compaction.ts`
Add `summary` to `CompactionResult` so the orchestrator can write it to `user/working` without re-computing it. Thread the PA prompt option. Also route memory extraction to `{userNamespace}/facts` when namespace is set.
**Step 1: Update `CompactionResult` interface**
In `src/context/compaction.ts`, add `summary` to `CompactionResult`:
```typescript
export interface CompactionResult {
/** The compacted messages: [summary, ...recentMessages]. */
messages: Message[];
/** Number of messages that were compacted (removed). */
compactedCount: number;
/** Estimated tokens before compaction. */
tokensBefore: number;
/** Estimated tokens after compaction. */
tokensAfter: number;
/** The raw summary text produced by the compaction model (populated when compaction ran). */
summary?: string;
}
```
**Step 2: Add `usePersonalAssistantPrompt` option and populate `summary`**
Update the `compactHistory` function signature:
```typescript
export async function compactHistory(opts: {
messages: Message[];
orchestrator: AgentOrchestrator;
config: CompactionConfig;
memoryStore?: MemoryStore;
autoExtract?: boolean;
usePersonalAssistantPrompt?: boolean;
memoryExtractionNamespace?: string;
}): Promise<CompactionResult> {
```
Update the imports to include `buildCompactionPrompt`:
```typescript
import { COMPACTION_SYSTEM_PROMPT, MEMORY_EXTRACTION_PROMPT, buildCompactionPrompt } from '../backends/native/prompts.js';
```
In the body, replace the hardcoded `COMPACTION_SYSTEM_PROMPT` in the `orchestrator.delegate()` call:
```typescript
const systemPrompt = opts.usePersonalAssistantPrompt
? buildCompactionPrompt({ personalAssistant: true })
: COMPACTION_SYSTEM_PROMPT;
const result = await orchestrator.delegate({
task: 'compaction',
tier,
systemPrompt,
message: formattedConversation,
maxTokens: config.summaryMaxTokens,
});
```
Populate `summary` in the return value:
```typescript
return {
messages: [...preservedMessages, summaryMessage, ...toKeep],
compactedCount: toSummarize.length,
tokensBefore: estimateMessageTokens(messages),
tokensAfter: estimateMessageTokens([...preservedMessages, summaryMessage, ...toKeep]),
summary: result.content,
};
```
**Step 3: Route memory extraction to user namespace when set**
In the memory extraction block (around line 133), change the write target:
```typescript
if (opts.memoryStore && opts.autoExtract !== false) {
try {
// ...existing extraction delegate call...
const extractedContent = extraction.content.trim();
if (extractedContent.length > 0 && !extractedContent.toLowerCase().includes('no facts')) {
const extractionNs = opts.memoryExtractionNamespace ?? 'global';
opts.memoryStore.write(extractionNs, extractedContent, 'append');
console.log(`[Flynn:memory] Extracted ${extractedContent.length} chars of facts to ${extractionNs} memory`);
}
} catch (error) {
console.warn('[Flynn:memory] Failed to extract facts during compaction:', error);
}
}
```
**Step 4: Typecheck**
```bash
pnpm typecheck
```
Expected: no errors.
**Step 5: Run full test suite**
```bash
pnpm test:run
```
Expected: all pass (summary field is additive, existing tests unaffected).
**Step 6: Commit**
```bash
git add src/context/compaction.ts src/backends/native/prompts.ts
git commit -m "feat(memory): thread summary and PA prompt through compaction result"
```
---
## Task 5: Orchestrator — session-start injection + post-compaction write
**Files:**
- Modify: `src/backends/native/orchestrator.ts`
This task has two parts:
1. Write working memory after every compaction (when `userNamespace` is set)
2. Inject `user/profile` + `user/working` into the system prompt on first message
**Critical design note:** `_systemPromptBase` must remain immutable (set once in constructor). Session context is stored in a separate `_sessionContext` field and composed into the system prompt in `_injectMemoryContext()`.
**Step 1: Explore existing test patterns**
```bash
ls src/backends/native/*.test.ts
```
Examine the existing orchestrator test setup to understand mock patterns before writing the integration test.
**Step 2: Add new fields to `OrchestratorConfig`**
In `src/backends/native/orchestrator.ts`, in `OrchestratorConfig` (after `attachmentCollector`), add:
```typescript
/** Shared identity namespace for cross-channel memory (e.g. 'user'). Absent = session-scoped. */
userNamespace?: string;
/** TTL in days for working memory. Defaults to 14. */
workingMemoryTtlDays?: number;
/** Token budget for working memory injection. Defaults to 1000. */
workingMemoryMaxTokens?: number;
/** When true, instruct the model to acknowledge prior context on session start. */
proactiveSessionGreeting?: boolean;
```
**Step 3: Store new config fields on the class**
Add private fields (near the other `_memory*` fields):
```typescript
private _userNamespace?: string;
private _workingMemoryTtlDays: number;
private _workingMemoryMaxTokens: number;
private _proactiveSessionGreeting: boolean;
private _sessionContext: string | null = null;
```
In the constructor, after the existing memory field assignments:
```typescript
this._userNamespace = config.userNamespace;
this._workingMemoryTtlDays = config.workingMemoryTtlDays ?? 14;
this._workingMemoryMaxTokens = config.workingMemoryMaxTokens ?? 1000;
this._proactiveSessionGreeting = config.proactiveSessionGreeting ?? false;
```
**Step 4: Post-compaction write in `compact()`**
Import `writeWorkingMemory` at the top of orchestrator.ts:
```typescript
import { writeWorkingMemory } from '../../memory/workingMemory.js';
```
In the `compact()` method, after the `auditLogger?.sessionCompact(...)` call, add:
```typescript
// Write working memory when user namespace is configured
if (result.summary && this._userNamespace && this._memoryStore) {
const workingNs = `${this._userNamespace}/working`;
writeWorkingMemory(
this._memoryStore,
workingNs,
result.summary,
this._workingMemoryTtlDays,
this._workingMemoryMaxTokens,
);
console.log(`[Flynn:working-memory] Updated ${workingNs} after compaction`);
}
```
Also, pass `usePersonalAssistantPrompt` and `memoryExtractionNamespace` to `compactHistory()` in `compact()`:
```typescript
const result = await compactHistory({
messages,
orchestrator: this,
config,
memoryStore: this._memoryStore,
autoExtract: this._memoryAutoExtract,
usePersonalAssistantPrompt: Boolean(this._userNamespace),
memoryExtractionNamespace: this._userNamespace ? `${this._userNamespace}/facts` : undefined,
});
```
**Step 5: Session-start injection — `_buildSessionContext()`**
Add a private method that builds the session context string (but does NOT mutate `_systemPromptBase`):
```typescript
/**
* Build session context from user/profile and user/working memory.
* Called once on first process() call. Returns null if no context available.
* Does NOT mutate _systemPromptBase — the result is stored in _sessionContext
* and composed into the system prompt by _injectMemoryContext().
*/
private _buildSessionContext(): void {
if (this._sessionContext !== null || !this._memoryStore || !this._userNamespace) {
return;
}
const sections: string[] = [];
// User profile block
const profile = this._memoryStore.read(`${this._userNamespace}/profile`);
if (profile.length > 0) {
sections.push(`--- Who you're talking to ---\n${profile}`);
}
// Working memory block
const working = readWorkingMemory(this._memoryStore, `${this._userNamespace}/working`);
if (working) {
sections.push(`--- Recent context ---\n${working.content}`);
}
if (sections.length === 0) {
// Set to empty string (not null) to indicate we've run but found nothing.
// Null means "not yet computed".
this._sessionContext = '';
return;
}
let ctx = sections.join('\n\n');
if (this._proactiveSessionGreeting) {
ctx += '\n\n[If relevant, briefly acknowledge what the user was last working on before responding to their first message.]';
}
this._sessionContext = ctx;
}
```
Add the import at the top of orchestrator.ts:
```typescript
import { writeWorkingMemory, readWorkingMemory } from '../../memory/workingMemory.js';
```
**Step 6: Compose session context in `_injectMemoryContext()`**
In `_injectMemoryContext()`, change the base prompt computation to include session context:
Where the method currently does:
```typescript
this._agent.setSystemPrompt(this._systemPromptBase);
// and later:
const enrichedPrompt = `${this._systemPromptBase}\n\n# Memory Context\n\n...`;
```
Change to compose session context into the effective base:
```typescript
const effectiveBase = this._sessionContext
? `${this._systemPromptBase}\n\n${this._sessionContext}`
: this._systemPromptBase;
```
Then use `effectiveBase` everywhere `_systemPromptBase` was used in that method.
**Step 7: Call `_buildSessionContext()` from `process()`**
In the `process()` method, before `_injectMemoryContext()`:
```typescript
// One-time session-start context injection (user/profile + user/working)
this._buildSessionContext();
this._injectMemoryContext(userMessage);
```
**Step 8: Reset session context in `reset()`**
```typescript
reset(): void {
this._agent.reset();
this._usageByTier.clear();
this._lastContextAlertLevel = null;
this._pendingContextAlert = undefined;
this._lastCheckpointAt = 0;
this._sessionContext = null; // ← add: re-read on next process()
}
```
**Step 9: Typecheck**
```bash
pnpm typecheck
```
Expected: no errors.
**Step 10: Run full test suite**
```bash
pnpm test:run
```
Expected: all pass.
**Step 11: Commit**
```bash
git add src/backends/native/orchestrator.ts
git commit -m "feat(memory): inject session context and write working memory after compaction"
```
---
## Task 6: Daemon routing — wire new config fields
**Files:**
- Modify: `src/daemon/routing.ts`
**Step 1: Add the new fields after `memoryDailyLogMaxAssistantChars`**
In `src/daemon/routing.ts`, after:
```typescript
memoryDailyLogMaxAssistantChars: deps.config.memory?.daily_log?.max_assistant_chars,
```
Add:
```typescript
userNamespace: deps.config.memory?.user_namespace,
workingMemoryTtlDays: deps.config.memory?.working_memory_ttl_days,
workingMemoryMaxTokens: deps.config.memory?.working_memory_max_tokens,
proactiveSessionGreeting: deps.config.memory?.proactive_session_greeting,
```
**Step 2: Typecheck**
```bash
pnpm typecheck
```
Expected: no errors.
**Step 3: Run full test suite**
```bash
pnpm test:run
```
Expected: all pass.
**Step 4: Commit**
```bash
git add src/daemon/routing.ts
git commit -m "feat(memory): wire user_namespace and working memory config to orchestrator"
```
---
## Task 7: Smoke test and state.json update
**Step 1: Run full test suite one final time**
```bash
pnpm test:run
```
Expected: all pass.
**Step 2: Manual smoke test (optional but recommended)**
Add this to your `config.yaml`:
```yaml
memory:
user_namespace: "user"
working_memory_ttl_days: 14
working_memory_max_tokens: 1000
proactive_session_greeting: false
```
Start a session, chat for a while, trigger a compaction (or wait for auto-compaction), then restart and start a new session. Verify `user/working.md` exists in your memory dir and is injected on the new session's first turn.
**Step 3: Update state.json**
In `docs/plans/state.json`, add an entry to `completed`:
```json
"pi_personal_assistant_memory": {
"status": "complete",
"commit": "<sha>",
"summary": "Two-tier personal assistant memory: working memory (user/working, TTL-based) written on compaction, injected at session start; unified user/* namespace across channels; parameterized compaction prompt; memory extraction routed to user/facts; proactive session greeting option."
}
```
**Step 4: Final commit**
```bash
git add docs/plans/state.json
git commit -m "docs(state): mark pi personal assistant memory as complete"
```
---
## Success Criteria Checklist
- [ ] `user_namespace` absent → zero behavior change (all new code paths guarded by `this._userNamespace` check)
- [ ] After compaction with `user_namespace` set, `user/working.md` exists in memory dir
- [ ] Daemon restart + new session → `user/working` content appears in the first-turn system prompt
- [ ] Telegram and web UI sessions share the same `user/working` file
- [ ] Expired working memory (TTL elapsed) is silently ignored
- [ ] `_systemPromptBase` is never mutated — session context composed via `_sessionContext` field
- [ ] Memory extraction writes to `{userNamespace}/facts` when namespace set, `global` otherwise
- [ ] All existing tests pass
---
## Future considerations (not in scope)
- **OrchestratorConfig refactor:** Group the 17+ flat `memory*` fields into a nested `MemoryConfig` interface. Mechanical but improves readability. Do this in a separate PR.
- **Concurrent compaction guard:** Multiple sessions could write `user/working` simultaneously. `writeFileSync` is atomic per-call so data won't corrupt, but the last-writer-wins. Consider a merge strategy if this becomes a problem.
- **Token counting:** Currently using `maxTokens * 4` char heuristic. Could reuse `estimateMessageTokens()` for consistency.
---
## Reference: Config
```yaml
memory:
# Enables shared identity and working memory. Absent = unchanged behavior.
user_namespace: "user"
working_memory_ttl_days: 14 # default
working_memory_max_tokens: 1000 # default
proactive_session_greeting: false # default
```