diff --git a/plans/fizzy-puzzling-candy.md b/plans/fizzy-puzzling-candy.md new file mode 100644 index 0000000..3e984b8 --- /dev/null +++ b/plans/fizzy-puzzling-candy.md @@ -0,0 +1,150 @@ +# Plan: Session Summarization Hook + +## Problem + +Sessions are tracked in `~/.claude/state/personal-assistant/history/index.json` but: +1. No conversation logs are captured to our history folder +2. Sessions never get marked as summarized +3. Memory files remain empty (decisions, preferences, projects, facts) + +## Root Cause + +Missing `SessionEnd` hook to trigger summarization when sessions end. + +## Solution + +Add a `SessionEnd` hook that: +1. Reads the transcript from Claude's built-in storage (`transcript_path`) +2. Extracts key information (decisions, preferences, project context, facts) +3. Saves to appropriate memory files +4. Updates history index to mark session as summarized + +## Files to Modify + +| File | Action | +|------|--------| +| `~/.claude/hooks/hooks.json` | Add `SessionEnd` hook entry | +| `~/.claude/hooks/scripts/session-end.sh` | **Create** - orchestrates summarization | +| `~/.claude/hooks/scripts/summarize-transcript.py` | **Create** - Python script to process transcript | + +## Implementation Details + +### 1. Hook Configuration (`hooks.json`) + +Add `SessionEnd` hook that calls the summarization script: + +```json +"SessionEnd": [ + { + "hooks": [ + { + "type": "command", + "command": "~/.claude/hooks/scripts/session-end.sh", + "timeout": 120 + } + ] + } +] +``` + +### 2. Session End Script (`session-end.sh`) + +- Receives JSON via stdin with `session_id`, `transcript_path`, `reason` +- Calls Python summarization script +- Handles errors gracefully (session end shouldn't fail) + +### 3. Summarization Script (`summarize-transcript.py`) + +The script will: + +1. **Parse transcript** - Read the `.jsonl` file from `transcript_path` +2. **Extract key items** - Use heuristics to identify: + - Decisions: "let's use", "we decided", "I'll go with" + - Preferences: "I prefer", "always", "never", "I like" + - Project context: file paths, config references + - Facts: environment info, tool locations +3. **Deduplicate** - Check against existing memory items +4. **Save to memory files** - Append new items with UUIDs +5. **Update history index** - Mark session as summarized, add topics + +### Processing Approach: Hybrid (Decision) + +**Step 1: Threshold check** +- Skip sessions with < 3 user messages +- Skip sessions that are only quick commands (no substantive discussion) + +**Step 2: Heuristic extraction (fast, no API)** +- File paths mentioned → project context +- Environment facts (tool locations, versions) +- Simple preferences with clear keywords + +**Step 3: LLM extraction (if substantive content)** +- Complex decisions with rationale +- Nuanced preferences +- Project context requiring interpretation +- Use Claude API (Haiku for cost efficiency) + +### Transcript Storage (Decision) + +Reference Claude's existing transcript location (`~/.claude/projects/.../[uuid].jsonl`) rather than copying to our history folder. The history index will store the transcript path for future reference. + +### Memory File Format + +Each item: +```json +{ + "id": "uuid", + "date": "YYYY-MM-DD", + "content": "Brief description", + "context": "Additional context", + "session": "session-id" +} +``` + +## Implementation Steps + +### Step 1: Update hooks.json + +Add `SessionEnd` entry to `~/.claude/hooks/hooks.json` + +### Step 2: Create session-end.sh + +Shell wrapper at `~/.claude/hooks/scripts/session-end.sh`: +- Parse JSON input from stdin +- Extract session_id, transcript_path, reason +- Call Python summarization script +- Handle errors silently (don't break session exit) + +### Step 3: Create summarize-transcript.py + +Python script at `~/.claude/hooks/scripts/summarize-transcript.py`: + +``` +Arguments: --session-id --transcript [--reason ] + +1. Load transcript (.jsonl) +2. Count user messages → skip if < 3 +3. Heuristic pass: + - Extract file paths → projects.json + - Extract env facts → facts.json +4. If substantive content detected: + - Call Claude API (Haiku) for decisions/preferences + - Parse response → decisions.json, preferences.json +5. Update history/index.json: + - Set summarized: true + - Add transcript_path + - Add extracted topics +``` + +### Step 4: Update history index schema + +Add `transcript_path` field to session entries in `history/index.json` + +## Testing + +1. Start a test session with substantive discussion +2. Exit session normally +3. Verify: + - Hook fired (check with `--debug`) + - Memory files updated + - History index marked summarized