151 lines
4.3 KiB
Markdown
151 lines
4.3 KiB
Markdown
# Plan: Session Summarization Hook
|
|
|
|
## Problem
|
|
|
|
Sessions are tracked in `~/.claude/state/personal-assistant/history/index.json` but:
|
|
1. No conversation logs are captured to our history folder
|
|
2. Sessions never get marked as summarized
|
|
3. Memory files remain empty (decisions, preferences, projects, facts)
|
|
|
|
## Root Cause
|
|
|
|
Missing `SessionEnd` hook to trigger summarization when sessions end.
|
|
|
|
## Solution
|
|
|
|
Add a `SessionEnd` hook that:
|
|
1. Reads the transcript from Claude's built-in storage (`transcript_path`)
|
|
2. Extracts key information (decisions, preferences, project context, facts)
|
|
3. Saves to appropriate memory files
|
|
4. Updates history index to mark session as summarized
|
|
|
|
## Files to Modify
|
|
|
|
| File | Action |
|
|
|------|--------|
|
|
| `~/.claude/hooks/hooks.json` | Add `SessionEnd` hook entry |
|
|
| `~/.claude/hooks/scripts/session-end.sh` | **Create** - orchestrates summarization |
|
|
| `~/.claude/hooks/scripts/summarize-transcript.py` | **Create** - Python script to process transcript |
|
|
|
|
## Implementation Details
|
|
|
|
### 1. Hook Configuration (`hooks.json`)
|
|
|
|
Add `SessionEnd` hook that calls the summarization script:
|
|
|
|
```json
|
|
"SessionEnd": [
|
|
{
|
|
"hooks": [
|
|
{
|
|
"type": "command",
|
|
"command": "~/.claude/hooks/scripts/session-end.sh",
|
|
"timeout": 120
|
|
}
|
|
]
|
|
}
|
|
]
|
|
```
|
|
|
|
### 2. Session End Script (`session-end.sh`)
|
|
|
|
- Receives JSON via stdin with `session_id`, `transcript_path`, `reason`
|
|
- Calls Python summarization script
|
|
- Handles errors gracefully (session end shouldn't fail)
|
|
|
|
### 3. Summarization Script (`summarize-transcript.py`)
|
|
|
|
The script will:
|
|
|
|
1. **Parse transcript** - Read the `.jsonl` file from `transcript_path`
|
|
2. **Extract key items** - Use heuristics to identify:
|
|
- Decisions: "let's use", "we decided", "I'll go with"
|
|
- Preferences: "I prefer", "always", "never", "I like"
|
|
- Project context: file paths, config references
|
|
- Facts: environment info, tool locations
|
|
3. **Deduplicate** - Check against existing memory items
|
|
4. **Save to memory files** - Append new items with UUIDs
|
|
5. **Update history index** - Mark session as summarized, add topics
|
|
|
|
### Processing Approach: Hybrid (Decision)
|
|
|
|
**Step 1: Threshold check**
|
|
- Skip sessions with < 3 user messages
|
|
- Skip sessions that are only quick commands (no substantive discussion)
|
|
|
|
**Step 2: Heuristic extraction (fast, no API)**
|
|
- File paths mentioned → project context
|
|
- Environment facts (tool locations, versions)
|
|
- Simple preferences with clear keywords
|
|
|
|
**Step 3: LLM extraction (if substantive content)**
|
|
- Complex decisions with rationale
|
|
- Nuanced preferences
|
|
- Project context requiring interpretation
|
|
- Use Claude API (Haiku for cost efficiency)
|
|
|
|
### Transcript Storage (Decision)
|
|
|
|
Reference Claude's existing transcript location (`~/.claude/projects/.../[uuid].jsonl`) rather than copying to our history folder. The history index will store the transcript path for future reference.
|
|
|
|
### Memory File Format
|
|
|
|
Each item:
|
|
```json
|
|
{
|
|
"id": "uuid",
|
|
"date": "YYYY-MM-DD",
|
|
"content": "Brief description",
|
|
"context": "Additional context",
|
|
"session": "session-id"
|
|
}
|
|
```
|
|
|
|
## Implementation Steps
|
|
|
|
### Step 1: Update hooks.json
|
|
|
|
Add `SessionEnd` entry to `~/.claude/hooks/hooks.json`
|
|
|
|
### Step 2: Create session-end.sh
|
|
|
|
Shell wrapper at `~/.claude/hooks/scripts/session-end.sh`:
|
|
- Parse JSON input from stdin
|
|
- Extract session_id, transcript_path, reason
|
|
- Call Python summarization script
|
|
- Handle errors silently (don't break session exit)
|
|
|
|
### Step 3: Create summarize-transcript.py
|
|
|
|
Python script at `~/.claude/hooks/scripts/summarize-transcript.py`:
|
|
|
|
```
|
|
Arguments: --session-id <id> --transcript <path> [--reason <reason>]
|
|
|
|
1. Load transcript (.jsonl)
|
|
2. Count user messages → skip if < 3
|
|
3. Heuristic pass:
|
|
- Extract file paths → projects.json
|
|
- Extract env facts → facts.json
|
|
4. If substantive content detected:
|
|
- Call Claude API (Haiku) for decisions/preferences
|
|
- Parse response → decisions.json, preferences.json
|
|
5. Update history/index.json:
|
|
- Set summarized: true
|
|
- Add transcript_path
|
|
- Add extracted topics
|
|
```
|
|
|
|
### Step 4: Update history index schema
|
|
|
|
Add `transcript_path` field to session entries in `history/index.json`
|
|
|
|
## Testing
|
|
|
|
1. Start a test session with substantive discussion
|
|
2. Exit session normally
|
|
3. Verify:
|
|
- Hook fired (check with `--debug`)
|
|
- Memory files updated
|
|
- History index marked summarized
|