11 KiB
phase, plan, type, wave, depends_on, files_modified, autonomous, must_haves
| phase | plan | type | wave | depends_on | files_modified | autonomous | must_haves | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 03-live-ops-dashboard | 02 | execute | 2 |
|
|
false |
|
Purpose: This is the user-facing deliverable — the operator opens the dashboard and sees real-time system health without tailing logs. All data comes from the RPC handlers created in Plan 01.
Output: Enhanced dashboard.js with four new sections, supporting CSS, human-verified live dashboard.
<execution_context> @/home/will/.config/opencode/get-shit-done/workflows/execute-plan.md @/home/will/.config/opencode/get-shit-done/templates/summary.md </execution_context>
@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/03-live-ops-dashboard/03-01-SUMMARY.md @src/gateway/ui/pages/dashboard.js @src/gateway/ui/style.css @src/gateway/ui/index.html @src/gateway/ui/app.js @src/gateway/ui/lib/ws-client.js Task 1: Extend dashboard page with live ops sections src/gateway/ui/pages/dashboard.js src/gateway/ui/style.css **IMPORTANT: Extend the existing vanilla JS dashboard — do NOT replace with React or any framework. This is a locked user decision.**Rewrite src/gateway/ui/pages/dashboard.js to show four sections (replacing the current simple health/channels/usage layout):
Section 1: Core Counters (top row of stat cards)
- Messages Processed (from
system.metrics→ messagesProcessed) - Active Sessions (from
system.health→ sessions) - Queue Depth (from
system.metrics→ queueDepth) - Daemon Uptime (from
system.metrics→ uptime, formatted as "Xd Xh Xm Xs") - Active Requests (from
system.metrics→ activeRequests) - Errors (from
system.metrics→ errors, colored red if > 0)
Use the existing .stats-grid and .stat-card CSS classes.
Section 2: Model Performance (table of recent model calls)
- Show the most recent 20 model calls from
system.metrics→ modelCalls.recentCalls - Table columns: Time (relative, e.g. "3s ago"), Provider, Latency (ms), Tokens/sec, In/Out tokens, Status (✓ or ✗)
- Summary row above the table: Total calls, Avg latency, Error rate %
- Use existing table CSS classes
Section 3: Event Stream (scrollable log)
- Fetch from
system.eventswith{ limit: 50 } - Each event rendered as a row:
[HH:MM:SS] [LEVEL] source: message - Color-code: error=red, warn=yellow, info=default
- Container has max-height with overflow-y: auto and auto-scrolls to bottom on new entries
- New class
.event-streamfor the container,.event-rowfor each entry,.event-level-error,.event-level-warn,.event-level-infofor coloring
Section 4: Active Requests (table, only shown when requests in flight)
- Fetch from
system.activeRequests - Table columns: Session, Channel, Duration (live-updating), Started
- If no active requests, show "No active requests" muted text
- Use existing table CSS
Section 5: Channels (keep existing)
- Keep the existing channels grid showing connected/disconnected channel adapters
Refresh strategy:
- Replace the current 10-second interval with a 3-second interval for the core data (system.metrics, system.events, system.activeRequests)
- Fetch system.health and system.channels every 10 seconds (less dynamic data)
- Use
Promise.allto batch the frequent calls together - Keep the existing
teardown()pattern withclearInterval
Implementation approach:
- Keep the same module pattern:
loadDashboard(el, client)function +DashboardPageexport withrender/teardown - Use two timers:
_fastTimer(3s) for metrics/events/requests,_slowTimer(10s) for health/channels - On first render, fetch everything with
Promise.all - On subsequent fast ticks, only update the dynamic sections (don't re-render the whole page — use targeted DOM updates via
getElementByIdfor each section) - Generate unique section IDs:
#ops-counters,#ops-model-table,#ops-events,#ops-requests,#ops-channels
CSS additions in src/gateway/ui/style.css:
Add at the end of the file (before the responsive section):
/* ── Event Stream ──────────────────────────────────────── */
.event-stream {
max-height: 300px;
overflow-y: auto;
background-color: var(--bg-secondary);
border: 1px solid var(--border);
border-radius: var(--radius);
padding: 8px;
font-size: var(--font-size-sm);
font-family: var(--font-mono);
}
.event-row {
padding: 4px 8px;
border-bottom: 1px solid var(--border-light);
white-space: pre-wrap;
word-break: break-word;
}
.event-row:last-child {
border-bottom: none;
}
.event-level-error { color: var(--error); }
.event-level-warn { color: var(--warning); }
.event-level-info { color: var(--text-secondary); }
/* ── Model Metrics Summary ─────────────────────────────── */
.metrics-summary {
display: flex;
gap: 24px;
margin-bottom: 12px;
font-size: var(--font-size-sm);
color: var(--text-secondary);
}
.metrics-summary .metric {
display: flex;
gap: 6px;
}
.metrics-summary .metric-value {
font-weight: 600;
color: var(--text-primary);
}
Keep the formatUptime helper — it already exists and works perfectly.
Avoid: Do NOT add animations or transitions. Do NOT import external libraries. Do NOT use template literals with innerHTML for the fast-update path — use targeted textContent/innerHTML updates on specific elements to avoid flicker.
pnpm typecheck — no type errors (vanilla JS won't affect this, but ensures no TS regressions).
pnpm build — builds successfully (UI files are served as static assets, not compiled).
Manual check: Open src/gateway/ui/pages/dashboard.js and verify it:
- Calls
client.call('system.metrics') - Calls
client.call('system.events') - Calls
client.call('system.activeRequests') - Has 3-second and 10-second refresh timers
- Has
teardown()that cleans up both timers Dashboard page shows five sections: core counters, model performance table, event stream, active requests, and channels. Counters and events refresh every 3 seconds. Health and channels refresh every 10 seconds. Event stream auto-scrolls and is color-coded by level. Active requests section shows in-flight requests or "no active requests" message. All existing stat-card and table CSS reused; new event-stream CSS added.
Steps to verify:
- Start Flynn:
pnpm dev - Open the dashboard in a browser (default: http://localhost:3100 or configured port)
- Verify the dashboard shows:
- Core counters row: Messages Processed, Active Sessions, Queue Depth, Uptime, Active Requests, Errors
- Model Performance section: table of recent model calls (may be empty if no messages sent yet)
- Event Stream section: scrollable log (may show startup events)
- Active Requests section: "No active requests" or table
- Channels section: connected channel adapters
- Send a message through the chat page (or via a connected channel) and verify:
- Messages Processed counter increments within 3 seconds
- Model Performance table shows the new call with latency and tokens/sec
- Event stream shows relevant entries
- Trigger an error (e.g., send a message that causes a tool error) and verify it appears in the event stream in red
- Test HTTP /health:
curl http://localhost:3100/health— should return JSON with status, uptime, version - Run
pnpm test:run— all tests pass
Resume signal: Type "approved" or describe issues. Human confirms dashboard displays correctly and updates in real-time. Dashboard visually confirmed working with live-updating metrics, event stream, and model performance data.
1. Dashboard loads without errors in browser console 2. All five sections render with real data 3. Counters update within 3 seconds of events occurring 4. Event stream is scrollable and color-coded 5. `curl /health` returns valid JSON 6. `pnpm test:run` — all tests pass 7. `pnpm typecheck` — zero type errors<success_criteria>
- Dashboard shows live-updating counters that change as messages flow (DASH-01)
- Model call metrics visible with latency and tokens/sec (DASH-02)
- Event stream shows errors with timestamps and context (DASH-03)
- Active requests tracked and displayed (DASH-04)
- GET /health returns JSON status (DASH-05)
- Existing dashboard pages (chat, sessions, usage, settings) unaffected
- Zero test regressions </success_criteria>