Files
agentmon/docs/plans/2026-03-14-dashboard-design.md
2026-03-20 11:17:17 -07:00

3.8 KiB

Dashboard with Real-time Stats and Graphs

Date: 2026-03-14 Status: Approved

Overview

Add a comprehensive dashboard at / combining server-side aggregation endpoints, the existing WebSocket stream, and uPlot charts for real-time agent monitoring with stats and graphs.

Architecture

Three data sources feed the dashboard:

  1. Server-side aggregation endpoints (new) - historical stats on page load
  2. WebSocket stream (existing) - live events update charts in real-time
  3. Existing REST endpoints - sessions/runs for linking out

New Backend Endpoints

GET /v1/stats/summary

Current-day aggregates:

{
  "active_sessions": 3,
  "runs_today": 47,
  "tool_calls_today": 312,
  "errors_today": 2,
  "by_framework": {
    "openclaw": { "runs": 30, "tools": 210, "errors": 1 },
    "claude-code": { "runs": 17, "tools": 102, "errors": 1 }
  }
}

Simple COUNT + GROUP BY over the events table using type and source_framework columns, filtered to ts >= today midnight.

GET /v1/stats/timeseries?window=1h&bucket=1m

Bucketed event counts:

{
  "window": "1h",
  "bucket": "1m",
  "series": [
    { "ts": "2026-03-14T10:00:00Z", "runs": 2, "tools": 14, "errors": 0 },
    { "ts": "2026-03-14T10:01:00Z", "runs": 1, "tools": 8, "errors": 0 }
  ]
}

Bucket sizes auto-calculated if not provided: 1h→1m, 6h→5m, 24h→15m, 7d→1h. Uses Postgres date_bin for bucketing.

Dashboard Layout

Single-scroll page with four sections:

1. Summary Strip

Four stat cards in a horizontal row:

  • Active Sessions - sessions with no session.end event, with framework breakdown
  • Runs Today - total runs since midnight
  • Tool Calls - total tool spans today
  • Errors - error events today, red-highlighted if > 0

2. OpenClaw VM Strip

Reuse existing VM pill component (zap/orb/sun online/offline status).

3. Activity Charts

Two charts side by side:

  • Left: Event rate - uPlot stacked area time-series. One series per category (runs, tools, errors). Time window selector (1h/6h/24h/7d) in top-right.
  • Right: Framework breakdown - horizontal bar chart showing events by framework. Rendered as styled divs (categorical, no uPlot needed).

4. Bottom Panels

Two columns:

  • Left: Recent activity feed - last 20 events as compact timeline (reuse existing timeline helpers)
  • Right: Top tools - ranked list with counts and bar visualization

Frontend

  • uPlot loaded from CDN for time-series charts
  • Thin wrapper createChart(el, series, opts) applies dark theme from existing CSS vars
  • Real-time update flow:
    1. Page load → fetch summary + timeseries (default 1h window)
    2. Connect WebSocket → append live events to charts
    3. On each WS event: increment counters, update chart bucket, update top tools, prepend to feed
    4. Time window change → re-fetch from server, rebuild chart
  • Framework bars: styled <div> elements with proportional widths
  • Activity feed: reuse getEventIcon, getEventLabel, getEventBody from existing code (extracted as shared helpers)
  • Time window selector: segmented control with 1h / 6h / 24h / 7d buttons

File Changes

New files

  • internal/store/postgres/stats.go - Summary() and Timeseries() query functions

Modified files

  • cmd/query-api/main.go - add /v1/stats/summary and /v1/stats/timeseries handlers
  • cmd/web-ui/static/index.html - add uPlot CDN script/link tags
  • cmd/web-ui/static/app.js - dashboard route at /, shared helpers, chart rendering, real-time updates
  • cmd/web-ui/static/style.css - dashboard layout, summary cards, chart containers, time window selector, framework bars

No changes to

  • Database schema (queries use existing events table)
  • Event format or ingestion pipeline
  • Existing pages (sessions, agents, openclaw)