Files
agentmon/docs/plans/2026-04-22-web-ui-improvements-plan.md
2026-05-20 17:35:56 -07:00

39 KiB
Raw Permalink Blame History

Web UI Improvements — Plan

Date: 2026-04-22 Status: Pending

For Claude: Use code-implementer agent to execute tasks. Each task is self-contained and can be committed independently. All changes are frontend-only (cmd/web-ui/static/) unless noted.

Context: The previous UI plan (2026-03-28-ui-ux-improvements-design.md) has been fully implemented. This plan covers the next layer of improvements: data richness, UX gaps, bug fixes, code quality, and new pages.

Architecture: Vanilla JS SPA, no build tools. app.js is a single IIFE (~3700 lines). style.css uses CSS custom properties. Go server is a static file host + API reverse proxy + WebSocket proxy. No backend changes are needed for most tasks; backend tasks are noted explicitly.


Summary

# Task Files Impact Effort
1 Fix Ctrl+K label on Linux app.js, index.html Low Trivial
2 Fix WS unsubscribe variable reuse bug app.js Medium Small
3 Stack multiple toasts app.js, style.css Low Small
4 Show token usage + cost in run meta tiles app.js High Small
5 Show aggregate tokens + cost in session detail app.js High Small
6 Error count column in sessions table app.js, style.css Medium Small
7 "Showing X of Y" pagination indicator app.js, query-api/main.go, store/ Medium Medium
8 Sortable sessions table app.js, style.css Medium Medium
9 Relax + expand global search app.js Medium Small
10 Agent deep-link routes (/agents/:key) app.js, style.css Medium Medium
11 Infrastructure manual refresh app.js, style.css Low Small
12 Settings/admin page (data retention UI) app.js, style.css Medium Medium
13 Cost/usage analytics panel app.js, style.css High Medium
14 Span waterfall / trace view app.js, style.css High Large
15 Error boundary for render functions app.js Medium Medium
16 Split app.js into logical modules app.js → multiple files, index.html Medium Large

Task 1: Fix Ctrl+K Label on Linux

Problem: The command palette hint button in the header shows ⌘K (macOS), but on Linux the keyboard shortcut is Ctrl+K. The keyboard handler already correctly handles both (e.metaKey || e.ctrlKey), but the visible hint is wrong.

Files: cmd/web-ui/static/index.html, cmd/web-ui/static/app.js

Changes:

In index.html, the static ⌘K text in the button:

<button class="cmd-k-hint" id="cmd-k-hint" title="Command palette" type="button">
  <kbd>⌘K</kbd>
</button>

Replace with a JS-rendered label. In app.js, in the DOMContentLoaded handler where cmd-k-hint is wired up, also set the button text:

const isMac = /Mac|iPhone|iPad/.test(navigator.platform || navigator.userAgentData?.platform || '');
const btn = document.getElementById('cmd-k-hint');
if (btn) {
  btn.innerHTML = `<kbd>${isMac ? '⌘K' : 'Ctrl+K'}</kbd>`;
  btn.addEventListener('click', openCommandPalette);
}

Remove the static <kbd>⌘K</kbd> from index.html (leave the button element, just clear its body — JS will fill it).

Commit: fix(web-ui): show Ctrl+K shortcut hint on non-Mac platforms


Task 2: Fix WebSocket Unsubscribe Variable Reuse Bug

Problem: sessionsUnsubscribe (declared at line 326 of app.js) is reused as the cleanup handle for three different pages: the sessions list (renderSessions), the session detail (renderSession), and the run detail (renderRun). If navigation happens quickly, the previous page's cleanup may not run correctly or may clean up the wrong subscription.

Files: cmd/web-ui/static/app.js

Changes:

Rename the per-page subscription variables to be more specific. The cleanupLiveViews function already calls sessionsUnsubscribe() — keep that. But use the right variable in each context:

  1. Rename the declaration at line 326:

    let sessionsPageUnsubscribe = null;  // was sessionsUnsubscribe
    let sessionDetailUnsubscribe = null;
    let runDetailUnsubscribe = null;
    
  2. In cleanupLiveViews, clean up all three:

    if (sessionsPageUnsubscribe) { sessionsPageUnsubscribe(); sessionsPageUnsubscribe = null; }
    if (sessionDetailUnsubscribe) { sessionDetailUnsubscribe(); sessionDetailUnsubscribe = null; }
    if (runDetailUnsubscribe) { runDetailUnsubscribe(); runDetailUnsubscribe = null; }
    
  3. In renderSessions, assign to sessionsPageUnsubscribe.

  4. In renderSession, assign to sessionDetailUnsubscribe.

  5. In renderRun, assign to runDetailUnsubscribe.

  6. In loadRunDetailData, when unsetting the subscription after run ends, reference runDetailUnsubscribe.

Commit: fix(web-ui): rename per-page WS unsubscribe variables to prevent reuse bug


Task 3: Stack Multiple Toast Notifications

Problem: The current showToast function removes any existing toast before adding a new one. If two events arrive in quick succession (e.g., a copy success followed by an API error), the first toast disappears immediately.

Files: cmd/web-ui/static/app.js, cmd/web-ui/static/style.css

Changes:

In showToast, instead of removing the existing toast, append new toasts and position them as a stack. Limit to 3 visible at a time.

function showToast(message, type) {
  // Limit to 3 toasts
  const existing = document.querySelectorAll('.toast');
  if (existing.length >= 3) existing[0].remove();

  const toast = document.createElement('div');
  toast.className = 'toast toast-' + (type || 'info');
  toast.textContent = message;
  document.body.appendChild(toast);

  // Stack: offset each toast by its position in the stack
  const toasts = document.querySelectorAll('.toast');
  toasts.forEach((t, i) => {
    t.style.bottom = (2 + i * 3.5) + 'rem';
  });

  requestAnimationFrame(() => toast.classList.add('visible'));
  setTimeout(() => {
    toast.classList.remove('visible');
    setTimeout(() => {
      toast.remove();
      // Re-stack remaining
      document.querySelectorAll('.toast').forEach((t, i) => {
        t.style.bottom = (2 + i * 3.5) + 'rem';
      });
    }, 300);
  }, 4000);
}

Add a transition: bottom 200ms ease to the .toast CSS rule so stacking animates smoothly.

Commit: feat(web-ui): stack multiple toast notifications instead of replacing


Task 4: Token Usage + Cost in Run Meta Tiles

Problem: The run detail page shows Started, Duration, Model, and Tool Calls in its meta tiles. Token usage and cost are available in run.end span payload data but are not surfaced anywhere in the run detail view.

The run object returned by /v1/runs/:id already includes whatever the processor stores — check what fields are available. The run detail page renders from data.run (r).

Files: cmd/web-ui/static/app.js

Changes:

In renderRun (around line 1374), extend the meta tiles section to include token/cost data if present. The run object fields to look for: total_tokens, input_tokens, output_tokens, total_cost (or nested under usage).

// After existing meta tiles, add conditionally:
${r.total_tokens ? `
  <div class="meta-tile">
    <div class="meta-tile-label">Tokens</div>
    <div class="meta-tile-value" style="font-size:1.2rem">${formatTokenCount(r.total_tokens)}</div>
    ${r.input_tokens || r.output_tokens ? `<div class="meta-tile-sub">${formatTokenCount(r.input_tokens || 0)} in · ${formatTokenCount(r.output_tokens || 0)} out</div>` : ''}
  </div>` : ''}
${r.total_cost != null ? `
  <div class="meta-tile">
    <div class="meta-tile-label">Cost</div>
    <div class="meta-tile-value" style="font-size:1.2rem">${formatCost(r.total_cost)}</div>
  </div>` : ''}

Also add a .meta-tile-sub CSS rule for the secondary line:

.meta-tile-sub {
  font-family: var(--font-mono);
  font-size: 0.68rem;
  color: var(--text-dim);
  margin-top: 0.2rem;
}

Backend check: Verify that the runs table in the postgres store returns these fields. If not, they may need to be populated by the event processor from run.end payload. Check internal/store/postgres/ and cmd/event-processor/ — if the fields are missing from the schema, add a small backend task here.

Commit: feat(web-ui): show token usage and cost in run detail meta tiles


Task 5: Aggregate Tokens + Cost in Session Detail

Problem: The session detail page shows Started, Framework, Host, and Duration. It has no aggregate view of tokens consumed or cost incurred across all runs in the session.

Files: cmd/web-ui/static/app.js

Changes:

In renderSession, after the existing meta tiles, compute totals from the runs array and add two more tiles:

// After building the meta tiles section:
const totalTokens = runs.reduce((sum, r) => sum + (r.total_tokens || 0), 0);
const totalCost = runs.reduce((sum, r) => sum + (r.total_cost || 0), 0);
const totalTools = runs.reduce((sum, r) => sum + (r.tool_count || 0), 0);

// Add to meta tiles HTML:
${totalTokens > 0 ? `
  <div class="meta-tile">
    <div class="meta-tile-label">Total Tokens</div>
    <div class="meta-tile-value">${formatTokenCount(totalTokens)}</div>
  </div>` : ''}
${totalCost > 0 ? `
  <div class="meta-tile">
    <div class="meta-tile-label">Total Cost</div>
    <div class="meta-tile-value">${formatCost(totalCost)}</div>
  </div>` : ''}
${totalTools > 0 ? `
  <div class="meta-tile">
    <div class="meta-tile-label">Total Tools</div>
    <div class="meta-tile-value">${totalTools}</div>
  </div>` : ''}

These tiles should also re-render when loadSessionData refreshes the runs list.

Commit: feat(web-ui): aggregate token count and cost in session detail header


Task 6: Error Count Column in Sessions Table

Problem: Sessions track _errorCount in client-side state and WS updates increment it, but the sessions table never shows it. Users have no way to know which sessions had errors without clicking into each one.

Files: cmd/web-ui/static/app.js, cmd/web-ui/static/style.css

Changes:

  1. Add an Errors column to the sessions table header in renderSessions:

    <th>Errors</th>
    
  2. Update refreshSessionsTable to render the error count in each row. If s._errorCount > 0, show a red badge:

    const errorCell = s._errorCount > 0
      ? `<span class="error-count-badge">${s._errorCount}</span>`
      : '<span style="color:var(--text-dim)">—</span>';
    
  3. Add the CSS badge:

    .error-count-badge {
      display: inline-flex;
      align-items: center;
      justify-content: center;
      min-width: 20px;
      height: 20px;
      padding: 0 5px;
      background: rgba(248, 113, 113, 0.15);
      color: var(--error);
      border: 1px solid rgba(248, 113, 113, 0.25);
      border-radius: 10px;
      font-family: var(--font-mono);
      font-size: 0.72rem;
      font-weight: 600;
    }
    
  4. Update renderSessionRow to include the new column as well (used by the WS update path).

  5. Update the colspan on empty-state rows from 5 to 6.

Note: The API response for sessions doesn't currently include an error_count field — the count is only maintained client-side via WS events. Sessions loaded before connecting to WS will show . A follow-up could add this to the DB query.

Commit: feat(web-ui): add error count column to sessions table


Task 7: "Showing X of Y" Pagination Indicator

Problem: The sessions page has no count indicator. Users don't know how many sessions exist, how many are loaded, or how many are hidden by their current filter.

Files: cmd/web-ui/static/app.js, cmd/web-ui/static/style.css, cmd/query-api/main.go, internal/store/postgres/sessions.go

Backend changes required:

Add a total field to the /v1/sessions response. This requires:

  1. In internal/store/postgres/sessions.go, add a CountSessions(ctx, filter) method that runs a SELECT COUNT(*) with the same WHERE clause as ListSessions.

  2. In cmd/query-api/main.go, call it in parallel with ListSessions and include total in the response:

    total, err := db.CountSessions(r.Context(), f)
    // ...
    resp["total"] = total
    

Frontend changes:

In loadSessions, read data.total and store it:

sessionsState.total = data.total || 0;

In renderSessions, add a count indicator above the table:

<div class="pagination-info" id="pagination-info"></div>

After each loadSessions call, update it:

function updatePaginationInfo() {
  const el = document.getElementById('pagination-info');
  if (!el) return;
  const loaded = sessionsState.sessions.length;
  const total = sessionsState.total || loaded;
  const filtered = /* count after applying sessionFilterMode */;
  el.textContent = filtered < loaded
    ? `Showing ${filtered} of ${loaded} loaded (${total} total)`
    : `Showing ${loaded} of ${total}`;
}

Add CSS:

.pagination-info {
  font-family: var(--font-mono);
  font-size: 0.72rem;
  color: var(--text-dim);
  margin-bottom: 0.75rem;
  letter-spacing: 0.02em;
}

Commit: feat: show session count with pagination indicator (backend + frontend)


Task 8: Sortable Sessions Table

Problem: The sessions table is always sorted newest-first. There's no way to sort by duration, run count, or framework.

Files: cmd/web-ui/static/app.js, cmd/web-ui/static/style.css

Approach: Client-side sort of the already-loaded sessionsState.sessions array. No new API endpoints needed.

Changes:

  1. Add sort state:

    let sessionSortKey = 'started_at'; // default
    let sessionSortDir = 'desc';       // 'asc' | 'desc'
    
  2. In renderSessions, make table headers clickable with sort indicators:

    <th class="sortable" data-sort="session_id">Session <span class="sort-icon"></span></th>
    <th class="sortable" data-sort="framework">Framework <span class="sort-icon"></span></th>
    <th class="sortable" data-sort="host">Host <span class="sort-icon"></span></th>
    <th class="sortable" data-sort="run_count">Runs <span class="sort-icon"></span></th>
    <th class="sortable" data-sort="started_at">Time <span class="sort-icon"></span></th>
    <th>Errors</th>
    
  3. Add sort handler:

    document.querySelectorAll('th.sortable').forEach(th => {
      th.addEventListener('click', () => {
        const key = th.dataset.sort;
        if (sessionSortKey === key) {
          sessionSortDir = sessionSortDir === 'asc' ? 'desc' : 'asc';
        } else {
          sessionSortKey = key;
          sessionSortDir = 'desc';
        }
        refreshSessionsTable();
      });
    });
    
  4. In refreshSessionsTable, sort before filtering:

    function sortSessions(sessions) {
      return [...sessions].sort((a, b) => {
        let av = a[sessionSortKey], bv = b[sessionSortKey];
        if (sessionSortKey === 'started_at') { av = new Date(av).getTime(); bv = new Date(bv).getTime(); }
        if (av < bv) return sessionSortDir === 'asc' ? -1 : 1;
        if (av > bv) return sessionSortDir === 'asc' ? 1 : -1;
        return 0;
      });
    }
    
  5. CSS for sortable headers and icons:

    th.sortable { cursor: pointer; user-select: none; }
    th.sortable:hover { color: var(--text-bright); }
    th.sortable.sort-asc .sort-icon::after { content: ' ↑'; }
    th.sortable.sort-desc .sort-icon::after { content: ' ↓'; }
    .sort-icon { color: var(--accent); font-size: 0.7rem; }
    

Commit: feat(web-ui): sortable columns in sessions table


Problem: The global search (header input + command palette) requires 8+ characters and only searches by exact session or run ID prefix. This is unnecessarily restrictive.

Files: cmd/web-ui/static/app.js

Changes:

  1. Lower minimum from 8 to 4 characters in handleGlobalSearch:

    if (id.length < 4) {
      showToast('Search ID must be at least 4 characters', 'info');
      return;
    }
    

    Also update the command palette ID detection threshold.

  2. Search by framework/host shortcut: If the query doesn't look like a hex ID (contains letters A-Z or common words), navigate to /sessions?framework=<query> or /sessions?host=<query>:

    async function handleGlobalSearch(query) {
      query = query.trim();
      if (query.length < 4) { showToast('Enter at least 4 characters', 'info'); return; }
    
      // Hex ID pattern
      if (/^[a-f0-9-]{4,}$/i.test(query)) {
        // Try session, then run (existing logic)
        // ...
      } else {
        // Non-hex: treat as framework or host search
        navigate('/sessions?framework=' + encodeURIComponent(query));
      }
    }
    
  3. Update the ⌘K palette search hint text from "Search for ID: " to "Search: ".

Commit: feat(web-ui): relax search minimum to 4 chars, support framework/host queries


Problem: The agents page has no per-agent URL. You can't share a link to a specific agent's live view, and the browser back button doesn't restore the selected agent.

Files: cmd/web-ui/static/app.js

Changes:

  1. Add route handler in the route() function:

    } else if (path.startsWith('/agents/')) {
      const agentKey = path.split('/agents/')[1];
      renderAgents(agentKey); // pass initial selected key
    } else if (path.startsWith('/agents')) {
      renderAgents();
    }
    
  2. Update renderAgents to accept an optional initialKey parameter and pre-select that agent after data loads.

  3. Update selectAgent to push a URL state change:

    function selectAgent(key, nextMode) {
      if (!key || !agentsState.agents[key]) return;
      agentsState.selectedAgentKey = key;
      if (nextMode) agentsState.viewMode = nextMode;
      // Push URL so it's shareable/bookmarkable
      const newPath = '/agents/' + encodeURIComponent(key);
      if (window.location.pathname !== newPath) {
        history.pushState(null, '', newPath);
      }
      renderAgentsContent();
      if (agentsState.viewMode === 'live') void loadSelectedAgentLiveData();
    }
    
  4. Update breadcrumbs to show the agent name for /agents/:key paths.

  5. Update renderBreadcrumbs to show a readable label (agent name) rather than the raw key for agent paths.

Commit: feat(web-ui): deep-link routes for individual agents (/agents/:key)


Task 11: Infrastructure Manual Refresh

Problem: The infrastructure page only updates when WebSocket events arrive. If the page is stale (e.g., no new snapshots have arrived), there's no way to force a refresh.

Files: cmd/web-ui/static/app.js, cmd/web-ui/static/style.css

Changes:

  1. Add a refresh button to the infra page header:

    // In renderInfraGrid(), update the page-header to include:
    <button class="refresh-btn" id="infra-refresh-btn" type="button" title="Refresh">
      <svg ...spinner icon...></svg> Refresh
    </button>
    
  2. Wire the button to re-fetch all infra data:

    document.getElementById('infra-refresh-btn')?.addEventListener('click', async () => {
      const btn = document.getElementById('infra-refresh-btn');
      if (btn) btn.disabled = true;
      try {
        const [ocData, swarmData] = await Promise.all([
          api('/v1/events?event_type=openclaw.snapshot&limit=100'),
          api('/v1/events?event_type=swarm.snapshot&limit=10').catch(() => ({ events: [] })),
        ]);
        mergeOpenClawEvents(ocData.events || []);
        for (const evt of swarmData.events || []) mergeSwarmSnapshot(evt);
        renderInfraGrid();
      } finally {
        if (btn) btn.disabled = false;
      }
    });
    
  3. Add CSS for .refresh-btn (small, subtle button with a spinner animation on :disabled).

Commit: feat(web-ui): manual refresh button on infrastructure page


Task 12: Settings / Admin Page

Problem: The API exposes POST /v1/admin/retention for managing data retention, but there is no UI for it. There's no settings page at all.

Files: cmd/web-ui/static/app.js, cmd/web-ui/static/style.css, cmd/web-ui/static/index.html

Changes:

  1. Add a Settings nav link in index.html:

    <a href="/settings">Settings</a>
    

    This is the only nav change.

  2. Add route handler in route():

    } else if (path === '/settings') {
      renderSettings();
    }
    
  3. Add keyboard shortcut in the g-prefix block: g+p/settings.

  4. Implement renderSettings():

    async function renderSettings() {
      app.innerHTML = `
        <div class="page-header"><h2>Settings</h2></div>
    
        <div class="settings-section">
          <h3 class="settings-section-title">Data Retention</h3>
          <p class="settings-section-desc">Delete events older than the specified number of days. This runs automatically every 24 hours. Currently configured via <code>RETENTION_DAYS</code> environment variable (default: 30 days).</p>
          <div class="settings-row">
            <label class="settings-label" for="retention-days">Purge events older than</label>
            <div class="settings-input-group">
              <input type="number" id="retention-days" class="settings-input" min="1" max="365" value="30">
              <span class="settings-input-suffix">days</span>
              <button class="settings-btn" id="run-retention-btn" type="button">Run Now</button>
            </div>
          </div>
          <div id="retention-result" class="settings-result"></div>
        </div>
      `;
    
      document.getElementById('run-retention-btn')?.addEventListener('click', async () => {
        const days = parseInt(document.getElementById('retention-days').value, 10);
        if (!days || days < 1) { showToast('Enter a valid number of days', 'error'); return; }
        const btn = document.getElementById('run-retention-btn');
        btn.disabled = true;
        btn.textContent = 'Running…';
        try {
          const resp = await fetch('/api/v1/admin/retention', {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({ days }),
          });
          const data = await resp.json();
          if (!resp.ok) throw new Error(data.error || 'Request failed');
          const result = document.getElementById('retention-result');
          if (result) result.innerHTML = `<span class="settings-result-ok">Deleted ${data.deleted} events older than ${new Date(data.cutoff).toLocaleDateString()}.</span>`;
          showToast(`Deleted ${data.deleted} events`, 'success');
        } catch (e) {
          showToast('Retention failed: ' + e.message, 'error');
        } finally {
          btn.disabled = false;
          btn.textContent = 'Run Now';
        }
      });
    }
    
  5. Add settings page CSS:

    .settings-section {
      background: var(--surface);
      border: 1px solid var(--border);
      border-radius: var(--radius-lg);
      padding: 1.5rem;
      margin-bottom: 1.5rem;
      max-width: 640px;
    }
    .settings-section-title {
      font-family: var(--font-display);
      font-size: 1rem;
      font-weight: 700;
      color: var(--text-bright);
      margin-bottom: 0.5rem;
    }
    .settings-section-desc {
      font-size: 0.82rem;
      color: var(--text-dim);
      margin-bottom: 1.25rem;
      line-height: 1.6;
    }
    .settings-row {
      display: flex;
      flex-direction: column;
      gap: 0.5rem;
    }
    .settings-label {
      font-size: 0.78rem;
      font-weight: 600;
      color: var(--text-dim);
      text-transform: uppercase;
      letter-spacing: 0.06em;
    }
    .settings-input-group {
      display: flex;
      align-items: center;
      gap: 0.75rem;
    }
    .settings-input {
      background: var(--surface-2);
      border: 1px solid var(--border);
      border-radius: var(--radius);
      color: var(--text);
      padding: 0.45rem 0.75rem;
      font-family: var(--font-mono);
      font-size: 0.88rem;
      width: 80px;
      outline: none;
    }
    .settings-input:focus { border-color: var(--accent); box-shadow: 0 0 0 3px var(--accent-dim); }
    .settings-input-suffix { font-size: 0.82rem; color: var(--text-dim); }
    .settings-btn {
      background: var(--accent-dim);
      border: 1px solid var(--accent-glow);
      border-radius: var(--radius);
      color: var(--accent);
      font-family: var(--font-body);
      font-size: 0.82rem;
      font-weight: 600;
      padding: 0.45rem 1rem;
      cursor: pointer;
      transition: background 0.15s, border-color 0.15s;
    }
    .settings-btn:hover { background: rgba(34, 211, 238, 0.15); border-color: var(--accent); }
    .settings-btn:disabled { opacity: 0.5; cursor: default; }
    .settings-result { margin-top: 0.75rem; font-size: 0.82rem; }
    .settings-result-ok { color: var(--success); }
    

Commit: feat(web-ui): add Settings page with data retention UI


Task 13: Cost / Usage Analytics Panel

Problem: The dashboard shows top tools and top models in the right panel, but there's no dedicated view for cost trends over time, per-framework costs, or a historical breakdown. The API already has all needed endpoints: /v1/stats/timeseries, /v1/stats/top-tools, /v1/stats/top-models, /v1/stats/summary.

Files: cmd/web-ui/static/app.js, cmd/web-ui/static/style.css

Approach: Add a "Usage" tab to the dashboard (not a new nav page) that shows cost/token analytics. Alternatively, add a dedicated /usage page. A /usage page is cleaner given the existing nav structure.

Changes:

  1. Add a /usage route and nav link in index.html:

    <a href="/usage">Usage</a>
    
  2. Add renderUsage() function:

    • Fetch: summary, top-tools (limit 20), top-models (limit 10), timeseries (window=7d)
    • Render sections:
      • Summary bar: Sessions, Runs, Tool Calls, Errors — with "today" vs "7d" comparisons
      • Token & cost overview: total tokens (7d), estimated total cost (7d)
      • Top models: bar chart of usage counts + cost breakdown per model
      • Top tools: ranked list with usage bars
      • Activity chart: reuse the uPlot timeseries chart (stacked runs/tools/errors over 7d)
  3. The top-models endpoint returns { models: [{ model, count, total_cost, total_tokens }] }. Display as a ranked table with inline bar tracks.

  4. The top-tools endpoint returns { tools: [{ name, count }] }. Display as a ranked list.

  5. Add CSS for the usage page layout (reuse existing .stat-card, .fw-bars, .fw-bar-* classes where possible).

Commit: feat(web-ui): add Usage analytics page with token/cost breakdown


Task 14: Span Waterfall / Trace View

Problem: The run detail page shows spans in a flat table (name, kind, status, duration). There's no visualization of how spans overlap in time, which spans are children of others, or the overall execution timeline. For debugging long runs or understanding parallelism, a Gantt-style waterfall is much more useful.

Files: cmd/web-ui/static/app.js, cmd/web-ui/static/style.css

Approach: Add a "Waterfall" toggle button on the run detail page that switches the spans view from the existing table to a SVG/HTML timeline. Keep the table as the default (for backward compatibility); waterfall is opt-in.

Data requirements: Each span needs started_at, ended_at (or duration_ms), name, kind, status. The /v1/runs/:id response already returns spans with this data via data.spans.

Changes:

  1. Add a view toggle in the run detail section title row:

    `<div class="section-title">
      Spans <span class="count" id="run-detail-span-count">${spans.length}</span>
      <div class="view-toggle" style="margin-left:auto">
        <button class="view-toggle-btn active" id="spans-view-table">Table</button>
        <button class="view-toggle-btn" id="spans-view-waterfall">Waterfall</button>
      </div>
    </div>`
    
  2. Implement renderSpanWaterfall(spans, runStarted):

    Compute the run's start time from r.started_at. For each span, compute left% and width% relative to the total run duration:

    function renderSpanWaterfall(spans, runStartedAt, runDurationMS) {
      if (!spans || spans.length === 0) return '<p class="empty-state">No spans</p>';
      const runStart = new Date(runStartedAt).getTime();
      const totalMS = runDurationMS || Math.max(...spans.map(sp => {
        const s = new Date(sp.started_at || runStartedAt).getTime();
        return (s - runStart) + (sp.duration_ms || 0);
      }), 1);
    
      return `
        <div class="waterfall">
          <div class="waterfall-header">
            <div class="waterfall-name-col">Span</div>
            <div class="waterfall-bar-col">
              <div class="waterfall-timescale">${renderTimescale(totalMS)}</div>
            </div>
          </div>
          ${spans.map(sp => {
            const spStart = sp.started_at ? new Date(sp.started_at).getTime() - runStart : 0;
            const spDur = sp.duration_ms || 0;
            const leftPct = (spStart / totalMS * 100).toFixed(2);
            const widthPct = Math.max(0.5, (spDur / totalMS * 100)).toFixed(2);
            const kindClass = sp.kind || 'unknown';
            const statusClass = sp.status === 'error' ? ' wf-error' : sp.status === 'success' ? ' wf-success' : '';
            return `
              <div class="waterfall-row">
                <div class="waterfall-name-col">
                  <span class="span-kind-badge ${kindClass}">${sp.kind || '?'}</span>
                  <span class="waterfall-name" title="${escapeHTML(sp.name || '')}">${escapeHTML((sp.name || '(unnamed)').slice(0, 40))}</span>
                </div>
                <div class="waterfall-bar-col">
                  <div class="waterfall-bar-track">
                    <div class="waterfall-bar${statusClass}" style="left:${leftPct}%;width:${widthPct}%" title="${formatDuration(spDur)}">
                      <span class="waterfall-bar-label">${spDur > totalMS * 0.05 ? formatDuration(spDur) : ''}</span>
                    </div>
                  </div>
                </div>
              </div>`;
          }).join('')}
        </div>`;
    }
    
    function renderTimescale(totalMS) {
      const ticks = 5;
      return Array.from({ length: ticks + 1 }, (_, i) => {
        const pct = (i / ticks * 100).toFixed(0);
        return `<span style="left:${pct}%">${formatDuration(totalMS * i / ticks)}</span>`;
      }).join('');
    }
    
  3. Wire the toggle buttons to swap the spans container between table and waterfall:

    document.getElementById('spans-view-waterfall')?.addEventListener('click', () => {
      document.getElementById('spans-container').innerHTML = renderSpanWaterfall(spans, r.started_at, r.ended_at ? new Date(r.ended_at) - new Date(r.started_at) : null);
      document.getElementById('spans-view-waterfall').classList.add('active');
      document.getElementById('spans-view-table').classList.remove('active');
    });
    
  4. CSS for the waterfall:

    .waterfall {
      overflow-x: auto;
    }
    .waterfall-header,
    .waterfall-row {
      display: grid;
      grid-template-columns: 240px 1fr;
      gap: 0.75rem;
      align-items: center;
      padding: 0.4rem 1.25rem;
      border-bottom: 1px solid var(--border-soft);
    }
    .waterfall-header { background: var(--surface-2); font-size: 0.68rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.08em; color: var(--text-dim); }
    .waterfall-row:hover { background: var(--surface-2); }
    .waterfall-name-col { display: flex; align-items: center; gap: 0.4rem; min-width: 0; }
    .waterfall-name { font-size: 0.8rem; color: var(--text); white-space: nowrap; overflow: hidden; text-overflow: ellipsis; }
    .waterfall-bar-col { position: relative; }
    .waterfall-bar-track { position: relative; height: 20px; background: var(--surface-2); border-radius: 3px; }
    .waterfall-bar {
      position: absolute;
      top: 2px;
      height: 16px;
      border-radius: 3px;
      background: var(--accent);
      opacity: 0.7;
      display: flex;
      align-items: center;
      overflow: hidden;
      transition: opacity 0.15s;
    }
    .waterfall-bar:hover { opacity: 1; }
    .waterfall-bar.wf-error { background: var(--error); }
    .waterfall-bar.wf-success { background: var(--success); }
    .waterfall-bar-label { font-family: var(--font-mono); font-size: 0.6rem; padding: 0 4px; color: #fff; white-space: nowrap; }
    .waterfall-timescale { position: relative; height: 16px; }
    .waterfall-timescale span { position: absolute; transform: translateX(-50%); font-family: var(--font-mono); font-size: 0.62rem; color: var(--text-dim); }
    

Commit: feat(web-ui): span waterfall / trace view on run detail page


Task 15: Error Boundary for Render Functions

Problem: An unhandled exception in any render function (e.g., unexpected data shape from the API) leaves #app in a broken or empty state with no recovery path. The user sees a blank page with no indication of what happened.

Files: cmd/web-ui/static/app.js

Changes:

  1. Wrap the render dispatch in route() with a try/catch:

    try {
      if (path === '/') renderDashboard();
      else if (path === '/sessions') renderSessions();
      // ... etc
    } catch (err) {
      console.error('Render error:', err);
      app.innerHTML = `
        <div class="error-boundary">
          <h2>Something went wrong</h2>
          <p>An error occurred while rendering this page.</p>
          <pre class="error-boundary-detail">${escapeHTML(err.message)}</pre>
          <button onclick="navigate('/')">Back to Dashboard</button>
        </div>
      `;
    }
    
  2. Also wrap each individual page render function's top-level body in a try/catch so errors are attributed to the correct page.

  3. Add CSS for the error boundary:

    .error-boundary {
      padding: 3rem 2rem;
      max-width: 560px;
      margin: 0 auto;
    }
    .error-boundary h2 {
      font-family: var(--font-display);
      font-size: 1.4rem;
      color: var(--error);
      margin-bottom: 0.5rem;
    }
    .error-boundary p { color: var(--text-dim); margin-bottom: 1rem; }
    .error-boundary-detail {
      background: var(--surface-2);
      border: 1px solid var(--border);
      border-radius: var(--radius);
      padding: 0.75rem 1rem;
      font-family: var(--font-mono);
      font-size: 0.78rem;
      color: var(--code-text);
      margin-bottom: 1.25rem;
      white-space: pre-wrap;
      word-break: break-word;
    }
    

Commit: feat(web-ui): error boundary with fallback UI for render failures


Task 16: Split app.js into Logical Modules

Problem: app.js is a single ~3700-line IIFE with no internal separation. Everything is tangled: routing, WebSocket management, per-page state, render functions, and utilities. This makes it hard to navigate, test, or extend.

Approach: Split into focused files loaded as ES modules. No bundler required — use native <script type="module"> in index.html.

Files: cmd/web-ui/static/app.js → split into multiple files, cmd/web-ui/static/index.html

Target structure:

static/
├── modules/
│   ├── api.js         — fetch wrapper, showToast
│   ├── ws.js          — WebSocket management (connectWS, subscribeWS, updateWSIndicator)
│   ├── router.js      — route(), navigate(), renderBreadcrumbs(), updateActiveNav()
│   ├── theme.js       — cycleTheme, getTheme, applyTheme, updateToggleBtn
│   ├── palette.js     — command palette (openCommandPalette, closeCommandPalette, etc.)
│   ├── utils.js       — relativeTime, formatDuration, formatBytes, formatCost,
│   │                    formatTokenCount, escapeHTML, copyToClipboard, skeletonRows
│   ├── state.js       — all state declarations (sessionsState, agentsState, etc.)
│   ├── pages/
│   │   ├── dashboard.js
│   │   ├── sessions.js
│   │   ├── session-detail.js
│   │   ├── run-detail.js
│   │   ├── agents.js
│   │   ├── infrastructure.js
│   │   ├── settings.js
│   │   └── usage.js
│   └── infra/
│       ├── openclaw.js    — mergeOpenClawEvents, renderVMCard
│       └── swarm.js       — mergeSwarmSnapshot, renderServiceCard
├── app.js             — thin entry point: imports modules, wires DOMContentLoaded
├── style.css
└── index.html         — <script type="module" src="/static/app.js">

Migration strategy:

  1. Start with the pure utility functions (no dependencies on other app state) — utils.js first.
  2. Then theme.js, api.js, ws.js (minimal dependencies).
  3. Then state.js (all let state declarations).
  4. Then router.js (depends on state, ws, utils).
  5. Then individual page modules (each depends on api, ws, utils, state).
  6. Finally, app.js becomes the entry point that imports everything and calls route() on load.

Note: The Go embed.FS serves from static/, so static/modules/*.js files are automatically served. No server changes needed.

index.html change:

<!-- Replace: -->
<script src="/static/app.js"></script>
<!-- With: -->
<script type="module" src="/static/app.js"></script>

Commit strategy: Do this in multiple commits — one per module extracted. Start with the ones at the bottom of the dependency tree.

Commits:

refactor(web-ui): extract utility functions to modules/utils.js
refactor(web-ui): extract theme, api, ws to separate modules
refactor(web-ui): extract state declarations to modules/state.js
refactor(web-ui): extract router to modules/router.js
refactor(web-ui): extract dashboard page to modules/pages/dashboard.js
refactor(web-ui): extract sessions pages to modules/pages/sessions*.js
refactor(web-ui): extract agents page to modules/pages/agents.js
refactor(web-ui): extract infrastructure to modules/pages/infrastructure.js + infra/
refactor(web-ui): extract command palette to modules/palette.js
refactor(web-ui): finalize module split, remove monolithic app.js IIFE

Execution Order

Recommended sequence, grouped by dependency and risk:

Phase 1 — Bug Fixes (low risk, do first)

  1. Task 2 — Fix WS unsubscribe variable reuse bug
  2. Task 1 — Fix Ctrl+K label on Linux

Phase 2 — Data Richness (high value, easy wins)

  1. Task 4 — Token usage + cost in run meta tiles
  2. Task 5 — Aggregate tokens + cost in session detail
  3. Task 6 — Error count column in sessions table

Phase 3 — UX Improvements

  1. Task 3 — Stack multiple toasts
  2. Task 9 — Relax + expand global search
  3. Task 11 — Infrastructure manual refresh
  4. Task 8 — Sortable sessions table

Phase 4 — New Features

  1. Task 10 — Agent deep-link routes
  2. Task 12 — Settings/admin page
  3. Task 13 — Cost/usage analytics page
  4. Task 7 — Pagination indicator (requires backend changes)

Phase 5 — Advanced Features

  1. Task 14 — Span waterfall / trace view
  2. Task 15 — Error boundary

Phase 6 — Code Quality (do last, after features are stable)

  1. Task 16 — Split app.js into modules

Notes

  • No build tooling: All JS is plain ES2020+ (or IIFE). Tasks 115 keep the existing IIFE structure. Task 16 migrates to native ES modules — only do this after all feature tasks are complete and stable.
  • Backend tasks: Only Task 7 (pagination count) requires a backend change. All other tasks are frontend-only.
  • Testing: Run make test after any changes to verify Go code (if Task 7 is done). For frontend changes, manually verify in a browser against a running instance.
  • CSS naming: Follow the existing convention — BEM-like flat class names, no nesting, CSS custom properties for all colors.