agentmon

Author	SHA1	Message	Date
William Valentin	ebc944702f	chore: drop retired orb and sun VMs Only the zap VM remains in the fleet. Remove orb/sun from the README architecture/config docs, the getVMClassName allowlist, and their .timeline-vm-tag color styles. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-27 10:38:04 -07:00
William Valentin	69eb87ebc9	feat(web-ui): improve Agents page legibility and scannability Targeted UI/UX polish on the Agents page, keeping the existing dark aesthetic and both Overview/Live view modes: - Add a readable --text-mute token (dark + light) and apply it to the summary chips, lane meta, and idle/offline status, which previously used the near-invisible --text-dim. - Event feed: demote the generic "Span Started/Completed" label to a quiet mono category tag and promote the tool name, with a left-edge accent by event kind (run/span/error/session). Scoped to #agents-content so other pages' feeds are unaffected. - Active-op pills: add a per-kind left accent bar (tool/subagent/run). - Lane sparkline: raise opacity and add a gradient so it actually reads. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-27 10:35:33 -07:00
William Valentin	478c7529a7	feat(hooks): emit per-run token usage and duration on run.end The stats layer reads usage/duration only from run.end, but neither framework populated them, so tokens/cost/avg-duration were always 0. - hermes: accumulate token usage across each run's api-result calls in session state and attach the summed usage plus a computed duration_ms (from a stored runStartedAt) onto run.end. metric.snapshot emission is unchanged, so there is no double counting. - claude-code: store runStartedAt and use it as a duration_ms fallback at all run.end sites. Usage is unavailable from CC hook inputs. Live verification: a real hermes run now reports duration_ms and total_tokens on run.end; dashboard tokens_today/avg_duration_ms, both previously 0, now populate. cost_today stays 0 (no provider emits cost through the hooks). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 11:16:23 -07:00
William Valentin	5014d89258	feat(metrics): surface tool-span latency in stats and dashboard Tool spans already carry duration_ms and status, but the metrics layer only counted them. Expose that data: - GetTopTools now returns avg/p95 duration and error count per tool. - Timeseries buckets gain tool_avg_ms / tool_p95_ms (filtered percentile_cont over tool spans). - Dashboard Top Tools shows avg latency per tool; the Latency panel, previously always empty (it read run-level duration that is never emitted), now plots real tool-span latency (min/avg/p95). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 11:16:23 -07:00
William Valentin	c44e7fe72e	refactor(web-ui): extract shared component primitives Introduce components.js with barTrack, barRow, barRankList, metricPill, metricStrip, and chartHeader helpers. Migrate dashboard.js and usage.js to use these primitives, replacing 13 families of duplicated CSS (stat-list, fw-bar, token-bar, metric-pill, chart-insight, chart-header, usage-chart-total, etc.) with a unified .am-* namespace. Net: -256 lines. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-22 12:21:48 -07:00
William Valentin	8753c0c9d5	feat(web-ui): better stats and ergonomics Usage page: add 7-day trend chart (activity/tokens/cost tabs), framework breakdown panel with per-framework run/tool/error counts and proportional bars, and 7d aggregate pills above the chart. Dashboard: add avg cost/run metric pill to the metrics strip. Run detail: extract and display prompt preview from the first agent span's payload above the spans table. Bug fixes: stat-list bars now render correctly (flex-direction:column), right-panel-tab active background uses correct accent color, missing framework colors added for hermes/codex/gemini/copilot. Dead code renderSessionRow removed from sessions.js. Hardcoded font-family replaced with CSS variable in metric-pill-value and token-stat-value. Usage page cleanup() wired into router teardown. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 16:49:05 -07:00
William Valentin	1b01f0b0cd	chore(compose): pin postgres patch image	2026-05-20 22:19:58 -07:00
William Valentin	27d40ce28f	feat(hooks): add Hermes telemetry handler	2026-05-20 17:35:56 -07:00
William Valentin	78376bdd83	feat(query): include session totals and stable framework names	2026-05-20 17:35:56 -07:00
William Valentin	db73eca6fd	chore(infra): pin nats image digest	2026-05-20 17:35:56 -07:00
William Valentin	f8bec2d6d5	fix: ignore non-persistent claude startups	2026-04-30 17:07:19 -07:00
William Valentin	476c0e347f	fix: count only live dashboard sessions	2026-04-30 17:07:17 -07:00
William Valentin	fd17628e94	fix: ignore invalid claude hook starts	2026-04-29 09:41:07 -07:00
William Valentin	6799cc3681	docs: add run detail improvements design Covers three improvements to /runs/🆔 prompt/error header callouts, client-side span filter/search, and an interactive waterfall with hover tooltips and inline detail drawers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 13:35:20 -07:00
William Valentin	184aa5e6cb	fix(web-ui): security hardening, SPA nav, and modularization Ship the in-progress ES-module refactor of the web-ui (new static/modules/ layout, Usage/Settings pages, uplot-based dashboard) alongside a round of security and UX fixes: - main.go: add CSP + X-Frame-Options: DENY + X-Content-Type-Options: nosniff + Referrer-Policy middleware on every response; WS CheckOrigin now requires Origin host to match Host (blocks cross-site WebSocket hijacking); upgrade client before dialing upstream so origin check runs first; fatal on unparseable AGENTMON_QUERY_BASE. - app.js: delegated click handler intercepts same-origin <a> clicks for SPA navigation (prev. every nav link caused a full page reload, dropping WS + in-memory state); delegated .copy-btn[data-copy] handler replaces inline onclick=; removed window.navigate / window.copyToClipboard globals and the duplicated handleGlobalSearch. - modules/nav-signal.js: per-route AbortController so in-flight fetches are cancelled when the user navigates away, preventing stale toasts and wasted renders. - modules/api.js: honours the nav signal by default; AbortError is silent. - modules/router.js: resets the nav controller on every route; dropped the fixed 80ms transition delay; breadcrumbs no longer emit inline onclick= (delegated handler picks them up). - modules/utils.js: renderCopyButton emits data-copy=\"...\" instead of nesting a JS string inside an HTML attribute — fixes an XSS where values containing ' broke out via ' decoding. Verified: go build clean; `node --check` clean on all modified modules; manual curl probes confirm security headers present on every response and WS upgrade returns 403 for cross-origin/missing Origin while 101 for same-origin. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-23 15:36:12 -07:00
William Valentin	41b7165800	fix(store): backfill spans in run detail	2026-04-21 13:07:09 -07:00
William Valentin	43113f6241	feat(web-ui): improve navigation and session UX	2026-04-21 13:07:05 -07:00
William Valentin	8f766b4019	chore(gitignore): ignore build directory	2026-04-21 13:07:01 -07:00
William Valentin	d5154b8eec	fix(codex): recover session lifecycle from hooks	2026-04-21 13:02:58 -07:00
William Valentin	8b6ce8e628	Add restart policy to docker-compose services	2026-03-27 20:47:24 -07:00
William Valentin	c53283ac07	feat: improve web UI UX with global search, breadcrumbs, and better feedback	2026-03-26 14:24:52 -07:00
William Valentin	8bca99573b	feat(web-ui): redesign dashboard and live sessions	2026-03-26 11:22:49 -07:00
William Valentin	5ff4794d98	feat(openclaw-monitor): add MinIO telemetry	2026-03-26 11:22:45 -07:00
William Valentin	6605780b58	feat(ingest): batch event writes and harden transport	2026-03-26 11:22:42 -07:00
William Valentin	43877a5448	feat(query-api): add richer stats and retention	2026-03-26 11:22:34 -07:00
William Valentin	fdfcb50e80	feat(hooks): consolidate shared transport helpers	2026-03-26 11:22:27 -07:00
William Valentin	d49785cb25	fix: filter dashboard activity feed events	2026-03-20 14:05:59 -07:00
William Valentin	687a7aa79d	Add live agent views and improve Codex monitoring	2026-03-20 13:59:51 -07:00
William Valentin	a87bbc6983	fix(claude-hook): derive span durations from start timestamps	2026-03-20 11:17:40 -07:00
William Valentin	d235e3c873	feat(hooks): add telemetry handlers for codex/copilot/gemini	2026-03-20 11:17:26 -07:00
William Valentin	c88746693a	docs(plans): add dashboard and realtime agent plans	2026-03-20 11:17:17 -07:00
William Valentin	2e277fb138	fix: preserve session state across turns in claude-code hook handler handleNotification("Done") was incorrectly emitting session.end and calling clearState at the end of each Claude turn. Since "Done" means a turn finished (not the session), clearing state caused subsequent tool calls to find no runId, storing spans without run_id and making them invisible in run-level queries. - handleNotification: remove session.end emission and clearState call; only emit run.end for the completed turn - handleSessionEnd: load state file to get runId (in-memory activeRuns is always empty in a subprocess) - handlePromptSubmit: load state file to get runId for ending previous run before starting a new one Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 23:42:22 -07:00
William Valentin	f8ddea3698	feat: add agentmon services section to infrastructure page Label all agentmon docker-compose services with agentmon.monitor=true and agentmon.group=agentmon so the swarm-monitor picks them up. Adds Group field to ServiceSnapshot, probes /healthz for api/web roles, and renders a separate "Agentmon" section below Swarm Services on the Infrastructure page with new api and worker card renderers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 13:41:26 -07:00
William Valentin	d2d044a3d8	fix: use Docker socket HTTP API in swarm collector, no CLI dependency Replace exec.CommandContext calls (docker ps, docker inspect, nc -z) with direct HTTP calls over the Unix socket using Go's net/http + custom transport. Also removes netcat-openbsd from Dockerfile since nc is no longer used. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 10:36:32 -07:00
William Valentin	f48953781b	fix: add swarm-monitor binary and netcat to Dockerfile	2026-03-18 10:31:28 -07:00
William Valentin	edaa7bac45	feat: add swarm-monitor service to docker-compose	2026-03-18 10:29:40 -07:00
William Valentin	1b3c74b441	fix: add /infrastructure to SPA catch-all routes	2026-03-18 10:27:06 -07:00
William Valentin	cd2f345454	feat: rename OpenClaw to Infrastructure page, add service cards Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 10:20:28 -07:00
William Valentin	93edd39a2b	feat: add infrastructure page CSS	2026-03-18 10:16:50 -07:00
William Valentin	07c16653cd	feat: add swarm strip to dashboard Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 10:14:48 -07:00
William Valentin	7c043b78a4	feat: add swarm-monitor binary	2026-03-18 10:12:18 -07:00
William Valentin	9c2f048b92	feat: add swarm collector with docker inspect + HTTP probes	2026-03-18 10:10:34 -07:00
William Valentin	083e522bb7	feat: add swarm monitor types	2026-03-18 10:08:54 -07:00
William Valentin	22bc16bf51	docs: swarm monitor implementation plan	2026-03-18 09:57:51 -07:00
William Valentin	ecabc7fd19	docs: swarm monitor design — infra page, docker labels, role-driven cards	2026-03-18 09:53:39 -07:00
William Valentin	e7be607db4	feat: extend agentmon hook with agent:bootstrap for embedded/cron runs - Add agent:bootstrap handler to capture run.start events for cron and automation runs that bypass the message:received path - Remove dead event subscriptions (tool_result_persist, session:compact:*) which are plugin hook events and never fire through triggerInternalHook - Remove AGENTMON_INGEST_URL from requires.env since handler has a hardcoded fallback URL - Drop activeCompactions map (no longer needed after removing compaction handlers) Deployed to zap VM with hooks.internal.enabled=true in openclaw.json. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-15 17:32:32 -07:00
William Valentin	13356adfbd	feat: openclaw card dividers, running pulse, issue label section Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-14 12:16:30 -07:00
William Valentin	acd89e95a9	feat: stat card top accents, timeline time hierarchy	2026-03-14 12:14:15 -07:00
William Valentin	5dbfd68fb5	feat: meta tiles, back link button, css chevron, span-details bg fix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-14 12:11:55 -07:00
William Valentin	eb12319f19	feat: framework color dots in sessions table, filter toolbar panel Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-14 12:05:40 -07:00

1 2

75 Commits