Files
agentmon/docs/plans/2026-03-13-agent-monitoring-plan.md
T
William Valentin 3434db3c59 feat: complete agent monitoring - hook, UI, and backend filter
- Add event_type and framework filters to events query endpoint
- Add /agents SPA route to web-ui server
- Add Agents nav link and route in frontend
- Add agents page CSS (timeline, VM pills, stats panel)
- Build VM status strip, activity timeline, and real-time stats
- Add agentmon hook for OpenClaw (HOOK.md + handler.ts)
- Add docker-compose, Dockerfile, and supporting infra files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 00:26:42 -07:00

1235 lines
33 KiB
Markdown

# Agent Activity Monitoring Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Monitor all OpenClaw agent activity (tool calls, messages, sessions, errors) across three VMs and display it in a real-time dashboard.
**Architecture:** An OpenClaw hook (TypeScript) on each VM captures agent events and POSTs them as agentmon envelopes to the ingest gateway. A new `/agents` page in the web UI renders a live activity timeline via WebSocket. A small backend addition filters events by framework.
**Tech Stack:** TypeScript (hook), Go (backend filter), Vanilla JS/CSS (UI)
**Design doc:** `docs/plans/2026-03-13-agent-monitoring-design.md`
---
## Task 1: Add framework filter to events query
**Files:**
- Modify: `internal/store/postgres/query.go`
- Modify: `cmd/query-api/main.go`
The agents UI needs to load recent events filtered by `source_framework = 'openclaw'`. The existing `ListRecentEvents` only takes `limit`.
**Step 1: Add EventsFilter struct and update ListRecentEvents**
In `internal/store/postgres/query.go`, replace the current function:
```go
type EventsFilter struct {
Limit int
EventType string
Framework string
}
func (d *DB) ListRecentEvents(ctx context.Context, f EventsFilter) ([]EventRow, error) {
if f.Limit <= 0 {
f.Limit = 100
}
if f.Limit > 1000 {
f.Limit = 1000
}
query := "SELECT event_id, ts, type, payload FROM events WHERE 1=1"
args := []any{}
argN := 1
if f.EventType != "" {
query += fmt.Sprintf(" AND type = $%d", argN)
args = append(args, f.EventType)
argN++
}
if f.Framework != "" {
query += fmt.Sprintf(" AND source_framework = $%d", argN)
args = append(args, f.Framework)
argN++
}
query += fmt.Sprintf(" ORDER BY ts DESC LIMIT $%d", argN)
args = append(args, f.Limit)
rows, err := d.sql.QueryContext(ctx, query, args...)
if err != nil {
return nil, err
}
defer rows.Close()
var out []EventRow
for rows.Next() {
var r EventRow
if err := rows.Scan(&r.EventID, &r.TS, &r.Type, &r.Payload); err != nil {
return nil, err
}
out = append(out, r)
}
return out, rows.Err()
}
```
Add `"fmt"` to the import block.
**Step 2: Update the query-api handler to pass filters**
In `cmd/query-api/main.go`, update the `/v1/events` handler (around line 113):
```go
r.Get("/v1/events", func(w http.ResponseWriter, r *http.Request) {
limit, _ := strconv.Atoi(r.URL.Query().Get("limit"))
f := postgres.EventsFilter{
Limit: limit,
EventType: r.URL.Query().Get("event_type"),
Framework: r.URL.Query().Get("framework"),
}
events, err := db.ListRecentEvents(r.Context(), f)
if err != nil {
httpx.WriteJSON(w, http.StatusInternalServerError, map[string]any{"error": "db_error"})
return
}
httpx.WriteJSON(w, http.StatusOK, map[string]any{"events": events})
})
```
**Step 3: Verify it compiles**
Run: `cd /home/will/lab/agentmon && go build ./...`
Expected: No errors
**Step 4: Test the filter with curl**
Run: `curl -s 'http://localhost:8081/v1/events?framework=openclaw&limit=5'`
Expected: `{"events":null}` or `{"events":[]}` (no openclaw events yet, but no errors)
**Step 5: Commit**
```bash
git add internal/store/postgres/query.go cmd/query-api/main.go
git commit -m "feat: add event_type and framework filters to events endpoint"
```
---
## Task 2: Add /agents SPA route to Go server
**Files:**
- Modify: `cmd/web-ui/main.go:51`
**Step 1: Add /agents to the SPA catch-all**
In `cmd/web-ui/main.go`, update line 51 to include `/agents`:
Change:
```go
if r.URL.Path == "/" || strings.HasPrefix(r.URL.Path, "/sessions") || strings.HasPrefix(r.URL.Path, "/runs") || strings.HasPrefix(r.URL.Path, "/openclaw") {
```
To:
```go
if r.URL.Path == "/" || strings.HasPrefix(r.URL.Path, "/sessions") || strings.HasPrefix(r.URL.Path, "/runs") || strings.HasPrefix(r.URL.Path, "/openclaw") || strings.HasPrefix(r.URL.Path, "/agents") {
```
**Step 2: Verify it compiles**
Run: `cd /home/will/lab/agentmon && go build ./cmd/web-ui/`
Expected: No errors
**Step 3: Commit**
```bash
git add cmd/web-ui/main.go
git commit -m "feat: add /agents SPA route to web-ui server"
```
---
## Task 3: Add Agents nav link and route in app.js
**Files:**
- Modify: `cmd/web-ui/static/index.html`
- Modify: `cmd/web-ui/static/app.js`
**Step 1: Add "Agents" nav link in index.html**
In `cmd/web-ui/static/index.html`, update the `<nav>` to add an Agents link before OpenClaw:
```html
<nav><a href="/agents">Agents</a><a href="/openclaw">OpenClaw</a></nav>
```
**Step 2: Add /agents route to the SPA router**
In `cmd/web-ui/static/app.js`, update the `route()` function (around line 58). Add the agents route before the openclaw check:
```javascript
function route() {
const path = window.location.pathname;
if (path === '/' || path === '/sessions') {
renderSessions();
} else if (path.startsWith('/agents')) {
renderAgents();
} else if (path.startsWith('/openclaw')) {
renderOpenClaw();
} else if (path.startsWith('/sessions/')) {
const sessionID = path.split('/sessions/')[1];
renderSession(sessionID);
} else if (path.startsWith('/runs/')) {
const runID = path.split('/runs/')[1];
renderRun(runID);
} else {
app.innerHTML = '<p>Page not found</p>';
}
}
```
**Step 3: Add a stub renderAgents function**
Add this above the `// Start` comment at the bottom of `app.js`:
```javascript
// Agents page
let agentsState = { events: [], vmStatus: {} };
let agentsUnsubscribe = null;
async function renderAgents() {
app.innerHTML = `
<div class="page-header"><h2>Agents</h2></div>
<p class="empty-state">Loading agent activity...</p>
`;
}
```
**Step 4: Verify in browser**
Open `http://localhost:8082/agents` — should show "Loading agent activity..."
**Step 5: Commit**
```bash
git add cmd/web-ui/static/index.html cmd/web-ui/static/app.js
git commit -m "feat: add /agents route stub and nav link"
```
---
## Task 4: Agents page CSS
**Files:**
- Modify: `cmd/web-ui/static/style.css`
**Step 1: Add all agents page styles**
Append to the end of `style.css`:
```css
/* ── Agents Page ───────────────────────────────────────── */
.agents-layout {
display: grid;
grid-template-columns: 1fr 280px;
gap: 1.5rem;
margin-top: 1.25rem;
}
@media (max-width: 900px) {
.agents-layout { grid-template-columns: 1fr; }
}
/* VM Status Strip */
.vm-strip {
display: flex;
gap: 0.75rem;
margin-bottom: 1.5rem;
}
.vm-pill {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.5rem 1rem;
background: var(--surface);
border: 1px solid var(--border);
border-radius: 999px;
font-size: 0.78rem;
font-weight: 600;
letter-spacing: 0.04em;
transition: border-color 0.2s;
}
.vm-pill.active {
border-color: rgba(52, 211, 153, 0.3);
}
.vm-pill.inactive {
border-color: rgba(248, 113, 113, 0.2);
opacity: 0.6;
}
.vm-pill-dot {
width: 7px;
height: 7px;
border-radius: 50%;
flex-shrink: 0;
}
.vm-pill.active .vm-pill-dot {
background: var(--success);
box-shadow: 0 0 6px rgba(52, 211, 153, 0.5);
animation: livePulse 2s ease-in-out infinite;
}
.vm-pill.inactive .vm-pill-dot {
background: var(--error);
}
.vm-pill-name {
font-family: var(--font-mono);
color: var(--text-bright);
}
.vm-pill-label {
color: var(--text-dim);
font-size: 0.68rem;
text-transform: uppercase;
letter-spacing: 0.06em;
}
/* Activity Timeline */
.timeline {
display: flex;
flex-direction: column;
gap: 0.5rem;
}
.timeline-event {
background: var(--card);
border: 1px solid var(--border);
border-radius: var(--radius-lg);
padding: 0.875rem 1.125rem;
backdrop-filter: blur(8px);
-webkit-backdrop-filter: blur(8px);
animation: fadeUp 0.25s ease both;
transition: border-color 0.15s;
}
.timeline-event:hover {
border-color: rgba(34, 211, 238, 0.15);
}
.timeline-event-header {
display: flex;
align-items: center;
gap: 0.6rem;
margin-bottom: 0.35rem;
}
.timeline-vm-tag {
font-family: var(--font-mono);
font-size: 0.68rem;
font-weight: 700;
padding: 0.15rem 0.5rem;
border-radius: 4px;
letter-spacing: 0.05em;
text-transform: uppercase;
}
.timeline-vm-tag.zap {
background: rgba(34, 211, 238, 0.12);
color: var(--accent);
border: 1px solid rgba(34, 211, 238, 0.2);
}
.timeline-vm-tag.orb {
background: rgba(167, 139, 250, 0.12);
color: var(--purple);
border: 1px solid rgba(167, 139, 250, 0.2);
}
.timeline-vm-tag.sun {
background: rgba(251, 191, 36, 0.12);
color: var(--warning);
border: 1px solid rgba(251, 191, 36, 0.2);
}
.timeline-event-type {
font-size: 0.75rem;
font-weight: 600;
color: var(--text-bright);
}
.timeline-event-time {
font-family: var(--font-mono);
font-size: 0.68rem;
color: var(--text-dim);
margin-left: auto;
}
.timeline-event-body {
font-size: 0.82rem;
color: var(--text);
line-height: 1.5;
padding-left: 0.15rem;
}
.timeline-event-body.tool-name {
font-family: var(--font-mono);
color: var(--accent);
font-size: 0.78rem;
}
.timeline-event-body.message-preview {
color: var(--text-dim);
font-style: italic;
}
.timeline-event-body.error-message {
color: var(--error);
}
.timeline-duration {
font-family: var(--font-mono);
font-size: 0.72rem;
color: var(--text-dim);
margin-left: 0.5rem;
}
.timeline-detail {
margin-top: 0.5rem;
padding: 0.75rem;
background: #020508;
border-radius: var(--radius);
font-family: var(--font-mono);
font-size: 0.75rem;
color: #7a9ab5;
white-space: pre-wrap;
word-break: break-all;
line-height: 1.65;
display: none;
}
.timeline-event.expanded .timeline-detail {
display: block;
}
.timeline-expand-hint {
font-size: 0.68rem;
color: var(--text-dim);
cursor: pointer;
margin-top: 0.3rem;
letter-spacing: 0.03em;
}
.timeline-expand-hint:hover {
color: var(--accent);
}
/* Stats Panel */
.stats-panel {
display: flex;
flex-direction: column;
gap: 1rem;
}
.stat-card {
background: var(--surface);
border: 1px solid var(--border);
border-radius: var(--radius-lg);
padding: 1rem;
}
.stat-card-title {
font-size: 0.68rem;
font-weight: 700;
color: var(--text-dim);
text-transform: uppercase;
letter-spacing: 0.1em;
margin-bottom: 0.6rem;
}
.stat-card-value {
font-family: var(--font-display);
font-size: 1.6rem;
font-weight: 800;
color: var(--text-bright);
letter-spacing: -0.02em;
}
.stat-card-sub {
font-size: 0.72rem;
color: var(--text-dim);
margin-top: 0.1rem;
}
.stat-list {
list-style: none;
}
.stat-list li {
display: flex;
justify-content: space-between;
align-items: center;
padding: 0.35rem 0;
border-bottom: 1px solid var(--border-soft);
font-size: 0.8rem;
}
.stat-list li:last-child { border-bottom: none; }
.stat-list-name {
font-family: var(--font-mono);
font-size: 0.75rem;
color: var(--text);
}
.stat-list-count {
font-family: var(--font-mono);
font-size: 0.72rem;
color: var(--text-dim);
background: var(--surface-2);
padding: 0.1rem 0.4rem;
border-radius: 4px;
}
/* Event type icons */
.event-icon {
width: 18px;
height: 18px;
border-radius: 4px;
display: flex;
align-items: center;
justify-content: center;
font-size: 0.6rem;
flex-shrink: 0;
}
.event-icon.message-in {
background: rgba(52, 211, 153, 0.12);
color: var(--success);
border: 1px solid rgba(52, 211, 153, 0.25);
}
.event-icon.message-out {
background: rgba(34, 211, 238, 0.12);
color: var(--accent);
border: 1px solid rgba(34, 211, 238, 0.25);
}
.event-icon.tool {
background: rgba(167, 139, 250, 0.12);
color: var(--purple);
border: 1px solid rgba(167, 139, 250, 0.25);
}
.event-icon.error {
background: rgba(248, 113, 113, 0.12);
color: var(--error);
border: 1px solid rgba(248, 113, 113, 0.25);
}
.event-icon.session {
background: rgba(251, 191, 36, 0.12);
color: var(--warning);
border: 1px solid rgba(251, 191, 36, 0.25);
}
.event-icon.internal {
background: var(--surface-2);
color: var(--text-dim);
border: 1px solid var(--border);
}
```
**Step 2: Commit**
```bash
git add cmd/web-ui/static/style.css
git commit -m "feat: add agents page CSS with timeline, stats panel, vm pills"
```
---
## Task 5: Build the VM status strip
**Files:**
- Modify: `cmd/web-ui/static/app.js`
**Step 1: Update renderAgents to show VM status strip**
Replace the stub `renderAgents` function with:
```javascript
// Agents page
let agentsState = { events: [], vmStatus: {}, stats: { messages: 0, tools: 0, errors: 0, toolCounts: {} } };
let agentsUnsubscribe = null;
function getVMStatus() {
// Reuse openclaw instance data if available
const vms = ['zap', 'orb', 'sun'];
return vms.map(name => {
const inst = openclawState.instances[name];
if (!inst) return { name, active: false };
const payload = inst.payload?.payload || inst.payload || {};
const host = payload.host || {};
return { name, active: host.state === 'running' };
});
}
async function renderAgents() {
// Load openclaw status for VM pills
try {
const data = await api('/v1/events?event_type=openclaw.snapshot&limit=100');
updateOpenClawState(data.events || []);
} catch (e) { /* ignore */ }
const vms = getVMStatus();
app.innerHTML = `
<div class="page-header">
<h2>Agents <span class="live-indicator"><span class="live-dot"></span>Live</span></h2>
</div>
<div class="vm-strip">
${vms.map(vm => `
<div class="vm-pill ${vm.active ? 'active' : 'inactive'}">
<span class="vm-pill-dot"></span>
<span class="vm-pill-name">${vm.name}</span>
<span class="vm-pill-label">${vm.active ? 'online' : 'offline'}</span>
</div>
`).join('')}
</div>
<div class="agents-layout">
<div class="timeline" id="agents-timeline">
<p class="empty-state">Waiting for agent activity...</p>
</div>
<div class="stats-panel" id="agents-stats">
<div class="stat-card">
<div class="stat-card-title">Messages</div>
<div class="stat-card-value" id="stat-messages">0</div>
<div class="stat-card-sub">received &amp; sent</div>
</div>
<div class="stat-card">
<div class="stat-card-title">Tool Calls</div>
<div class="stat-card-value" id="stat-tools">0</div>
</div>
<div class="stat-card">
<div class="stat-card-title">Errors</div>
<div class="stat-card-value" id="stat-errors">0</div>
</div>
<div class="stat-card">
<div class="stat-card-title">Top Tools</div>
<ul class="stat-list" id="stat-top-tools">
<li style="color:var(--text-dim);font-size:0.8rem">No data yet</li>
</ul>
</div>
</div>
</div>
`;
// Load initial events and subscribe to WebSocket
await loadAgentEvents();
if (agentsUnsubscribe) agentsUnsubscribe();
agentsUnsubscribe = subscribeWS(handleAgentsWS);
}
```
**Step 2: Verify in browser**
Open `http://localhost:8082/agents` — should show the VM pills and empty timeline with stats cards.
**Step 3: Commit**
```bash
git add cmd/web-ui/static/app.js
git commit -m "feat: agents page layout with VM strip, timeline, and stats panel"
```
---
## Task 6: Build the activity timeline and stats
**Files:**
- Modify: `cmd/web-ui/static/app.js`
**Step 1: Add loadAgentEvents, handleAgentsWS, and rendering functions**
Add these functions after `renderAgents`:
```javascript
async function loadAgentEvents() {
try {
const data = await api('/v1/events?framework=openclaw&limit=200');
const events = (data.events || []).reverse();
agentsState.events = events;
agentsState.stats = { messages: 0, tools: 0, errors: 0, toolCounts: {} };
events.forEach(e => updateAgentStats(e));
renderAgentTimeline();
renderAgentStats();
} catch (e) {
console.error('Failed to load agent events:', e);
}
}
function handleAgentsWS(msg) {
if (msg.type !== 'message') return;
try {
const raw = typeof msg.data === 'string' ? JSON.parse(msg.data) : msg.data;
// Check if this is an openclaw agent event (framework = openclaw, but not openclaw.snapshot)
const eventType = raw.event?.type || raw.payload?.event?.type || raw.type;
const framework = raw.event?.source?.framework || raw.payload?.event?.source?.framework || raw.source_framework;
if (framework !== 'openclaw' || eventType === 'openclaw.snapshot') return;
agentsState.events.push(raw);
// Keep max 500 events in memory
if (agentsState.events.length > 500) agentsState.events.shift();
updateAgentStats(raw);
renderAgentTimeline();
renderAgentStats();
} catch (e) { /* ignore parse errors */ }
}
function updateAgentStats(evt) {
const payload = evt.payload || {};
const eventPayload = payload.payload || payload;
const eventType = evt.type || payload.event?.type || evt.event?.type;
const attrs = evt.attributes || payload.attributes || {};
if (eventType === 'run.start' || eventType === 'run.end') {
agentsState.stats.messages++;
} else if (eventType === 'span.end') {
agentsState.stats.tools++;
const toolName = attrs.name || 'unknown';
agentsState.stats.toolCounts[toolName] = (agentsState.stats.toolCounts[toolName] || 0) + 1;
} else if (eventType === 'error') {
agentsState.stats.errors++;
}
}
function getEventIcon(eventType) {
switch (eventType) {
case 'run.start': return '<div class="event-icon message-in">&#x2193;</div>';
case 'run.end': return '<div class="event-icon message-out">&#x2191;</div>';
case 'span.start':
case 'span.end': return '<div class="event-icon tool">&#x2699;</div>';
case 'error': return '<div class="event-icon error">!</div>';
case 'session.start':
case 'session.end': return '<div class="event-icon session">&#x25CB;</div>';
default: return '<div class="event-icon internal">&#xB7;</div>';
}
}
function getEventLabel(eventType) {
const labels = {
'session.start': 'Session Started',
'session.end': 'Session Ended',
'run.start': 'Message Received',
'run.end': 'Response Sent',
'span.start': 'Tool Started',
'span.end': 'Tool Completed',
'error': 'Error',
'metric.snapshot': 'Metric',
};
return labels[eventType] || eventType;
}
function getVMName(evt) {
const payload = evt.payload || {};
return evt.client_id || payload.event?.source?.client_id || payload.source?.client_id || evt.event?.source?.client_id || 'unknown';
}
function getEventBody(evt) {
const eventType = evt.type || evt.payload?.event?.type || evt.event?.type;
const payload = evt.payload?.payload || evt.payload || {};
const attrs = evt.attributes || evt.payload?.attributes || {};
if (eventType === 'span.end' || eventType === 'span.start') {
const name = attrs.name || 'unknown tool';
const dur = payload.duration_ms ? ` <span class="timeline-duration">${formatDuration(payload.duration_ms)}</span>` : '';
return `<div class="timeline-event-body tool-name">${name}${dur}</div>`;
}
if (eventType === 'run.start') {
const preview = payload.message_preview || '';
return preview ? `<div class="timeline-event-body message-preview">"${preview.substring(0, 120)}${preview.length > 120 ? '...' : ''}"</div>` : '';
}
if (eventType === 'run.end') {
const status = payload.status || 'unknown';
const error = payload.error;
if (error) return `<div class="timeline-event-body error-message">${error}</div>`;
return `<div class="timeline-event-body">${statusIcon(status)}</div>`;
}
if (eventType === 'error') {
const errPayload = payload.error || {};
return `<div class="timeline-event-body error-message">${errPayload.type || 'error'}: ${errPayload.message || 'unknown'}</div>`;
}
return '';
}
function renderAgentTimeline() {
const timeline = document.getElementById('agents-timeline');
if (!timeline) return;
// Show most recent events first (last N)
const recent = agentsState.events.slice(-100).reverse();
if (recent.length === 0) {
timeline.innerHTML = '<p class="empty-state">Waiting for agent activity...</p>';
return;
}
timeline.innerHTML = recent.map((evt, i) => {
const eventType = evt.type || evt.payload?.event?.type || evt.event?.type;
const ts = evt.ts || evt.payload?.event?.ts || evt.event?.ts;
const vmName = getVMName(evt);
const timeStr = ts ? new Date(ts).toLocaleTimeString() : '';
const hasPayload = evt.payload && Object.keys(evt.payload).length > 0;
return `
<div class="timeline-event" data-index="${i}">
<div class="timeline-event-header">
${getEventIcon(eventType)}
<span class="timeline-vm-tag ${vmName}">${vmName}</span>
<span class="timeline-event-type">${getEventLabel(eventType)}</span>
<span class="timeline-event-time">${timeStr}</span>
</div>
${getEventBody(evt)}
${hasPayload ? '<div class="timeline-expand-hint" onclick="this.parentElement.classList.toggle(\'expanded\')">details</div>' : ''}
${hasPayload ? `<div class="timeline-detail">${JSON.stringify(evt.payload, null, 2)}</div>` : ''}
</div>
`;
}).join('');
}
function renderAgentStats() {
const s = agentsState.stats;
const el = (id) => document.getElementById(id);
if (el('stat-messages')) el('stat-messages').textContent = s.messages;
if (el('stat-tools')) el('stat-tools').textContent = s.tools;
if (el('stat-errors')) el('stat-errors').textContent = s.errors;
const topTools = Object.entries(s.toolCounts)
.sort((a, b) => b[1] - a[1])
.slice(0, 8);
const list = el('stat-top-tools');
if (list) {
if (topTools.length === 0) {
list.innerHTML = '<li style="color:var(--text-dim);font-size:0.8rem">No data yet</li>';
} else {
list.innerHTML = topTools.map(([name, count]) => `
<li>
<span class="stat-list-name">${name}</span>
<span class="stat-list-count">${count}</span>
</li>
`).join('');
}
}
}
```
**Step 2: Verify in browser**
Open `http://localhost:8082/agents` — should show the full layout. No events yet (no hook deployed) but the structure should be complete with the empty timeline and zero stats.
**Step 3: Commit**
```bash
git add cmd/web-ui/static/app.js
git commit -m "feat: agents activity timeline and stats panel with real-time WebSocket"
```
---
## Task 7: Create the OpenClaw hook — HOOK.md
**Files:**
- Create: `hooks/agentmon/HOOK.md`
**Step 1: Create the hook metadata file**
```markdown
---
name: agentmon
description: "Emit agent telemetry events to the agentmon monitoring system"
metadata:
openclaw:
emoji: "\U0001F4CA"
events:
- "command:new"
- "command:stop"
- "command:reset"
- "message:received"
- "message:sent"
- "tool_result_persist"
- "session:compact:before"
- "session:compact:after"
export: "default"
requires:
env:
- "AGENTMON_INGEST_URL"
---
# Agentmon Telemetry Hook
Captures OpenClaw agent activity and emits it as structured events to the
agentmon ingest gateway for monitoring and visualization.
## Events Captured
| Event | Maps To | Description |
|-------|---------|-------------|
| command:new | session.start | Agent session begins |
| command:stop/reset | session.end | Session ends |
| message:received | run.start | Inbound message starts a turn |
| message:sent | run.end | Agent response completes |
| tool_result_persist | span.end | Tool call completed |
| session:compact:* | span (internal) | Context management |
## Configuration
Set `AGENTMON_INGEST_URL` to the agentmon ingest gateway URL:
```
export AGENTMON_INGEST_URL=http://192.168.122.1:8080
```
Optionally set `AGENTMON_VM_NAME` to override the VM identifier (defaults
to system hostname).
```
**Step 2: Commit**
```bash
git add hooks/agentmon/HOOK.md
git commit -m "feat: add agentmon hook metadata for OpenClaw"
```
---
## Task 8: Create the OpenClaw hook — handler.ts
**Files:**
- Create: `hooks/agentmon/handler.ts`
**Step 1: Write the complete handler**
```typescript
import { randomUUID } from 'crypto';
import { hostname } from 'os';
// --- Configuration ---
const INGEST_URL = process.env.AGENTMON_INGEST_URL || 'http://192.168.122.1:8080';
const VM_NAME = process.env.AGENTMON_VM_NAME || hostname();
const BATCH_SIZE = 10;
const FLUSH_MS = 2000;
const FETCH_TIMEOUT_MS = 500;
// --- State ---
let buffer: any[] = [];
let flushTimer: ReturnType<typeof setTimeout> | null = null;
const activeRuns = new Map<string, string>(); // sessionKey -> runId
const activeCompactions = new Map<string, string>(); // sessionKey -> spanId
// --- Envelope builder ---
function envelope(
type: string,
sessionKey: string,
opts: {
runId?: string;
spanId?: string;
traceId?: string;
parentSpanId?: string;
attributes?: Record<string, any>;
payload?: Record<string, any>;
} = {}
) {
return {
schema: { name: 'agentmon.event', version: 1 },
event: {
id: randomUUID(),
type,
ts: new Date().toISOString(),
source: {
framework: 'openclaw',
client_id: VM_NAME,
host: VM_NAME,
},
},
correlation: {
session_id: sessionKey || undefined,
run_id: opts.runId || undefined,
trace_id: opts.traceId || undefined,
span_id: opts.spanId || undefined,
parent_span_id: opts.parentSpanId || undefined,
},
attributes: opts.attributes || undefined,
payload: opts.payload || undefined,
};
}
// --- Buffered emitter ---
function enqueue(evt: any) {
buffer.push(evt);
if (buffer.length >= BATCH_SIZE) {
flush();
} else if (!flushTimer) {
flushTimer = setTimeout(flush, FLUSH_MS);
}
}
async function flush() {
if (flushTimer) {
clearTimeout(flushTimer);
flushTimer = null;
}
if (buffer.length === 0) return;
const batch = buffer.splice(0);
try {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), FETCH_TIMEOUT_MS);
await fetch(`${INGEST_URL}/v1/events`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(batch),
signal: controller.signal,
});
clearTimeout(timeout);
} catch {
// Fire-and-forget: log but never block the agent loop
console.debug(`[agentmon] flush failed for ${batch.length} events`);
}
}
// --- Event handler ---
const handler = async (event: any) => {
const sk = event.sessionKey || 'unknown';
try {
// --- Command events ---
if (event.type === 'command') {
if (event.action === 'new') {
enqueue(envelope('session.start', sk));
} else if (event.action === 'stop') {
enqueue(envelope('session.end', sk));
activeRuns.delete(sk);
} else if (event.action === 'reset') {
enqueue(envelope('session.end', sk));
activeRuns.delete(sk);
enqueue(envelope('session.start', sk));
}
return;
}
// --- Message events ---
if (event.type === 'message') {
if (event.action === 'received') {
const runId = randomUUID();
activeRuns.set(sk, runId);
enqueue(envelope('run.start', sk, {
runId,
attributes: {
channel: event.context?.channelId,
from: event.context?.from,
},
payload: {
message_preview: (event.context?.content || '').substring(0, 200),
},
}));
} else if (event.action === 'sent') {
const runId = activeRuns.get(sk);
enqueue(envelope('run.end', sk, {
runId,
attributes: {
channel: event.context?.channelId,
to: event.context?.to,
},
payload: {
status: event.context?.success !== false ? 'success' : 'error',
error: event.context?.error || undefined,
},
}));
// Don't delete runId yet - tools may still fire between sent events
}
return;
}
// --- Tool result ---
if (event.type === 'tool_result_persist') {
const runId = activeRuns.get(sk);
const spanId = randomUUID();
const toolName = event.context?.toolName || event.context?.name || 'unknown_tool';
enqueue(envelope('span.end', sk, {
runId,
spanId,
attributes: {
span_kind: 'tool',
name: toolName,
},
payload: {
status: 'success',
result_preview: JSON.stringify(event.context?.result || '').substring(0, 500),
},
}));
return;
}
// --- Session compaction ---
if (event.type === 'session') {
const runId = activeRuns.get(sk);
if (event.action === 'compact:before') {
const spanId = randomUUID();
activeCompactions.set(sk, spanId);
enqueue(envelope('span.start', sk, {
runId,
spanId,
attributes: { span_kind: 'internal', name: 'context_compaction' },
}));
} else if (event.action === 'compact:after') {
const spanId = activeCompactions.get(sk) || randomUUID();
activeCompactions.delete(sk);
enqueue(envelope('span.end', sk, {
runId,
spanId,
attributes: { span_kind: 'internal', name: 'context_compaction' },
payload: { status: 'success' },
}));
}
return;
}
} catch {
console.debug('[agentmon] handler error');
}
};
export default handler;
```
**Step 2: Commit**
```bash
git add hooks/agentmon/handler.ts
git commit -m "feat: agentmon hook handler with buffered event emission"
```
---
## Task 9: Rebuild and verify end-to-end
**Step 1: Rebuild containers**
```bash
docker compose build --no-cache
docker compose up -d
```
**Step 2: Verify the framework filter works**
```bash
curl -s 'http://localhost:8081/v1/events?framework=openclaw&limit=5'
```
Expected: `{"events":null}` or empty array (no errors)
**Step 3: Send a synthetic test event to verify the pipeline**
```bash
curl -s -X POST http://localhost:8080/v1/events \
-H 'Content-Type: application/json' \
-d '[{
"schema": {"name": "agentmon.event", "version": 1},
"event": {
"id": "test-agent-001",
"type": "run.start",
"ts": "'$(date -u +%Y-%m-%dT%H:%M:%SZ)'",
"source": {"framework": "openclaw", "client_id": "zap", "host": "zap"}
},
"correlation": {"session_id": "test-session-1", "run_id": "test-run-1"},
"payload": {"message_preview": "Hello, check the logs please"}
}]'
```
Expected: `{"accepted":1,"rejected":0}`
**Step 4: Verify it appears in the agents page**
```bash
curl -s 'http://localhost:8081/v1/events?framework=openclaw&limit=5'
```
Expected: The test event appears in the response.
Open `http://localhost:8082/agents` in a browser — the test event should appear in the timeline.
**Step 5: Commit**
```bash
git add -A
git commit -m "feat: complete agent monitoring - hook, UI, and backend filter"
```
---
## Task 10: Deploy hook to VMs
**Step 1: Copy the hook to each VM**
```bash
for vm in zap orb sun; do
IP=$(virsh domifaddr $vm | grep -oP '192\.168\.122\.\d+')
scp -r hooks/agentmon/ openclaw@${IP}:~/.openclaw/hooks/agentmon/
done
```
**Step 2: Set the environment variable on each VM**
```bash
HOST_IP=$(ip addr show virbr0 | grep -oP '192\.168\.122\.\d+')
for vm in zap orb sun; do
IP=$(virsh domifaddr $vm | grep -oP '192\.168\.122\.\d+')
ssh openclaw@${IP} "echo 'AGENTMON_INGEST_URL=http://${HOST_IP}:8080' >> ~/.openclaw/.env"
ssh openclaw@${IP} "echo 'AGENTMON_VM_NAME=${vm}' >> ~/.openclaw/.env"
done
```
**Step 3: Verify hooks are discovered**
```bash
for vm in zap orb sun; do
IP=$(virsh domifaddr $vm | grep -oP '192\.168\.122\.\d+')
ssh openclaw@${IP} "openclaw hooks list --json" | jq '.[] | select(.name == "agentmon")'
done
```
Expected: The agentmon hook appears in the list for each VM.
**Step 4: Send a test message to one agent and confirm the event appears**
Send a message to any OpenClaw instance through its configured channel. Watch the agents page at `http://localhost:8082/agents` for the event to appear in the timeline.
---
## Summary
| Task | Component | What it does |
|------|-----------|-------------|
| 1 | Backend | Add framework/event_type filters to events query |
| 2 | Backend | Add /agents SPA route to Go server |
| 3 | Frontend | Add nav link and route stub |
| 4 | Frontend | Agents page CSS (timeline, stats, VM pills) |
| 5 | Frontend | VM status strip and page layout |
| 6 | Frontend | Activity timeline rendering and stats panel |
| 7 | Hook | HOOK.md metadata file |
| 8 | Hook | handler.ts with buffered event emission |
| 9 | Infra | Rebuild, test pipeline end-to-end |
| 10 | Infra | Deploy hook to VMs |