feat: complete agent monitoring - hook, UI, and backend filter
- Add event_type and framework filters to events query endpoint - Add /agents SPA route to web-ui server - Add Agents nav link and route in frontend - Add agents page CSS (timeline, VM pills, stats panel) - Build VM status strip, activity timeline, and real-time stats - Add agentmon hook for OpenClaw (HOOK.md + handler.ts) - Add docker-compose, Dockerfile, and supporting infra files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,14 @@
|
||||
# Local development configuration
|
||||
# Copy this file to .env and adjust as needed
|
||||
|
||||
# Postgres database (running via docker-compose)
|
||||
DATABASE_URL=postgres://postgres:pass@localhost:5432/agentmon?sslmode=disable
|
||||
|
||||
# NATS message queue (running via docker-compose)
|
||||
NATS_URL=nats://localhost:4222
|
||||
|
||||
# NATS topic for events
|
||||
NATS_TOPIC=agentmon.events.v1
|
||||
|
||||
# Query API base URL (for web-ui proxy)
|
||||
AGENTMON_QUERY_BASE=http://localhost:8081
|
||||
@@ -1 +1,5 @@
|
||||
.worktrees/
|
||||
.env
|
||||
/ingest-gateway
|
||||
/query-api
|
||||
/web-ui
|
||||
|
||||
@@ -0,0 +1,166 @@
|
||||
# AGENTS.md
|
||||
|
||||
This file provides guidelines for agentic coding agents working on the agentmon repository.
|
||||
|
||||
## Build/Lint/Test Commands
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
go test ./...
|
||||
|
||||
# Run tests for a specific package
|
||||
go test ./internal/event
|
||||
|
||||
# Run a single test
|
||||
go test ./internal/event -run TestValidate_ValidEvent
|
||||
|
||||
# Run tests with verbose output
|
||||
go test -v ./...
|
||||
|
||||
# Tidy dependencies
|
||||
go mod tidy
|
||||
|
||||
# Run services via Makefile
|
||||
make tidy
|
||||
make test
|
||||
make run-ingest # Ingest gateway (requires NATS_URL, NATS_TOPIC)
|
||||
make run-query # Query API (requires DATABASE_URL)
|
||||
make run-ui # Web UI
|
||||
make run-processor # Event processor (requires DATABASE_URL, NATS_URL, NATS_TOPIC)
|
||||
|
||||
# Build executables
|
||||
go build -o ingest-gateway ./cmd/ingest-gateway
|
||||
go build -o query-api ./cmd/query-api
|
||||
go build -o web-ui ./cmd/web-ui
|
||||
```
|
||||
|
||||
## Code Style Guidelines
|
||||
|
||||
### Imports
|
||||
- Order: stdlib, internal packages, external packages
|
||||
- Group by blank line between each section
|
||||
- No unused imports
|
||||
|
||||
Example:
|
||||
```go
|
||||
import (
|
||||
"context"
|
||||
"database/sql"
|
||||
|
||||
"agentmon/internal/event"
|
||||
"agentmon/internal/httpx"
|
||||
|
||||
"github.com/go-chi/chi/v5"
|
||||
"github.com/jackc/pgx/v5"
|
||||
)
|
||||
```
|
||||
|
||||
### Formatting
|
||||
- Use `go fmt` or enable auto-formatting
|
||||
- Standard Go formatting rules apply
|
||||
- No inline comments unless necessary
|
||||
|
||||
### Types
|
||||
- Use `any` for generic types (not `interface{}`)
|
||||
- Use pointer types (`*int64`) for optional JSON fields
|
||||
- Struct tags for JSON serialization: `json:"field_name,omitempty"`
|
||||
- Use `sql.ErrNoRows` for "not found" database errors
|
||||
|
||||
### Naming Conventions
|
||||
- Exported: CamelCase (e.g., `ValidationError`, `Publish`)
|
||||
- Unexported: camelCase (e.g., `db`, `validate`)
|
||||
- Acronyms: keep uppercase (e.g., `DB`, `NATS`, `URL`)
|
||||
- Constants: CamelCase (e.g., `validTypes`)
|
||||
- Test functions: `Test<Function>_<Scenario>`
|
||||
|
||||
### Error Handling
|
||||
- Always check errors, don't ignore
|
||||
- Use `log.Fatalf` for startup errors (main package only)
|
||||
- Use `errors.As()` for type assertions: `errors.As(err, &ve)`
|
||||
- Custom error types must implement `Error() string` method
|
||||
- Return errors from functions, handle at call site
|
||||
- HTTP errors: return JSON with error field, appropriate status code
|
||||
|
||||
Example:
|
||||
```go
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("operation failed: %w", err)
|
||||
}
|
||||
|
||||
if ve, ok := err.(ValidationError); ok {
|
||||
return ValidationError{Field: "field", Message: "message"}
|
||||
}
|
||||
```
|
||||
|
||||
### Database
|
||||
- Use `context.Context` for all DB operations
|
||||
- Use pgx/v5 via stdlib interface: `sql.Open("pgx", url)`
|
||||
- Check `sql.ErrNoRows` explicitly for not-found cases
|
||||
- Always defer `db.Close()` in main functions
|
||||
|
||||
### HTTP
|
||||
- Use chi router with middleware
|
||||
- Standard middleware chain: `RequestID`, `RealIP`, `Logger`, `Recoverer`
|
||||
- Health check endpoint: `GET /healthz` returns 200 with "ok"
|
||||
- JSON responses: use `httpx.WriteJSON(w, status, data)`
|
||||
- Get path params: `chi.URLParam(r, "paramName")`
|
||||
- Get query params: `r.URL.Query().Get("key")`
|
||||
|
||||
### Configuration
|
||||
- Use environment variables for configuration
|
||||
- Helper pattern: `envDefault(key, defaultValue)` function
|
||||
- Required env vars: log.Fatal if missing
|
||||
- Optional env vars: provide sensible defaults
|
||||
|
||||
### Validation
|
||||
- Validate all input (HTTP, events)
|
||||
- Return structured errors with field path and message
|
||||
- Type assertion with comma-ok for error type checking
|
||||
- Valid event types: `session.start`, `session.end`, `run.start`, `run.end`, `span.start`, `span.end`, `error`, `metric.snapshot`
|
||||
|
||||
### Testing
|
||||
- Use standard `testing` package
|
||||
- Test file naming: `*_test.go`
|
||||
- Test function naming: `Test<Function>_<Scenario>`
|
||||
- Use `t.Fatalf` for setup failures
|
||||
- Use `t.Fatal` for assertion failures (not `t.Error`)
|
||||
- Minimal test structure: setup, act, assert
|
||||
|
||||
Example:
|
||||
```go
|
||||
func TestValidate_ValidEvent(t *testing.T) {
|
||||
raw := `{"schema": {"name": "agentmon.event", "version": 1}}`
|
||||
var m map[string]any
|
||||
_ = json.Unmarshal([]byte(raw), &m)
|
||||
|
||||
err := Validate(m)
|
||||
if err != nil {
|
||||
t.Fatalf("expected no error, got %v", err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Context Usage
|
||||
- Pass `context.Context` to functions that do I/O or external calls
|
||||
- Use `context.WithTimeout` for operations with deadlines
|
||||
- Defer cancel functions
|
||||
- Example: `ctx, cancel := context.WithTimeout(ctx, 5*time.Second)`
|
||||
|
||||
### Package Structure
|
||||
- `cmd/`: Executable entry points (main packages)
|
||||
- `internal/`: Internal packages not imported by external code
|
||||
- `internal/event/`: Event schema and validation
|
||||
- `internal/httpx/`: HTTP utilities
|
||||
- `internal/store/postgres/`: Database operations
|
||||
- `internal/queue/nats/`: NATS publishing
|
||||
|
||||
### JSON Handling
|
||||
- Decode to `map[string]any` for flexible event processing
|
||||
- Type assertions: `if v, ok := m["key"].(string); ok {}`
|
||||
- Use `json.RawMessage` for buffering JSON data
|
||||
- JSON encoder/decoder for I/O
|
||||
|
||||
### Logging
|
||||
- Use `log.Printf` for general logging
|
||||
- Use `log.Fatalf` for unrecoverable errors in main
|
||||
- Minimal logging in packages (prefer returning errors)
|
||||
+37
@@ -0,0 +1,37 @@
|
||||
FROM golang:1.25 AS builder
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
COPY go.mod go.sum ./
|
||||
RUN go mod download
|
||||
|
||||
COPY . .
|
||||
|
||||
RUN CGO_ENABLED=0 GOOS=linux go build -o /usr/local/bin/ingest-gateway ./cmd/ingest-gateway
|
||||
RUN CGO_ENABLED=0 GOOS=linux go build -o /usr/local/bin/query-api ./cmd/query-api
|
||||
RUN CGO_ENABLED=0 GOOS=linux go build -o /usr/local/bin/web-ui ./cmd/web-ui
|
||||
RUN CGO_ENABLED=0 GOOS=linux go build -o /usr/local/bin/event-processor ./cmd/event-processor
|
||||
RUN CGO_ENABLED=0 GOOS=linux go build -o /usr/local/bin/openclaw-monitor ./cmd/openclaw-monitor
|
||||
|
||||
FROM debian:bookworm-slim
|
||||
|
||||
RUN apt-get update && apt-get install -y \
|
||||
ca-certificates \
|
||||
libvirt-clients \
|
||||
openssh-client \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
COPY --from=builder /usr/local/bin/* /usr/local/bin/
|
||||
|
||||
ENV AGENTMON_ADDR=:8080
|
||||
ENV AGENTMON_QUERY_ADDR=:8081
|
||||
ENV AGENTMON_UI_ADDR=:8082
|
||||
ENV DATABASE_URL=postgres://postgres:pass@postgres:5432/agentmon?sslmode=disable
|
||||
ENV NATS_URL=nats://nats:4222
|
||||
ENV NATS_TOPIC=agentmon.events.v1
|
||||
ENV OPENCLAW_REGISTRY=/openclaw-registry/openclaw-instances.json
|
||||
ENV POLL_INTERVAL=30s
|
||||
|
||||
CMD ["ingest-gateway"]
|
||||
@@ -1,4 +1,10 @@
|
||||
.PHONY: tidy test run-ingest run-query run-ui run-processor
|
||||
.PHONY: tidy test run-ingest run-query run-ui run-processor run-openclaw-monitor up down logs
|
||||
|
||||
# Load .env file
|
||||
ifneq (,$(wildcard ./.env))
|
||||
include .env
|
||||
export
|
||||
endif
|
||||
|
||||
tidy:
|
||||
go mod tidy
|
||||
@@ -18,29 +24,14 @@ run-ui:
|
||||
run-processor:
|
||||
DATABASE_URL=$${DATABASE_URL:?set DATABASE_URL} NATS_URL=$${NATS_URL:-nats://nats:4222} NATS_TOPIC=$${NATS_TOPIC:-agentmon.events.v1} go run ./cmd/event-processor
|
||||
|
||||
tidy:
|
||||
go mod tidy
|
||||
run-openclaw-monitor:
|
||||
NATS_URL=$${NATS_URL:-nats://nats:4222} NATS_TOPIC=$${NATS_TOPIC:-agentmon.events.v1} OPENCLAW_REGISTRY=$${OPENCLAW_REGISTRY:-/home/will/.claude/state/openclaw-instances.json} POLL_INTERVAL=$${POLL_INTERVAL:-30s} go run ./cmd/openclaw-monitor
|
||||
|
||||
test:
|
||||
go test ./...
|
||||
up:
|
||||
docker-compose up -d
|
||||
|
||||
run-ingest:
|
||||
AGENTMON_ADDR=:8080 go run ./cmd/ingest-gateway
|
||||
down:
|
||||
docker-compose down
|
||||
|
||||
run-query:
|
||||
AGENTMON_QUERY_ADDR=:8081 go run ./cmd/query-api
|
||||
|
||||
run-ui:
|
||||
AGENTMON_UI_ADDR=:8082 go run ./cmd/web-ui
|
||||
|
||||
tidy:
|
||||
go mod tidy
|
||||
|
||||
test:
|
||||
go test ./...
|
||||
|
||||
run-ingest:
|
||||
AGENTMON_ADDR=:8080 go run ./cmd/ingest-gateway
|
||||
|
||||
run-query:
|
||||
AGENTMON_QUERY_ADDR=:8081 go run ./cmd/query-api
|
||||
logs:
|
||||
docker-compose logs -f
|
||||
|
||||
@@ -0,0 +1,253 @@
|
||||
# agentmon
|
||||
|
||||
Telemetry and monitoring system for AI agent activity across [OpenClaw](https://openclaw.ai/) instances running on KVM virtual machines. Captures sessions, runs, tool calls, errors, and VM health metrics — viewable in a real-time web dashboard.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌──────────────────────────┐
|
||||
│ OpenClaw VMs │
|
||||
│ (zap, orb, sun) │
|
||||
│ │
|
||||
│ hooks/agentmon/ │
|
||||
│ → handler.ts │
|
||||
└──────────┬───────────────┘
|
||||
│ HTTP POST
|
||||
▼
|
||||
┌─────────────┐ publish ┌──────────────┐
|
||||
│ openclaw- │────────────▶│ NATS │
|
||||
│ monitor │ │ :4222 │
|
||||
│ (VM polls) │ └──────┬───────┘
|
||||
└─────────────┘ │ subscribe
|
||||
▼
|
||||
┌──────────────────┐
|
||||
│ event-processor │
|
||||
└────────┬─────────┘
|
||||
│ INSERT
|
||||
▼
|
||||
┌─────────────┐ query ┌──────────────┐ proxy ┌──────────────┐
|
||||
│ web-ui │◀────────▶│ query-api │◀──────────│ browser │
|
||||
│ :8082 │ │ :8081 │ └──────────────┘
|
||||
└─────────────┘ └──────────────┘
|
||||
▲
|
||||
│
|
||||
┌────────┴───────┐
|
||||
│ PostgreSQL │
|
||||
│ :5432 │
|
||||
└────────────────┘
|
||||
```
|
||||
|
||||
**Data flow:** OpenClaw hooks emit telemetry events over HTTP to the **ingest gateway**, which publishes them to **NATS**. The **event processor** subscribes and persists events to **PostgreSQL**. The **query API** serves aggregated data (sessions, runs, spans) to the **web UI**. A separate **openclaw-monitor** polls VM health metrics (CPU, memory, disk, service status) via libvirt and SSH.
|
||||
|
||||
Real-time updates flow through NATS → query-api → WebSocket → browser.
|
||||
|
||||
## Services
|
||||
|
||||
| Service | Port | Description |
|
||||
|---------|------|-------------|
|
||||
| **ingest-gateway** | 8080 | HTTP + WebSocket event ingestion, publishes to NATS |
|
||||
| **query-api** | 8081 | REST API for sessions, runs, spans; WebSocket live feed |
|
||||
| **web-ui** | 8082 | SPA frontend with reverse proxy to query-api |
|
||||
| **event-processor** | — | NATS subscriber, persists events to Postgres |
|
||||
| **openclaw-monitor** | — | Polls VM instances via libvirt/SSH, emits snapshots |
|
||||
| **postgres** | 5432 | Event storage |
|
||||
| **nats** | 4222 | Message queue (JetStream) |
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
make up
|
||||
```
|
||||
|
||||
This starts Postgres, NATS, and all application services via Docker Compose. Open http://localhost:8082.
|
||||
|
||||
For local development, start infrastructure only and run services manually:
|
||||
|
||||
```bash
|
||||
make up # postgres + nats
|
||||
make run-ingest # terminal 1
|
||||
make run-query # terminal 2
|
||||
make run-ui # terminal 3
|
||||
make run-processor # terminal 4
|
||||
make run-openclaw-monitor # terminal 5
|
||||
```
|
||||
|
||||
Or use the convenience scripts:
|
||||
|
||||
```bash
|
||||
./start-all.sh # start everything
|
||||
./stop-all.sh # stop everything
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Environment variables (see `.env.example`):
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `DATABASE_URL` | — | Postgres connection string (required) |
|
||||
| `NATS_URL` | `nats://nats:4222` | NATS server address |
|
||||
| `NATS_TOPIC` | `agentmon.events.v1` | NATS topic for events |
|
||||
| `AGENTMON_ADDR` | `:8080` | Ingest gateway listen address |
|
||||
| `AGENTMON_QUERY_ADDR` | `:8081` | Query API listen address |
|
||||
| `AGENTMON_UI_ADDR` | `:8082` | Web UI listen address |
|
||||
| `AGENTMON_QUERY_BASE` | `http://query-api` | Query API URL (for web-ui proxy) |
|
||||
| `OPENCLAW_REGISTRY` | `~/.claude/state/openclaw-instances.json` | VM instance registry |
|
||||
| `POLL_INTERVAL` | `30s` | VM polling interval |
|
||||
|
||||
## API
|
||||
|
||||
### Ingest Gateway (`:8080`)
|
||||
|
||||
```
|
||||
GET /healthz Health check
|
||||
POST /v1/events Batch event ingestion (JSON array)
|
||||
GET /v1/ws WebSocket event stream
|
||||
```
|
||||
|
||||
### Query API (`:8081`)
|
||||
|
||||
```
|
||||
GET /healthz Health check
|
||||
GET /v1/events List events (?event_type=&framework=&limit=)
|
||||
GET /v1/sessions List sessions (?from=&to=&framework=&host=&cursor=&limit=)
|
||||
GET /v1/sessions/{id} Session detail with runs
|
||||
GET /v1/runs/{id} Run detail with spans
|
||||
GET /v1/ws WebSocket live event broadcast
|
||||
```
|
||||
|
||||
## Event Schema
|
||||
|
||||
Events follow the `agentmon.event` envelope format:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": { "name": "agentmon.event", "version": 1 },
|
||||
"event": {
|
||||
"id": "uuid",
|
||||
"type": "session.start",
|
||||
"ts": "2026-03-13T12:00:00Z",
|
||||
"source": {
|
||||
"framework": "openclaw",
|
||||
"client_id": "zap",
|
||||
"host": "zap"
|
||||
}
|
||||
},
|
||||
"correlation": {
|
||||
"session_id": "uuid",
|
||||
"run_id": "uuid",
|
||||
"span_id": "uuid"
|
||||
},
|
||||
"attributes": {},
|
||||
"payload": {}
|
||||
}
|
||||
```
|
||||
|
||||
**Event types:** `session.start`, `session.end`, `run.start`, `run.end`, `span.start`, `span.end`, `error`, `metric.snapshot`, `openclaw.snapshot`
|
||||
|
||||
## Database Schema
|
||||
|
||||
```sql
|
||||
CREATE TABLE events (
|
||||
event_id TEXT PRIMARY KEY,
|
||||
ts TIMESTAMPTZ NOT NULL,
|
||||
type TEXT NOT NULL,
|
||||
session_id TEXT,
|
||||
run_id TEXT,
|
||||
trace_id TEXT,
|
||||
span_id TEXT,
|
||||
parent_span_id TEXT,
|
||||
source_framework TEXT,
|
||||
client_id TEXT,
|
||||
payload JSONB NOT NULL
|
||||
);
|
||||
```
|
||||
|
||||
## OpenClaw Hook
|
||||
|
||||
The `hooks/agentmon/` directory contains a TypeScript hook that captures agent activity from OpenClaw instances and emits it to the ingest gateway. It maps OpenClaw events to agentmon's session/run/span model:
|
||||
|
||||
| OpenClaw Event | agentmon Event | Description |
|
||||
|----------------|----------------|-------------|
|
||||
| `command:new` | `session.start` | New conversation started |
|
||||
| `command:stop` | `session.end` | Conversation ended |
|
||||
| `command:reset` | `session.end` + `session.start` | Conversation reset |
|
||||
| `message:received` | `run.start` | User message received |
|
||||
| `message:sent` | `run.end` | Agent response sent |
|
||||
| `tool_result_persist` | `span.end` | Tool call completed |
|
||||
| `session:compact:before` | `span.start` | Context compaction started |
|
||||
| `session:compact:after` | `span.end` | Context compaction finished |
|
||||
|
||||
### Deploying the hook
|
||||
|
||||
The hook is deployed to each VM at `~/.openclaw/hooks/agentmon/`. Two environment variables are required in `~/.openclaw/.env`:
|
||||
|
||||
```bash
|
||||
AGENTMON_INGEST_URL=http://192.168.122.1:8080
|
||||
AGENTMON_VM_NAME=zap # or orb, sun
|
||||
```
|
||||
|
||||
Deployment is automated via Ansible — see the [swarm ansible playbook](https://gitea-http.taildb3494.ts.net/will/swarm) `playbooks/customize.yml`.
|
||||
|
||||
## Go SDK
|
||||
|
||||
Emit events from Go applications:
|
||||
|
||||
```go
|
||||
emitter, err := sdk.NewEmitter(sdk.Config{
|
||||
ServerURL: "http://localhost:8080",
|
||||
Framework: "my-agent",
|
||||
ClientID: "client-001",
|
||||
Host: "localhost",
|
||||
})
|
||||
defer emitter.Close(ctx)
|
||||
|
||||
emitter.Emit(ctx, sdk.NewSessionStart(sessionID, sdk.WithSource(emitter)))
|
||||
emitter.Emit(ctx, sdk.NewRunStart(sessionID, runID))
|
||||
emitter.Emit(ctx, sdk.NewRunEnd(sessionID, runID, sdk.WithPayload(map[string]any{
|
||||
"status": "success",
|
||||
"duration_ms": 1234,
|
||||
})))
|
||||
```
|
||||
|
||||
## Web UI
|
||||
|
||||
The dashboard has four views:
|
||||
|
||||
- **Sessions** — browse all agent sessions with date range and framework filters
|
||||
- **Session Detail** — view runs within a session, drill into individual runs
|
||||
- **OpenClaw** — real-time grid of VM health cards (state, CPU, memory, disk, issues)
|
||||
- **Agents** — live timeline of agent events with statistics (message counts, tool usage, errors)
|
||||
|
||||
## Development
|
||||
|
||||
```bash
|
||||
make test # run tests
|
||||
make tidy # go mod tidy
|
||||
make logs # docker compose logs
|
||||
make down # stop everything
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
cmd/
|
||||
├── ingest-gateway/ HTTP event ingestion service
|
||||
├── query-api/ REST API for querying events
|
||||
├── web-ui/ SPA frontend + static assets
|
||||
│ └── static/ HTML, CSS, JS
|
||||
├── event-processor/ NATS → Postgres persistence
|
||||
└── openclaw-monitor/ VM health polling
|
||||
internal/
|
||||
├── event/ Envelope types and validation
|
||||
├── httpx/ HTTP response helpers
|
||||
├── queue/nats/ NATS publisher and subscriber
|
||||
├── store/postgres/ Database queries (sessions, runs, spans)
|
||||
├── sdk/ Go client library for emitting events
|
||||
└── monitor/openclaw/ VM metrics collection (libvirt, SSH)
|
||||
hooks/
|
||||
└── agentmon/ OpenClaw hook (TypeScript)
|
||||
deploy/
|
||||
└── k8s/ Database schema (postgres.sql)
|
||||
```
|
||||
@@ -0,0 +1,174 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"log"
|
||||
"os"
|
||||
"time"
|
||||
|
||||
"agentmon/internal/monitor/openclaw"
|
||||
qnats "agentmon/internal/queue/nats"
|
||||
)
|
||||
|
||||
type Event struct {
|
||||
Schema map[string]any `json:"schema"`
|
||||
Event map[string]any `json:"event"`
|
||||
Payload map[string]any `json:"payload"`
|
||||
}
|
||||
|
||||
func main() {
|
||||
natsURL := envDefault("NATS_URL", "nats://nats:4222")
|
||||
natsTopic := envDefault("NATS_TOPIC", "agentmon.events.v1")
|
||||
registryPath := envDefault("OPENCLAW_REGISTRY", "/home/will/.claude/state/openclaw-instances.json")
|
||||
interval := envDefault("POLL_INTERVAL", "30s")
|
||||
|
||||
pub, err := qnats.NewPublisher(natsURL, natsTopic)
|
||||
if err != nil {
|
||||
log.Fatalf("failed to connect to NATS: %v", err)
|
||||
}
|
||||
defer pub.Close()
|
||||
|
||||
pollDuration, err := time.ParseDuration(interval)
|
||||
if err != nil {
|
||||
log.Fatalf("invalid poll interval: %v", err)
|
||||
}
|
||||
|
||||
ticker := time.NewTicker(pollDuration)
|
||||
defer ticker.Stop()
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
log.Printf("openclaw-monitor started, polling every %s", pollDuration)
|
||||
|
||||
for {
|
||||
select {
|
||||
case <-ticker.C:
|
||||
if err := pollInstances(ctx, pub, registryPath); err != nil {
|
||||
log.Printf("poll error: %v", err)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func pollInstances(ctx context.Context, pub *qnats.Publisher, registryPath string) error {
|
||||
instances, err := openclaw.LoadInstances(registryPath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
for _, instance := range instances {
|
||||
metrics := openclaw.Metrics{
|
||||
Instance: instance,
|
||||
Timestamp: time.Now().UTC(),
|
||||
}
|
||||
|
||||
hostMetrics, err := openclaw.CollectHostMetrics(instance.Domain)
|
||||
if err != nil {
|
||||
metrics.Error = err.Error()
|
||||
emitEvent(ctx, pub, instance.Name, metrics)
|
||||
continue
|
||||
}
|
||||
metrics.Host = hostMetrics
|
||||
|
||||
if hostMetrics.State == "running" && instance.Host != nil {
|
||||
guestMetrics, err := openclaw.CollectGuestMetrics(*instance.Host, instance.User)
|
||||
if err != nil {
|
||||
log.Printf("guest collection failed for %s: %v", instance.Name, err)
|
||||
} else {
|
||||
metrics.Guest = guestMetrics
|
||||
}
|
||||
}
|
||||
|
||||
backupStatus, err := openclaw.CollectBackupStatus(instance.Name)
|
||||
if err != nil {
|
||||
log.Printf("backup collection failed for %s: %v", instance.Name, err)
|
||||
} else {
|
||||
metrics.Backup = backupStatus
|
||||
}
|
||||
|
||||
issues := openclaw.DetectIssues(metrics)
|
||||
if anyIssues(issues) {
|
||||
log.Printf("issues detected for %s: %+v", instance.Name, issues)
|
||||
}
|
||||
|
||||
emitEvent(ctx, pub, instance.Name, metrics)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func emitEvent(ctx context.Context, pub *qnats.Publisher, instanceName string, metrics openclaw.Metrics) {
|
||||
event := Event{
|
||||
Schema: map[string]any{
|
||||
"name": "agentmon.openclaw",
|
||||
"version": 1,
|
||||
},
|
||||
Event: map[string]any{
|
||||
"id": generateID(),
|
||||
"type": "openclaw.snapshot",
|
||||
"ts": metrics.Timestamp.UTC().Format(time.RFC3339Nano),
|
||||
},
|
||||
Payload: map[string]any{
|
||||
"instance": metrics.Instance,
|
||||
"host": metrics.Host,
|
||||
},
|
||||
}
|
||||
|
||||
if metrics.Guest != nil {
|
||||
event.Payload["guest"] = metrics.Guest
|
||||
}
|
||||
if metrics.Backup != nil {
|
||||
event.Payload["backup"] = metrics.Backup
|
||||
}
|
||||
if metrics.Error != "" {
|
||||
event.Payload["error"] = metrics.Error
|
||||
}
|
||||
|
||||
issues := openclaw.DetectIssues(metrics)
|
||||
if anyIssues(issues) {
|
||||
event.Payload["issues"] = issues
|
||||
}
|
||||
|
||||
data, err := json.Marshal(event)
|
||||
if err != nil {
|
||||
log.Printf("failed to marshal event for %s: %v", instanceName, err)
|
||||
return
|
||||
}
|
||||
|
||||
if err := pub.Publish(ctx, data); err != nil {
|
||||
log.Printf("failed to publish event for %s: %v", instanceName, err)
|
||||
}
|
||||
}
|
||||
|
||||
func anyIssues(issues openclaw.Issues) bool {
|
||||
return issues.GuestDiskUsageHigh ||
|
||||
issues.GuestMemoryUsageHigh ||
|
||||
issues.HostDiskUsageHigh ||
|
||||
issues.GatewayDown ||
|
||||
issues.HTTPUnhealthy ||
|
||||
issues.VersionMismatch ||
|
||||
issues.VMNotRunning ||
|
||||
issues.BackupStale
|
||||
}
|
||||
|
||||
func generateID() string {
|
||||
return time.Now().Format("20060102150405") + "-" + randomString(8)
|
||||
}
|
||||
|
||||
func randomString(n int) string {
|
||||
const chars = "abcdefghijklmnopqrstuvwxyz0123456789"
|
||||
b := make([]byte, n)
|
||||
for i := range b {
|
||||
b[i] = chars[time.Now().Nanosecond()%len(chars)]
|
||||
time.Sleep(time.Nanosecond)
|
||||
}
|
||||
return string(b)
|
||||
}
|
||||
|
||||
func envDefault(key, def string) string {
|
||||
if v := os.Getenv(key); v != "" {
|
||||
return v
|
||||
}
|
||||
return def
|
||||
}
|
||||
+86
-1
@@ -6,6 +6,7 @@ import (
|
||||
"net/http"
|
||||
"os"
|
||||
"strconv"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"agentmon/internal/httpx"
|
||||
@@ -13,11 +14,80 @@ import (
|
||||
|
||||
"github.com/go-chi/chi/v5"
|
||||
"github.com/go-chi/chi/v5/middleware"
|
||||
"github.com/gorilla/websocket"
|
||||
"github.com/nats-io/nats.go"
|
||||
)
|
||||
|
||||
var (
|
||||
wsUpgrader = websocket.Upgrader{
|
||||
CheckOrigin: func(r *http.Request) bool { return true },
|
||||
}
|
||||
wsClients = make(map[*websocket.Conn]bool)
|
||||
wsMu sync.RWMutex
|
||||
natsConn *nats.Conn
|
||||
)
|
||||
|
||||
func subscribeToNATS(nc *nats.Conn) {
|
||||
topic := envDefault("NATS_TOPIC", "agentmon.events.v1")
|
||||
sub, err := nc.Subscribe(topic, func(msg *nats.Msg) {
|
||||
wsMu.RLock()
|
||||
var stale []*websocket.Conn
|
||||
for conn := range wsClients {
|
||||
err := conn.WriteMessage(websocket.TextMessage, msg.Data)
|
||||
if err != nil {
|
||||
conn.Close()
|
||||
stale = append(stale, conn)
|
||||
}
|
||||
}
|
||||
wsMu.RUnlock()
|
||||
|
||||
if len(stale) > 0 {
|
||||
wsMu.Lock()
|
||||
for _, conn := range stale {
|
||||
delete(wsClients, conn)
|
||||
}
|
||||
wsMu.Unlock()
|
||||
}
|
||||
})
|
||||
if err != nil {
|
||||
log.Printf("failed to subscribe to NATS: %v", err)
|
||||
return
|
||||
}
|
||||
log.Printf("subscribed to NATS topic: %s", topic)
|
||||
_ = sub
|
||||
}
|
||||
|
||||
func wsHandler(w http.ResponseWriter, r *http.Request) {
|
||||
conn, err := wsUpgrader.Upgrade(w, r, nil)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
defer conn.Close()
|
||||
|
||||
wsMu.Lock()
|
||||
wsClients[conn] = true
|
||||
wsMu.Unlock()
|
||||
|
||||
log.Printf("WebSocket client connected")
|
||||
|
||||
for {
|
||||
_, _, err := conn.ReadMessage()
|
||||
if err != nil {
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
wsMu.Lock()
|
||||
delete(wsClients, conn)
|
||||
wsMu.Unlock()
|
||||
log.Printf("WebSocket client disconnected")
|
||||
}
|
||||
|
||||
func main() {
|
||||
addr := envDefault("AGENTMON_QUERY_ADDR", ":8081")
|
||||
dsn := os.Getenv("DATABASE_URL")
|
||||
natsURL := envDefault("NATS_URL", "nats://localhost:4222")
|
||||
|
||||
if dsn == "" {
|
||||
log.Fatalf("DATABASE_URL is required")
|
||||
}
|
||||
@@ -28,6 +98,14 @@ func main() {
|
||||
}
|
||||
defer func() { _ = db.Close() }()
|
||||
|
||||
nc, err := nats.Connect(natsURL)
|
||||
if err != nil {
|
||||
log.Printf("warning: failed to connect to NATS: %v", err)
|
||||
} else {
|
||||
natsConn = nc
|
||||
go subscribeToNATS(nc)
|
||||
}
|
||||
|
||||
r := chi.NewRouter()
|
||||
r.Use(middleware.RequestID)
|
||||
r.Use(middleware.RealIP)
|
||||
@@ -39,9 +117,16 @@ func main() {
|
||||
_, _ = w.Write([]byte("ok"))
|
||||
})
|
||||
|
||||
r.Get("/v1/ws", wsHandler)
|
||||
|
||||
r.Get("/v1/events", func(w http.ResponseWriter, r *http.Request) {
|
||||
limit, _ := strconv.Atoi(r.URL.Query().Get("limit"))
|
||||
events, err := db.ListRecentEvents(r.Context(), limit)
|
||||
f := postgres.EventsFilter{
|
||||
Limit: limit,
|
||||
EventType: r.URL.Query().Get("event_type"),
|
||||
Framework: r.URL.Query().Get("framework"),
|
||||
}
|
||||
events, err := db.ListRecentEvents(r.Context(), f)
|
||||
if err != nil {
|
||||
httpx.WriteJSON(w, http.StatusInternalServerError, map[string]any{"error": "db_error"})
|
||||
return
|
||||
|
||||
+1
-1
@@ -48,7 +48,7 @@ func main() {
|
||||
// SPA catch-all: serve index.html for all other routes
|
||||
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
|
||||
// Serve index.html for SPA routes
|
||||
if r.URL.Path == "/" || strings.HasPrefix(r.URL.Path, "/sessions") || strings.HasPrefix(r.URL.Path, "/runs") {
|
||||
if r.URL.Path == "/" || strings.HasPrefix(r.URL.Path, "/sessions") || strings.HasPrefix(r.URL.Path, "/runs") || strings.HasPrefix(r.URL.Path, "/openclaw") || strings.HasPrefix(r.URL.Path, "/agents") {
|
||||
f, err := staticFiles.Open("static/index.html")
|
||||
if err != nil {
|
||||
http.Error(w, "index.html not found", http.StatusInternalServerError)
|
||||
|
||||
+744
-105
@@ -1,18 +1,91 @@
|
||||
(function() {
|
||||
const app = document.getElementById('app');
|
||||
|
||||
// Router
|
||||
function route() {
|
||||
const path = window.location.pathname;
|
||||
let ws = null;
|
||||
let wsReconnectTimeout = null;
|
||||
const wsCallbacks = new Set();
|
||||
|
||||
let sessionsState = { sessions: [], cursor: null };
|
||||
let openclawState = { instances: {} };
|
||||
let openclawUnsubscribe = null;
|
||||
let agentsState = createAgentsState();
|
||||
let agentsUnsubscribe = null;
|
||||
|
||||
function getWsURL() {
|
||||
const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:';
|
||||
return protocol + '//' + window.location.host + '/api/v1/ws';
|
||||
}
|
||||
|
||||
function connectWS() {
|
||||
if (ws && (ws.readyState === WebSocket.OPEN || ws.readyState === WebSocket.CONNECTING)) {
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
ws = new WebSocket(getWsURL());
|
||||
|
||||
ws.onopen = () => {
|
||||
console.log('WebSocket connected');
|
||||
wsCallbacks.forEach(cb => cb({ type: 'connected' }));
|
||||
};
|
||||
|
||||
ws.onmessage = (event) => {
|
||||
try {
|
||||
const data = JSON.parse(event.data);
|
||||
wsCallbacks.forEach(cb => cb({ type: 'message', data }));
|
||||
} catch (e) {
|
||||
console.error('Failed to parse WS message:', e);
|
||||
}
|
||||
};
|
||||
|
||||
ws.onclose = () => {
|
||||
console.log('WebSocket disconnected');
|
||||
wsCallbacks.forEach(cb => cb({ type: 'disconnected' }));
|
||||
wsReconnectTimeout = setTimeout(connectWS, 5000);
|
||||
};
|
||||
|
||||
ws.onerror = (err) => {
|
||||
console.error('WebSocket error:', err);
|
||||
};
|
||||
} catch (e) {
|
||||
console.error('Failed to connect WebSocket:', e);
|
||||
wsReconnectTimeout = setTimeout(connectWS, 5000);
|
||||
}
|
||||
}
|
||||
|
||||
function subscribeWS(callback) {
|
||||
wsCallbacks.add(callback);
|
||||
if (!ws || ws.readyState !== WebSocket.OPEN) {
|
||||
connectWS();
|
||||
}
|
||||
return () => wsCallbacks.delete(callback);
|
||||
}
|
||||
|
||||
function cleanupLiveViews() {
|
||||
if (openclawUnsubscribe) {
|
||||
openclawUnsubscribe();
|
||||
openclawUnsubscribe = null;
|
||||
}
|
||||
if (agentsUnsubscribe) {
|
||||
agentsUnsubscribe();
|
||||
agentsUnsubscribe = null;
|
||||
}
|
||||
}
|
||||
|
||||
function route() {
|
||||
cleanupLiveViews();
|
||||
|
||||
const path = window.location.pathname;
|
||||
if (path === '/' || path === '/sessions') {
|
||||
renderSessions();
|
||||
} else if (path.startsWith('/agents')) {
|
||||
renderAgents();
|
||||
} else if (path.startsWith('/openclaw')) {
|
||||
renderOpenClaw();
|
||||
} else if (path.startsWith('/sessions/')) {
|
||||
const sessionID = path.split('/sessions/')[1];
|
||||
renderSession(sessionID);
|
||||
renderSession(path.split('/sessions/')[1]);
|
||||
} else if (path.startsWith('/runs/')) {
|
||||
const runID = path.split('/runs/')[1];
|
||||
renderRun(runID);
|
||||
renderRun(path.split('/runs/')[1]);
|
||||
} else {
|
||||
app.innerHTML = '<p>Page not found</p>';
|
||||
}
|
||||
@@ -25,14 +98,28 @@
|
||||
|
||||
window.addEventListener('popstate', route);
|
||||
|
||||
// API helpers
|
||||
async function api(path) {
|
||||
const resp = await fetch('/api' + path);
|
||||
if (!resp.ok) throw new Error('API error');
|
||||
if (!resp.ok) {
|
||||
throw new Error('API error');
|
||||
}
|
||||
return resp.json();
|
||||
}
|
||||
|
||||
function escapeHTML(value) {
|
||||
return String(value ?? '')
|
||||
.replace(/&/g, '&')
|
||||
.replace(/</g, '<')
|
||||
.replace(/>/g, '>')
|
||||
.replace(/"/g, '"')
|
||||
.replace(/'/g, ''');
|
||||
}
|
||||
|
||||
function relativeTime(ts) {
|
||||
if (!ts) {
|
||||
return '-';
|
||||
}
|
||||
|
||||
const now = Date.now();
|
||||
const then = new Date(ts).getTime();
|
||||
const diff = now - then;
|
||||
@@ -44,23 +131,82 @@
|
||||
}
|
||||
|
||||
function formatDuration(ms) {
|
||||
if (!ms) return '-';
|
||||
if (ms === undefined || ms === null || ms === '') return '-';
|
||||
if (ms < 1000) return ms + 'ms';
|
||||
if (ms < 60000) return (ms / 1000).toFixed(1) + 's';
|
||||
return (ms / 60000).toFixed(1) + 'm';
|
||||
}
|
||||
|
||||
function statusIcon(status) {
|
||||
if (status === 'success') return '<span class="status-success">✓</span>';
|
||||
if (status === 'error') return '<span class="status-error">✗</span>';
|
||||
return '<span class="status-unknown">●</span>';
|
||||
function formatBytes(bytes) {
|
||||
if (!bytes) return null;
|
||||
const units = ['B', 'KB', 'MB', 'GB', 'TB'];
|
||||
let unitIndex = 0;
|
||||
let value = bytes;
|
||||
while (value >= 1024 && unitIndex < units.length - 1) {
|
||||
value /= 1024;
|
||||
unitIndex++;
|
||||
}
|
||||
return value.toFixed(1) + ' ' + units[unitIndex];
|
||||
}
|
||||
|
||||
// Sessions list
|
||||
let sessionsState = { sessions: [], cursor: null, filters: {} };
|
||||
function statusIcon(status) {
|
||||
if (status === 'success') return '<span class="status-badge status-success"><span class="status-dot"></span>success</span>';
|
||||
if (status === 'error') return '<span class="status-badge status-error"><span class="status-dot"></span>error</span>';
|
||||
return '<span class="status-badge status-unknown"><span class="status-dot"></span>unknown</span>';
|
||||
}
|
||||
|
||||
function extractEnvelope(record) {
|
||||
if (record && typeof record === 'object' && record.payload && record.payload.event && record.payload.schema) {
|
||||
return record.payload;
|
||||
}
|
||||
return record || {};
|
||||
}
|
||||
|
||||
function getEnvelopeEvent(record) {
|
||||
const envelope = extractEnvelope(record);
|
||||
return envelope.event || envelope.Event || {};
|
||||
}
|
||||
|
||||
function getEnvelopeType(record) {
|
||||
return record?.type || getEnvelopeEvent(record).type || '';
|
||||
}
|
||||
|
||||
function getEnvelopeTS(record) {
|
||||
return record?.ts || getEnvelopeEvent(record).ts || '';
|
||||
}
|
||||
|
||||
function getEnvelopeSource(record) {
|
||||
return getEnvelopeEvent(record).source || {};
|
||||
}
|
||||
|
||||
function getEnvelopePayload(record) {
|
||||
const envelope = extractEnvelope(record);
|
||||
return envelope.payload || envelope.Payload || {};
|
||||
}
|
||||
|
||||
function getEnvelopeAttributes(record) {
|
||||
const envelope = extractEnvelope(record);
|
||||
return envelope.attributes || envelope.Attributes || {};
|
||||
}
|
||||
|
||||
function getEnvelopeCorrelation(record) {
|
||||
const envelope = extractEnvelope(record);
|
||||
return envelope.correlation || envelope.Correlation || {};
|
||||
}
|
||||
|
||||
function getRecordID(record) {
|
||||
return record?.event_id || getEnvelopeEvent(record).id || '';
|
||||
}
|
||||
|
||||
function isCurrentPath(prefix) {
|
||||
return window.location.pathname.startsWith(prefix);
|
||||
}
|
||||
|
||||
async function renderSessions() {
|
||||
app.innerHTML = `
|
||||
<div class="page-header">
|
||||
<h2>Sessions</h2>
|
||||
</div>
|
||||
<div class="filters">
|
||||
<label>From <input type="date" id="filter-from"></label>
|
||||
<label>To <input type="date" id="filter-to"></label>
|
||||
@@ -69,26 +215,28 @@
|
||||
<option value="">All</option>
|
||||
<option value="claude-code">claude-code</option>
|
||||
<option value="opencode">opencode</option>
|
||||
<option value="openclaw">openclaw</option>
|
||||
</select>
|
||||
</label>
|
||||
<label>Host <input type="text" id="filter-host" placeholder="hostname"></label>
|
||||
</div>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Session</th>
|
||||
<th>Framework</th>
|
||||
<th>Host</th>
|
||||
<th>Runs</th>
|
||||
<th>Time</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody id="sessions-body"></tbody>
|
||||
</table>
|
||||
<div class="table-container">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Session</th>
|
||||
<th>Framework</th>
|
||||
<th>Host</th>
|
||||
<th>Runs</th>
|
||||
<th>Time</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody id="sessions-body"></tbody>
|
||||
</table>
|
||||
</div>
|
||||
<button id="load-more" class="load-more" style="display:none">Load more</button>
|
||||
`;
|
||||
|
||||
// Bind filter events
|
||||
['from', 'to', 'framework', 'host'].forEach(f => {
|
||||
document.getElementById('filter-' + f).addEventListener('change', () => {
|
||||
sessionsState.sessions = [];
|
||||
@@ -122,12 +270,12 @@
|
||||
|
||||
const tbody = document.getElementById('sessions-body');
|
||||
tbody.innerHTML = sessionsState.sessions.map(s => `
|
||||
<tr class="clickable" data-session="${s.session_id}">
|
||||
<td>${s.session_id.substring(0, 12)}...</td>
|
||||
<td>${s.framework || '-'}</td>
|
||||
<td>${s.host || '-'}</td>
|
||||
<tr class="clickable" data-session="${escapeHTML(s.session_id)}">
|
||||
<td class="id-cell">${escapeHTML(s.session_id.substring(0, 12))}...</td>
|
||||
<td>${escapeHTML(s.framework || '-')}</td>
|
||||
<td>${escapeHTML(s.host || '-')}</td>
|
||||
<td>${s.run_count}</td>
|
||||
<td title="${s.started_at}">${relativeTime(s.started_at)}</td>
|
||||
<td title="${escapeHTML(s.started_at)}">${escapeHTML(relativeTime(s.started_at))}</td>
|
||||
</tr>
|
||||
`).join('') || '<tr><td colspan="5" class="empty-state">No sessions found</td></tr>';
|
||||
|
||||
@@ -138,12 +286,10 @@
|
||||
document.getElementById('load-more').style.display = sessionsState.cursor ? 'block' : 'none';
|
||||
}
|
||||
|
||||
// Session detail
|
||||
async function renderSession(sessionID) {
|
||||
const data = await api('/v1/sessions/' + sessionID);
|
||||
const s = data.session;
|
||||
const runs = data.runs || [];
|
||||
|
||||
const duration = s.ended_at
|
||||
? formatDuration(new Date(s.ended_at) - new Date(s.started_at))
|
||||
: 'ongoing';
|
||||
@@ -151,42 +297,44 @@
|
||||
app.innerHTML = `
|
||||
<a href="/sessions" class="back-link">← Back to Sessions</a>
|
||||
<div class="page-header">
|
||||
<h2>Session ${sessionID.substring(0, 16)}...</h2>
|
||||
<p class="meta">
|
||||
Started: ${new Date(s.started_at).toLocaleString()} •
|
||||
Framework: ${s.framework || '-'} •
|
||||
Host: ${s.host || '-'} •
|
||||
Duration: ${duration}
|
||||
</p>
|
||||
<h2>Session <span style="font-family:var(--font-mono);font-size:1.1rem;color:var(--accent)">${escapeHTML(sessionID.substring(0, 16))}...</span></h2>
|
||||
<div class="meta">
|
||||
<span class="meta-item"><span class="meta-label">Started</span> ${escapeHTML(new Date(s.started_at).toLocaleString())}</span>
|
||||
<span class="meta-item"><span class="meta-label">Framework</span> ${escapeHTML(s.framework || '-')}</span>
|
||||
<span class="meta-item"><span class="meta-label">Host</span> ${escapeHTML(s.host || '-')}</span>
|
||||
<span class="meta-item"><span class="meta-label">Duration</span> ${escapeHTML(duration)}</span>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section-title">Runs <span class="count">${runs.length}</span></div>
|
||||
<div class="table-container">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Run ID</th>
|
||||
<th>Status</th>
|
||||
<th>Spans</th>
|
||||
<th>Duration</th>
|
||||
<th>Started</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
${runs.map(r => {
|
||||
const runDuration = r.ended_at
|
||||
? formatDuration(new Date(r.ended_at) - new Date(r.started_at))
|
||||
: '-';
|
||||
return `
|
||||
<tr class="clickable" data-run="${escapeHTML(r.run_id)}">
|
||||
<td class="id-cell">${escapeHTML(r.run_id.substring(0, 12))}...</td>
|
||||
<td>${statusIcon(r.status)}</td>
|
||||
<td>${r.span_count}</td>
|
||||
<td>${escapeHTML(runDuration)}</td>
|
||||
<td>${escapeHTML(new Date(r.started_at).toLocaleTimeString())}</td>
|
||||
</tr>
|
||||
`;
|
||||
}).join('') || '<tr><td colspan="5" class="empty-state">No runs</td></tr>'}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<h3>Runs (${runs.length})</h3>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Run ID</th>
|
||||
<th>Status</th>
|
||||
<th>Spans</th>
|
||||
<th>Duration</th>
|
||||
<th>Started</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
${runs.map(r => {
|
||||
const dur = r.ended_at
|
||||
? formatDuration(new Date(r.ended_at) - new Date(r.started_at))
|
||||
: '-';
|
||||
return `
|
||||
<tr class="clickable" data-run="${r.run_id}">
|
||||
<td>${r.run_id.substring(0, 12)}...</td>
|
||||
<td>${statusIcon(r.status)} ${r.status}</td>
|
||||
<td>${r.span_count}</td>
|
||||
<td>${dur}</td>
|
||||
<td>${new Date(r.started_at).toLocaleTimeString()}</td>
|
||||
</tr>
|
||||
`;
|
||||
}).join('') || '<tr><td colspan="5" class="empty-state">No runs</td></tr>'}
|
||||
</tbody>
|
||||
</table>
|
||||
`;
|
||||
|
||||
document.querySelectorAll('tr.clickable').forEach(row => {
|
||||
@@ -199,51 +347,51 @@
|
||||
});
|
||||
}
|
||||
|
||||
// Run detail
|
||||
async function renderRun(runID) {
|
||||
const data = await api('/v1/runs/' + runID);
|
||||
const r = data.run;
|
||||
const spans = data.spans || [];
|
||||
|
||||
const duration = r.ended_at
|
||||
? formatDuration(new Date(r.ended_at) - new Date(r.started_at))
|
||||
: 'ongoing';
|
||||
|
||||
app.innerHTML = `
|
||||
<a href="/sessions/${r.session_id}" class="back-link">← Back to Session</a>
|
||||
<a href="/sessions/${escapeHTML(r.session_id)}" class="back-link">← Back to Session</a>
|
||||
<div class="page-header">
|
||||
<h2>Run ${runID.substring(0, 16)}... ${statusIcon(r.status)}</h2>
|
||||
<p class="meta">
|
||||
Started: ${new Date(r.started_at).toLocaleString()} •
|
||||
Duration: ${duration}
|
||||
</p>
|
||||
<h2>Run <span style="font-family:var(--font-mono);font-size:1.1rem;color:var(--accent)">${escapeHTML(runID.substring(0, 16))}...</span> ${statusIcon(r.status)}</h2>
|
||||
<div class="meta">
|
||||
<span class="meta-item"><span class="meta-label">Started</span> ${escapeHTML(new Date(r.started_at).toLocaleString())}</span>
|
||||
<span class="meta-item"><span class="meta-label">Duration</span> ${escapeHTML(duration)}</span>
|
||||
</div>
|
||||
</div>
|
||||
<h3>Spans (${spans.length})</h3>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Name</th>
|
||||
<th>Kind</th>
|
||||
<th>Status</th>
|
||||
<th>Duration</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody id="spans-body">
|
||||
${spans.map((sp, i) => `
|
||||
<tr class="expandable" data-index="${i}">
|
||||
<td><span class="expand-icon">▶</span>${sp.name}</td>
|
||||
<td>${sp.kind}</td>
|
||||
<td>${statusIcon(sp.status)}</td>
|
||||
<td>${formatDuration(sp.duration_ms)}</td>
|
||||
<div class="section-title">Spans <span class="count">${spans.length}</span></div>
|
||||
<div class="table-container">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Name</th>
|
||||
<th>Kind</th>
|
||||
<th>Status</th>
|
||||
<th>Duration</th>
|
||||
</tr>
|
||||
<tr class="span-detail-row" data-index="${i}" style="display:none">
|
||||
<td colspan="4">
|
||||
<div class="span-details">${JSON.stringify(sp.payload, null, 2)}</div>
|
||||
</td>
|
||||
</tr>
|
||||
`).join('') || '<tr><td colspan="4" class="empty-state">No spans</td></tr>'}
|
||||
</tbody>
|
||||
</table>
|
||||
</thead>
|
||||
<tbody id="spans-body">
|
||||
${spans.map((sp, i) => `
|
||||
<tr class="expandable" data-index="${i}">
|
||||
<td><span class="expand-icon">▶</span>${escapeHTML(sp.name)}</td>
|
||||
<td>${escapeHTML(sp.kind)}</td>
|
||||
<td>${statusIcon(sp.status)}</td>
|
||||
<td>${escapeHTML(formatDuration(sp.duration_ms))}</td>
|
||||
</tr>
|
||||
<tr class="span-detail-row" data-index="${i}" style="display:none">
|
||||
<td colspan="4">
|
||||
<div class="span-details">${escapeHTML(JSON.stringify(sp.payload, null, 2))}</div>
|
||||
</td>
|
||||
</tr>
|
||||
`).join('') || '<tr><td colspan="4" class="empty-state">No spans</td></tr>'}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
`;
|
||||
|
||||
document.querySelectorAll('tr.expandable').forEach(row => {
|
||||
@@ -267,6 +415,497 @@
|
||||
});
|
||||
}
|
||||
|
||||
// Start
|
||||
async function renderOpenClaw() {
|
||||
app.innerHTML = '<div class="page-header"><h2>OpenClaw</h2></div><p class="empty-state">Loading...</p>';
|
||||
|
||||
openclawUnsubscribe = subscribeWS(handleOpenClawWS);
|
||||
|
||||
try {
|
||||
const data = await api('/v1/events?event_type=openclaw.snapshot&limit=100');
|
||||
mergeOpenClawEvents(data.events || []);
|
||||
if (isCurrentPath('/openclaw')) {
|
||||
renderOpenClawGrid();
|
||||
}
|
||||
} catch (e) {
|
||||
if (isCurrentPath('/openclaw')) {
|
||||
app.innerHTML = `<div class="page-header"><h2>OpenClaw</h2></div><p class="empty-state">Error loading: ${escapeHTML(e.message)}</p>`;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function handleOpenClawWS(msg) {
|
||||
if (msg.type !== 'message') {
|
||||
return;
|
||||
}
|
||||
|
||||
if (getEnvelopeType(msg.data) !== 'openclaw.snapshot') {
|
||||
return;
|
||||
}
|
||||
|
||||
mergeOpenClawEvents([msg.data]);
|
||||
|
||||
if (isCurrentPath('/openclaw')) {
|
||||
renderOpenClawGrid();
|
||||
}
|
||||
if (isCurrentPath('/agents')) {
|
||||
renderAgentVMStrip();
|
||||
}
|
||||
}
|
||||
|
||||
function mergeOpenClawEvents(events) {
|
||||
for (const evt of events) {
|
||||
const payload = getEnvelopePayload(evt);
|
||||
const instance = payload.instance || {};
|
||||
if (!instance.name) {
|
||||
continue;
|
||||
}
|
||||
|
||||
const existing = openclawState.instances[instance.name];
|
||||
const nextTS = new Date(getEnvelopeTS(evt) || 0).getTime();
|
||||
const currentTS = existing ? new Date(getEnvelopeTS(existing) || 0).getTime() : 0;
|
||||
if (!existing || nextTS >= currentTS) {
|
||||
openclawState.instances[instance.name] = evt;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function renderOpenClawGrid() {
|
||||
const names = Object.keys(openclawState.instances).sort();
|
||||
|
||||
if (names.length === 0) {
|
||||
app.innerHTML = `
|
||||
<div class="page-header"><h2>OpenClaw</h2></div>
|
||||
<p class="empty-state">No OpenClaw instances found</p>
|
||||
`;
|
||||
return;
|
||||
}
|
||||
|
||||
app.innerHTML = `
|
||||
<div class="page-header">
|
||||
<h2>OpenClaw <span class="live-indicator"><span class="live-dot"></span>Live</span></h2>
|
||||
</div>
|
||||
<div class="vm-grid">
|
||||
${names.map(name => {
|
||||
const evt = openclawState.instances[name];
|
||||
const payload = getEnvelopePayload(evt);
|
||||
const inst = payload.instance || {};
|
||||
const host = payload.host || {};
|
||||
const guest = payload.guest;
|
||||
const issues = payload.issues;
|
||||
|
||||
return `
|
||||
<div class="vm-card">
|
||||
<div class="vm-card-header">
|
||||
<h3>${escapeHTML(inst.name || name)}</h3>
|
||||
<div class="vm-status ${host.state === 'running' ? 'running' : 'stopped'}">
|
||||
${host.state === 'running' ? 'Running' : 'Stopped'}
|
||||
</div>
|
||||
</div>
|
||||
<div class="vm-updated">Updated ${escapeHTML(relativeTime(getEnvelopeTS(evt)))}</div>
|
||||
<table class="vm-stats">
|
||||
<tr><td>Host</td><td>${escapeHTML(inst.host || '-')}</td></tr>
|
||||
<tr><td>Domain</td><td>${escapeHTML(inst.domain || '-')}</td></tr>
|
||||
<tr><td>vCPUs</td><td>${host.vcpus || '-'}</td></tr>
|
||||
<tr><td>Memory</td><td>${escapeHTML(formatBytes(host.memory_kib ? host.memory_kib * 1024 : 0) || '-')}</td></tr>
|
||||
<tr><td>Disk</td><td>${escapeHTML(formatBytes(host.disk_actual_bytes) || '-')}</td></tr>
|
||||
<tr><td>Autostart</td><td>${host.autostart ? 'Yes' : 'No'}</td></tr>
|
||||
${guest ? `
|
||||
<tr><td>Gateway</td><td class="${guest.service_active ? 'status-success' : 'status-error'}">${guest.service_active ? 'Active' : 'Inactive'}</td></tr>
|
||||
<tr><td>HTTP</td><td class="${guest.http_status === 200 ? 'status-success' : 'status-error'}">${guest.http_status || 'N/A'}</td></tr>
|
||||
<tr><td>Version</td><td>${escapeHTML(guest.version || '-')}</td></tr>
|
||||
<tr><td>Guest Memory</td><td>${guest.memory_percent !== undefined ? guest.memory_percent.toFixed(1) : '-'}%</td></tr>
|
||||
<tr><td>Guest Disk</td><td>${guest.disk_percent !== undefined ? guest.disk_percent.toFixed(1) : '-'}%</td></tr>
|
||||
<tr><td>Load</td><td>${guest.load_average !== undefined ? guest.load_average.toFixed(2) : '-'}</td></tr>
|
||||
<tr><td>Uptime</td><td>${escapeHTML(guest.service_uptime || '-')}</td></tr>
|
||||
` : ''}
|
||||
</table>
|
||||
${issues ? `
|
||||
<div class="vm-issues">
|
||||
${Object.entries(issues).filter(([, value]) => value).map(([key]) => `
|
||||
<span class="issue ${escapeHTML(key)}">${escapeHTML(key.replace(/_/g, ' '))}</span>
|
||||
`).join('')}
|
||||
</div>
|
||||
` : ''}
|
||||
</div>
|
||||
`;
|
||||
}).join('')}
|
||||
</div>
|
||||
`;
|
||||
}
|
||||
|
||||
function createAgentsState() {
|
||||
return {
|
||||
events: [],
|
||||
eventIDs: new Set(),
|
||||
stats: {
|
||||
messages: 0,
|
||||
tools: 0,
|
||||
errors: 0,
|
||||
toolCounts: {},
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
function getVMStatus() {
|
||||
const names = ['zap', 'orb', 'sun'];
|
||||
return names.map(name => {
|
||||
const snapshot = openclawState.instances[name];
|
||||
const payload = snapshot ? getEnvelopePayload(snapshot) : {};
|
||||
const host = payload.host || {};
|
||||
return {
|
||||
name,
|
||||
active: host.state === 'running',
|
||||
};
|
||||
});
|
||||
}
|
||||
|
||||
async function renderAgents() {
|
||||
agentsState = createAgentsState();
|
||||
|
||||
app.innerHTML = `
|
||||
<div class="page-header">
|
||||
<h2>Agents <span class="live-indicator"><span class="live-dot"></span>Live</span></h2>
|
||||
</div>
|
||||
<div class="vm-strip" id="agents-vm-strip"></div>
|
||||
<div class="agents-layout">
|
||||
<div class="timeline" id="agents-timeline">
|
||||
<p class="empty-state">Loading agent activity...</p>
|
||||
</div>
|
||||
<div class="stats-panel">
|
||||
<div class="stat-card">
|
||||
<div class="stat-card-title">Messages</div>
|
||||
<div class="stat-card-value" id="stat-messages">0</div>
|
||||
<div class="stat-card-sub">received and sent</div>
|
||||
</div>
|
||||
<div class="stat-card">
|
||||
<div class="stat-card-title">Tool Calls</div>
|
||||
<div class="stat-card-value" id="stat-tools">0</div>
|
||||
</div>
|
||||
<div class="stat-card">
|
||||
<div class="stat-card-title">Errors</div>
|
||||
<div class="stat-card-value" id="stat-errors">0</div>
|
||||
</div>
|
||||
<div class="stat-card">
|
||||
<div class="stat-card-title">Top Tools</div>
|
||||
<ul class="stat-list" id="stat-top-tools">
|
||||
<li style="color:var(--text-dim);font-size:0.8rem">No data yet</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
`;
|
||||
|
||||
renderAgentVMStrip();
|
||||
|
||||
try {
|
||||
const [snapshots, events] = await Promise.all([
|
||||
api('/v1/events?event_type=openclaw.snapshot&limit=100').catch(() => ({ events: [] })),
|
||||
api('/v1/events?framework=openclaw&limit=200'),
|
||||
]);
|
||||
|
||||
if (!isCurrentPath('/agents')) {
|
||||
return;
|
||||
}
|
||||
|
||||
mergeOpenClawEvents(snapshots.events || []);
|
||||
renderAgentVMStrip();
|
||||
addAgentEvents((events.events || []).slice().reverse());
|
||||
renderAgentTimeline();
|
||||
renderAgentStats();
|
||||
} catch (e) {
|
||||
const timeline = document.getElementById('agents-timeline');
|
||||
if (timeline) {
|
||||
timeline.innerHTML = `<p class="empty-state">Error loading agent activity: ${escapeHTML(e.message)}</p>`;
|
||||
}
|
||||
}
|
||||
|
||||
agentsUnsubscribe = subscribeWS(handleAgentsWS);
|
||||
}
|
||||
|
||||
function renderAgentVMStrip() {
|
||||
const strip = document.getElementById('agents-vm-strip');
|
||||
if (!strip) {
|
||||
return;
|
||||
}
|
||||
|
||||
const vms = getVMStatus();
|
||||
strip.innerHTML = vms.map(vm => `
|
||||
<div class="vm-pill ${vm.active ? 'active' : 'inactive'}">
|
||||
<span class="vm-pill-dot"></span>
|
||||
<span class="vm-pill-name">${escapeHTML(vm.name)}</span>
|
||||
<span class="vm-pill-label">${vm.active ? 'online' : 'offline'}</span>
|
||||
</div>
|
||||
`).join('');
|
||||
}
|
||||
|
||||
function handleAgentsWS(msg) {
|
||||
if (msg.type !== 'message') {
|
||||
return;
|
||||
}
|
||||
|
||||
const eventType = getEnvelopeType(msg.data);
|
||||
if (eventType === 'openclaw.snapshot') {
|
||||
mergeOpenClawEvents([msg.data]);
|
||||
renderAgentVMStrip();
|
||||
return;
|
||||
}
|
||||
|
||||
const framework = getEnvelopeSource(msg.data).framework || msg.data.source_framework;
|
||||
if (framework !== 'openclaw') {
|
||||
return;
|
||||
}
|
||||
|
||||
addAgentEvents([msg.data]);
|
||||
renderAgentTimeline();
|
||||
renderAgentStats();
|
||||
}
|
||||
|
||||
function addAgentEvents(events) {
|
||||
let changed = false;
|
||||
|
||||
for (const evt of events) {
|
||||
const id = getRecordID(evt);
|
||||
if (!id || agentsState.eventIDs.has(id)) {
|
||||
continue;
|
||||
}
|
||||
agentsState.eventIDs.add(id);
|
||||
agentsState.events.push(evt);
|
||||
changed = true;
|
||||
}
|
||||
|
||||
if (!changed) {
|
||||
return;
|
||||
}
|
||||
|
||||
agentsState.events.sort((a, b) => new Date(getEnvelopeTS(a)).getTime() - new Date(getEnvelopeTS(b)).getTime());
|
||||
|
||||
while (agentsState.events.length > 500) {
|
||||
const removed = agentsState.events.shift();
|
||||
agentsState.eventIDs.delete(getRecordID(removed));
|
||||
}
|
||||
|
||||
recomputeAgentStats();
|
||||
}
|
||||
|
||||
function recomputeAgentStats() {
|
||||
const stats = {
|
||||
messages: 0,
|
||||
tools: 0,
|
||||
errors: 0,
|
||||
toolCounts: {},
|
||||
};
|
||||
|
||||
for (const evt of agentsState.events) {
|
||||
const eventType = getEnvelopeType(evt);
|
||||
const attrs = getEnvelopeAttributes(evt);
|
||||
|
||||
if (eventType === 'run.start' || eventType === 'run.end') {
|
||||
stats.messages++;
|
||||
}
|
||||
|
||||
if (eventType === 'span.end' && attrs.span_kind === 'tool') {
|
||||
stats.tools++;
|
||||
const toolName = attrs.name || 'unknown';
|
||||
stats.toolCounts[toolName] = (stats.toolCounts[toolName] || 0) + 1;
|
||||
}
|
||||
|
||||
if (eventType === 'error') {
|
||||
stats.errors++;
|
||||
}
|
||||
}
|
||||
|
||||
agentsState.stats = stats;
|
||||
}
|
||||
|
||||
function getEventIcon(eventType) {
|
||||
switch (eventType) {
|
||||
case 'run.start':
|
||||
return '<div class="event-icon message-in">↓</div>';
|
||||
case 'run.end':
|
||||
return '<div class="event-icon message-out">↑</div>';
|
||||
case 'span.start':
|
||||
case 'span.end':
|
||||
return '<div class="event-icon tool">⚙</div>';
|
||||
case 'error':
|
||||
return '<div class="event-icon error">!</div>';
|
||||
case 'session.start':
|
||||
case 'session.end':
|
||||
return '<div class="event-icon session">○</div>';
|
||||
default:
|
||||
return '<div class="event-icon internal">·</div>';
|
||||
}
|
||||
}
|
||||
|
||||
function getEventLabel(eventType) {
|
||||
const labels = {
|
||||
'session.start': 'Session Started',
|
||||
'session.end': 'Session Ended',
|
||||
'run.start': 'Message Received',
|
||||
'run.end': 'Response Sent',
|
||||
'span.start': 'Span Started',
|
||||
'span.end': 'Span Completed',
|
||||
'error': 'Error',
|
||||
'metric.snapshot': 'Metric',
|
||||
};
|
||||
return labels[eventType] || eventType;
|
||||
}
|
||||
|
||||
function getVMName(evt) {
|
||||
return getEnvelopeSource(evt).client_id || evt.client_id || 'unknown';
|
||||
}
|
||||
|
||||
function getVMClassName(vmName) {
|
||||
const normalized = String(vmName || 'unknown').toLowerCase();
|
||||
return ['zap', 'orb', 'sun'].includes(normalized) ? normalized : 'unknown';
|
||||
}
|
||||
|
||||
function getEventBody(evt) {
|
||||
const eventType = getEnvelopeType(evt);
|
||||
const payload = getEnvelopePayload(evt);
|
||||
const attrs = getEnvelopeAttributes(evt);
|
||||
const correlation = getEnvelopeCorrelation(evt);
|
||||
|
||||
if (eventType === 'span.start' || eventType === 'span.end') {
|
||||
const name = attrs.name || attrs.span_kind || 'unknown span';
|
||||
const duration = payload.duration_ms !== undefined && payload.duration_ms !== null
|
||||
? ` <span class="timeline-duration">${escapeHTML(formatDuration(payload.duration_ms))}</span>`
|
||||
: '';
|
||||
return `<div class="timeline-event-body tool-name">${escapeHTML(name)}${duration}</div>`;
|
||||
}
|
||||
|
||||
if (eventType === 'run.start') {
|
||||
const preview = payload.message_preview || payload.message || '';
|
||||
if (!preview) {
|
||||
return '';
|
||||
}
|
||||
const trimmed = preview.length > 140 ? preview.slice(0, 140) + '...' : preview;
|
||||
return `<div class="timeline-event-body message-preview">"${escapeHTML(trimmed)}"</div>`;
|
||||
}
|
||||
|
||||
if (eventType === 'run.end') {
|
||||
return `<div class="timeline-event-body">${statusIcon(payload.status || 'unknown')}</div>`;
|
||||
}
|
||||
|
||||
if (eventType === 'error') {
|
||||
const errPayload = payload.error || {};
|
||||
const errType = errPayload.type || 'error';
|
||||
const message = errPayload.message || payload.message || 'unknown';
|
||||
return `<div class="timeline-event-body error-message">${escapeHTML(errType + ': ' + message)}</div>`;
|
||||
}
|
||||
|
||||
if (eventType === 'session.start' || eventType === 'session.end') {
|
||||
return correlation.session_id
|
||||
? `<div class="timeline-event-body">session ${escapeHTML(correlation.session_id)}</div>`
|
||||
: '';
|
||||
}
|
||||
|
||||
return '';
|
||||
}
|
||||
|
||||
function getEventDetails(evt) {
|
||||
const details = {};
|
||||
const correlation = getEnvelopeCorrelation(evt);
|
||||
const attributes = getEnvelopeAttributes(evt);
|
||||
const payload = getEnvelopePayload(evt);
|
||||
|
||||
if (Object.keys(correlation).length > 0) {
|
||||
details.correlation = correlation;
|
||||
}
|
||||
if (Object.keys(attributes).length > 0) {
|
||||
details.attributes = attributes;
|
||||
}
|
||||
if (Object.keys(payload).length > 0) {
|
||||
details.payload = payload;
|
||||
}
|
||||
|
||||
if (Object.keys(details).length === 0) {
|
||||
return '';
|
||||
}
|
||||
|
||||
return JSON.stringify(details, null, 2);
|
||||
}
|
||||
|
||||
function renderAgentTimeline() {
|
||||
const timeline = document.getElementById('agents-timeline');
|
||||
if (!timeline) {
|
||||
return;
|
||||
}
|
||||
|
||||
const recent = agentsState.events.slice(-100).reverse();
|
||||
if (recent.length === 0) {
|
||||
timeline.innerHTML = '<p class="empty-state">Waiting for agent activity...</p>';
|
||||
return;
|
||||
}
|
||||
|
||||
timeline.innerHTML = recent.map((evt, index) => {
|
||||
const eventType = getEnvelopeType(evt);
|
||||
const vmName = getVMName(evt);
|
||||
const vmClass = getVMClassName(vmName);
|
||||
const details = getEventDetails(evt);
|
||||
const detailHTML = details ? `<div class="timeline-detail">${escapeHTML(details)}</div>` : '';
|
||||
const expandHTML = details ? '<button class="timeline-expand-hint" type="button">details</button>' : '';
|
||||
|
||||
return `
|
||||
<div class="timeline-event" data-index="${index}">
|
||||
<div class="timeline-event-header">
|
||||
${getEventIcon(eventType)}
|
||||
<span class="timeline-vm-tag ${vmClass}">${escapeHTML(vmName)}</span>
|
||||
<span class="timeline-event-type">${escapeHTML(getEventLabel(eventType))}</span>
|
||||
<span class="timeline-event-time">${escapeHTML(new Date(getEnvelopeTS(evt)).toLocaleTimeString())}</span>
|
||||
</div>
|
||||
${getEventBody(evt)}
|
||||
${expandHTML}
|
||||
${detailHTML}
|
||||
</div>
|
||||
`;
|
||||
}).join('');
|
||||
|
||||
timeline.querySelectorAll('.timeline-expand-hint').forEach(button => {
|
||||
button.addEventListener('click', () => {
|
||||
button.parentElement.classList.toggle('expanded');
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
function renderAgentStats() {
|
||||
const stats = agentsState.stats;
|
||||
|
||||
const messagesEl = document.getElementById('stat-messages');
|
||||
if (messagesEl) {
|
||||
messagesEl.textContent = String(stats.messages);
|
||||
}
|
||||
|
||||
const toolsEl = document.getElementById('stat-tools');
|
||||
if (toolsEl) {
|
||||
toolsEl.textContent = String(stats.tools);
|
||||
}
|
||||
|
||||
const errorsEl = document.getElementById('stat-errors');
|
||||
if (errorsEl) {
|
||||
errorsEl.textContent = String(stats.errors);
|
||||
}
|
||||
|
||||
const list = document.getElementById('stat-top-tools');
|
||||
if (!list) {
|
||||
return;
|
||||
}
|
||||
|
||||
const topTools = Object.entries(stats.toolCounts)
|
||||
.sort((a, b) => b[1] - a[1])
|
||||
.slice(0, 8);
|
||||
|
||||
if (topTools.length === 0) {
|
||||
list.innerHTML = '<li style="color:var(--text-dim);font-size:0.8rem">No data yet</li>';
|
||||
return;
|
||||
}
|
||||
|
||||
list.innerHTML = topTools.map(([name, count]) => `
|
||||
<li>
|
||||
<span class="stat-list-name">${escapeHTML(name)}</span>
|
||||
<span class="stat-list-count">${count}</span>
|
||||
</li>
|
||||
`).join('');
|
||||
}
|
||||
|
||||
route();
|
||||
})();
|
||||
|
||||
@@ -4,11 +4,17 @@
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>agentmon</title>
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Syne:wght@400;500;600;700;800&family=Outfit:wght@300;400;500;600&family=Fira+Code:wght@400;500&display=swap" rel="stylesheet">
|
||||
<link rel="stylesheet" href="/static/style.css">
|
||||
</head>
|
||||
<body>
|
||||
<header>
|
||||
<h1><a href="/sessions">agentmon</a></h1>
|
||||
<div class="header-logo">
|
||||
<h1><a href="/sessions">agentmon<span class="logo-dot"></span></a></h1>
|
||||
</div>
|
||||
<nav><a href="/agents">Agents</a><a href="/openclaw">OpenClaw</a></nav>
|
||||
</header>
|
||||
<main id="app">
|
||||
<p>Loading...</p>
|
||||
|
||||
+870
-61
@@ -1,70 +1,280 @@
|
||||
* { box-sizing: border-box; margin: 0; padding: 0; }
|
||||
/* ============================================================
|
||||
agentmon — Refined Dark UI
|
||||
============================================================ */
|
||||
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
|
||||
background: #0d1117;
|
||||
color: #c9d1d9;
|
||||
line-height: 1.5;
|
||||
:root {
|
||||
--bg: #07090f;
|
||||
--surface: #0d1117;
|
||||
--surface-2: #121922;
|
||||
--card: rgba(13, 20, 32, 0.85);
|
||||
--border: #1c2637;
|
||||
--border-soft: rgba(28, 38, 55, 0.6);
|
||||
|
||||
--text: #c8d3e0;
|
||||
--text-dim: #4e6070;
|
||||
--text-bright: #e8eef4;
|
||||
|
||||
--accent: #22d3ee;
|
||||
--accent-dim: rgba(34, 211, 238, 0.08);
|
||||
--accent-glow: rgba(34, 211, 238, 0.2);
|
||||
|
||||
--success: #34d399;
|
||||
--error: #f87171;
|
||||
--warning: #fbbf24;
|
||||
--purple: #a78bfa;
|
||||
|
||||
--font-display: 'Syne', sans-serif;
|
||||
--font-body: 'Outfit', sans-serif;
|
||||
--font-mono: 'Fira Code', monospace;
|
||||
|
||||
--radius: 8px;
|
||||
--radius-lg: 12px;
|
||||
--radius-xl: 16px;
|
||||
}
|
||||
|
||||
/* ── Reset ─────────────────────────────────────────────────── */
|
||||
*, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
|
||||
|
||||
/* ── Base ──────────────────────────────────────────────────── */
|
||||
html { scroll-behavior: smooth; }
|
||||
|
||||
body {
|
||||
font-family: var(--font-body);
|
||||
font-size: 15px;
|
||||
background-color: var(--bg);
|
||||
background-image:
|
||||
radial-gradient(ellipse 80% 40% at 50% -20%, rgba(34, 211, 238, 0.04) 0%, transparent 70%),
|
||||
radial-gradient(circle at 1px 1px, rgba(34, 211, 238, 0.045) 1px, transparent 0);
|
||||
background-size: 100% 100%, 28px 28px;
|
||||
color: var(--text);
|
||||
line-height: 1.6;
|
||||
min-height: 100vh;
|
||||
-webkit-font-smoothing: antialiased;
|
||||
}
|
||||
|
||||
/* ── Header ────────────────────────────────────────────────── */
|
||||
header {
|
||||
background: #161b22;
|
||||
padding: 1rem 2rem;
|
||||
border-bottom: 1px solid #30363d;
|
||||
position: sticky;
|
||||
top: 0;
|
||||
z-index: 100;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: space-between;
|
||||
padding: 0 2rem;
|
||||
height: 54px;
|
||||
background: rgba(7, 9, 15, 0.82);
|
||||
backdrop-filter: blur(16px);
|
||||
-webkit-backdrop-filter: blur(16px);
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
|
||||
header::after {
|
||||
content: '';
|
||||
position: absolute;
|
||||
bottom: -1px;
|
||||
left: 0;
|
||||
right: 0;
|
||||
height: 1px;
|
||||
background: linear-gradient(90deg, transparent 0%, var(--accent) 40%, var(--accent) 60%, transparent 100%);
|
||||
opacity: 0.15;
|
||||
pointer-events: none;
|
||||
}
|
||||
|
||||
.header-logo {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
|
||||
header h1 a {
|
||||
color: #58a6ff;
|
||||
font-family: var(--font-display);
|
||||
font-size: 1rem;
|
||||
font-weight: 800;
|
||||
color: var(--text-bright);
|
||||
text-decoration: none;
|
||||
letter-spacing: 0.08em;
|
||||
text-transform: uppercase;
|
||||
}
|
||||
|
||||
main {
|
||||
max-width: 1200px;
|
||||
margin: 0 auto;
|
||||
padding: 2rem;
|
||||
}
|
||||
|
||||
.back-link {
|
||||
.logo-dot {
|
||||
display: inline-block;
|
||||
margin-bottom: 1rem;
|
||||
color: #58a6ff;
|
||||
text-decoration: none;
|
||||
width: 6px;
|
||||
height: 6px;
|
||||
border-radius: 50%;
|
||||
background: var(--accent);
|
||||
box-shadow: 0 0 8px var(--accent-glow);
|
||||
margin-left: 2px;
|
||||
vertical-align: middle;
|
||||
position: relative;
|
||||
top: -1px;
|
||||
}
|
||||
|
||||
.back-link:hover { text-decoration: underline; }
|
||||
header nav {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.25rem;
|
||||
}
|
||||
|
||||
header nav a {
|
||||
font-size: 0.8rem;
|
||||
font-weight: 500;
|
||||
color: var(--text-dim);
|
||||
text-decoration: none;
|
||||
padding: 0.375rem 0.875rem;
|
||||
border-radius: var(--radius);
|
||||
letter-spacing: 0.04em;
|
||||
transition: color 0.15s, background 0.15s;
|
||||
}
|
||||
|
||||
header nav a:hover {
|
||||
color: var(--text-bright);
|
||||
background: var(--surface-2);
|
||||
}
|
||||
|
||||
/* ── Main ──────────────────────────────────────────────────── */
|
||||
main {
|
||||
max-width: 1240px;
|
||||
margin: 0 auto;
|
||||
padding: 2.5rem 2rem;
|
||||
}
|
||||
|
||||
/* Page entry animation */
|
||||
@keyframes fadeUp {
|
||||
from { opacity: 0; transform: translateY(10px); }
|
||||
to { opacity: 1; transform: translateY(0); }
|
||||
}
|
||||
|
||||
main > * {
|
||||
animation: fadeUp 0.28s ease both;
|
||||
}
|
||||
|
||||
/* ── Back link ─────────────────────────────────────────────── */
|
||||
.back-link {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.4rem;
|
||||
margin-bottom: 1.75rem;
|
||||
color: var(--text-dim);
|
||||
text-decoration: none;
|
||||
font-size: 0.8rem;
|
||||
font-weight: 500;
|
||||
letter-spacing: 0.03em;
|
||||
transition: color 0.15s;
|
||||
}
|
||||
|
||||
.back-link:hover { color: var(--accent); }
|
||||
|
||||
/* ── Page header ───────────────────────────────────────────── */
|
||||
.page-header {
|
||||
margin-bottom: 1.5rem;
|
||||
margin-bottom: 2rem;
|
||||
padding-bottom: 1.5rem;
|
||||
border-bottom: 1px solid var(--border-soft);
|
||||
}
|
||||
|
||||
.page-header h2 {
|
||||
font-size: 1.5rem;
|
||||
margin-bottom: 0.5rem;
|
||||
font-family: var(--font-display);
|
||||
font-size: 1.55rem;
|
||||
font-weight: 700;
|
||||
color: var(--text-bright);
|
||||
margin-bottom: 0.6rem;
|
||||
letter-spacing: -0.02em;
|
||||
}
|
||||
|
||||
.meta { color: #8b949e; font-size: 0.9rem; }
|
||||
.meta {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 0.4rem 1.25rem;
|
||||
color: var(--text-dim);
|
||||
font-size: 0.8rem;
|
||||
}
|
||||
|
||||
.meta-item {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.3rem;
|
||||
}
|
||||
|
||||
.meta-label {
|
||||
font-size: 0.72rem;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.06em;
|
||||
}
|
||||
|
||||
/* ── Section title ─────────────────────────────────────────── */
|
||||
.section-title {
|
||||
font-family: var(--font-display);
|
||||
font-size: 1rem;
|
||||
font-weight: 700;
|
||||
color: var(--text-bright);
|
||||
margin-bottom: 1rem;
|
||||
letter-spacing: 0.01em;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.6rem;
|
||||
}
|
||||
|
||||
.section-title .count {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.72rem;
|
||||
font-weight: 500;
|
||||
color: var(--text-dim);
|
||||
background: var(--surface-2);
|
||||
border: 1px solid var(--border);
|
||||
padding: 0.1rem 0.45rem;
|
||||
border-radius: 999px;
|
||||
letter-spacing: 0.04em;
|
||||
}
|
||||
|
||||
/* ── Filters ───────────────────────────────────────────────── */
|
||||
.filters {
|
||||
display: flex;
|
||||
gap: 1rem;
|
||||
gap: 0.75rem;
|
||||
margin-bottom: 1.5rem;
|
||||
flex-wrap: wrap;
|
||||
align-items: flex-end;
|
||||
}
|
||||
|
||||
.filters label {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.25rem;
|
||||
font-size: 0.85rem;
|
||||
color: #8b949e;
|
||||
gap: 0.35rem;
|
||||
font-size: 0.7rem;
|
||||
font-weight: 600;
|
||||
color: var(--text-dim);
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.08em;
|
||||
}
|
||||
|
||||
.filters input, .filters select {
|
||||
background: #21262d;
|
||||
border: 1px solid #30363d;
|
||||
color: #c9d1d9;
|
||||
padding: 0.5rem;
|
||||
border-radius: 4px;
|
||||
.filters input,
|
||||
.filters select {
|
||||
background: var(--surface);
|
||||
border: 1px solid var(--border);
|
||||
color: var(--text);
|
||||
padding: 0.45rem 0.75rem;
|
||||
border-radius: var(--radius);
|
||||
font-family: var(--font-body);
|
||||
font-size: 0.85rem;
|
||||
transition: border-color 0.15s, box-shadow 0.15s;
|
||||
outline: none;
|
||||
min-width: 140px;
|
||||
}
|
||||
|
||||
.filters input:focus,
|
||||
.filters select:focus {
|
||||
border-color: var(--accent);
|
||||
box-shadow: 0 0 0 3px var(--accent-dim);
|
||||
}
|
||||
|
||||
.filters select option {
|
||||
background: var(--surface-2);
|
||||
}
|
||||
|
||||
/* ── Table container ───────────────────────────────────────── */
|
||||
.table-container {
|
||||
background: var(--surface);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius-lg);
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
table {
|
||||
@@ -74,55 +284,654 @@ table {
|
||||
|
||||
th, td {
|
||||
text-align: left;
|
||||
padding: 0.75rem 1rem;
|
||||
border-bottom: 1px solid #21262d;
|
||||
padding: 0.7rem 1.25rem;
|
||||
}
|
||||
|
||||
th {
|
||||
background: #161b22;
|
||||
font-weight: 600;
|
||||
font-size: 0.85rem;
|
||||
background: var(--surface-2);
|
||||
font-size: 0.68rem;
|
||||
font-weight: 700;
|
||||
text-transform: uppercase;
|
||||
color: #8b949e;
|
||||
letter-spacing: 0.1em;
|
||||
color: var(--text-dim);
|
||||
border-bottom: 1px solid var(--border);
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
||||
tr:hover { background: #161b22; }
|
||||
td {
|
||||
font-size: 0.875rem;
|
||||
border-bottom: 1px solid var(--border-soft);
|
||||
color: var(--text);
|
||||
}
|
||||
|
||||
tr:last-child td { border-bottom: none; }
|
||||
|
||||
tr.clickable { cursor: pointer; }
|
||||
|
||||
.status-success { color: #3fb950; }
|
||||
.status-error { color: #f85149; }
|
||||
.status-unknown { color: #d29922; }
|
||||
tr.clickable:hover td {
|
||||
background: var(--surface-2);
|
||||
color: var(--text-bright);
|
||||
}
|
||||
|
||||
tr.clickable:hover td:first-child {
|
||||
border-left: 2px solid var(--accent);
|
||||
padding-left: calc(1.25rem - 2px);
|
||||
}
|
||||
|
||||
/* ── Status badges ─────────────────────────────────────────── */
|
||||
.status-badge {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.35rem;
|
||||
padding: 0.2rem 0.6rem;
|
||||
border-radius: 999px;
|
||||
font-size: 0.72rem;
|
||||
font-weight: 600;
|
||||
letter-spacing: 0.04em;
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
||||
.status-dot {
|
||||
width: 5px;
|
||||
height: 5px;
|
||||
border-radius: 50%;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
.status-success {
|
||||
color: var(--success);
|
||||
background: rgba(52, 211, 153, 0.1);
|
||||
border: 1px solid rgba(52, 211, 153, 0.2);
|
||||
}
|
||||
.status-success .status-dot { background: var(--success); }
|
||||
|
||||
.status-error {
|
||||
color: var(--error);
|
||||
background: rgba(248, 113, 113, 0.1);
|
||||
border: 1px solid rgba(248, 113, 113, 0.2);
|
||||
}
|
||||
.status-error .status-dot { background: var(--error); }
|
||||
|
||||
.status-unknown {
|
||||
color: var(--warning);
|
||||
background: rgba(251, 191, 36, 0.1);
|
||||
border: 1px solid rgba(251, 191, 36, 0.2);
|
||||
}
|
||||
.status-unknown .status-dot { background: var(--warning); }
|
||||
|
||||
/* ── Monospace cells ───────────────────────────────────────── */
|
||||
.id-cell {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.78rem;
|
||||
color: var(--accent);
|
||||
letter-spacing: 0.02em;
|
||||
}
|
||||
|
||||
/* ── Load more ─────────────────────────────────────────────── */
|
||||
.load-more {
|
||||
display: block;
|
||||
width: 100%;
|
||||
margin-top: 1rem;
|
||||
padding: 0.75rem;
|
||||
background: #21262d;
|
||||
border: 1px solid #30363d;
|
||||
color: #c9d1d9;
|
||||
margin-top: 0.875rem;
|
||||
padding: 0.7rem;
|
||||
background: transparent;
|
||||
border: 1px dashed var(--border);
|
||||
color: var(--text-dim);
|
||||
cursor: pointer;
|
||||
border-radius: 4px;
|
||||
border-radius: var(--radius);
|
||||
font-family: var(--font-body);
|
||||
font-size: 0.8rem;
|
||||
font-weight: 500;
|
||||
letter-spacing: 0.05em;
|
||||
text-transform: uppercase;
|
||||
transition: border-color 0.15s, color 0.15s, background 0.15s;
|
||||
}
|
||||
|
||||
.load-more:hover { background: #30363d; }
|
||||
.load-more:hover {
|
||||
border-color: var(--accent);
|
||||
color: var(--accent);
|
||||
background: var(--accent-dim);
|
||||
}
|
||||
|
||||
/* ── Span expand ───────────────────────────────────────────── */
|
||||
.expandable { cursor: pointer; }
|
||||
.expand-icon { margin-right: 0.5rem; }
|
||||
|
||||
.expand-icon {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
width: 16px;
|
||||
height: 16px;
|
||||
margin-right: 0.5rem;
|
||||
color: var(--text-dim);
|
||||
font-size: 0.6rem;
|
||||
transition: transform 0.18s ease;
|
||||
}
|
||||
|
||||
.span-details {
|
||||
background: #161b22;
|
||||
padding: 1rem;
|
||||
margin: 0.5rem 0;
|
||||
border-radius: 4px;
|
||||
font-family: monospace;
|
||||
font-size: 0.85rem;
|
||||
background: #020508;
|
||||
padding: 1.25rem;
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.78rem;
|
||||
white-space: pre-wrap;
|
||||
word-break: break-all;
|
||||
color: #7a9ab5;
|
||||
line-height: 1.75;
|
||||
border-top: 1px solid var(--border);
|
||||
}
|
||||
|
||||
/* ── Empty state ───────────────────────────────────────────── */
|
||||
.empty-state {
|
||||
text-align: center;
|
||||
padding: 3rem;
|
||||
color: #8b949e;
|
||||
padding: 4rem 2rem;
|
||||
color: var(--text-dim);
|
||||
font-size: 0.875rem;
|
||||
letter-spacing: 0.02em;
|
||||
}
|
||||
|
||||
/* ── Live indicator ────────────────────────────────────────── */
|
||||
.live-indicator {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.4rem;
|
||||
font-size: 0.68rem;
|
||||
font-weight: 700;
|
||||
letter-spacing: 0.1em;
|
||||
text-transform: uppercase;
|
||||
color: var(--success);
|
||||
}
|
||||
|
||||
.live-dot {
|
||||
width: 6px;
|
||||
height: 6px;
|
||||
border-radius: 50%;
|
||||
background: var(--success);
|
||||
box-shadow: 0 0 6px rgba(52, 211, 153, 0.5);
|
||||
animation: livePulse 2s ease-in-out infinite;
|
||||
}
|
||||
|
||||
@keyframes livePulse {
|
||||
0%, 100% { opacity: 1; box-shadow: 0 0 6px rgba(52, 211, 153, 0.5); }
|
||||
50% { opacity: 0.6; box-shadow: 0 0 2px rgba(52, 211, 153, 0.2); }
|
||||
}
|
||||
|
||||
/* ── VM Grid ───────────────────────────────────────────────── */
|
||||
.vm-grid {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fill, minmax(360px, 1fr));
|
||||
gap: 1.25rem;
|
||||
}
|
||||
|
||||
/* ── VM Card ───────────────────────────────────────────────── */
|
||||
.vm-card {
|
||||
background: var(--card);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius-xl);
|
||||
padding: 1.25rem;
|
||||
backdrop-filter: blur(10px);
|
||||
-webkit-backdrop-filter: blur(10px);
|
||||
transition: border-color 0.2s, transform 0.2s;
|
||||
position: relative;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.vm-card::before {
|
||||
content: '';
|
||||
position: absolute;
|
||||
top: 0; left: 0; right: 0;
|
||||
height: 1px;
|
||||
background: linear-gradient(90deg, transparent, var(--accent-glow), transparent);
|
||||
opacity: 0;
|
||||
transition: opacity 0.2s;
|
||||
}
|
||||
|
||||
.vm-card:hover {
|
||||
border-color: rgba(34, 211, 238, 0.18);
|
||||
transform: translateY(-2px);
|
||||
}
|
||||
|
||||
.vm-card:hover::before { opacity: 1; }
|
||||
|
||||
.vm-card-header {
|
||||
display: flex;
|
||||
align-items: flex-start;
|
||||
justify-content: space-between;
|
||||
margin-bottom: 0.875rem;
|
||||
}
|
||||
|
||||
.vm-card h3 {
|
||||
font-family: var(--font-display);
|
||||
font-size: 0.95rem;
|
||||
font-weight: 700;
|
||||
color: var(--text-bright);
|
||||
letter-spacing: 0.03em;
|
||||
}
|
||||
|
||||
.vm-status {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.35rem;
|
||||
padding: 0.2rem 0.65rem;
|
||||
border-radius: 999px;
|
||||
font-size: 0.68rem;
|
||||
font-weight: 700;
|
||||
letter-spacing: 0.08em;
|
||||
text-transform: uppercase;
|
||||
}
|
||||
|
||||
.vm-status.running {
|
||||
background: rgba(52, 211, 153, 0.1);
|
||||
color: var(--success);
|
||||
border: 1px solid rgba(52, 211, 153, 0.2);
|
||||
}
|
||||
|
||||
.vm-status.stopped {
|
||||
background: rgba(248, 113, 113, 0.1);
|
||||
color: var(--error);
|
||||
border: 1px solid rgba(248, 113, 113, 0.2);
|
||||
}
|
||||
|
||||
.vm-updated {
|
||||
font-size: 0.7rem;
|
||||
color: var(--text-dim);
|
||||
margin-bottom: 0.75rem;
|
||||
font-family: var(--font-mono);
|
||||
letter-spacing: 0.02em;
|
||||
}
|
||||
|
||||
.vm-divider {
|
||||
height: 1px;
|
||||
background: var(--border-soft);
|
||||
margin: 0.875rem 0;
|
||||
}
|
||||
|
||||
.vm-stats { width: 100%; }
|
||||
|
||||
.vm-stats td {
|
||||
padding: 0.28rem 0;
|
||||
border-bottom: none;
|
||||
font-size: 0.82rem;
|
||||
}
|
||||
|
||||
.vm-stats td:first-child {
|
||||
color: var(--text-dim);
|
||||
font-size: 0.7rem;
|
||||
font-weight: 600;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.07em;
|
||||
width: 42%;
|
||||
}
|
||||
|
||||
.vm-stats td:last-child {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.78rem;
|
||||
color: var(--text);
|
||||
}
|
||||
|
||||
.vm-issues {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 0.4rem;
|
||||
margin-top: 0.875rem;
|
||||
}
|
||||
|
||||
.issue {
|
||||
font-size: 0.68rem;
|
||||
padding: 0.22rem 0.6rem;
|
||||
border-radius: 4px;
|
||||
font-weight: 700;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.06em;
|
||||
}
|
||||
|
||||
.issue.gateway_down {
|
||||
background: rgba(248, 113, 113, 0.12);
|
||||
color: var(--error);
|
||||
border: 1px solid rgba(248, 113, 113, 0.2);
|
||||
}
|
||||
|
||||
.issue.http_unhealthy {
|
||||
background: rgba(251, 191, 36, 0.1);
|
||||
color: var(--warning);
|
||||
border: 1px solid rgba(251, 191, 36, 0.2);
|
||||
}
|
||||
|
||||
.issue.backup_stale {
|
||||
background: rgba(251, 191, 36, 0.08);
|
||||
color: var(--warning);
|
||||
border: 1px solid rgba(251, 191, 36, 0.15);
|
||||
}
|
||||
|
||||
.issue.version_mismatch {
|
||||
background: rgba(167, 139, 250, 0.1);
|
||||
color: var(--purple);
|
||||
border: 1px solid rgba(167, 139, 250, 0.2);
|
||||
}
|
||||
|
||||
/* ── Agents Page ───────────────────────────────────────────── */
|
||||
.agents-layout {
|
||||
display: grid;
|
||||
grid-template-columns: minmax(0, 1fr) 280px;
|
||||
gap: 1.5rem;
|
||||
margin-top: 1.25rem;
|
||||
}
|
||||
|
||||
@media (max-width: 900px) {
|
||||
.agents-layout {
|
||||
grid-template-columns: 1fr;
|
||||
}
|
||||
}
|
||||
|
||||
.vm-strip {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 0.75rem;
|
||||
margin-bottom: 1.5rem;
|
||||
}
|
||||
|
||||
.vm-pill {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
padding: 0.5rem 1rem;
|
||||
background: var(--surface);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 999px;
|
||||
font-size: 0.78rem;
|
||||
font-weight: 600;
|
||||
letter-spacing: 0.04em;
|
||||
transition: border-color 0.2s, opacity 0.2s;
|
||||
}
|
||||
|
||||
.vm-pill.active {
|
||||
border-color: rgba(52, 211, 153, 0.3);
|
||||
}
|
||||
|
||||
.vm-pill.inactive {
|
||||
border-color: rgba(248, 113, 113, 0.2);
|
||||
opacity: 0.6;
|
||||
}
|
||||
|
||||
.vm-pill-dot {
|
||||
width: 7px;
|
||||
height: 7px;
|
||||
border-radius: 50%;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
.vm-pill.active .vm-pill-dot {
|
||||
background: var(--success);
|
||||
box-shadow: 0 0 6px rgba(52, 211, 153, 0.5);
|
||||
animation: livePulse 2s ease-in-out infinite;
|
||||
}
|
||||
|
||||
.vm-pill.inactive .vm-pill-dot {
|
||||
background: var(--error);
|
||||
}
|
||||
|
||||
.vm-pill-name {
|
||||
font-family: var(--font-mono);
|
||||
color: var(--text-bright);
|
||||
}
|
||||
|
||||
.vm-pill-label {
|
||||
color: var(--text-dim);
|
||||
font-size: 0.68rem;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.06em;
|
||||
}
|
||||
|
||||
.timeline {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.5rem;
|
||||
min-width: 0;
|
||||
}
|
||||
|
||||
.timeline-event {
|
||||
background: var(--card);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius-lg);
|
||||
padding: 0.875rem 1.125rem;
|
||||
backdrop-filter: blur(8px);
|
||||
-webkit-backdrop-filter: blur(8px);
|
||||
animation: fadeUp 0.25s ease both;
|
||||
transition: border-color 0.15s;
|
||||
}
|
||||
|
||||
.timeline-event:hover {
|
||||
border-color: rgba(34, 211, 238, 0.15);
|
||||
}
|
||||
|
||||
.timeline-event-header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.6rem;
|
||||
margin-bottom: 0.35rem;
|
||||
}
|
||||
|
||||
.timeline-vm-tag {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.68rem;
|
||||
font-weight: 700;
|
||||
padding: 0.15rem 0.5rem;
|
||||
border-radius: 4px;
|
||||
letter-spacing: 0.05em;
|
||||
text-transform: uppercase;
|
||||
}
|
||||
|
||||
.timeline-vm-tag.zap {
|
||||
background: rgba(34, 211, 238, 0.12);
|
||||
color: var(--accent);
|
||||
border: 1px solid rgba(34, 211, 238, 0.2);
|
||||
}
|
||||
|
||||
.timeline-vm-tag.orb {
|
||||
background: rgba(167, 139, 250, 0.12);
|
||||
color: var(--purple);
|
||||
border: 1px solid rgba(167, 139, 250, 0.2);
|
||||
}
|
||||
|
||||
.timeline-vm-tag.sun {
|
||||
background: rgba(251, 191, 36, 0.12);
|
||||
color: var(--warning);
|
||||
border: 1px solid rgba(251, 191, 36, 0.2);
|
||||
}
|
||||
|
||||
.timeline-vm-tag.unknown {
|
||||
background: var(--surface-2);
|
||||
color: var(--text-dim);
|
||||
border: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.timeline-event-type {
|
||||
font-size: 0.75rem;
|
||||
font-weight: 600;
|
||||
color: var(--text-bright);
|
||||
}
|
||||
|
||||
.timeline-event-time {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.68rem;
|
||||
color: var(--text-dim);
|
||||
margin-left: auto;
|
||||
}
|
||||
|
||||
.timeline-event-body {
|
||||
font-size: 0.82rem;
|
||||
color: var(--text);
|
||||
line-height: 1.5;
|
||||
padding-left: 0.15rem;
|
||||
}
|
||||
|
||||
.timeline-event-body.tool-name {
|
||||
font-family: var(--font-mono);
|
||||
color: var(--accent);
|
||||
font-size: 0.78rem;
|
||||
}
|
||||
|
||||
.timeline-event-body.message-preview {
|
||||
color: var(--text-dim);
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
.timeline-event-body.error-message {
|
||||
color: var(--error);
|
||||
}
|
||||
|
||||
.timeline-duration {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.72rem;
|
||||
color: var(--text-dim);
|
||||
margin-left: 0.5rem;
|
||||
}
|
||||
|
||||
.timeline-detail {
|
||||
margin-top: 0.5rem;
|
||||
padding: 0.75rem;
|
||||
background: #020508;
|
||||
border-radius: var(--radius);
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.75rem;
|
||||
color: #7a9ab5;
|
||||
white-space: pre-wrap;
|
||||
word-break: break-all;
|
||||
line-height: 1.65;
|
||||
display: none;
|
||||
}
|
||||
|
||||
.timeline-event.expanded .timeline-detail {
|
||||
display: block;
|
||||
}
|
||||
|
||||
.timeline-expand-hint {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
margin-top: 0.3rem;
|
||||
padding: 0;
|
||||
background: transparent;
|
||||
border: none;
|
||||
color: var(--text-dim);
|
||||
cursor: pointer;
|
||||
font-family: var(--font-body);
|
||||
font-size: 0.68rem;
|
||||
letter-spacing: 0.03em;
|
||||
}
|
||||
|
||||
.timeline-expand-hint:hover {
|
||||
color: var(--accent);
|
||||
}
|
||||
|
||||
.stats-panel {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 1rem;
|
||||
}
|
||||
|
||||
.stat-card {
|
||||
background: var(--surface);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius-lg);
|
||||
padding: 1rem;
|
||||
}
|
||||
|
||||
.stat-card-title {
|
||||
font-size: 0.68rem;
|
||||
font-weight: 700;
|
||||
color: var(--text-dim);
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.1em;
|
||||
margin-bottom: 0.6rem;
|
||||
}
|
||||
|
||||
.stat-card-value {
|
||||
font-family: var(--font-display);
|
||||
font-size: 1.6rem;
|
||||
font-weight: 800;
|
||||
color: var(--text-bright);
|
||||
letter-spacing: -0.02em;
|
||||
}
|
||||
|
||||
.stat-card-sub {
|
||||
font-size: 0.72rem;
|
||||
color: var(--text-dim);
|
||||
margin-top: 0.1rem;
|
||||
}
|
||||
|
||||
.stat-list {
|
||||
list-style: none;
|
||||
}
|
||||
|
||||
.stat-list li {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
padding: 0.35rem 0;
|
||||
border-bottom: 1px solid var(--border-soft);
|
||||
font-size: 0.8rem;
|
||||
}
|
||||
|
||||
.stat-list li:last-child {
|
||||
border-bottom: none;
|
||||
}
|
||||
|
||||
.stat-list-name {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.75rem;
|
||||
color: var(--text);
|
||||
}
|
||||
|
||||
.stat-list-count {
|
||||
font-family: var(--font-mono);
|
||||
font-size: 0.72rem;
|
||||
color: var(--text-dim);
|
||||
background: var(--surface-2);
|
||||
padding: 0.1rem 0.4rem;
|
||||
border-radius: 4px;
|
||||
}
|
||||
|
||||
.event-icon {
|
||||
width: 18px;
|
||||
height: 18px;
|
||||
border-radius: 4px;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
font-size: 0.6rem;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
.event-icon.message-in {
|
||||
background: rgba(52, 211, 153, 0.12);
|
||||
color: var(--success);
|
||||
border: 1px solid rgba(52, 211, 153, 0.25);
|
||||
}
|
||||
|
||||
.event-icon.message-out {
|
||||
background: rgba(34, 211, 238, 0.12);
|
||||
color: var(--accent);
|
||||
border: 1px solid rgba(34, 211, 238, 0.25);
|
||||
}
|
||||
|
||||
.event-icon.tool {
|
||||
background: rgba(167, 139, 250, 0.12);
|
||||
color: var(--purple);
|
||||
border: 1px solid rgba(167, 139, 250, 0.25);
|
||||
}
|
||||
|
||||
.event-icon.error {
|
||||
background: rgba(248, 113, 113, 0.12);
|
||||
color: var(--error);
|
||||
border: 1px solid rgba(248, 113, 113, 0.25);
|
||||
}
|
||||
|
||||
.event-icon.session {
|
||||
background: rgba(251, 191, 36, 0.12);
|
||||
color: var(--warning);
|
||||
border: 1px solid rgba(251, 191, 36, 0.25);
|
||||
}
|
||||
|
||||
.event-icon.internal {
|
||||
background: var(--surface-2);
|
||||
color: var(--text-dim);
|
||||
border: 1px solid var(--border);
|
||||
}
|
||||
|
||||
@@ -0,0 +1,110 @@
|
||||
services:
|
||||
postgres:
|
||||
image: postgres:16
|
||||
container_name: agentmon-db
|
||||
environment:
|
||||
POSTGRES_PASSWORD: pass
|
||||
POSTGRES_DB: agentmon
|
||||
ports:
|
||||
- "5432:5432"
|
||||
volumes:
|
||||
- postgres-data:/var/lib/postgresql/data
|
||||
- ./deploy/k8s/postgres.sql:/docker-entrypoint-initdb.d/init.sql
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U postgres"]
|
||||
interval: 5s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
|
||||
nats:
|
||||
image: nats:latest
|
||||
container_name: agentmon-nats
|
||||
ports:
|
||||
- "4222:4222"
|
||||
command: "--jetstream"
|
||||
volumes:
|
||||
- nats-data:/data
|
||||
|
||||
ingest-gateway:
|
||||
build: .
|
||||
container_name: agentmon-ingest
|
||||
command: ingest-gateway
|
||||
ports:
|
||||
- "8080:8080"
|
||||
depends_on:
|
||||
nats:
|
||||
condition: service_started
|
||||
environment:
|
||||
AGENTMON_ADDR: :8080
|
||||
NATS_URL: nats://nats:4222
|
||||
NATS_TOPIC: agentmon.events.v1
|
||||
|
||||
query-api:
|
||||
build: .
|
||||
container_name: agentmon-query
|
||||
command: query-api
|
||||
ports:
|
||||
- "8081:8081"
|
||||
depends_on:
|
||||
nats:
|
||||
condition: service_started
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
environment:
|
||||
AGENTMON_QUERY_ADDR: :8081
|
||||
DATABASE_URL: postgres://postgres:pass@postgres:5432/agentmon?sslmode=disable
|
||||
AGENTMON_QUERY_BASE: http://localhost:8081
|
||||
NATS_URL: nats://nats:4222
|
||||
NATS_TOPIC: agentmon.events.v1
|
||||
|
||||
web-ui:
|
||||
build: .
|
||||
container_name: agentmon-ui
|
||||
command: web-ui
|
||||
ports:
|
||||
- "8082:8082"
|
||||
depends_on:
|
||||
query-api:
|
||||
condition: service_started
|
||||
environment:
|
||||
AGENTMON_UI_ADDR: :8082
|
||||
AGENTMON_QUERY_BASE: http://query-api:8081
|
||||
|
||||
event-processor:
|
||||
build: .
|
||||
container_name: agentmon-processor
|
||||
command: event-processor
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
nats:
|
||||
condition: service_started
|
||||
environment:
|
||||
DATABASE_URL: postgres://postgres:pass@postgres:5432/agentmon?sslmode=disable
|
||||
NATS_URL: nats://nats:4222
|
||||
NATS_TOPIC: agentmon.events.v1
|
||||
|
||||
openclaw-monitor:
|
||||
build: .
|
||||
container_name: agentmon-openclaw-monitor
|
||||
command: openclaw-monitor
|
||||
network_mode: host
|
||||
depends_on:
|
||||
nats:
|
||||
condition: service_started
|
||||
environment:
|
||||
NATS_URL: nats://localhost:4222
|
||||
NATS_TOPIC: agentmon.events.v1
|
||||
OPENCLAW_REGISTRY: /openclaw-registry/openclaw-instances.json
|
||||
POLL_INTERVAL: 30s
|
||||
volumes:
|
||||
- /home/will/.claude/state/openclaw-instances.json:/openclaw-registry/openclaw-instances.json:ro
|
||||
- /var/run/libvirt/libvirt-sock:/var/run/libvirt/libvirt-sock
|
||||
- /home/will/.ssh/id_rsa:/root/.ssh/id_rsa:ro
|
||||
- /home/will/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub:ro
|
||||
- /home/will/.ssh/authorized_keys:/root/.ssh/authorized_keys:ro
|
||||
- /var/lib/libvirt:/var/lib/libvirt:ro
|
||||
|
||||
volumes:
|
||||
postgres-data:
|
||||
nats-data:
|
||||
@@ -0,0 +1,147 @@
|
||||
# Agent Activity Monitoring via OpenClaw Hooks
|
||||
|
||||
**Date:** 2026-03-13
|
||||
**Status:** Approved
|
||||
|
||||
## Goal
|
||||
|
||||
Monitor all OpenClaw agent and subagent activity across the three VMs (zap, orb, sun) — tool calls, conversation flow, token usage, session lifecycle, and errors — and display it in a real-time dashboard in the agentmon web UI.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ VM (zap / orb / sun) │
|
||||
│ │
|
||||
│ OpenClaw Gateway │
|
||||
│ ├── agent loop (messages, tools, sessions) │
|
||||
│ └── agentmon-hook (TypeScript) │
|
||||
│ │ listens to: message:received/sent, │
|
||||
│ │ tool_result_persist, command:*, session:* │
|
||||
│ │ │
|
||||
│ └──── POST /v1/events ─────────────────┐ │
|
||||
│ │ │
|
||||
└──────────────────────────────────────────────────│───┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────┐
|
||||
│ Host │
|
||||
│ agentmon ingest-gateway (:8080) │
|
||||
│ → NATS → event-processor → Postgres │
|
||||
│ → query-api → web-ui (new "Agents" page) │
|
||||
└──────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
One hook deployed to all three VMs captures everything and ships it to the existing agentmon pipeline. No changes needed to ingest, NATS, or storage.
|
||||
|
||||
## Event Mapping
|
||||
|
||||
| OpenClaw Event | agentmon Event | What it captures |
|
||||
|---|---|---|
|
||||
| `command:new` | `session.start` | Agent session begins |
|
||||
| `command:stop` / `command:reset` | `session.end` | Session ends |
|
||||
| `message:received` | `run.start` | Inbound message starts a turn |
|
||||
| `message:sent` | `run.end` | Agent response completes the turn |
|
||||
| `tool_result_persist` | `span.start` + `span.end` | Tool call with result |
|
||||
| `session:compact:before/after` | `span` (kind: `internal`) | Context window management |
|
||||
|
||||
### Correlation
|
||||
|
||||
- `session_id` = OpenClaw `sessionKey`
|
||||
- `run_id` = generated UUID per inbound message, carried through to `message:sent`
|
||||
- `framework` = `"openclaw"`
|
||||
- `client_id` = VM name (zap / orb / sun)
|
||||
|
||||
Token usage and cost attached via `WithLLMUsage` attributes on `run.end` events if the `message:sent` payload includes usage metadata.
|
||||
|
||||
## Hook Design
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
~/.openclaw/hooks/agentmon/
|
||||
├── HOOK.md # metadata: events, requirements
|
||||
├── handler.ts # event capture + HTTP emit
|
||||
└── package.json # minimal deps
|
||||
```
|
||||
|
||||
### Deployment
|
||||
|
||||
SCP the directory to each VM. The hook auto-discovers via OpenClaw's hook loading — no config changes needed beyond having hooks enabled.
|
||||
|
||||
The hook POSTs to the host machine's ingest gateway. VMs are on the libvirt bridge (192.168.122.x), so the gateway URL is configured as an env var or uses the host's bridge IP.
|
||||
|
||||
### Resilience
|
||||
|
||||
- Fire-and-forget with a small in-memory buffer (batch up to 10 events or 2s, whichever comes first)
|
||||
- 500ms timeout on fetch calls — if agentmon is slow, skip and move on
|
||||
- Events that fail to send are logged locally but not retried
|
||||
- The hook must never slow down the OpenClaw agent loop
|
||||
|
||||
## Error Handling
|
||||
|
||||
### In the hook
|
||||
|
||||
- All HTTP POSTs wrapped in try/catch — never throw, never block
|
||||
- Malformed event payloads (missing sessionKey, etc.) silently dropped with debug log
|
||||
|
||||
### In the pipeline
|
||||
|
||||
- Ingest gateway deduplicates by event ID — safe if a hook sends twice
|
||||
- Events with `framework: "openclaw"` but missing correlation IDs get stored but won't appear in the agents timeline
|
||||
|
||||
### Edge cases
|
||||
|
||||
- VM reboots mid-session: no `session.end` emitted — UI shows session as "ongoing" until a new `command:new` arrives
|
||||
- OpenClaw compacts context before hook fires: `session:compact:after` still fires, captured as internal span
|
||||
- Network partition between VM and host: events silently lost, no backfill — acceptable for monitoring
|
||||
|
||||
## UI — Agents Page
|
||||
|
||||
### Layout
|
||||
|
||||
A live activity dashboard at `/agents` with three sections:
|
||||
|
||||
1. **Top strip**: Three VM pill indicators (zap / orb / sun) showing online/offline with a subtle pulse when active
|
||||
2. **Activity timeline**: Vertical feed of events across all agents — messages, tool calls, errors — with VM name color-coded, monospace timestamps, and collapsible tool call detail rows. Real-time via existing WebSocket.
|
||||
3. **Side stats panel**: Aggregate metrics — messages/hour, tool calls today, error rate, most-used tools
|
||||
|
||||
### Aesthetic
|
||||
|
||||
Matches the refined dark theme already in place:
|
||||
- Timeline cards with glassmorphism
|
||||
- Color-coded VM badges
|
||||
- Monospace timestamps (Fira Code)
|
||||
- Syne display font for headings
|
||||
- Fade-in animations on new events
|
||||
- Status pill badges consistent with existing design system
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: OpenClaw Hook
|
||||
|
||||
1. Create hook directory structure (`HOOK.md`, `handler.ts`)
|
||||
2. Implement event-to-agentmon mapper — translate each OpenClaw event type to the agentmon envelope schema
|
||||
3. HTTP emitter with buffering (batch up to 10 events or 2s, whichever first) and 500ms timeout
|
||||
4. Unit test the mapper logic locally
|
||||
|
||||
### Phase 2: Agentmon UI — Agents Page
|
||||
|
||||
5. Add `/agents` route to the SPA router in `app.js`
|
||||
6. Add "Agents" nav link in header
|
||||
7. Build the top strip — three VM status pills pulling from existing `openclaw.snapshot` data
|
||||
8. Build the activity timeline — subscribe to WebSocket, filter for `framework: "openclaw"` events, render as vertical feed with collapsible tool call details
|
||||
9. Build the side stats panel — aggregate counts from the query API (messages/hour, tool calls, error rate, top tools)
|
||||
10. Style with the refined dark aesthetic — glassmorphism timeline cards, color-coded VM badges, monospace timestamps, fade-in animations
|
||||
|
||||
### Phase 3: Deploy
|
||||
|
||||
11. SCP hook to all three VMs, verify auto-discovery
|
||||
12. Send a test message to one agent, confirm events flow end-to-end
|
||||
|
||||
## Not in Scope (Future)
|
||||
|
||||
- Token/cost dashboard (needs usage data verification in `message:sent` payloads)
|
||||
- Historical analytics and aggregation queries
|
||||
- Hook auto-deployment via openclaw-monitor
|
||||
- Alerting on error rate spikes
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,147 @@
|
||||
// Package main demonstrates usage of the agentmon SDK.
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"log"
|
||||
"time"
|
||||
|
||||
"agentmon/internal/sdk"
|
||||
)
|
||||
|
||||
func main() {
|
||||
emitter, err := sdk.NewEmitter(sdk.Config{
|
||||
ServerURL: "http://localhost:8080",
|
||||
Framework: "example-agent",
|
||||
ClientID: "example-client-001",
|
||||
Host: "localhost",
|
||||
BufferSize: 50,
|
||||
EnableLogging: true,
|
||||
})
|
||||
if err != nil {
|
||||
log.Fatalf("Failed to create emitter: %v", err)
|
||||
}
|
||||
defer emitter.Close(context.Background())
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
sessionID := generateSessionID()
|
||||
log.Printf("Starting session: %s", sessionID)
|
||||
|
||||
sessionStart := sdk.NewSessionStart(sessionID, sdk.WithSource(emitter))
|
||||
if err := emitter.Emit(ctx, sessionStart); err != nil {
|
||||
log.Printf("Failed to emit session.start: %v", err)
|
||||
}
|
||||
|
||||
runID := generateRunID()
|
||||
log.Printf("Starting run: %s", runID)
|
||||
|
||||
runStart := sdk.NewRunStart(sessionID, runID,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithAttributes(map[string]any{
|
||||
"command": "example-command",
|
||||
"cwd": "/home/user/project",
|
||||
}),
|
||||
)
|
||||
if err := emitter.Emit(ctx, runStart); err != nil {
|
||||
log.Printf("Failed to emit run.start: %v", err)
|
||||
}
|
||||
|
||||
traceID := generateTraceID()
|
||||
|
||||
span1ID := generateSpanID()
|
||||
span1Start := sdk.NewSpanStart(sessionID, runID, traceID, span1ID,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithSpanKind("tool"),
|
||||
sdk.WithName("ExampleTool"),
|
||||
sdk.WithAttributes(map[string]any{
|
||||
"tool": "example-tool",
|
||||
}),
|
||||
)
|
||||
if err := emitter.Emit(ctx, span1Start); err != nil {
|
||||
log.Printf("Failed to emit span.start: %v", err)
|
||||
}
|
||||
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
|
||||
span1End := sdk.NewSpanEnd(sessionID, runID, traceID, span1ID, "success", 100,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithSpanKind("tool"),
|
||||
sdk.WithName("ExampleTool"),
|
||||
)
|
||||
if err := emitter.Emit(ctx, span1End); err != nil {
|
||||
log.Printf("Failed to emit span.end: %v", err)
|
||||
}
|
||||
|
||||
span2ID := generateSpanID()
|
||||
span2Start := sdk.NewSpanStart(sessionID, runID, traceID, span2ID,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithSpanKind("llm"),
|
||||
sdk.WithName("ClaudeRequest"),
|
||||
sdk.WithAttributes(map[string]any{
|
||||
"model": "claude-3-opus",
|
||||
}),
|
||||
)
|
||||
if err := emitter.Emit(ctx, span2Start); err != nil {
|
||||
log.Printf("Failed to emit span.start: %v", err)
|
||||
}
|
||||
|
||||
time.Sleep(200 * time.Millisecond)
|
||||
|
||||
span2End := sdk.NewSpanEnd(sessionID, runID, traceID, span2ID, "success", 200,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithSpanKind("llm"),
|
||||
sdk.WithName("ClaudeRequest"),
|
||||
sdk.WithLLMUsage("claude-3-opus", 1000, 500, 0.015),
|
||||
)
|
||||
if err := emitter.Emit(ctx, span2End); err != nil {
|
||||
log.Printf("Failed to emit span.end: %v", err)
|
||||
}
|
||||
|
||||
metrics := sdk.NewMetricSnapshot(sessionID, runID, map[string]any{
|
||||
"tokens_in": 1000.0,
|
||||
"tokens_out": 500.0,
|
||||
"cost_usd": 0.015,
|
||||
"latency_ms": 300.0,
|
||||
"error_count": 0,
|
||||
})
|
||||
if err := emitter.Emit(ctx, metrics); err != nil {
|
||||
log.Printf("Failed to emit metric.snapshot: %v", err)
|
||||
}
|
||||
|
||||
runEnd := sdk.NewRunEnd(sessionID, runID, "success", 300,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithLLMUsage("claude-3-opus", 1000, 500, 0.015),
|
||||
)
|
||||
if err := emitter.Emit(ctx, runEnd); err != nil {
|
||||
log.Printf("Failed to emit run.end: %v", err)
|
||||
}
|
||||
|
||||
sessionEnd := sdk.NewSessionEnd(sessionID, sdk.WithSource(emitter))
|
||||
if err := emitter.Emit(ctx, sessionEnd); err != nil {
|
||||
log.Printf("Failed to emit session.end: %v", err)
|
||||
}
|
||||
|
||||
if err := emitter.Flush(ctx); err != nil {
|
||||
log.Printf("Failed to flush events: %v", err)
|
||||
}
|
||||
|
||||
log.Println("Example completed successfully!")
|
||||
}
|
||||
|
||||
func generateSessionID() string {
|
||||
return fmt.Sprintf("sess-%d", time.Now().UnixNano())
|
||||
}
|
||||
|
||||
func generateRunID() string {
|
||||
return fmt.Sprintf("run-%d", time.Now().UnixNano())
|
||||
}
|
||||
|
||||
func generateTraceID() string {
|
||||
return fmt.Sprintf("trace-%d", time.Now().UnixNano())
|
||||
}
|
||||
|
||||
func generateSpanID() string {
|
||||
return fmt.Sprintf("span-%d", time.Now().UnixNano())
|
||||
}
|
||||
@@ -0,0 +1,38 @@
|
||||
---
|
||||
name: agentmon
|
||||
description: "Emit OpenClaw telemetry events to the agentmon monitoring pipeline"
|
||||
metadata:
|
||||
openclaw:
|
||||
events:
|
||||
- "command:new"
|
||||
- "command:stop"
|
||||
- "command:reset"
|
||||
- "message:received"
|
||||
- "message:sent"
|
||||
- "tool_result_persist"
|
||||
- "session:compact:before"
|
||||
- "session:compact:after"
|
||||
export: "default"
|
||||
requires:
|
||||
env:
|
||||
- "AGENTMON_INGEST_URL"
|
||||
---
|
||||
|
||||
# Agentmon Telemetry Hook
|
||||
|
||||
Captures OpenClaw agent activity and emits it as `agentmon.event` envelopes to
|
||||
the agentmon ingest gateway.
|
||||
|
||||
## Configuration
|
||||
|
||||
Set the ingest gateway URL before enabling the hook:
|
||||
|
||||
```bash
|
||||
export AGENTMON_INGEST_URL=http://192.168.122.1:8080
|
||||
```
|
||||
|
||||
You can optionally override the VM identifier:
|
||||
|
||||
```bash
|
||||
export AGENTMON_VM_NAME=zap
|
||||
```
|
||||
@@ -0,0 +1,411 @@
|
||||
import { randomUUID } from 'node:crypto';
|
||||
import { hostname } from 'node:os';
|
||||
|
||||
type Dict = Record<string, any>;
|
||||
|
||||
const INGEST_URL = process.env.AGENTMON_INGEST_URL || 'http://192.168.122.1:8080';
|
||||
const VM_NAME = process.env.AGENTMON_VM_NAME || hostname();
|
||||
const BATCH_SIZE = 10;
|
||||
const FLUSH_MS = 2000;
|
||||
const FETCH_TIMEOUT_MS = 500;
|
||||
|
||||
let buffer: Dict[] = [];
|
||||
let flushTimer: ReturnType<typeof setTimeout> | null = null;
|
||||
let isFlushing = false;
|
||||
|
||||
const activeRuns = new Map<string, string>();
|
||||
const activeCompactions = new Map<string, string>();
|
||||
|
||||
function isRecord(value: unknown): value is Dict {
|
||||
return value !== null && typeof value === 'object' && !Array.isArray(value);
|
||||
}
|
||||
|
||||
function pickString(...values: unknown[]): string | undefined {
|
||||
for (const value of values) {
|
||||
if (typeof value === 'string' && value.trim() !== '') {
|
||||
return value;
|
||||
}
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
function pickNumber(...values: unknown[]): number | undefined {
|
||||
for (const value of values) {
|
||||
if (typeof value === 'number' && Number.isFinite(value)) {
|
||||
return value;
|
||||
}
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
function truncate(value: unknown, limit: number): string | undefined {
|
||||
if (value === undefined || value === null) {
|
||||
return undefined;
|
||||
}
|
||||
|
||||
const text = typeof value === 'string' ? value : safeJSONStringify(value);
|
||||
if (!text) {
|
||||
return undefined;
|
||||
}
|
||||
|
||||
if (text.length <= limit) {
|
||||
return text;
|
||||
}
|
||||
return text.slice(0, limit) + '...';
|
||||
}
|
||||
|
||||
function safeJSONStringify(value: unknown): string {
|
||||
try {
|
||||
return JSON.stringify(value);
|
||||
} catch {
|
||||
return String(value);
|
||||
}
|
||||
}
|
||||
|
||||
function getEventName(input: Dict): string {
|
||||
const direct = pickString(input.name, input.event);
|
||||
if (direct) {
|
||||
return direct;
|
||||
}
|
||||
|
||||
if (typeof input.type === 'string' && input.type.includes(':')) {
|
||||
return input.type;
|
||||
}
|
||||
|
||||
if (typeof input.type === 'string' && typeof input.action === 'string') {
|
||||
return `${input.type}:${input.action}`;
|
||||
}
|
||||
|
||||
if (typeof input.type === 'string') {
|
||||
return input.type;
|
||||
}
|
||||
|
||||
return '';
|
||||
}
|
||||
|
||||
function getContext(input: Dict): Dict {
|
||||
return isRecord(input.context) ? input.context : {};
|
||||
}
|
||||
|
||||
function getSessionKey(input: Dict, context: Dict): string | undefined {
|
||||
return pickString(
|
||||
input.sessionKey,
|
||||
context.sessionKey,
|
||||
context.session_id,
|
||||
input.session_id,
|
||||
isRecord(input.session) ? input.session.key : undefined,
|
||||
isRecord(context.session) ? context.session.key : undefined,
|
||||
);
|
||||
}
|
||||
|
||||
function buildEnvelope(
|
||||
type: string,
|
||||
sessionKey?: string,
|
||||
opts: {
|
||||
runId?: string;
|
||||
spanId?: string;
|
||||
parentSpanId?: string;
|
||||
attributes?: Dict;
|
||||
payload?: Dict;
|
||||
} = {},
|
||||
): Dict {
|
||||
const correlation: Dict = {};
|
||||
if (sessionKey) {
|
||||
correlation.session_id = sessionKey;
|
||||
}
|
||||
if (opts.runId) {
|
||||
correlation.run_id = opts.runId;
|
||||
}
|
||||
if (opts.spanId) {
|
||||
correlation.span_id = opts.spanId;
|
||||
}
|
||||
if (opts.parentSpanId) {
|
||||
correlation.parent_span_id = opts.parentSpanId;
|
||||
}
|
||||
|
||||
const envelope: Dict = {
|
||||
schema: { name: 'agentmon.event', version: 1 },
|
||||
event: {
|
||||
id: randomUUID(),
|
||||
type,
|
||||
ts: new Date().toISOString(),
|
||||
source: {
|
||||
framework: 'openclaw',
|
||||
client_id: VM_NAME,
|
||||
host: VM_NAME,
|
||||
},
|
||||
},
|
||||
};
|
||||
|
||||
if (Object.keys(correlation).length > 0) {
|
||||
envelope.correlation = correlation;
|
||||
}
|
||||
if (opts.attributes && Object.keys(opts.attributes).length > 0) {
|
||||
envelope.attributes = opts.attributes;
|
||||
}
|
||||
if (opts.payload && Object.keys(opts.payload).length > 0) {
|
||||
envelope.payload = opts.payload;
|
||||
}
|
||||
|
||||
return envelope;
|
||||
}
|
||||
|
||||
function scheduleFlush() {
|
||||
if (!flushTimer) {
|
||||
flushTimer = setTimeout(() => {
|
||||
void flush();
|
||||
}, FLUSH_MS);
|
||||
}
|
||||
}
|
||||
|
||||
function enqueue(event: Dict) {
|
||||
buffer.push(event);
|
||||
if (buffer.length >= BATCH_SIZE) {
|
||||
void flush();
|
||||
} else {
|
||||
scheduleFlush();
|
||||
}
|
||||
}
|
||||
|
||||
async function postBatch(batch: Dict[]) {
|
||||
const controller = new AbortController();
|
||||
const timeout = setTimeout(() => controller.abort(), FETCH_TIMEOUT_MS);
|
||||
|
||||
try {
|
||||
await fetch(`${INGEST_URL}/v1/events`, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify(batch),
|
||||
signal: controller.signal,
|
||||
});
|
||||
} finally {
|
||||
clearTimeout(timeout);
|
||||
}
|
||||
}
|
||||
|
||||
async function flush() {
|
||||
if (flushTimer) {
|
||||
clearTimeout(flushTimer);
|
||||
flushTimer = null;
|
||||
}
|
||||
if (isFlushing || buffer.length === 0) {
|
||||
return;
|
||||
}
|
||||
|
||||
isFlushing = true;
|
||||
const batch = buffer.splice(0, BATCH_SIZE);
|
||||
|
||||
try {
|
||||
await postBatch(batch);
|
||||
} catch {
|
||||
console.debug(`[agentmon] failed to flush ${batch.length} events`);
|
||||
} finally {
|
||||
isFlushing = false;
|
||||
if (buffer.length > 0) {
|
||||
if (buffer.length >= BATCH_SIZE) {
|
||||
void flush();
|
||||
} else {
|
||||
scheduleFlush();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function emitError(sessionKey: string | undefined, runId: string | undefined, spanId: string | undefined, errorValue: unknown) {
|
||||
if (errorValue === undefined || errorValue === null || errorValue === false) {
|
||||
return;
|
||||
}
|
||||
|
||||
const errorRecord = isRecord(errorValue) ? errorValue : {};
|
||||
const message = pickString(errorRecord.message, errorRecord.error, errorValue) || 'unknown';
|
||||
const errType = pickString(errorRecord.type, errorRecord.code) || 'openclaw';
|
||||
|
||||
enqueue(buildEnvelope('error', sessionKey, {
|
||||
runId,
|
||||
spanId,
|
||||
payload: {
|
||||
error: {
|
||||
type: errType,
|
||||
message,
|
||||
},
|
||||
},
|
||||
}));
|
||||
}
|
||||
|
||||
function buildRunPayload(context: Dict, success: boolean): Dict {
|
||||
const payload: Dict = {
|
||||
status: success ? 'success' : 'error',
|
||||
};
|
||||
|
||||
const duration = pickNumber(context.duration_ms, context.durationMs, context.elapsed_ms);
|
||||
if (duration !== undefined) {
|
||||
payload.duration_ms = duration;
|
||||
}
|
||||
|
||||
const usage = isRecord(context.usage) ? context.usage : undefined;
|
||||
if (usage) {
|
||||
payload.usage = usage;
|
||||
}
|
||||
|
||||
const errorMessage = pickString(context.error, isRecord(context.result) ? context.result.error : undefined);
|
||||
if (errorMessage) {
|
||||
payload.error = errorMessage;
|
||||
}
|
||||
|
||||
return payload;
|
||||
}
|
||||
|
||||
const handler = async (rawEvent: unknown) => {
|
||||
if (!isRecord(rawEvent)) {
|
||||
return;
|
||||
}
|
||||
|
||||
const context = getContext(rawEvent);
|
||||
const eventName = getEventName(rawEvent);
|
||||
const sessionKey = getSessionKey(rawEvent, context);
|
||||
|
||||
try {
|
||||
if (eventName === 'command:new') {
|
||||
enqueue(buildEnvelope('session.start', sessionKey));
|
||||
return;
|
||||
}
|
||||
|
||||
if (eventName === 'command:stop') {
|
||||
enqueue(buildEnvelope('session.end', sessionKey));
|
||||
if (sessionKey) {
|
||||
activeRuns.delete(sessionKey);
|
||||
activeCompactions.delete(sessionKey);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (eventName === 'command:reset') {
|
||||
enqueue(buildEnvelope('session.end', sessionKey));
|
||||
enqueue(buildEnvelope('session.start', sessionKey));
|
||||
if (sessionKey) {
|
||||
activeRuns.delete(sessionKey);
|
||||
activeCompactions.delete(sessionKey);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (eventName === 'message:received') {
|
||||
const runId = randomUUID();
|
||||
if (sessionKey) {
|
||||
activeRuns.set(sessionKey, runId);
|
||||
}
|
||||
|
||||
enqueue(buildEnvelope('run.start', sessionKey, {
|
||||
runId,
|
||||
attributes: {
|
||||
channel: pickString(context.channelId, context.channel_id),
|
||||
from: pickString(context.from, context.sender),
|
||||
},
|
||||
payload: {
|
||||
message_preview: truncate(
|
||||
pickString(context.content, context.message, context.text) || context.input,
|
||||
200,
|
||||
),
|
||||
},
|
||||
}));
|
||||
return;
|
||||
}
|
||||
|
||||
if (eventName === 'message:sent') {
|
||||
const runId = sessionKey ? activeRuns.get(sessionKey) : undefined;
|
||||
const success = context.success !== false && !context.error;
|
||||
|
||||
enqueue(buildEnvelope('run.end', sessionKey, {
|
||||
runId,
|
||||
attributes: {
|
||||
channel: pickString(context.channelId, context.channel_id),
|
||||
to: pickString(context.to, context.recipient),
|
||||
},
|
||||
payload: buildRunPayload(context, success),
|
||||
}));
|
||||
|
||||
if (!success) {
|
||||
emitError(sessionKey, runId, undefined, context.error);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (eventName === 'tool_result_persist') {
|
||||
const runId = sessionKey ? activeRuns.get(sessionKey) : undefined;
|
||||
const spanId = randomUUID();
|
||||
const success = context.success !== false && !context.error;
|
||||
const toolName = pickString(context.toolName, context.tool_name, context.name) || 'unknown_tool';
|
||||
const payload: Dict = {
|
||||
status: success ? 'success' : 'error',
|
||||
};
|
||||
|
||||
const duration = pickNumber(context.duration_ms, context.durationMs, context.elapsed_ms);
|
||||
if (duration !== undefined) {
|
||||
payload.duration_ms = duration;
|
||||
}
|
||||
|
||||
const resultPreview = truncate(context.result ?? context.output, 500);
|
||||
if (resultPreview) {
|
||||
payload.result_preview = resultPreview;
|
||||
}
|
||||
|
||||
enqueue(buildEnvelope('span.end', sessionKey, {
|
||||
runId,
|
||||
spanId,
|
||||
attributes: {
|
||||
span_kind: 'tool',
|
||||
name: toolName,
|
||||
},
|
||||
payload,
|
||||
}));
|
||||
|
||||
if (!success) {
|
||||
emitError(sessionKey, runId, spanId, context.error);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (eventName === 'session:compact:before') {
|
||||
const runId = sessionKey ? activeRuns.get(sessionKey) : undefined;
|
||||
const spanId = randomUUID();
|
||||
if (sessionKey) {
|
||||
activeCompactions.set(sessionKey, spanId);
|
||||
}
|
||||
|
||||
enqueue(buildEnvelope('span.start', sessionKey, {
|
||||
runId,
|
||||
spanId,
|
||||
attributes: {
|
||||
span_kind: 'internal',
|
||||
name: 'context_compaction',
|
||||
},
|
||||
}));
|
||||
return;
|
||||
}
|
||||
|
||||
if (eventName === 'session:compact:after') {
|
||||
const runId = sessionKey ? activeRuns.get(sessionKey) : undefined;
|
||||
const spanId = (sessionKey && activeCompactions.get(sessionKey)) || randomUUID();
|
||||
if (sessionKey) {
|
||||
activeCompactions.delete(sessionKey);
|
||||
}
|
||||
|
||||
enqueue(buildEnvelope('span.end', sessionKey, {
|
||||
runId,
|
||||
spanId,
|
||||
attributes: {
|
||||
span_kind: 'internal',
|
||||
name: 'context_compaction',
|
||||
},
|
||||
payload: {
|
||||
status: 'success',
|
||||
duration_ms: pickNumber(context.duration_ms, context.durationMs, context.elapsed_ms),
|
||||
},
|
||||
}));
|
||||
}
|
||||
} catch {
|
||||
console.debug('[agentmon] handler error');
|
||||
}
|
||||
};
|
||||
|
||||
export default handler;
|
||||
@@ -0,0 +1,5 @@
|
||||
{
|
||||
"name": "agentmon-openclaw-hook",
|
||||
"private": true,
|
||||
"type": "module"
|
||||
}
|
||||
+26
-23
@@ -6,14 +6,15 @@ import (
|
||||
)
|
||||
|
||||
var validTypes = map[string]bool{
|
||||
"session.start": true,
|
||||
"session.end": true,
|
||||
"run.start": true,
|
||||
"run.end": true,
|
||||
"span.start": true,
|
||||
"span.end": true,
|
||||
"error": true,
|
||||
"metric.snapshot": true,
|
||||
"session.start": true,
|
||||
"session.end": true,
|
||||
"run.start": true,
|
||||
"run.end": true,
|
||||
"span.start": true,
|
||||
"span.end": true,
|
||||
"error": true,
|
||||
"metric.snapshot": true,
|
||||
"openclaw.snapshot": true,
|
||||
}
|
||||
|
||||
type ValidationError struct {
|
||||
@@ -31,8 +32,8 @@ func Validate(m map[string]any) error {
|
||||
if !ok {
|
||||
return ValidationError{Field: "schema", Message: "missing or invalid"}
|
||||
}
|
||||
if name, _ := schema["name"].(string); name != "agentmon.event" {
|
||||
return ValidationError{Field: "schema.name", Message: "must be 'agentmon.event'"}
|
||||
if name, _ := schema["name"].(string); name != "agentmon.event" && name != "agentmon.openclaw" {
|
||||
return ValidationError{Field: "schema.name", Message: "must be 'agentmon.event' or 'agentmon.openclaw'"}
|
||||
}
|
||||
if ver, _ := schema["version"].(float64); ver != 1 {
|
||||
return ValidationError{Field: "schema.version", Message: "must be 1"}
|
||||
@@ -60,20 +61,22 @@ func Validate(m map[string]any) error {
|
||||
return ValidationError{Field: "event.ts", Message: "required"}
|
||||
}
|
||||
|
||||
// Check source
|
||||
source, ok := event["source"].(map[string]any)
|
||||
if !ok {
|
||||
return ValidationError{Field: "event.source", Message: "missing or invalid"}
|
||||
}
|
||||
// Source is optional for openclaw.snapshot events
|
||||
if eventType != "openclaw.snapshot" {
|
||||
source, ok := event["source"].(map[string]any)
|
||||
if !ok {
|
||||
return ValidationError{Field: "event.source", Message: "missing or invalid"}
|
||||
}
|
||||
|
||||
if fw, _ := source["framework"].(string); fw == "" {
|
||||
return ValidationError{Field: "event.source.framework", Message: "required"}
|
||||
}
|
||||
if cid, _ := source["client_id"].(string); cid == "" {
|
||||
return ValidationError{Field: "event.source.client_id", Message: "required"}
|
||||
}
|
||||
if host, _ := source["host"].(string); host == "" {
|
||||
return ValidationError{Field: "event.source.host", Message: "required"}
|
||||
if fw, _ := source["framework"].(string); fw == "" {
|
||||
return ValidationError{Field: "event.source.framework", Message: "required"}
|
||||
}
|
||||
if cid, _ := source["client_id"].(string); cid == "" {
|
||||
return ValidationError{Field: "event.source.client_id", Message: "required"}
|
||||
}
|
||||
if host, _ := source["host"].(string); host == "" {
|
||||
return ValidationError{Field: "event.source.host", Message: "required"}
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
|
||||
@@ -0,0 +1,353 @@
|
||||
package openclaw
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"os/exec"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
"time"
|
||||
)
|
||||
|
||||
const (
|
||||
guestDiskWarningPercent = 80.0
|
||||
guestDiskCriticalPercent = 95.0
|
||||
guestMemoryWarningPercent = 80.0
|
||||
guestMemoryCriticalPercent = 95.0
|
||||
hostDiskActualGB = 32.0
|
||||
backupWarningAgeHours = 25.0
|
||||
backupCriticalAgeHours = 48.0
|
||||
)
|
||||
|
||||
func CollectHostMetrics(domain string) (HostMetrics, error) {
|
||||
metrics := HostMetrics{}
|
||||
|
||||
state, err := virshCmd("domstate", domain)
|
||||
if err != nil {
|
||||
return metrics, fmt.Errorf("failed to get VM state: %w", err)
|
||||
}
|
||||
metrics.State = strings.TrimSpace(state)
|
||||
|
||||
if metrics.State != "running" {
|
||||
metrics.VCPUs = 0
|
||||
metrics.MemoryKiB = 0
|
||||
metrics.Autostart = false
|
||||
metrics.Snapshots = 0
|
||||
return metrics, nil
|
||||
}
|
||||
|
||||
info, err := virshCmd("dominfo", domain)
|
||||
if err != nil {
|
||||
return metrics, fmt.Errorf("failed to get VM info: %w", err)
|
||||
}
|
||||
|
||||
if cpuStr := parseVirshInfo(info, "CPU(s)"); cpuStr != "" {
|
||||
if cpu, err := strconv.Atoi(strings.TrimSpace(cpuStr)); err == nil {
|
||||
metrics.VCPUs = cpu
|
||||
}
|
||||
}
|
||||
if memStr := parseVirshInfo(info, "Max memory"); memStr != "" {
|
||||
if mem, err := parseMemoryKiB(memStr); err == nil {
|
||||
metrics.MemoryKiB = mem
|
||||
}
|
||||
}
|
||||
|
||||
autostartInfo := parseVirshInfo(info, "Autostart")
|
||||
metrics.Autostart = strings.TrimSpace(autostartInfo) == "enable"
|
||||
|
||||
cpuTime, err := virshCmd("dominfo", domain)
|
||||
if err == nil {
|
||||
if cpuStr := parseVirshInfo(cpuTime, "CPU time"); cpuStr != "" {
|
||||
if cpu, err := parseCPUTimeNS(cpuStr); err == nil {
|
||||
metrics.CPUTime = cpu
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
snapshots, err := virshCmd("snapshot-list", domain, "--name")
|
||||
if err == nil {
|
||||
metrics.Snapshots = len(strings.Fields(strings.TrimSpace(snapshots)))
|
||||
}
|
||||
|
||||
diskPath, err := virshCmd("domblklist", domain)
|
||||
if err == nil {
|
||||
lines := strings.Split(diskPath, "\n")
|
||||
for _, line := range lines {
|
||||
fields := strings.Fields(line)
|
||||
if len(fields) >= 2 && fields[1] != "" && fields[1] != "-" {
|
||||
diskActual, _, err := getDiskStats(fields[1])
|
||||
if err == nil {
|
||||
metrics.DiskActual = diskActual
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return metrics, nil
|
||||
}
|
||||
|
||||
func CollectGuestMetrics(host, user string) (*GuestMetrics, error) {
|
||||
metrics := &GuestMetrics{}
|
||||
|
||||
serviceStatus, err := sshCmd(host, user, "systemctl --user is-active openclaw-gateway.service")
|
||||
if err == nil {
|
||||
metrics.ServiceActive = strings.TrimSpace(serviceStatus) == "active"
|
||||
}
|
||||
|
||||
if metrics.ServiceActive {
|
||||
uptime, err := sshCmd(host, user, "systemctl --user show openclaw-gateway.service -p ActiveEnterTimestamp --value")
|
||||
if err == nil {
|
||||
uptimeTS := strings.TrimSpace(uptime)
|
||||
if ts, err := time.Parse("Mon 2006-01-02 15:04:05 MST", uptimeTS); err == nil {
|
||||
duration := time.Since(ts)
|
||||
metrics.ServiceUptime = duration.String()
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
httpCode, err := sshCmd(host, user, "curl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:18789/")
|
||||
if err == nil {
|
||||
if code, err := strconv.Atoi(strings.TrimSpace(httpCode)); err == nil {
|
||||
metrics.HTTPStatus = code
|
||||
}
|
||||
}
|
||||
|
||||
version, err := sshCmd(host, user, "ls -la ~/.local/share/pnpm/5/node_modules/openclaw | grep -oP 'openclaw@[0-9.]+' | head -1")
|
||||
if err == nil {
|
||||
metrics.Version = strings.TrimSpace(strings.TrimPrefix(version, "openclaw@"))
|
||||
if metrics.Version != "" {
|
||||
serviceVersion, err := sshCmd(host, user, "grep OPENCLAW_SERVICE_VERSION ~/.config/systemd/user/openclaw-gateway.service 2>/dev/null | head -1")
|
||||
if err == nil && strings.Contains(serviceVersion, metrics.Version) {
|
||||
metrics.VersionConsistent = true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
memInfo, err := sshCmd(host, user, "free -b | grep '^Mem:'")
|
||||
if err == nil {
|
||||
fields := strings.Fields(memInfo)
|
||||
if len(fields) >= 3 {
|
||||
if total, err := strconv.ParseInt(fields[1], 10, 64); err == nil {
|
||||
metrics.MemoryTotal = total
|
||||
}
|
||||
if used, err := strconv.ParseInt(fields[2], 10, 64); err == nil {
|
||||
metrics.MemoryUsed = used
|
||||
}
|
||||
if metrics.MemoryTotal > 0 {
|
||||
metrics.MemoryPercent = float64(metrics.MemoryUsed) / float64(metrics.MemoryTotal) * 100
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
diskInfo, err := sshCmd(host, user, "df -B1 / | tail -1")
|
||||
if err == nil {
|
||||
fields := strings.Fields(diskInfo)
|
||||
if len(fields) >= 5 {
|
||||
if total, err := strconv.ParseInt(fields[1], 10, 64); err == nil {
|
||||
metrics.DiskTotal = total
|
||||
}
|
||||
if used, err := strconv.ParseInt(fields[2], 10, 64); err == nil {
|
||||
metrics.DiskUsed = used
|
||||
}
|
||||
if metrics.DiskTotal > 0 {
|
||||
metrics.DiskPercent = float64(metrics.DiskUsed) / float64(metrics.DiskTotal) * 100
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
loadAvg, err := sshCmd(host, user, "awk '{print $1}' /proc/loadavg")
|
||||
if err == nil {
|
||||
if load, err := strconv.ParseFloat(strings.TrimSpace(loadAvg), 64); err == nil {
|
||||
metrics.LoadAverage = load
|
||||
}
|
||||
}
|
||||
|
||||
swappiness, err := sshCmd(host, user, "cat /proc/sys/vm/swappiness")
|
||||
if err == nil {
|
||||
if swap, err := strconv.Atoi(strings.TrimSpace(swappiness)); err == nil {
|
||||
metrics.Swappiness = swap
|
||||
}
|
||||
}
|
||||
|
||||
return metrics, nil
|
||||
}
|
||||
|
||||
func CollectBackupStatus(instanceName string) (*BackupStatus, error) {
|
||||
backupPath := "/home/will/lab/swarm/openclaw"
|
||||
fileInfo, err := exec.Command("stat", "-c", "%Y", backupPath).Output()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to get backup timestamp: %w", err)
|
||||
}
|
||||
|
||||
timestampStr := strings.TrimSpace(string(fileInfo))
|
||||
timestamp, err := strconv.ParseInt(timestampStr, 10, 64)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to parse backup timestamp: %w", err)
|
||||
}
|
||||
|
||||
lastBackup := time.Unix(timestamp, 0)
|
||||
age := time.Since(lastBackup)
|
||||
ageHours := age.Hours()
|
||||
|
||||
return &BackupStatus{
|
||||
LastBackup: lastBackup.UTC().Format(time.RFC3339),
|
||||
AgeHours: ageHours,
|
||||
}, nil
|
||||
}
|
||||
|
||||
func DetectIssues(metrics Metrics) Issues {
|
||||
issues := Issues{}
|
||||
|
||||
if metrics.Guest != nil {
|
||||
if metrics.Guest.DiskPercent > guestDiskCriticalPercent {
|
||||
issues.GuestDiskUsageHigh = true
|
||||
}
|
||||
if metrics.Guest.MemoryPercent > guestMemoryCriticalPercent {
|
||||
issues.GuestMemoryUsageHigh = true
|
||||
}
|
||||
if !metrics.Guest.ServiceActive {
|
||||
issues.GatewayDown = true
|
||||
}
|
||||
if metrics.Guest.HTTPStatus != 200 {
|
||||
issues.HTTPUnhealthy = true
|
||||
}
|
||||
if metrics.Guest.Version != "" && !metrics.Guest.VersionConsistent {
|
||||
issues.VersionMismatch = true
|
||||
}
|
||||
}
|
||||
|
||||
if metrics.Instance.Status == "active" && metrics.Host.State != "running" {
|
||||
issues.VMNotRunning = true
|
||||
}
|
||||
|
||||
if metrics.Backup != nil && metrics.Backup.AgeHours > backupCriticalAgeHours {
|
||||
issues.BackupStale = true
|
||||
}
|
||||
|
||||
return issues
|
||||
}
|
||||
|
||||
func LoadInstances(registryPath string) ([]Instance, error) {
|
||||
data, err := exec.Command("cat", registryPath).Output()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to read registry: %w", err)
|
||||
}
|
||||
|
||||
var registry struct {
|
||||
Instances []map[string]any `json:"instances"`
|
||||
}
|
||||
|
||||
if err := json.Unmarshal(data, ®istry); err != nil {
|
||||
return nil, fmt.Errorf("failed to parse registry: %w", err)
|
||||
}
|
||||
|
||||
instances := make([]Instance, 0, len(registry.Instances))
|
||||
for _, rawInst := range registry.Instances {
|
||||
inst := Instance{}
|
||||
|
||||
if name, ok := rawInst["name"].(string); ok {
|
||||
inst.Name = name
|
||||
}
|
||||
if domain, ok := rawInst["domain"].(string); ok {
|
||||
inst.Domain = domain
|
||||
}
|
||||
if host, ok := rawInst["host"].(string); ok && host != "" {
|
||||
inst.Host = &host
|
||||
}
|
||||
if user, ok := rawInst["user"].(string); ok {
|
||||
inst.User = user
|
||||
}
|
||||
if status, ok := rawInst["status"].(string); ok {
|
||||
inst.Status = status
|
||||
}
|
||||
|
||||
inst.Additional = rawInst
|
||||
|
||||
instances = append(instances, inst)
|
||||
}
|
||||
|
||||
return instances, nil
|
||||
}
|
||||
|
||||
func virshCmd(args ...string) (string, error) {
|
||||
cmd := exec.Command("virsh", append([]string{"-c", "qemu:///system"}, args...)...)
|
||||
output, err := cmd.CombinedOutput()
|
||||
return string(output), err
|
||||
}
|
||||
|
||||
func sshCmd(host, user, command string) (string, error) {
|
||||
cmd := exec.Command("ssh", "-o", "ConnectTimeout=5", "-o", "BatchMode=yes", "-o", "StrictHostKeyChecking=no", "-o", "UserKnownHostsFile=/dev/null", "-o", "LogLevel=ERROR", "-q", fmt.Sprintf("%s@%s", user, host), command)
|
||||
output, err := cmd.CombinedOutput()
|
||||
return string(output), err
|
||||
}
|
||||
|
||||
func parseVirshInfo(info, key string) string {
|
||||
re := regexp.MustCompile(fmt.Sprintf(`(?m)^%s:\s*(.*)$`, regexp.QuoteMeta(key)))
|
||||
match := re.FindStringSubmatch(info)
|
||||
if match != nil && len(match) > 1 {
|
||||
return match[1]
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func parseMemoryKiB(memStr string) (int64, error) {
|
||||
memStr = strings.TrimSpace(strings.ToLower(memStr))
|
||||
re := regexp.MustCompile(`^(\d+(?:\.\d+)?)\s*([kmgt]?)i?b$`)
|
||||
match := re.FindStringSubmatch(memStr)
|
||||
if match == nil || len(match) < 3 {
|
||||
return 0, fmt.Errorf("invalid memory format: %s", memStr)
|
||||
}
|
||||
|
||||
value, err := strconv.ParseFloat(match[1], 64)
|
||||
if err != nil {
|
||||
return 0, fmt.Errorf("failed to parse memory value: %w", err)
|
||||
}
|
||||
|
||||
unit := match[2]
|
||||
multiplier := int64(1)
|
||||
switch unit {
|
||||
case "k":
|
||||
multiplier = 1
|
||||
case "m":
|
||||
multiplier = 1024
|
||||
case "g":
|
||||
multiplier = 1024 * 1024
|
||||
case "t":
|
||||
multiplier = 1024 * 1024 * 1024
|
||||
}
|
||||
|
||||
return int64(value * float64(multiplier)), nil
|
||||
}
|
||||
|
||||
func parseCPUTimeNS(cpuStr string) (int64, error) {
|
||||
parts := strings.Fields(cpuStr)
|
||||
if len(parts) < 4 {
|
||||
return 0, fmt.Errorf("invalid CPU time format")
|
||||
}
|
||||
|
||||
hours, _ := strconv.ParseFloat(parts[0], 64)
|
||||
minutes, _ := strconv.ParseFloat(parts[2], 64)
|
||||
seconds, _ := strconv.ParseFloat(strings.TrimSuffix(parts[4], "s"), 64)
|
||||
|
||||
totalSeconds := hours*3600 + minutes*60 + seconds
|
||||
return int64(totalSeconds * 1e9), nil
|
||||
}
|
||||
|
||||
func getDiskStats(path string) (actual, virtual int64, err error) {
|
||||
info, err := exec.Command("stat", "-c", "%s %b", path).Output()
|
||||
if err != nil {
|
||||
return 0, 0, err
|
||||
}
|
||||
|
||||
fields := strings.Fields(string(info))
|
||||
if len(fields) < 2 {
|
||||
return 0, 0, fmt.Errorf("invalid stat output")
|
||||
}
|
||||
|
||||
blockSize, _ := strconv.ParseInt(fields[0], 10, 64)
|
||||
blockCount, _ := strconv.ParseInt(fields[1], 10, 64)
|
||||
actual = blockSize * blockCount
|
||||
|
||||
return actual, 0, nil
|
||||
}
|
||||
@@ -0,0 +1,64 @@
|
||||
package openclaw
|
||||
|
||||
import "time"
|
||||
|
||||
type Instance struct {
|
||||
Name string `json:"name"`
|
||||
Domain string `json:"domain"`
|
||||
Host *string `json:"host,omitempty"`
|
||||
User string `json:"user"`
|
||||
Status string `json:"status"`
|
||||
|
||||
Additional map[string]any `json:"-"`
|
||||
}
|
||||
|
||||
type HostMetrics struct {
|
||||
State string `json:"state"`
|
||||
VCPUs int `json:"vcpus"`
|
||||
MemoryKiB int64 `json:"memory_kib"`
|
||||
Autostart bool `json:"autostart"`
|
||||
Snapshots int `json:"snapshots"`
|
||||
DiskActual int64 `json:"disk_actual_bytes"`
|
||||
CPUTime int64 `json:"cpu_time_ns"`
|
||||
}
|
||||
|
||||
type GuestMetrics struct {
|
||||
ServiceActive bool `json:"service_active"`
|
||||
ServiceUptime string `json:"service_uptime"`
|
||||
HTTPStatus int `json:"http_status"`
|
||||
Version string `json:"version"`
|
||||
VersionConsistent bool `json:"version_consistent"`
|
||||
MemoryTotal int64 `json:"memory_total_bytes"`
|
||||
MemoryUsed int64 `json:"memory_used_bytes"`
|
||||
MemoryPercent float64 `json:"memory_percent"`
|
||||
DiskTotal int64 `json:"disk_total_bytes"`
|
||||
DiskUsed int64 `json:"disk_used_bytes"`
|
||||
DiskPercent float64 `json:"disk_percent"`
|
||||
LoadAverage float64 `json:"load_average"`
|
||||
Swappiness int `json:"swappiness"`
|
||||
}
|
||||
|
||||
type BackupStatus struct {
|
||||
LastBackup string `json:"last_backup"`
|
||||
AgeHours float64 `json:"age_hours"`
|
||||
}
|
||||
|
||||
type Metrics struct {
|
||||
Instance Instance `json:"instance"`
|
||||
Host HostMetrics `json:"host"`
|
||||
Guest *GuestMetrics `json:"guest,omitempty"`
|
||||
Backup *BackupStatus `json:"backup,omitempty"`
|
||||
Timestamp time.Time `json:"timestamp"`
|
||||
Error string `json:"error,omitempty"`
|
||||
}
|
||||
|
||||
type Issues struct {
|
||||
GuestDiskUsageHigh bool `json:"guest_disk_usage_high"`
|
||||
GuestMemoryUsageHigh bool `json:"guest_memory_usage_high"`
|
||||
HostDiskUsageHigh bool `json:"host_disk_usage_high"`
|
||||
GatewayDown bool `json:"gateway_down"`
|
||||
HTTPUnhealthy bool `json:"http_unhealthy"`
|
||||
VersionMismatch bool `json:"version_mismatch"`
|
||||
VMNotRunning bool `json:"vm_not_running"`
|
||||
BackupStale bool `json:"backup_stale"`
|
||||
}
|
||||
@@ -0,0 +1,211 @@
|
||||
# Agentmon SDK
|
||||
|
||||
The agentmon SDK provides a Go client for sending telemetry events to the agentmon backend.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
go get agentmon/internal/sdk
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"log"
|
||||
|
||||
"agentmon/internal/sdk"
|
||||
)
|
||||
|
||||
func main() {
|
||||
emitter, err := sdk.NewEmitter(sdk.Config{
|
||||
ServerURL: "http://localhost:8080",
|
||||
Framework: "my-agent",
|
||||
ClientID: "my-client-001",
|
||||
Host: "localhost",
|
||||
BufferSize: 100,
|
||||
})
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
defer emitter.Close(context.Background())
|
||||
|
||||
ctx := context.Background()
|
||||
sessionID := "session-123"
|
||||
|
||||
// Start a session
|
||||
sessionStart := sdk.NewSessionStart(sessionID, sdk.WithSource(emitter))
|
||||
if err := emitter.Emit(ctx, sessionStart); err != nil {
|
||||
log.Printf("Error: %v", err)
|
||||
}
|
||||
|
||||
// ... do work ...
|
||||
|
||||
// End the session
|
||||
sessionEnd := sdk.NewSessionEnd(sessionID, sdk.WithSource(emitter))
|
||||
if err := emitter.Emit(ctx, sessionEnd); err != nil {
|
||||
log.Printf("Error: %v", err)
|
||||
}
|
||||
|
||||
// Flush buffered events
|
||||
if err := emitter.Flush(ctx); err != nil {
|
||||
log.Printf("Error: %v", err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
The `Config` struct configures the emitter:
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `ServerURL` | string | Yes | - | URL of the ingest gateway (e.g., `http://localhost:8080`) |
|
||||
| `APIKey` | string | No | - | Optional API key for authentication |
|
||||
| `Framework` | string | Yes | - | Name of the framework (e.g., `opencode`, `claude-code`) |
|
||||
| `ClientID` | string | Yes | - | Stable identifier for this emitter instance |
|
||||
| `Host` | string | No | `localhost` | Hostname where events originate |
|
||||
| `BufferSize` | int | No | `100` | Max number of events to buffer before flushing |
|
||||
| `UseWebSocket` | bool | No | `false` | Enable WebSocket streaming mode |
|
||||
| `EnableLogging` | bool | No | `false` | Enable debug logging |
|
||||
|
||||
## Event Types
|
||||
|
||||
### Session Events
|
||||
|
||||
```go
|
||||
// Start a session
|
||||
sessionStart := sdk.NewSessionStart(sessionID,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithAttributes(map[string]any{
|
||||
"cwd": "/home/user/project",
|
||||
"repo": "myrepo",
|
||||
"branch": "main",
|
||||
}),
|
||||
)
|
||||
|
||||
// End a session
|
||||
sessionEnd := sdk.NewSessionEnd(sessionID,
|
||||
sdk.WithSource(emitter),
|
||||
)
|
||||
```
|
||||
|
||||
### Run Events
|
||||
|
||||
```go
|
||||
// Start a run
|
||||
runStart := sdk.NewRunStart(sessionID, runID,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithAttributes(map[string]any{
|
||||
"command": "my-command",
|
||||
"agent": "my-agent",
|
||||
}),
|
||||
)
|
||||
|
||||
// End a run
|
||||
runEnd := sdk.NewRunEnd(sessionID, runID, "success", 60000,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithLLMUsage("claude-3-opus", 1000, 500, 0.015),
|
||||
)
|
||||
```
|
||||
|
||||
### Span Events
|
||||
|
||||
```go
|
||||
// Start a span
|
||||
spanStart := sdk.NewSpanStart(sessionID, runID, traceID, spanID,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithSpanKind("tool"),
|
||||
sdk.WithName("Bash"),
|
||||
sdk.WithAttributes(map[string]any{
|
||||
"command": "echo hello",
|
||||
}),
|
||||
)
|
||||
|
||||
// End a span
|
||||
spanEnd := sdk.NewSpanEnd(sessionID, runID, traceID, spanID, "success", 1000,
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithSpanKind("tool"),
|
||||
sdk.WithName("Bash"),
|
||||
)
|
||||
```
|
||||
|
||||
### Error Events
|
||||
|
||||
```go
|
||||
errEvent := sdk.NewError(sessionID, runID, traceID, spanID,
|
||||
"validation", "invalid input",
|
||||
sdk.WithSource(emitter),
|
||||
sdk.WithErrorDetails("VAL001", false),
|
||||
)
|
||||
```
|
||||
|
||||
### Metric Snapshots
|
||||
|
||||
```go
|
||||
metrics := sdk.NewMetricSnapshot(sessionID, runID, map[string]any{
|
||||
"tokens_in": 1000.0,
|
||||
"tokens_out": 500.0,
|
||||
"cost_usd": 0.015,
|
||||
"latency_ms": 300.0,
|
||||
"error_count": 0,
|
||||
})
|
||||
```
|
||||
|
||||
## Event Options
|
||||
|
||||
Event options are functions that modify events before sending:
|
||||
|
||||
- `WithSource(emitter)` - Add source information (framework, client_id, host)
|
||||
- `WithAttributes(attrs)` - Add arbitrary attributes
|
||||
- `WithSpanKind(kind)` - Set the span_kind attribute (`llm`, `tool`, `skill`, `internal`)
|
||||
- `WithName(name)` - Set the name attribute
|
||||
- `WithParentSpanID(parentID)` - Set the parent span ID
|
||||
- `WithPayload(payload)` - Set custom payload
|
||||
- `WithSeq(seq)` - Set sequence number (for WebSocket mode)
|
||||
- `WithLLMUsage(model, inTokens, outTokens, costUSD)` - Add LLM usage to run.end or span.end
|
||||
- `WithErrorDetails(code, retryable)` - Add error details
|
||||
|
||||
## Span Kinds
|
||||
|
||||
Common span kinds:
|
||||
|
||||
- `llm` - LLM API calls
|
||||
- `tool` - Tool/function calls
|
||||
- `skill` - Skill execution
|
||||
- `internal` - Internal operations
|
||||
|
||||
## WebSocket Mode
|
||||
|
||||
For real-time streaming, enable WebSocket mode:
|
||||
|
||||
```go
|
||||
emitter, err := sdk.NewEmitter(sdk.Config{
|
||||
ServerURL: "http://localhost:8080",
|
||||
Framework: "my-agent",
|
||||
ClientID: "my-client-001",
|
||||
Host: "localhost",
|
||||
UseWebSocket: true,
|
||||
})
|
||||
```
|
||||
|
||||
In WebSocket mode, events are sent immediately rather than buffered.
|
||||
|
||||
## Example
|
||||
|
||||
See `examples/sdk-example/main.go` for a complete example.
|
||||
|
||||
## Testing
|
||||
|
||||
Run tests with:
|
||||
|
||||
```bash
|
||||
go test ./internal/sdk/...
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
Same license as the agentmon project.
|
||||
@@ -0,0 +1,351 @@
|
||||
// Package sdk provides the agentmon emitter SDK for sending telemetry events.
|
||||
package sdk
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"crypto/rand"
|
||||
"encoding/hex"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/gorilla/websocket"
|
||||
)
|
||||
|
||||
const (
|
||||
schemaName = "agentmon.event"
|
||||
schemaVersion = 1
|
||||
)
|
||||
|
||||
// Emitter is the main client for sending agentmon events.
|
||||
type Emitter struct {
|
||||
config Config
|
||||
httpClient *http.Client
|
||||
wsClient *WSClient
|
||||
buffer []Event
|
||||
bufferSize int
|
||||
mu sync.Mutex
|
||||
closed bool
|
||||
}
|
||||
|
||||
// Config holds emitter configuration.
|
||||
type Config struct {
|
||||
// ServerURL is the base URL of the ingest gateway (e.g., "http://localhost:8080")
|
||||
ServerURL string
|
||||
// APIKey is optional authentication key
|
||||
APIKey string
|
||||
// Framework is the name of the agent framework (e.g., "opencode", "claude-code")
|
||||
Framework string
|
||||
// ClientID is a stable identifier for this emitter instance
|
||||
ClientID string
|
||||
// Host is the hostname where events originate
|
||||
Host string
|
||||
// BufferSize is the max number of events to buffer before flushing
|
||||
BufferSize int
|
||||
// UseWebSocket enables WebSocket streaming mode
|
||||
UseWebSocket bool
|
||||
// EnableLogging enables debug logging
|
||||
EnableLogging bool
|
||||
}
|
||||
|
||||
// Event represents a complete agentmon event.
|
||||
type Event map[string]any
|
||||
|
||||
// WSClient handles WebSocket communication with the ingest gateway.
|
||||
type WSClient struct {
|
||||
conn *websocket.Conn
|
||||
sendChan chan []byte
|
||||
ackChan chan int
|
||||
mu sync.Mutex
|
||||
closed bool
|
||||
}
|
||||
|
||||
// NewWSClient creates a new WebSocket client.
|
||||
func NewWSClient(url string) (*WSClient, error) {
|
||||
conn, _, err := websocket.DefaultDialer.Dial(url, nil)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
return &WSClient{
|
||||
conn: conn,
|
||||
sendChan: make(chan []byte, 100),
|
||||
ackChan: make(chan int, 1),
|
||||
}, nil
|
||||
}
|
||||
|
||||
// Run starts the WebSocket client's main loop.
|
||||
func (w *WSClient) Run(ctx context.Context) {
|
||||
defer w.Close()
|
||||
|
||||
go w.readMessages()
|
||||
w.writeMessages()
|
||||
}
|
||||
|
||||
// Send queues an event to be sent via WebSocket.
|
||||
func (w *WSClient) Send(data []byte) error {
|
||||
w.mu.Lock()
|
||||
if w.closed {
|
||||
w.mu.Unlock()
|
||||
return fmt.Errorf("WebSocket client is closed")
|
||||
}
|
||||
w.mu.Unlock()
|
||||
|
||||
select {
|
||||
case w.sendChan <- data:
|
||||
return nil
|
||||
default:
|
||||
return fmt.Errorf("send buffer full")
|
||||
}
|
||||
}
|
||||
|
||||
// Close closes the WebSocket connection.
|
||||
func (w *WSClient) Close() error {
|
||||
w.mu.Lock()
|
||||
defer w.mu.Unlock()
|
||||
|
||||
if w.closed {
|
||||
return nil
|
||||
}
|
||||
w.closed = true
|
||||
|
||||
if w.conn != nil {
|
||||
_ = w.conn.Close()
|
||||
}
|
||||
close(w.sendChan)
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func (w *WSClient) readMessages() {
|
||||
for {
|
||||
_, message, err := w.conn.ReadMessage()
|
||||
if err != nil {
|
||||
if websocket.IsUnexpectedCloseError(err, websocket.CloseGoingAway, websocket.CloseNormalClosure) {
|
||||
log.Printf("WebSocket read error: %v", err)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
var ack map[string]any
|
||||
if err := json.Unmarshal(message, &ack); err != nil {
|
||||
log.Printf("Failed to unmarshal ack: %v", err)
|
||||
continue
|
||||
}
|
||||
|
||||
if seq, ok := ack["ack"].(map[string]any)["up_to_seq"].(float64); ok {
|
||||
select {
|
||||
case w.ackChan <- int(seq):
|
||||
default:
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (w *WSClient) writeMessages() {
|
||||
for data := range w.sendChan {
|
||||
w.mu.Lock()
|
||||
if w.closed {
|
||||
w.mu.Unlock()
|
||||
return
|
||||
}
|
||||
|
||||
err := w.conn.WriteMessage(websocket.TextMessage, data)
|
||||
w.mu.Unlock()
|
||||
|
||||
if err != nil {
|
||||
log.Printf("WebSocket write error: %v", err)
|
||||
return
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// NewEmitter creates a new emitter with the given configuration.
|
||||
func NewEmitter(cfg Config) (*Emitter, error) {
|
||||
if cfg.ServerURL == "" {
|
||||
return nil, fmt.Errorf("ServerURL is required")
|
||||
}
|
||||
if cfg.Framework == "" {
|
||||
return nil, fmt.Errorf("Framework is required")
|
||||
}
|
||||
if cfg.ClientID == "" {
|
||||
return nil, fmt.Errorf("ClientID is required")
|
||||
}
|
||||
if cfg.Host == "" {
|
||||
cfg.Host = "localhost"
|
||||
}
|
||||
if cfg.BufferSize <= 0 {
|
||||
cfg.BufferSize = 100
|
||||
}
|
||||
|
||||
e := &Emitter{
|
||||
config: cfg,
|
||||
httpClient: &http.Client{Timeout: 30 * time.Second},
|
||||
buffer: make([]Event, 0, cfg.BufferSize),
|
||||
bufferSize: cfg.BufferSize,
|
||||
}
|
||||
|
||||
if cfg.UseWebSocket {
|
||||
wsURL := wsURLFromHTTP(cfg.ServerURL)
|
||||
wsClient, err := NewWSClient(wsURL)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to create WebSocket client: %w", err)
|
||||
}
|
||||
e.wsClient = wsClient
|
||||
go e.wsClient.Run(context.Background())
|
||||
}
|
||||
|
||||
return e, nil
|
||||
}
|
||||
|
||||
// Emit sends a single event.
|
||||
func (e *Emitter) Emit(ctx context.Context, event Event) error {
|
||||
e.mu.Lock()
|
||||
defer e.mu.Unlock()
|
||||
|
||||
if e.closed {
|
||||
return fmt.Errorf("emitter is closed")
|
||||
}
|
||||
|
||||
if e.config.UseWebSocket {
|
||||
data, err := json.Marshal(event)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to marshal event: %w", err)
|
||||
}
|
||||
return e.wsClient.Send(data)
|
||||
}
|
||||
|
||||
e.buffer = append(e.buffer, event)
|
||||
if len(e.buffer) >= e.bufferSize {
|
||||
return e.Flush(ctx)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// Flush sends all buffered events to the server.
|
||||
func (e *Emitter) Flush(ctx context.Context) error {
|
||||
e.mu.Lock()
|
||||
defer e.mu.Unlock()
|
||||
|
||||
if e.closed {
|
||||
return fmt.Errorf("emitter is closed")
|
||||
}
|
||||
|
||||
if len(e.buffer) == 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
if e.config.EnableLogging {
|
||||
log.Printf("Flushing %d events", len(e.buffer))
|
||||
}
|
||||
|
||||
events := make([]map[string]any, len(e.buffer))
|
||||
for i, ev := range e.buffer {
|
||||
events[i] = ev
|
||||
}
|
||||
|
||||
resp, err := e.sendEvents(ctx, events)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to send events: %w", err)
|
||||
}
|
||||
|
||||
e.buffer = e.buffer[:0]
|
||||
|
||||
if resp.Rejected > 0 && e.config.EnableLogging {
|
||||
log.Printf("Rejected %d events", resp.Rejected)
|
||||
if len(resp.Errors) > 0 {
|
||||
log.Printf("Errors: %v", resp.Errors)
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// Close flushes any remaining events and closes the emitter.
|
||||
func (e *Emitter) Close(ctx context.Context) error {
|
||||
e.mu.Lock()
|
||||
defer e.mu.Unlock()
|
||||
|
||||
if e.closed {
|
||||
return nil
|
||||
}
|
||||
|
||||
e.closed = true
|
||||
|
||||
if e.wsClient != nil {
|
||||
_ = e.wsClient.Close()
|
||||
}
|
||||
|
||||
if len(e.buffer) > 0 {
|
||||
_ = e.Flush(ctx)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
type sendResponse struct {
|
||||
Accepted int `json:"accepted"`
|
||||
Rejected int `json:"rejected"`
|
||||
Errors []struct {
|
||||
Error string `json:"error"`
|
||||
} `json:"errors,omitempty"`
|
||||
}
|
||||
|
||||
func (e *Emitter) sendEvents(ctx context.Context, events []map[string]any) (*sendResponse, error) {
|
||||
body, err := json.Marshal(events)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
req, err := http.NewRequestWithContext(ctx, "POST", e.config.ServerURL+"/v1/events", bytes.NewReader(body))
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
if e.config.APIKey != "" {
|
||||
req.Header.Set("Authorization", "Bearer "+e.config.APIKey)
|
||||
}
|
||||
|
||||
resp, err := e.httpClient.Do(req)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
if resp.StatusCode != http.StatusAccepted {
|
||||
return nil, fmt.Errorf("unexpected status code: %d", resp.StatusCode)
|
||||
}
|
||||
|
||||
var result sendResponse
|
||||
if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
return &result, nil
|
||||
}
|
||||
|
||||
func wsURLFromHTTP(httpURL string) string {
|
||||
switch {
|
||||
case len(httpURL) >= 8 && httpURL[:8] == "https://":
|
||||
return "wss://" + httpURL[8:] + "/v1/ws"
|
||||
case len(httpURL) >= 7 && httpURL[:7] == "http://":
|
||||
return "ws://" + httpURL[7:] + "/v1/ws"
|
||||
default:
|
||||
return httpURL + "/v1/ws"
|
||||
}
|
||||
}
|
||||
|
||||
// generateID creates a new UUID-like identifier.
|
||||
func generateID() string {
|
||||
b := make([]byte, 16)
|
||||
if _, err := rand.Read(b); err != nil {
|
||||
log.Printf("Failed to generate random ID: %v", err)
|
||||
return fmt.Sprintf("%d", time.Now().UnixNano())
|
||||
}
|
||||
return hex.EncodeToString(b)
|
||||
}
|
||||
@@ -0,0 +1,348 @@
|
||||
package sdk
|
||||
|
||||
import (
|
||||
"context"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestNewEmitter(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
config Config
|
||||
wantErr bool
|
||||
}{
|
||||
{
|
||||
name: "valid config",
|
||||
config: Config{
|
||||
ServerURL: "http://localhost:8080",
|
||||
Framework: "test-framework",
|
||||
ClientID: "test-client",
|
||||
Host: "test-host",
|
||||
},
|
||||
wantErr: false,
|
||||
},
|
||||
{
|
||||
name: "missing server URL",
|
||||
config: Config{
|
||||
Framework: "test-framework",
|
||||
ClientID: "test-client",
|
||||
},
|
||||
wantErr: true,
|
||||
},
|
||||
{
|
||||
name: "missing framework",
|
||||
config: Config{
|
||||
ServerURL: "http://localhost:8080",
|
||||
ClientID: "test-client",
|
||||
},
|
||||
wantErr: true,
|
||||
},
|
||||
{
|
||||
name: "missing client ID",
|
||||
config: Config{
|
||||
ServerURL: "http://localhost:8080",
|
||||
Framework: "test-framework",
|
||||
},
|
||||
wantErr: true,
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
emitter, err := NewEmitter(tt.config)
|
||||
if (err != nil) != tt.wantErr {
|
||||
t.Errorf("NewEmitter() error = %v, wantErr %v", err, tt.wantErr)
|
||||
return
|
||||
}
|
||||
if !tt.wantErr {
|
||||
_ = emitter.Close(context.Background())
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestGenerateID(t *testing.T) {
|
||||
id1 := generateID()
|
||||
id2 := generateID()
|
||||
|
||||
if id1 == id2 {
|
||||
t.Errorf("generateID() should produce unique IDs, got duplicate: %s", id1)
|
||||
}
|
||||
|
||||
if len(id1) == 0 {
|
||||
t.Error("generateID() should return a non-empty string")
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewSessionStart(t *testing.T) {
|
||||
sessionID := "test-session-001"
|
||||
event := NewSessionStart(sessionID)
|
||||
|
||||
if event["event"].(map[string]any)["type"] != "session.start" {
|
||||
t.Error("NewSessionStart() should create session.start event")
|
||||
}
|
||||
|
||||
if event["correlation"].(map[string]any)["session_id"] != sessionID {
|
||||
t.Error("NewSessionStart() should set session_id in correlation")
|
||||
}
|
||||
|
||||
if event["schema"].(map[string]any)["name"] != schemaName {
|
||||
t.Error("NewSessionStart() should set correct schema name")
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewRunStart(t *testing.T) {
|
||||
sessionID := "test-session-001"
|
||||
runID := "test-run-001"
|
||||
event := NewRunStart(sessionID, runID)
|
||||
|
||||
if event["event"].(map[string]any)["type"] != "run.start" {
|
||||
t.Error("NewRunStart() should create run.start event")
|
||||
}
|
||||
|
||||
if event["correlation"].(map[string]any)["session_id"] != sessionID {
|
||||
t.Error("NewRunStart() should set session_id in correlation")
|
||||
}
|
||||
|
||||
if event["correlation"].(map[string]any)["run_id"] != runID {
|
||||
t.Error("NewRunStart() should set run_id in correlation")
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewSpanStart(t *testing.T) {
|
||||
sessionID := "test-session-001"
|
||||
runID := "test-run-001"
|
||||
traceID := "test-trace-001"
|
||||
spanID := "test-span-001"
|
||||
event := NewSpanStart(sessionID, runID, traceID, spanID)
|
||||
|
||||
if event["event"].(map[string]any)["type"] != "span.start" {
|
||||
t.Error("NewSpanStart() should create span.start event")
|
||||
}
|
||||
|
||||
if event["correlation"].(map[string]any)["span_id"] != spanID {
|
||||
t.Error("NewSpanStart() should set span_id in correlation")
|
||||
}
|
||||
|
||||
if event["correlation"].(map[string]any)["trace_id"] != traceID {
|
||||
t.Error("NewSpanStart() should set trace_id in correlation")
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewRunEnd(t *testing.T) {
|
||||
sessionID := "test-session-001"
|
||||
runID := "test-run-001"
|
||||
status := "success"
|
||||
durationMs := int64(60000)
|
||||
event := NewRunEnd(sessionID, runID, status, durationMs)
|
||||
|
||||
if event["event"].(map[string]any)["type"] != "run.end" {
|
||||
t.Error("NewRunEnd() should create run.end event")
|
||||
}
|
||||
|
||||
payload := event["payload"].(map[string]any)
|
||||
if payload["status"] != status {
|
||||
t.Errorf("NewRunEnd() should set status to %s", status)
|
||||
}
|
||||
|
||||
if payload["duration_ms"] != durationMs {
|
||||
t.Errorf("NewRunEnd() should set duration_ms to %d", durationMs)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewSpanEnd(t *testing.T) {
|
||||
sessionID := "test-session-001"
|
||||
runID := "test-run-001"
|
||||
traceID := "test-trace-001"
|
||||
spanID := "test-span-001"
|
||||
status := "success"
|
||||
durationMs := int64(1000)
|
||||
event := NewSpanEnd(sessionID, runID, traceID, spanID, status, durationMs)
|
||||
|
||||
if event["event"].(map[string]any)["type"] != "span.end" {
|
||||
t.Error("NewSpanEnd() should create span.end event")
|
||||
}
|
||||
|
||||
payload := event["payload"].(map[string]any)
|
||||
if payload["status"] != status {
|
||||
t.Errorf("NewSpanEnd() should set status to %s", status)
|
||||
}
|
||||
|
||||
if payload["duration_ms"] != durationMs {
|
||||
t.Errorf("NewSpanEnd() should set duration_ms to %d", durationMs)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewError(t *testing.T) {
|
||||
sessionID := "test-session-001"
|
||||
runID := "test-run-001"
|
||||
traceID := "test-trace-001"
|
||||
spanID := "test-span-001"
|
||||
errType := "validation"
|
||||
message := "invalid input"
|
||||
event := NewError(sessionID, runID, traceID, spanID, errType, message)
|
||||
|
||||
if event["event"].(map[string]any)["type"] != "error" {
|
||||
t.Error("NewError() should create error event")
|
||||
}
|
||||
|
||||
payload := event["payload"].(map[string]any)
|
||||
err := payload["error"].(map[string]any)
|
||||
if err["type"] != errType {
|
||||
t.Errorf("NewError() should set error type to %s", errType)
|
||||
}
|
||||
|
||||
if err["message"] != message {
|
||||
t.Errorf("NewError() should set error message to %s", message)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewMetricSnapshot(t *testing.T) {
|
||||
sessionID := "test-session-001"
|
||||
runID := "test-run-001"
|
||||
metrics := map[string]any{
|
||||
"tokens_in": 1000.0,
|
||||
"tokens_out": 500.0,
|
||||
"cost_usd": 0.002,
|
||||
}
|
||||
event := NewMetricSnapshot(sessionID, runID, metrics)
|
||||
|
||||
if event["event"].(map[string]any)["type"] != "metric.snapshot" {
|
||||
t.Error("NewMetricSnapshot() should create metric.snapshot event")
|
||||
}
|
||||
|
||||
payload := event["payload"].(map[string]any)
|
||||
payloadMetrics := payload["metrics"].(map[string]any)
|
||||
if len(payloadMetrics) != len(metrics) {
|
||||
t.Error("NewMetricSnapshot() should include all metrics")
|
||||
}
|
||||
}
|
||||
|
||||
func TestEventOptions(t *testing.T) {
|
||||
emitter, err := NewEmitter(Config{
|
||||
ServerURL: "http://localhost:8080",
|
||||
Framework: "test-framework",
|
||||
ClientID: "test-client",
|
||||
Host: "test-host",
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("NewEmitter() error = %v", err)
|
||||
}
|
||||
defer emitter.Close(context.Background())
|
||||
|
||||
sessionID := "test-session-001"
|
||||
event := NewSessionStart(sessionID,
|
||||
WithSource(emitter),
|
||||
WithAttributes(map[string]any{"cwd": "/home/user"}),
|
||||
WithSeq(1),
|
||||
)
|
||||
|
||||
if _, ok := event["event"].(map[string]any)["source"]; !ok {
|
||||
t.Error("WithSource() should set source")
|
||||
}
|
||||
|
||||
if _, ok := event["attributes"]; !ok {
|
||||
t.Error("WithAttributes() should set attributes")
|
||||
}
|
||||
|
||||
if event["event"].(map[string]any)["seq"] != 1 {
|
||||
t.Error("WithSeq() should set sequence number")
|
||||
}
|
||||
}
|
||||
|
||||
func TestWithSpanKind(t *testing.T) {
|
||||
sessionID := "test-session-001"
|
||||
runID := "test-run-001"
|
||||
traceID := "test-trace-001"
|
||||
spanID := "test-span-001"
|
||||
event := NewSpanStart(sessionID, runID, traceID, spanID, WithSpanKind("tool"))
|
||||
|
||||
attrs := event["attributes"].(map[string]any)
|
||||
if attrs["span_kind"] != "tool" {
|
||||
t.Error("WithSpanKind() should set span_kind attribute")
|
||||
}
|
||||
}
|
||||
|
||||
func TestWithLLMUsage(t *testing.T) {
|
||||
sessionID := "test-session-001"
|
||||
runID := "test-run-001"
|
||||
event := NewRunEnd(sessionID, runID, "success", 60000,
|
||||
WithLLMUsage("claude-3-opus", 1000, 500, 0.015),
|
||||
)
|
||||
|
||||
payload := event["payload"].(map[string]any)
|
||||
llm := payload["llm"].(map[string]any)
|
||||
|
||||
if llm["model"] != "claude-3-opus" {
|
||||
t.Error("WithLLMUsage() should set model")
|
||||
}
|
||||
|
||||
usage := llm["usage"].(map[string]any)
|
||||
if usage["input_tokens"] != 1000 {
|
||||
t.Error("WithLLMUsage() should set input_tokens")
|
||||
}
|
||||
|
||||
if usage["output_tokens"] != 500 {
|
||||
t.Error("WithLLMUsage() should set output_tokens")
|
||||
}
|
||||
|
||||
cost := llm["cost"].(map[string]any)
|
||||
if cost["total_usd"] != 0.015 {
|
||||
t.Error("WithLLMUsage() should set total_usd")
|
||||
}
|
||||
}
|
||||
|
||||
func TestEmit(t *testing.T) {
|
||||
emitter, err := NewEmitter(Config{
|
||||
ServerURL: "http://localhost:9999",
|
||||
Framework: "test-framework",
|
||||
ClientID: "test-client",
|
||||
Host: "test-host",
|
||||
BufferSize: 10,
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("NewEmitter() error = %v", err)
|
||||
}
|
||||
|
||||
sessionID := "test-session-001"
|
||||
event := NewSessionStart(sessionID, WithSource(emitter))
|
||||
|
||||
ctx := context.Background()
|
||||
|
||||
err = emitter.Emit(ctx, event)
|
||||
if err != nil {
|
||||
t.Errorf("Emit() error = %v", err)
|
||||
}
|
||||
|
||||
err = emitter.Emit(ctx, NewSessionStart(sessionID+"-2", WithSource(emitter)))
|
||||
if err != nil {
|
||||
t.Errorf("Emit() error = %v", err)
|
||||
}
|
||||
|
||||
emitter.mu.Lock()
|
||||
buffered := len(emitter.buffer)
|
||||
emitter.mu.Unlock()
|
||||
|
||||
if buffered != 2 {
|
||||
t.Errorf("Expected 2 buffered events, got %d", buffered)
|
||||
}
|
||||
}
|
||||
|
||||
func TestWsURLFromHTTP(t *testing.T) {
|
||||
tests := []struct {
|
||||
httpURL string
|
||||
want string
|
||||
}{
|
||||
{"http://localhost:8080", "ws://localhost:8080/v1/ws"},
|
||||
{"https://example.com", "wss://example.com/v1/ws"},
|
||||
{"http://example.com:8080", "ws://example.com:8080/v1/ws"},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.httpURL, func(t *testing.T) {
|
||||
if got := wsURLFromHTTP(tt.httpURL); got != tt.want {
|
||||
t.Errorf("wsURLFromHTTP() = %v, want %v", got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,332 @@
|
||||
package sdk
|
||||
|
||||
import (
|
||||
"time"
|
||||
)
|
||||
|
||||
// NewSessionStart creates a session.start event.
|
||||
func NewSessionStart(sessionID string, opts ...EventOption) Event {
|
||||
now := time.Now()
|
||||
event := map[string]any{
|
||||
"schema": map[string]any{
|
||||
"name": schemaName,
|
||||
"version": schemaVersion,
|
||||
},
|
||||
"event": map[string]any{
|
||||
"id": generateID(),
|
||||
"type": "session.start",
|
||||
"ts": now.UTC().Format(time.RFC3339Nano),
|
||||
},
|
||||
"correlation": map[string]any{
|
||||
"session_id": sessionID,
|
||||
},
|
||||
}
|
||||
|
||||
for _, opt := range opts {
|
||||
opt(event)
|
||||
}
|
||||
|
||||
return event
|
||||
}
|
||||
|
||||
// NewSessionEnd creates a session.end event.
|
||||
func NewSessionEnd(sessionID string, opts ...EventOption) Event {
|
||||
now := time.Now()
|
||||
event := map[string]any{
|
||||
"schema": map[string]any{
|
||||
"name": schemaName,
|
||||
"version": schemaVersion,
|
||||
},
|
||||
"event": map[string]any{
|
||||
"id": generateID(),
|
||||
"type": "session.end",
|
||||
"ts": now.UTC().Format(time.RFC3339Nano),
|
||||
},
|
||||
"correlation": map[string]any{
|
||||
"session_id": sessionID,
|
||||
},
|
||||
}
|
||||
|
||||
for _, opt := range opts {
|
||||
opt(event)
|
||||
}
|
||||
|
||||
return event
|
||||
}
|
||||
|
||||
// NewRunStart creates a run.start event.
|
||||
func NewRunStart(sessionID, runID string, opts ...EventOption) Event {
|
||||
now := time.Now()
|
||||
event := map[string]any{
|
||||
"schema": map[string]any{
|
||||
"name": schemaName,
|
||||
"version": schemaVersion,
|
||||
},
|
||||
"event": map[string]any{
|
||||
"id": generateID(),
|
||||
"type": "run.start",
|
||||
"ts": now.UTC().Format(time.RFC3339Nano),
|
||||
},
|
||||
"correlation": map[string]any{
|
||||
"session_id": sessionID,
|
||||
"run_id": runID,
|
||||
},
|
||||
}
|
||||
|
||||
for _, opt := range opts {
|
||||
opt(event)
|
||||
}
|
||||
|
||||
return event
|
||||
}
|
||||
|
||||
// NewRunEnd creates a run.end event.
|
||||
func NewRunEnd(sessionID, runID string, status string, durationMs int64, opts ...EventOption) Event {
|
||||
now := time.Now()
|
||||
event := map[string]any{
|
||||
"schema": map[string]any{
|
||||
"name": schemaName,
|
||||
"version": schemaVersion,
|
||||
},
|
||||
"event": map[string]any{
|
||||
"id": generateID(),
|
||||
"type": "run.end",
|
||||
"ts": now.UTC().Format(time.RFC3339Nano),
|
||||
},
|
||||
"correlation": map[string]any{
|
||||
"session_id": sessionID,
|
||||
"run_id": runID,
|
||||
},
|
||||
"payload": map[string]any{
|
||||
"status": status,
|
||||
"duration_ms": durationMs,
|
||||
},
|
||||
}
|
||||
|
||||
for _, opt := range opts {
|
||||
opt(event)
|
||||
}
|
||||
|
||||
return event
|
||||
}
|
||||
|
||||
// NewSpanStart creates a span.start event.
|
||||
func NewSpanStart(sessionID, runID, traceID, spanID string, opts ...EventOption) Event {
|
||||
now := time.Now()
|
||||
event := map[string]any{
|
||||
"schema": map[string]any{
|
||||
"name": schemaName,
|
||||
"version": schemaVersion,
|
||||
},
|
||||
"event": map[string]any{
|
||||
"id": generateID(),
|
||||
"type": "span.start",
|
||||
"ts": now.UTC().Format(time.RFC3339Nano),
|
||||
},
|
||||
"correlation": map[string]any{
|
||||
"session_id": sessionID,
|
||||
"run_id": runID,
|
||||
"trace_id": traceID,
|
||||
"span_id": spanID,
|
||||
},
|
||||
}
|
||||
|
||||
for _, opt := range opts {
|
||||
opt(event)
|
||||
}
|
||||
|
||||
return event
|
||||
}
|
||||
|
||||
// NewSpanEnd creates a span.end event.
|
||||
func NewSpanEnd(sessionID, runID, traceID, spanID string, status string, durationMs int64, opts ...EventOption) Event {
|
||||
now := time.Now()
|
||||
event := map[string]any{
|
||||
"schema": map[string]any{
|
||||
"name": schemaName,
|
||||
"version": schemaVersion,
|
||||
},
|
||||
"event": map[string]any{
|
||||
"id": generateID(),
|
||||
"type": "span.end",
|
||||
"ts": now.UTC().Format(time.RFC3339Nano),
|
||||
},
|
||||
"correlation": map[string]any{
|
||||
"session_id": sessionID,
|
||||
"run_id": runID,
|
||||
"trace_id": traceID,
|
||||
"span_id": spanID,
|
||||
},
|
||||
"payload": map[string]any{
|
||||
"status": status,
|
||||
"duration_ms": durationMs,
|
||||
},
|
||||
}
|
||||
|
||||
for _, opt := range opts {
|
||||
opt(event)
|
||||
}
|
||||
|
||||
return event
|
||||
}
|
||||
|
||||
// NewError creates an error event.
|
||||
func NewError(sessionID, runID, traceID, spanID string, errType, message string, opts ...EventOption) Event {
|
||||
now := time.Now()
|
||||
event := map[string]any{
|
||||
"schema": map[string]any{
|
||||
"name": schemaName,
|
||||
"version": schemaVersion,
|
||||
},
|
||||
"event": map[string]any{
|
||||
"id": generateID(),
|
||||
"type": "error",
|
||||
"ts": now.UTC().Format(time.RFC3339Nano),
|
||||
},
|
||||
"correlation": map[string]any{
|
||||
"session_id": sessionID,
|
||||
"run_id": runID,
|
||||
"trace_id": traceID,
|
||||
"span_id": spanID,
|
||||
},
|
||||
"payload": map[string]any{
|
||||
"error": map[string]any{
|
||||
"type": errType,
|
||||
"message": message,
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
for _, opt := range opts {
|
||||
opt(event)
|
||||
}
|
||||
|
||||
return event
|
||||
}
|
||||
|
||||
// NewMetricSnapshot creates a metric.snapshot event.
|
||||
func NewMetricSnapshot(sessionID, runID string, metrics map[string]any, opts ...EventOption) Event {
|
||||
now := time.Now()
|
||||
event := map[string]any{
|
||||
"schema": map[string]any{
|
||||
"name": schemaName,
|
||||
"version": schemaVersion,
|
||||
},
|
||||
"event": map[string]any{
|
||||
"id": generateID(),
|
||||
"type": "metric.snapshot",
|
||||
"ts": now.UTC().Format(time.RFC3339Nano),
|
||||
},
|
||||
"correlation": map[string]any{
|
||||
"session_id": sessionID,
|
||||
"run_id": runID,
|
||||
},
|
||||
"payload": map[string]any{
|
||||
"metrics": metrics,
|
||||
},
|
||||
}
|
||||
|
||||
for _, opt := range opts {
|
||||
opt(event)
|
||||
}
|
||||
|
||||
return event
|
||||
}
|
||||
|
||||
// EventOption is a function that modifies an event.
|
||||
type EventOption func(Event)
|
||||
|
||||
// WithSource sets the source information on an event.
|
||||
func WithSource(emitter *Emitter) EventOption {
|
||||
return func(e Event) {
|
||||
if event, ok := e["event"].(map[string]any); ok {
|
||||
event["source"] = map[string]any{
|
||||
"framework": emitter.config.Framework,
|
||||
"client_id": emitter.config.ClientID,
|
||||
"host": emitter.config.Host,
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// WithAttributes adds attributes to an event.
|
||||
func WithAttributes(attrs map[string]any) EventOption {
|
||||
return func(e Event) {
|
||||
if _, ok := e["attributes"]; !ok {
|
||||
e["attributes"] = make(map[string]any)
|
||||
}
|
||||
if attrsMap, ok := e["attributes"].(map[string]any); ok {
|
||||
for k, v := range attrs {
|
||||
attrsMap[k] = v
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// WithSpanKind sets the span_kind attribute.
|
||||
func WithSpanKind(kind string) EventOption {
|
||||
return WithAttributes(map[string]any{"span_kind": kind})
|
||||
}
|
||||
|
||||
// WithName sets the name attribute.
|
||||
func WithName(name string) EventOption {
|
||||
return WithAttributes(map[string]any{"name": name})
|
||||
}
|
||||
|
||||
// WithParentSpanID sets the parent_span_id in correlation.
|
||||
func WithParentSpanID(parentID string) EventOption {
|
||||
return func(e Event) {
|
||||
if corr, ok := e["correlation"].(map[string]any); ok {
|
||||
corr["parent_span_id"] = parentID
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// WithPayload sets the payload on an event.
|
||||
func WithPayload(payload map[string]any) EventOption {
|
||||
return func(e Event) {
|
||||
e["payload"] = payload
|
||||
}
|
||||
}
|
||||
|
||||
// WithSeq sets the sequence number on an event.
|
||||
func WithSeq(seq int) EventOption {
|
||||
return func(e Event) {
|
||||
if event, ok := e["event"].(map[string]any); ok {
|
||||
event["seq"] = seq
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// WithLLMUsage adds LLM usage information to a span.end or run.end payload.
|
||||
func WithLLMUsage(model string, inputTokens, outputTokens int, costUSD float64) EventOption {
|
||||
return func(e Event) {
|
||||
if payload, ok := e["payload"].(map[string]any); ok {
|
||||
if _, ok := payload["llm"]; !ok {
|
||||
payload["llm"] = make(map[string]any)
|
||||
}
|
||||
if llm, ok := payload["llm"].(map[string]any); ok {
|
||||
llm["model"] = model
|
||||
llm["usage"] = map[string]any{
|
||||
"input_tokens": inputTokens,
|
||||
"output_tokens": outputTokens,
|
||||
}
|
||||
llm["cost"] = map[string]any{
|
||||
"total_usd": costUSD,
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// WithErrorDetails adds error details to a payload.
|
||||
func WithErrorDetails(code string, retryable bool) EventOption {
|
||||
return func(e Event) {
|
||||
if payload, ok := e["payload"].(map[string]any); ok {
|
||||
if err, ok := payload["error"].(map[string]any); ok {
|
||||
err["code"] = code
|
||||
err["retryable"] = retryable
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -3,6 +3,7 @@ package postgres
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"time"
|
||||
)
|
||||
|
||||
@@ -13,20 +14,39 @@ type EventRow struct {
|
||||
Payload json.RawMessage `json:"payload"`
|
||||
}
|
||||
|
||||
func (d *DB) ListRecentEvents(ctx context.Context, limit int) ([]EventRow, error) {
|
||||
if limit <= 0 {
|
||||
limit = 100
|
||||
type EventsFilter struct {
|
||||
Limit int
|
||||
EventType string
|
||||
Framework string
|
||||
}
|
||||
|
||||
func (d *DB) ListRecentEvents(ctx context.Context, f EventsFilter) ([]EventRow, error) {
|
||||
if f.Limit <= 0 {
|
||||
f.Limit = 100
|
||||
}
|
||||
if limit > 1000 {
|
||||
limit = 1000
|
||||
if f.Limit > 1000 {
|
||||
f.Limit = 1000
|
||||
}
|
||||
|
||||
rows, err := d.sql.QueryContext(ctx, `
|
||||
select event_id, ts, type, payload
|
||||
from events
|
||||
order by ts desc
|
||||
limit $1
|
||||
`, limit)
|
||||
query := "SELECT event_id, ts, type, payload FROM events WHERE 1=1"
|
||||
args := []any{}
|
||||
argN := 1
|
||||
|
||||
if f.EventType != "" {
|
||||
query += fmt.Sprintf(" AND type = $%d", argN)
|
||||
args = append(args, f.EventType)
|
||||
argN++
|
||||
}
|
||||
if f.Framework != "" {
|
||||
query += fmt.Sprintf(" AND source_framework = $%d", argN)
|
||||
args = append(args, f.Framework)
|
||||
argN++
|
||||
}
|
||||
|
||||
query += fmt.Sprintf(" ORDER BY ts DESC LIMIT $%d", argN)
|
||||
args = append(args, f.Limit)
|
||||
|
||||
rows, err := d.sql.QueryContext(ctx, query, args...)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
@@ -40,8 +60,5 @@ limit $1
|
||||
}
|
||||
out = append(out, r)
|
||||
}
|
||||
if err := rows.Err(); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return out, nil
|
||||
return out, rows.Err()
|
||||
}
|
||||
|
||||
Executable
+38
@@ -0,0 +1,38 @@
|
||||
#!/bin/bash
|
||||
# Start all agentmon services
|
||||
|
||||
set -e
|
||||
|
||||
echo "Starting infrastructure..."
|
||||
make up
|
||||
|
||||
echo "Waiting for services to be ready..."
|
||||
sleep 5
|
||||
|
||||
echo "Starting application services..."
|
||||
make run-query > /tmp/query-api.log 2>&1 &
|
||||
QUERY_PID=$!
|
||||
echo "query-api started (PID: $QUERY_PID)"
|
||||
|
||||
make run-ingest > /tmp/ingest-gateway.log 2>&1 &
|
||||
INGEST_PID=$!
|
||||
echo "ingest-gateway started (PID: $INGEST_PID)"
|
||||
|
||||
make run-ui > /tmp/web-ui.log 2>&1 &
|
||||
UI_PID=$!
|
||||
echo "web-ui started (PID: $UI_PID)"
|
||||
|
||||
echo ""
|
||||
echo "All services started!"
|
||||
echo ""
|
||||
echo "Endpoints:"
|
||||
echo " - Web UI: http://localhost:8082"
|
||||
echo " - Query API: http://localhost:8081"
|
||||
echo " - Ingest GW: http://localhost:8080"
|
||||
echo ""
|
||||
echo "Logs:"
|
||||
echo " - query-api: tail -f /tmp/query-api.log"
|
||||
echo " - ingest-gw: tail -f /tmp/ingest-gateway.log"
|
||||
echo " - web-ui: tail -f /tmp/web-ui.log"
|
||||
echo ""
|
||||
echo "To stop all services: pkill -f 'go run' && make down"
|
||||
Executable
+10
@@ -0,0 +1,10 @@
|
||||
#!/bin/bash
|
||||
# Stop all agentmon services
|
||||
|
||||
echo "Stopping application services..."
|
||||
pkill -f "go run" || true
|
||||
|
||||
echo "Stopping infrastructure..."
|
||||
make down
|
||||
|
||||
echo "All services stopped."
|
||||
Reference in New Issue
Block a user