feat(subagents): complete queue, budgets, audit, and inspection controls
This commit is contained in:
@@ -41,7 +41,8 @@ The gateway serialises agent work **per session**, not per WebSocket connection:
|
||||
- The gateway `agent.send` command path and channel-router path use the same runtime backend-mode command service; `flynn tui` forwards `/runtime ...` through this gateway path for parity.
|
||||
- Backend routing and fallback outcomes are emitted to audit logs (`backend.route`, `backend.success`, `backend.fallback`) for rollout evaluation; this telemetry is outside JSON-RPC response payloads.
|
||||
- Session-start memory injection (`user/profile` + `user/working`) is server-side and controlled by `memory.user_namespace`; it does not affect protocol payloads.
|
||||
- Multi-turn child agents are exposed through tool calls (`subagent.spawn/send/list/cancel/delete/summary`) inside the agent loop; they do not add new JSON-RPC methods.
|
||||
- Multi-turn child agents are exposed through tool calls (`subagent.spawn/send/list/cancel/delete/summary`) inside the agent loop; child sessions support per-session queue mode and budget guardrails but do not add new JSON-RPC methods.
|
||||
- Session command fast-path includes `/subagents` (`list|summary|cancel|delete`) for child-session inspection/control without protocol changes.
|
||||
|
||||
This is implemented via a per-lane queue (`LaneQueue`) in the gateway server, and used by `agent.send` and `agent.cancel`.
|
||||
|
||||
|
||||
@@ -137,8 +137,9 @@ Tool Calls (inside NativeAgent loop)
|
||||
+---------------------------> AuditLogger (redacted)
|
||||
|
||||
Subagent sessions (multi-turn child agents)
|
||||
parent AgentOrchestrator -> subagent.* tools -> SubagentManager (TTL cleanup)
|
||||
SubagentManager -> child AgentOrchestrator (session namespace: subagent:<parent>:<id>)
|
||||
parent AgentOrchestrator -> subagent.* tools -> SubagentManager (TTL cleanup + queue/budget controls)
|
||||
SubagentManager -> child AgentOrchestrator (session namespace: subagent:<parent>:<id>, trace_id)
|
||||
SubagentManager -> AuditLogger (subagent.lifecycle + subagent.turn events)
|
||||
child AgentOrchestrator -> NativeAgent/tool loop (same policy engine, recursion tools removed)
|
||||
|
||||
Session start (when `memory.user_namespace` is set)
|
||||
|
||||
@@ -17,7 +17,7 @@ If you only want the protocol surface, see `docs/api/PROTOCOL.md`.
|
||||
- Backend routing outcomes are auditable via `backend.route` / `backend.success` / `backend.fallback`, which enables offline canary evaluation without changing gateway protocol methods.
|
||||
- Run lifecycle/cancel intent and reaction decisions are emitted to audit logs, and aggregated into `system.metrics` counters (runStates, cancelLatencyMs, reactions) for dashboards.
|
||||
- Reaction matching is deterministic (priority + cooldown + recursion guard) before intent/agent routing.
|
||||
- `subagent.*` tools create child orchestrators scoped to the parent conversation (`subagent:<parentSessionId>:<childId>`) with idle TTL cleanup; this is tool-loop behavior, not a separate gateway RPC session lane.
|
||||
- `subagent.*` tools create child orchestrators scoped to the parent conversation (`subagent:<parentSessionId>:<childId>`) with idle TTL cleanup, per-child queue mode (`followup|interrupt`), and session budgets (turn/token/timeout); this is tool-loop behavior, not a separate gateway RPC session lane.
|
||||
- Companion `node.*` registration is per WebSocket connection; reconnects must re-register capabilities before invoking node RPC methods.
|
||||
- Canvas artifacts are persisted per session under the gateway data directory for UI recovery across restarts.
|
||||
- TTS output is best-effort; synthesis failures fall back to text-only responses.
|
||||
|
||||
@@ -13,6 +13,7 @@ The following were previously treated as gaps but are already implemented in Fly
|
||||
3. Browser automation baseline is present (`browser.navigate/click/type/screenshot/content/eval` in `src/tools/builtin/browser/tools.ts`).
|
||||
4. Companion protocol/runtime foundation is present (`src/companion/runtimeClient.ts`, `src/companion/platformClients.ts`).
|
||||
5. Talk mode + wake phrase baseline is present (`src/daemon/routing.ts`, `audio.talk_mode` schema support).
|
||||
6. Subagent sessions now include queue/budget controls, transcript export, and session inspection UX (`subagent.*`, `/subagents`).
|
||||
|
||||
## Remaining Product Gaps (Now)
|
||||
|
||||
@@ -20,7 +21,6 @@ The following were previously treated as gaps but are already implemented in Fly
|
||||
2. Voice UX is functional but not yet a polished, end-to-end daily-driver experience across surfaces.
|
||||
3. Browser tools exist but lack task-level reliability primitives (checkpoints/retries/guardrails) for autonomous workflows.
|
||||
4. Onboarding lacks a "first success" guided path that validates real integrations live during setup.
|
||||
5. Subagent sessions are now available (`subagent.*`) with idle TTL cleanup and transcript summary support, but still need budgeting/UI visibility for larger autonomous workflows.
|
||||
|
||||
## Product Goal
|
||||
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# Subagents Support Plan (Flynn)
|
||||
|
||||
Date: 2026-02-26
|
||||
Status: phase 1 implemented, phase 2 partially implemented
|
||||
Status: phases 1-3 implemented
|
||||
Scope: add OpenClaw-style multi-turn subagent session support in Flynn without changing channel surface scope (Telegram-first)
|
||||
|
||||
## Constraints
|
||||
@@ -30,22 +30,25 @@ Scope: add OpenClaw-style multi-turn subagent session support in Flynn without c
|
||||
- `max_active_sessions`
|
||||
5. Added policy/profile support so `subagent.*` is controlled through `group:agents` and tool profiles.
|
||||
|
||||
## Phase 2 (Next)
|
||||
## Phase 2 (Implemented)
|
||||
|
||||
1. Add per-subagent TTL/idle eviction and auto-cleanup metrics. (implemented: TTL eviction)
|
||||
2. Add optional transcript export/summarization (`subagent.summary`). (implemented)
|
||||
3. Add per-subagent tool-profile override (read-only by default for risky workloads). (pending)
|
||||
4. Add parent-child trace IDs in audit events for easier debugging. (pending)
|
||||
1. Added idle TTL eviction plus audit lifecycle events for cleanup visibility.
|
||||
2. Added transcript export/summarization via `subagent.summary`.
|
||||
3. Added per-subagent tool-profile override (`queue_mode`, `tool_profile` on spawn).
|
||||
4. Added parent-child trace IDs and subagent lifecycle/turn audit events.
|
||||
|
||||
## Phase 3 (Stretch)
|
||||
## Phase 3 (Implemented)
|
||||
|
||||
1. Add queue semantics for child sessions (`followup` vs `interrupt` per subagent).
|
||||
2. Add explicit resource budgets (token/time) per child session.
|
||||
3. Add UI affordances in gateway chat for subagent session inspection.
|
||||
1. Added queue semantics per child session (`followup` FIFO, `interrupt` latest-wins).
|
||||
2. Added explicit resource budgets (max turns, max total tokens, per-turn timeout).
|
||||
3. Added operator/UI affordances:
|
||||
- `/subagents` command (list/summary/cancel/delete),
|
||||
- gateway chat slash suggestion for `/subagents`.
|
||||
|
||||
## Acceptance Criteria (Phase 1)
|
||||
## Final Acceptance Criteria
|
||||
|
||||
1. Parent agent can spawn and continue a child subagent across multiple turns.
|
||||
1. Parent agent can spawn and continue child subagents across multiple turns.
|
||||
2. Child session state is isolated and delete clears history.
|
||||
3. Recursion tooling (`agent.delegate`, `council.run`, `subagent.*`) is removed from child registries.
|
||||
4. Tests cover manager lifecycle, tool behavior, config parsing, and policy profile inclusion.
|
||||
4. Child sessions support queue policy + budget guardrails + transcript inspection.
|
||||
5. Tests cover manager lifecycle, tool behavior, config parsing, routing command wiring, and audit event logging.
|
||||
|
||||
+13
-11
@@ -6800,21 +6800,23 @@
|
||||
"status": "completed",
|
||||
"date": "2026-02-26",
|
||||
"updated": "2026-02-26",
|
||||
"summary": "Implemented Phase 1 and partial Phase 2 subagent support: added a SubagentManager with multi-turn child sessions, idle TTL cleanup, new `subagent.*` tools (spawn/send/list/cancel/delete/summary), routing wiring, config guardrails, policy/profile integration, docs/diagram updates, and focused test coverage.",
|
||||
"summary": "Completed subagent phases 1-3: added queue semantics (`followup`/`interrupt`), turn/token/timeout budgets, per-subagent tool-profile overrides, parent-child trace IDs with lifecycle/turn audit events, `/subagents` runtime command surface, and updated docs/diagram coverage.",
|
||||
"files_modified": [
|
||||
"src/backends/native/subagents.ts",
|
||||
"src/backends/native/subagents.test.ts",
|
||||
"src/backends/native/index.ts",
|
||||
"src/backends/index.ts",
|
||||
"src/audit/types.ts",
|
||||
"src/audit/logger.ts",
|
||||
"src/audit/logger.test.ts",
|
||||
"src/tools/builtin/subagents.ts",
|
||||
"src/tools/builtin/subagents.test.ts",
|
||||
"src/tools/builtin/index.ts",
|
||||
"src/tools/index.ts",
|
||||
"src/tools/policy.ts",
|
||||
"src/tools/policy.test.ts",
|
||||
"src/commands/types.ts",
|
||||
"src/commands/builtin/index.ts",
|
||||
"src/commands/builtin/index.test.ts",
|
||||
"src/config/schema.ts",
|
||||
"src/config/schema.test.ts",
|
||||
"src/daemon/routing.ts",
|
||||
"src/daemon/routing.test.ts",
|
||||
"src/gateway/ui/pages/chat.js",
|
||||
"config/default.yaml",
|
||||
"README.md",
|
||||
"docs/api/PROTOCOL.md",
|
||||
@@ -6824,11 +6826,11 @@
|
||||
"docs/plans/2026-02-26-personal-assistant-productization-plan.md",
|
||||
"docs/plans/state.json"
|
||||
],
|
||||
"test_status": "pnpm test:run src/backends/native/subagents.test.ts src/tools/builtin/subagents.test.ts src/tools/policy.test.ts src/config/schema.test.ts src/daemon/routing.test.ts passing + pnpm typecheck"
|
||||
"test_status": "pnpm test:run src/backends/native/subagents.test.ts src/tools/builtin/subagents.test.ts src/commands/builtin/index.test.ts src/audit/logger.test.ts src/config/schema.test.ts src/daemon/routing.test.ts passing + pnpm typecheck"
|
||||
}
|
||||
},
|
||||
"overall_progress": {
|
||||
"total_test_count": 2533,
|
||||
"total_test_count": 2534,
|
||||
"all_tests_passing": true,
|
||||
"p0_completion": "3/3 (100%)",
|
||||
"p1_completion": "4/4 (100%)",
|
||||
@@ -6843,7 +6845,7 @@
|
||||
"tier2_completion": "4/4 (100%) \u2014 inbound webhooks, vector memory search, Dockerfile, heartbeat monitor",
|
||||
"tier3_completion": "5/5 (100%) \u2014 lane queue, credential redaction, web UI token dashboard, xAI (Grok) provider, Voyage AI embeddings",
|
||||
"tier4_completion": "4/4 (100%) \u2014 gateway lock, shell completion, Tailscale Serve/Funnel, DM pairing codes",
|
||||
"feature_gap_scorecard": "rebaselined 2026-02-26 — channel breadth, setup wizard, baseline browser automation, and partial phase-2 subagent support (`subagent.*` + idle TTL cleanup + transcript summary) are implemented; remaining high-impact personal-assistant gaps center on shipped companion apps (desktop/mobile), voice UX polish, browser workflow reliability primitives, and first-success onboarding funnel optimization.",
|
||||
"feature_gap_scorecard": "rebaselined 2026-02-26 — channel breadth, setup wizard, baseline browser automation, and full subagent support (`subagent.*` + queue modes + budgets + trace/audit + `/subagents` inspection) are implemented; remaining high-impact personal-assistant gaps center on shipped companion apps (desktop/mobile), voice UX polish, browser workflow reliability primitives, and first-success onboarding funnel optimization.",
|
||||
"operator_dx_milestone": "Phase 3 (Live Ops Dashboard): 2/2 plans complete \u2014 milestone done",
|
||||
"dashboard_observability": "completed \u2014 service health graphs + core service log viewer added to web UI via observability RPCs and bounded backend sampling",
|
||||
"gmail_auth_cli": "flynn gmail-auth command implemented with OAuth2 flow, doctor check, config routed to Telegram",
|
||||
@@ -6877,7 +6879,7 @@
|
||||
"deeper_surfaces_phase4_rollout": "completed \u2014 phase 4 rollout and operator readiness plan documented: canary rollout plan by feature flag/surface, explicit rollback playbook, operator docs and architecture/protocol docs synchronized",
|
||||
"post_phase_test_fixes": "completed \u2014 fixed 4 test failures introduced by phases 1-3: iOS/Android push listNodes (missing publishHeartbeat before platform-filtered query), server.test agent.send (run_state events now precede done; added sendAndWaitForDone helper), httpBody 413 (req.destroy() closed socket before response could be sent; replaced with Connection: close header on 413 responses)",
|
||||
"personal_assistant_productization_plan": "proposed \u2014 8-10 week phased roadmap defined (companion MVP surfaces, voice reliability hardening, browser workflow reliability layer, onboarding 2.0 first-success funnel) with measurable exit gates.",
|
||||
"subagents_support": "completed \u2014 phase-1 plus partial phase-2 subagent runtime support added with `subagent.spawn/send/list/cancel/delete/summary`, per-parent child-session orchestration, idle TTL cleanup (`agents.subagents.idle_ttl_ms`), config guardrails, and focused regression tests."
|
||||
"subagents_support": "completed \u2014 subagent phases 1-3 shipped with `subagent.spawn/send/list/cancel/delete/summary`, per-child queue mode (`followup|interrupt`), budgets (`max_turns`, `max_total_tokens`, `turn_timeout_ms`), tool-profile overrides, trace-linked audit events, `/subagents` inspection commands, and focused regression tests."
|
||||
},
|
||||
"soul_md_and_cron_create": {
|
||||
"date": "2026-02-11",
|
||||
|
||||
Reference in New Issue
Block a user