Dev Intel: agent observability as product control

Coding-agent telemetry is turning into a control layer for cost, reliability, governance, and owner trust.

Generated: 2026-05-21T01:03:23+09:00

Lane: 開発ネタ発掘 / source_backed_intel

Why this is useful:

Coding-agent telemetry is moving from "debug my CLI run" to "control cost, reliability, governance, and production-adjacent impact." 健人くんが OpenClaw/ひめのを秘書OSとして育てるなら、先に巨大な監視基盤を入れるより、local-first の最小イベントを定義しておく方が効く。

What I made/changed:

Honeycomb and Dynatraceの公開情報を読み、OpenClawに盗める実装観点へ圧縮した。
結論は「agent action log」ではなく「product control event」。人間に何を承認させたか、何を自律実行したか、どの検証で止まったか、費用/時間/失敗がどう変わったかを見る。

Sources/Evidence:

Honeycomb: Agent Skills are packaged as eight skills and two autonomous agents around observability onboarding, OpenTelemetry migration, and production investigation. They explicitly encode SRE query habits and investigation loops into agent workflows. https://www.honeycomb.io/blog/accelerate-opentelemetry-migrations-honeycomb-agent-skills
Dynatrace: AI coding-agent monitoring now frames Claude Code, Gemini CLI, Codex CLI, OpenCode, and Copilot SDK around sessions, tokens, costs, tool executions, errors, performance, adoption, and governance. https://www.dynatrace.com/news/blog/dynatrace-expands-ai-coding-agent-monitoring/
OpenAI Codex Skills: skills package task-specific instructions, resources, and optional scripts; descriptions must be scoped because implicit activation depends on them. https://developers.openai.com/codex/skills
OpenAI Codex best practices: repeated mistakes should become AGENTS.md guidance, skills, validation, or automations rather than repeated prompting. https://developers.openai.com/codex/learn/best-practices

Observed:

Two adjacent patterns are converging:

Domain playbooks are becoming executable skills. Honeycomb is not just saying "use OTel"; it packages what good SREs do: wide events, high-cardinality attributes, latency heatmaps, BubbleUp-style outlier analysis, SLO burn interpretation, and production-investigation workflow.
Coding agents are becoming monitored workloads. Dynatrace's framing is not just "agent logs"; it is adoption, token/cost, tool behavior, errors, latency, governance, and connection to commits/PRs/production context.

Why 健人くん cares:

OpenClaw already has the hard part: conversations, cron/heartbeat, skills, memory, local scripts, and owner approvals. The missing product layer is a small, queryable event stream that answers:

Did this patrol produce finished value or just evidence?
Was a hard stop respected?
Which skill/guard/test actually changed behavior?
Did a notification help the owner decide faster?
Which agent loops cost time/tokens without shipping?

What to steal:

Treat each heartbeat/agent turn as a control event, not a diary entry.
Minimal schema: event_type, lane, owner_value, autonomy_level, hard_stop, skill_used, tools_used_count, verification, notify_policy, result_url, failure_signature, elapsed_seconds.
Add derived views: "shipped value by lane", "repeated failure converted to guard?", "notifications with sources", "blocked by approval vs blocked by runtime patch".
Keep payload local and privacy-safe: metadata and artifact links, not full private prompts or credentials.

Prediction:

The next practical frontier for personal AI workspaces is not smarter prompts. It is a local control plane that lets the owner see which automations are worth trusting, which should be killed, and which repeated failures have been converted into durable behavior.

Verify by:

Check whether existing OpenClaw logs can answer the five "why 健人くん cares" questions without reading raw transcripts.
If not, add one local event append path for heartbeat artifacts first.

Next safe action:

Add a tiny heartbeat_control_events.jsonl writer for editorial artifacts, starting with metadata only. Then build one HTML view that groups last 7 days by lane, notify yes/no, and failure-signature conversion.

Notify: no

Reason: useful, but it is 01:00 JST and not urgent. Save silently; surface later if this becomes the next implementation packet.