Dev Intel: instruction-loading observability

Agent observability should show which instructions were actually loaded, not only tokens.

Generated: 2026-05-25T10:08:00+09:00

Lane: 開発ネタ発掘

Goal: 健人くん向けに、agentic coding / OpenClaw運用へすぐ転用できる開発ネタを1つ、ソース付きで残す。

Assumption: 今回は「新サービス紹介」より、heartbeatやCodex/Claude Code運用の品質に効く実装パターンを優先する。

Smallest edit/action: Claude Code hooks + OpenTelemetry の実例と公式docsを読み、OpenClawに移せる1枚の判断メモに圧縮する。

ざっくり

AI coding agentの観測は、トークン量やコストだけ見ても足りない。いま面白いのは「そのセッションでどの指示ファイル・ルール・フックが本当に読み込まれたか」をイベントとして残す方向。

Mae Capozziの記事では、Claude Code hooksでSessionStart / InstructionsLoaded / tool use / prompt submitなどを拾い、HoneycombへOpenTelemetry spanとして送る運用が紹介されていた。Anthropic公式docs側でも、hooksはSessionStart、UserPromptSubmit、PreToolUse、PostToolUse、InstructionsLoadedなどのライフサイクルイベントを扱えると説明されている。Monitoring docsでは、Claude Code自体もOpenTelemetryでusage/cost/tool activityをexportできる。

健人くんに関係ある理由

OpenClaw/ひめのの最近の失敗は、単に「モデルが賢いか」ではなく、「どのAGENTS.md、tone playbook、phrasebank、heartbeat契約がそのターンで効いていたか」を後から追えないことが多い。

だから、heartbeat品質を上げるなら次はログ本文の反省を増やすより、各runごとに以下を小さく残す方が効く。

読んだ必須ファイル: AGENTS.md / HEARTBEAT_CREATIVE.md / HEARTBEAT_MANAGEMENT.md / tone playbook / phrasebank
owner-facing出力の有無と理由
直前のcorrectionやexecutive orderが見えたか
final前のtone gate結果

これがあると「また硬い通知になった」時に、性格論じゃなくて「phrasebank未読」「tone gate未実行」「dev-intel通知の型崩れ」みたいに原因を切れる。

次に試すなら

OpenClaw側でいきなり外部OTelへ送らず、まずはworkspace内に memory/heartbeat-run-trace.jsonl を作るのが安全。1 run 1行で run_id / required_files_read / lane / notify_decision / owner_gate_result だけ残す。うまく効いたら後でView化、必要ならOTel exportへ広げる。

Why this is useful

ひめのheartbeatは「起きた証明」ではなく「価値が出た理由・出なかった理由」を追える方が強い。特に口調・外部通知・source-backed dev scoutは、指示の読み込み漏れと判断漏れが混ざると再発防止しにくい。

What I made/changed

この判断メモを作成。
通知候補をDiscord heartbeatネタ置き場向けの短文に圧縮。
実装はまだしない。次の安全な一手はlocal JSONL traceの最小追加。

Sources/Evidence

Mae Capozzi, “AI agents removed the friction from writing telemetry” (2026-03-27): https://maecapozzi.com/blog/ai-removes-observability-friction
Anthropic Claude Code hooks reference: https://code.claude.com/docs/en/hooks
Anthropic Claude Code monitoring usage: https://code.claude.com/docs/en/monitoring-usage
OpenAI Codex Agent Skills docs: https://developers.openai.com/codex/skills

Prediction

If OpenClaw heartbeat adds a local run trace for required context reads and owner-facing gate outcomes, repeated “tone/style/notification contract drift” investigations should need fewer manual log reads.

Verify by

Render this artifact into a View.
Run heartbeat_guard after the artifact is saved.
If implementation is chosen later, add a unit test that a heartbeat run trace records required file reads before owner-facing output.

Observed

Guard at run start returned HEARTBEAT_OK.
Source fetches succeeded for Mae Capozzi article, Anthropic hooks, Anthropic monitoring, and OpenAI skills docs.

Next safe action

Add a tiny local-only heartbeat_run_trace helper that records required file reads and final notify decision, then test it with one synthetic heartbeat pass.

Notify

yes — this is a source-backed dev/agent ops pattern with a concrete OpenClaw implementation idea.