Dev Intel: MCP resume needs lifecycle health

Codex MCP resume confusion, stale stdio MCP stacks, and Claude Code observability point to one small OpenClaw health packet.

Generated: 2026-05-24 11:43 JST

Lane: 開発ネタ発掘

Goal: Turn the latest Codex/Claude Code tool-runtime signals into one practical OpenClaw improvement idea without sending another clustered dev-intel notification.

Assumption: The useful owner takeaway is not "MCP is buggy"; it is that long-lived agent sessions need explicit checks for whether tools are still callable and whether their helper processes have returned to a bounded baseline.

Smallest edit/action: Saved this source-backed packet and rendered a View; no runtime or production change.

Why this is useful:

Recent evidence points to the same operational shape from two sides. One Codex issue reports that after resuming a session on v0.121.0, the model could still talk about MCP tools but repeatedly used shell instead; compacting the conversation made direct MCP calls work again. Another Codex issue reports long-lived multi-agent sessions accumulating duplicate stdio MCP launcher stacks after subagents finish. SigNoz's Claude Code OpenTelemetry post shows the parallel trend: coding agents are becoming measurable systems, with token usage, sessions, latency, success rate, model distribution, and accept/reject decisions flowing into dashboards.

For OpenClaw, the stealable idea is a tiny agent_tool_health packet on resume and after subagent close:

tool_visibility: tool is listed
tool_invocation: a harmless direct call actually succeeds
process_baseline: helper process count returns to expected range
last_compaction_at: whether a context transition happened before failure
owner_action: reauth / restart / compact / no action

This would catch the dangerous middle state where the agent "knows" a tool exists but is not actually invoking it, and the quieter state where completed parallel work leaves stale local processes behind.

What I made/changed:

Created this report under projects/heartbeat-editorial-room/.
Rendered a View for later reading.
Chose Notify: no because two dev-intel notifications already shipped this morning; this is good material, but not worth another phone ping right now.

Sources/Evidence:

Codex issue #18233: resumed session can confuse MCP tool invocation until compaction: https://github.com/openai/codex/issues/18233
Codex issue #14233: long-lived multi-agent sessions can accumulate duplicate stdio MCP stacks: https://github.com/openai/codex/issues/14233
SigNoz, "Bringing Observability to Claude Code: OpenTelemetry in Action" (updated 2026-05-19): https://signoz.io/blog/claude-code-monitoring-with-opentelemetry/

Prediction:

If OpenClaw adds a resume/close health packet for tool invocation and helper-process baseline, future heartbeat/runtime failures should become easier to classify as context_resume_confusion, stale_tool_processes, auth_needed, or real_tool_error instead of becoming vague "agent got confused" reports.

Verify by:

Render the View successfully.
Update creative state with notify:false.
Run a small JSON validation for the state file.

Observed:

Web fetch succeeded for all three sources.
The signal overlaps with existing OpenClaw themes: MCP parallelism, outcome traces, and tool-fault classification.
Notification budget argues for silent save rather than another Telegram message.

Next safe action:

Add a small local script later, agent_tool_health.py, that can record harmless direct-tool probes and local helper-process counts into a JSONL event file after resume/subagent close. Keep it read-only at first.

Notify: no — source-backed and useful, but this morning already had nearby dev-intel pings; saving it is the higher-quality move.