Discord meal-log silence RCA — 2026-05-26
過去レポートのView/ソース規律バックフィルで生成したView。
What happened
- 19:27:43 JST: gateway restart checkpoint was created for a guarded Gateway restart.
- 19:28:03 JST: Gateway received SIGTERM and began shutdown.
- 19:28:04 JST: the owner Discord meal-log message was persisted in the main channel transcript.
- 19:28:12-19:28:56 JST: Gateway restarted and providers reconnected; Discord channel resolution completed after the message ingress window.
- 19:34:05 JST: the watchdog posted "OpenClaw見守り - Gateway restart post-check recovered..." into the same Discord session.
- 20:15:29 JST: the owner sent "おい"; only then did the meal log work complete and receive a real answer.
Root cause
The user message landed during a Gateway restart boundary. The transcript retained the user message, but the normal assistant turn did not produce a response. The post-check watchdog only verified Gateway health and posted a cron/direct-delivery status message. That status message appeared after the user prompt but was not a real answer, and the recovery logic treated health recovery as enough.
The earlier explanation that nutrition lookup/tool work caused the wait was wrong. The local food DB work and View render take sub-second to a few seconds once a live turn is actually running.
Fix applied
- Added scripts/owner_unanswered_watch.py.
- Scans owner Discord transcripts.
- Treats cron/direct-delivery mirror messages and "OpenClaw見守り" as non-answers.
- Flags stale user prompts that have no real assistant text after them.
- Wired scripts/openclaw_watchdog.py to record owner_unanswered_message.
- Added regression tests in scripts/test_owner_unanswered_watch.py.
- Added a durable note in TOOLS.md so future restart/post-check work does not confuse "health recovered" with "owner answered."
Verification
- python3 -m py_compile scripts/owner_unanswered_watch.py scripts/openclaw_watchdog.py
- python3 -m unittest scripts/test_owner_unanswered_watch.py
The full test_openclaw_watchdog.py run hit the live watchdog lock in this running environment, so I did not count that as a clean full-suite pass.