Telegram 永続キュー実装設計書 v8 (Codex レビュー反映版)

2026年4月9日 09:14 更新

MD から自動変換されたページです。内容について質問があれば右下の ? ボタンからどうぞ。

作成日時: 2026-04-09 08:00 JST
v3 改訂日時: 2026-04-09 09:00 JST (Codex レビュー 1 回目反映)
v4 改訂日時: 2026-04-09 10:00 JST (Codex レビュー 2 回目反映)
v5 改訂日時: 2026-04-09 11:00 JST (Codex レビュー 3 回目反映)
v6 改訂日時: 2026-04-09 12:00 JST (Codex レビュー 4 回目 SHOULD 反映)
v7 改訂日時: 2026-04-09 13:00 JST (Codex レビュー 5 回目 watchdog バグ修正)
v8 改訂日時: 2026-04-09 14:00 JST (Codex レビュー 6 回目 BSD awk 互換 + signal 判定修正)
作成者: Kurisu (main, j-20260409-persistent-queue)
前提調査: 2026-04-09_Telegram連投脱落問題_調査.md (j-038)
関連先行設計: ~/plans/telegram-intent-tracking-v2.md
関連 ADR: adr-013 (permission auto approve), adr-015 (telegram outbound recording)
対象: ~/.claude/plugins/marketplaces/claude-plugins-official/external_plugins/telegram/server.ts

改訂履歴

版	日時	主な変更
v1 (draft)	2026-04-09 08:00	初版: 8 層アーキテクチャ (Layer 0-8)
v3	2026-04-09 09:00	Codex レビュー 1 回目反映: Layer 6-8 削除 (YAGNI違反)、静的パス解決 (marketplace.json ベース)、rate limit 対応の見直し (Stop hook 依存廃止、既存 SessionStart + 60秒 launchd で担保)
v4	2026-04-09 10:00	Codex レビュー 2 回目反映: path traversal 防御の実装 (realpath + prefix チェック)、rate limit 即時フォールバック (server.ts 直接 ntfy + 初回通知即時化)、Phase B1 手動パッチ安全策 (バックアップ + diff 検証)、cache クリーンアップ安全策 (厳密マーカー + 差分確認)
v5	2026-04-09 11:00	Codex レビュー 3 回目反映: server.ts 直 ntfy を削除 (プライバシー/スパム問題)、ntfy 通知を intents-timer.sh に集約 (30秒経過しても pending なら秘書停滞と間接判定、本文は 80 文字 snippet にマスク)、cache クリーンアップの find 方針明記 (`find -maxdepth 1 -type d -name '[0-9]' \| sort -V \| tail -1`)、last_reminded_at UPDATE SQL を実装例に追加*
v6	2026-04-09 12:00	Codex レビュー 4 回目 SHOULD 反映: launchd watchdog 追加 (launchctl print で LastExitStatus 監視、失敗なら ntfy + 自動リロード)、通知結果の exit code チェック (tmux.py send / curl の失敗時は `last_reminded_at` を UPDATE せず次ループで再送)、マスク処理を Unicode aware 40 文字 cut に変更 (Python `codecs` 経由で文字化け防止、byte cut 廃止)。LGTM は既に取れたが盤石さのため v6 で追加反映
v7	2026-04-09 13:00	Codex レビュー 5 回目 SHOULD 反映: watchdog 判定の誤検知修正 (`(never exited)` / 初回起動時を正常扱い、`state` を併用)、ログ窓判定のバグ修正 (awk の `$0 > cutoff` を正規表現で時刻抽出する方式に変更、ログ行先頭 `[` で常に真になる問題を解消)、Python 起動を 1 回に集約 (マスク生成のみ Python、tmux.py は別プロセスだが YAGNI で許容)
v8	2026-04-09 14:00	Codex レビュー 6 回目 MUST/SHOULD 反映: awk ログ窓判定を BSD awk 互換に修正 (gawk 専用の `match(..., m)` 3 引数形式を POSIX `match() + RSTART + substr` 方式に差し替え、macOS 標準 awk で常に 0 件判定になるバグを解消)、`last terminating signal` 行の解析を追加 (`launchctl print` は `kill -9` を `last exit code = -9` ではなく `last terminating signal = Killed: 9` として出力するため、exit code だけでは signal 異常を見逃す)、誤検知テスト表を 6 → 3 ケースに簡略化 (正常 / exit 異常 / signal 異常、launchctl 出力表記変更への耐性強化)

1. 背景

1.1 j-038 で特定された症状

2026-04-09 06:57〜06:59 JST、健人が同じテキストを Telegram DM で 4 連投。server.ts は 4 通全部受信してるのに、秘書セッションには最後の 1 通 (message_id 3020) しか届かなかった。MCP notification ログ/秘書 session jsonl の両方で中間 3 通が完全消失。

1.2 仮説 E (最有力): Claude Code 本体の channel notification coalesce

mcp.notification('notifications/claude/channel', ...) は stdio 越しに送信されるが、Claude Code 本体がセッション busy 時に同じ発信元の notification を「新しい方で上書き (coalesce)」している可能性。バイナリ解析で origin.kind === 'channel' を特別扱いする QO7 関数の痕跡あり。ただし この仮説の実証は本設計の対象外。Layer 1 さえ動けば永続化は保証されるため、coalesce があっても実害はなくなる。

1.3 v2 設計書 (telegram-intent-tracking-v2.md) との関係

v2 では既に 5 層防御アーキテクチャを設計・部分実装済み:

Layer 1: server.ts パッチ → record-intent.sh → intents テーブル (永続化)
Layer 2: close-intent.sh (reply 時 close)
Layer 3: check-pending-intents.sh (Bash 後に pending 確認)
Layer 4: intents-timer.sh (launchd 5 分間隔)
Layer 5: bootstrap-pending-intents.sh (SessionStart hook)

本設計書は v2 を破棄せず、Layer 1 の実質機能停止 (パス不一致バグ) を修復し、既存 Layer 3/4/5 を微調整するだけ。新規レイヤは追加しない (v1 draft の Layer 6-8 は Codex レビューで YAGNI 違反と判定され削除)。

2. 真の根本原因 (v2 では見逃されていた問題)

2.1 パッチ対象のミスマッチ

項目	値
`patch-telegram-plugin.sh` がパッチを当てる対象	`~/.claude/plugins/cache/claude-plugins-official/telegram/0.0.4/server.ts`
実際に `bun server.ts` として実行されるファイル	`~/.claude/plugins/marketplaces/claude-plugins-official/external_plugins/telegram/server.ts`

2.2 証拠

プロセス cwd 確認: ps aux | grep 'bun server.ts' で親プロセスの --cwd が marketplaces 側
.mcp.json の cwd 指定: ${CLAUDE_PLUGIN_ROOT} は Claude Code によって marketplaces 側に展開される
MD5 差分:
- marketplaces: 8894967289778fbb5ebde33c882406af (パッチなし)
- cache: 35840e36afeca17af21b443ebeed5d43 (旧 v1 INTENT_RECORDING_PATCH 残存)
intents テーブルの停止時刻: 最新レコード 2026-04-07 23:07:07 (migration test)。2026-04-08 以降実運用レコード 0 件
patch-telegram-plugin.log: 最終実行 2026-04-07 07:50:21 OK: patch already applied to ...cache... (cache 側には当たってたが使われていなかった)
Codex レビュー1回目の確認: ~/.claude/plugins/cache/claude-plugins-official/telegram/0.0.4/server.ts:996 に旧 v1 INTENT_RECORDING_PATCH マーカー存在。marketplaces 側には不在。つまり patch-telegram-plugin.sh は cache 側にのみ機能し、実行ファイルは無傷のままだった

2.3 実証計画 (Codex レビュー指摘対応)

「Layer 1 復活だけで連投脱落が構造的に解決する」と主張する以上、実機検証を Phase B2 に追加:

marketplaces 側 server.ts に 手動で 最小パッチを当てる (record-intent.sh fire-and-forget 追加)
secretary 再起動 → 自分の Telegram から 4 連投
intents テーブルに 4 件 insert されるか確認
秘書が 4 件全部に順次 reply するか確認
成功なら patch-telegram-plugin-v2.sh の実装 (Phase B3) に進む

この実験が失敗すれば、coalesce 以外の根本原因が存在するため設計の前提を見直す。

2.4 補助的に見つかった既存バグ

intents-timer.log の SQL 混入: sqlite3 heredoc で .timeout 5000 の 5000 が SELECT changes() の出力ストリームに混入。ORPHAN_CLEANUP の行数判定が壊れる
run-claude.sh の orphan kill 対象: TELEGRAM_PLUGIN_DIR="cache/..." と cache 側を指しており、実行中 bun プロセスを検知できない

2.5 連投が脱落した具体的メカニズム (確定版)

連投 4 通
    ↓
Telegram Bot API → grammy → handleInbound (marketplaces/server.ts)
    ↓
[Layer 1 死亡] record-intent.sh が呼ばれない → intents テーブル未更新
    ↓
mcp.notification('notifications/claude/channel', ...) × 4 (fire-and-forget)
    ↓
Claude Code 本体の channel queue で busy 時に coalesce → 最新 1 通だけ残る
    ↓
秘書 session に message_id=3020 だけ届く
    ↓
[Layer 2-5 は intents テーブルが空なので検知不能]
    ↓
中間 3 通は恒久的に消失

Layer 1 が効いていれば: intents に 4 件全部入る → Layer 3/5 が順次 pending 検知 → Claude が認識 → 対応。つまり Layer 1 復活だけで連投脱落は構造的に解決する。

3. 設計目標

#	要件	判定基準
R1	連投 N 通は N 件とも永続化される	`intents` テーブルに N 行 insert されること
R2	連投 N 通は秘書が全部処理する (coalesce 下でも)	N 件の reply が発生し N 件とも status=closed になる
R3	rate limit 停止 → 解除後に全件提示	次セッション起動時に SessionStart hook (bootstrap) が pending 全件を system-reminder に流す
R4	plugin upgrade でパッチが消えても次回起動で復元	`run-claude.sh` 起動時に `patch-telegram-plugin-v2.sh` が自動再適用
R5	既存 `intents` スキーマを壊さない	スキーマ変更なし
R6	既存 MCP 通信路を維持	`mcp.notification(...)` は継続呼び出し、並行動作
R7	非エンジニア向けに運用可能	手動 SQL や手動パッチ操作を要求しない
R8	パッチ適用失敗を即検知	起動時の同期 post-patch 検証で失敗ならClaude 起動拒否 + ntfy 通知

4. 非目標 (YAGNI)

プラグイン全体の fork (upstream 追従コストが重い)
独立 pending queue テーブルの新設 (intents 流用で十分)
画像・添付ファイルの完全メタ保存 (連投脱落の本論から外れる)
設定 UI / 管理画面
Claude Code 本体の channel notification 修正 (本体依存)
複数秘書セッション並行対応 (現状は単一 secretary のみ)
本体 coalesce 問題の解決 (プラグイン側で回避)
rate limit 即時エラー時の Stop hook 依存 (v1 draft の Layer 7 は削除): Stop hook が rate limit 時に発火する保証がないため、既存 SessionStart + launchd timer で担保
reply 後の強制 flush (v1 draft の Layer 6 は削除): 既存 Layer 3 (PostToolUse:Bash) で reply 後の Bash 呼び出し時に自動発火するため不要
非同期 patch-healthcheck (v1 draft の Layer 8 は削除): run-claude.sh 内の同期 post-patch 検証に統合 (パッチ適用直後にマーカー grep)

5. アーキテクチャ

5.1 レイヤ構成 (v3 = v2 の Layer 1 修復のみ)

[Telegram Bot API]
    ↓ grammy long-polling
[bun server.ts (marketplaces)]
    ├── Layer 0: MCP stdio transport (既存)
    ├── Layer 1': 受信永続化 (← 本設計で復活 / パス修正)
    │     handleInbound → record-intent.sh (fire-and-forget spawn)
    │                  → intents テーブル INSERT OR IGNORE
    │     パッチは patch-telegram-plugin-v2.sh が marketplaces 側に静的解決して当てる
    │
    └── mcp.notification(...) → Claude Code 本体 (coalesce 可能性あり)
                                  ↓
[秘書 Claude Code session (--name secretary)]
    ├── Layer 2 (既存): close-intent.sh (PostToolUse: reply)
    ├── Layer 3 (既存): check-pending-intents.sh (PostToolUse: Bash)
    │     ※ v1 draft の Layer 6 (reply直後の即時 flush) は
    │        このLayer 3 が Bash 呼び出し時に自動発火するため不要と判断 → 削除
    ├── Layer 5 (既存): bootstrap-pending-intents.sh (SessionStart)
    │     ※ rate limit 後の Claude 再起動時に pending 全件を自動提示
    │
[launchd (別プロセス)]
    ├── Layer 4 (既存 + 改修): intents-timer.sh
    │     StartInterval: 300秒 → 60秒 (rate limit 中の迅速提示)
    │     SQL 混入バグ修正 (sqlite3 の .timeout を -cmd で分離)
    │     v6: tmux.py send / curl の exit code 確認、Unicode aware 40 文字マスク
    │
    └── Layer 4w (v6 新設): intents-timer-watchdog.sh
          StartInterval: 300秒 (5 分)
          launchctl print で intents-timer の LastExitStatus 監視
          異常時は bootout → bootstrap で自動リロード + ntfy 警告

5.2 削除したレイヤ (v1 draft → v3)

v1 Layer	名前	削除理由
Layer 6	reply 後即時 flush (check-pending-now.sh)	既存 Layer 3 が PostToolUse:Bash で発火するため不要。reply 後に Claude が必ず何らかの Bash を実行するため自然に拾える。直接発火しなくてもLayer 4 が 60 秒以内にカバー
Layer 7	Stop hook rate limit 復帰 (flush-on-stop.sh)	Stop hook が rate limit エラー時に発火する保証なし (Claude が応答生成すらできない場合は Stop に到達しない)。代替: rate limit からの復帰は Claude Code の crash/reload ループ → run-claude.sh → SessionStart hook (Layer 5) で担保。rate limit 中の新規メッセージは intents に記録され、60 秒 launchd timer (Layer 4) が pending 10 分超を検知して tmux.py send で次ユーザー入力として注入する
Layer 8	非同期 patch-healthcheck (`sleep 30` で PID 取得)	`pgrep -f 'bun server.ts'` では Telegram 以外の bun プロセスを誤検知する可能性。代替: run-claude.sh の `patch-telegram-plugin-v2.sh` 直後に同期で marketplaces の server.ts を grep して INTENT_RECORD_BEGIN を確認。失敗なら Claude 起動を拒否 (fail closed)

5.3 変更ファイル一覧

Layer	ステータス	ファイル	変更内容
0	既存維持	`server.ts`	変更なし
1'	大幅修正	`patch-telegram-plugin-v2.sh` (新)	marketplaces 側にパッチ、静的 PLUGIN_ROOT 解決 (marketplace.json ベース)
1'	既存維持	`record-intent.sh`	変更なし (INSERT OR IGNORE の冪等性を確認)
2	既存維持	`close-intent.sh`	変更なし
3	既存維持	`check-pending-intents.sh`	変更なし
4	バグ修正	`intents-timer.sh`, `.plist`	SQL 混入修正 + StartInterval 60 秒化 + v6 で exit code チェック + Unicode aware マスク
4w	新規	`intents-timer-watchdog.sh`, `.plist`	launchctl print で LastExitStatus 監視 + 異常時自動リロード (v6)
5	既存維持	`bootstrap-pending-intents.sh`	変更なし
-	修正	`run-claude.sh`	orphan kill 対象を marketplaces 側に変更 + 同期 post-patch 検証追加
-	バックアップ	`secretary/settings.json`	変更なし (Layer 6/7 削除に伴い hook 追加ナシ)

6. 実装詳細

6.1 patch-telegram-plugin-v2.sh (新規 / Layer 1' 修復)

設計方針:

静的パス解決 (Codex レビュー指摘反映): 稼働中プロセスに依存せず、~/.claude/secretary/settings.json + marketplace.json から確定的にパスを引く
marketplaces 側のみにパッチ (cache 側は古い残骸。実行されないので無視)
既存 patch-telegram-plugin.sh のロジック (MARKER 検証・構文チェック・リバート) は流用
cache 側に v1 旧マーカー (INTENT_RECORDING_PATCH) があれば起動時に sed で除去 (安全のため)
fail closed: パス解決・パッチ適用・構文検証のいずれかが失敗すればエラー exit (run-claude.sh が Claude 起動を止める)

静的パス解決ロジック:

resolve_plugin_root() {
  # 1. secretary settings.json から enabledPlugins を読む
  local settings="$HOME/.claude/secretary/settings.json"
  [ -f "$settings" ] || { log "ERROR: settings.json not found"; return 1; }

  # "telegram@claude-plugins-official": true → "telegram" / "claude-plugins-official"
  local key
  key=$(python3 -c "
import json
with open('$settings') as f:
    d = json.load(f)
for k, v in d.get('enabledPlugins', {}).items():
    if v and k.startswith('telegram@'):
        print(k)
        break
")
  [ -n "$key" ] || { log "ERROR: telegram plugin not enabled in secretary settings"; return 1; }

  local plugin_name="${key%@*}"          # "telegram"
  local marketplace_name="${key#*@}"      # "claude-plugins-official"

  # 2. marketplace.json から plugin の source (相対パス) を取得
  local mp_json="$HOME/.claude/plugins/marketplaces/$marketplace_name/.claude-plugin/marketplace.json"
  [ -f "$mp_json" ] || { log "ERROR: marketplace.json not found at $mp_json"; return 1; }

  local source
  source=$(python3 -c "
import json
with open('$mp_json') as f:
    d = json.load(f)
for p in d.get('plugins', []):
    if p.get('name') == '$plugin_name':
        src = p.get('source')
        if isinstance(src, str):
            print(src)
        elif isinstance(src, dict):
            print(src.get('source', ''))
        break
")
  [ -n "$source" ] || { log "ERROR: plugin '$plugin_name' not found in marketplace.json"; return 1; }

  # source は "./external_plugins/telegram" の形式 (相対)
  # 絶対パスに正規化 + symlink 解決 (realpath)
  local marketplace_root="$HOME/.claude/plugins/marketplaces/$marketplace_name"
  [ -d "$marketplace_root" ] || { log "ERROR: marketplace root not found: $marketplace_root"; return 1; }

  local plugin_dir
  plugin_dir=$(cd "$marketplace_root" && cd "$source" 2>/dev/null && pwd -P)
  [ -n "$plugin_dir" ] || { log "ERROR: cannot resolve plugin source: $source"; return 1; }

  # realpath で symlink を完全解決 (macOS/Linux 両対応)
  # macOS の realpath (coreutils) が無い場合の fallback として python3 を使う
  if command -v realpath >/dev/null 2>&1; then
    plugin_dir=$(realpath "$plugin_dir" 2>/dev/null) || { log "ERROR: realpath failed: $plugin_dir"; return 1; }
  else
    plugin_dir=$(python3 -c "import os,sys; print(os.path.realpath(sys.argv[1]))" "$plugin_dir") || { log "ERROR: python realpath failed"; return 1; }
  fi

  # marketplaces 配下チェック (path traversal 防御)
  # $HOME/.claude/plugins/marketplaces/ の realpath を取得して prefix 比較
  local marketplaces_root_real
  if command -v realpath >/dev/null 2>&1; then
    marketplaces_root_real=$(realpath "$HOME/.claude/plugins/marketplaces" 2>/dev/null)
  else
    marketplaces_root_real=$(python3 -c "import os; print(os.path.realpath(os.path.expanduser('~/.claude/plugins/marketplaces')))")
  fi
  [ -n "$marketplaces_root_real" ] || { log "ERROR: cannot resolve marketplaces root"; return 1; }

  case "$plugin_dir" in
    "$marketplaces_root_real"/*)
      : # OK - marketplaces 配下
      ;;
    *)
      log "ERROR: path traversal attempt detected — plugin_dir outside marketplaces: $plugin_dir (expected prefix: $marketplaces_root_real/)"
      # ntfy 警告 (悪意ある marketplace.json の可能性)
      curl -s -d "秘書: plugin path traversal detected. resolved=$plugin_dir" ntfy.sh/aihara-64d1132d60c2 >/dev/null 2>&1 || true
      return 1
      ;;
  esac

  # 3. server.ts が存在し handleInbound を含むか検証
  local server_ts="$plugin_dir/server.ts"
  [ -f "$server_ts" ] || { log "ERROR: server.ts not found at $server_ts"; return 1; }

  if ! grep -q 'handleInbound' "$server_ts"; then
    log "ERROR: server.ts does not contain handleInbound (wrong file?): $server_ts"
    return 1
  fi

  echo "$plugin_dir"
}

path traversal 防御の詳細 (Codex レビュー 2 回目 MUST FIX 対応):

cd && pwd -P で物理パスに変換 (相対パスや .. を解決)
realpath で symlink を完全解決 (macOS なら python3 os.path.realpath で fallback)
case による prefix マッチで ~/.claude/plugins/marketplaces/ 配下であることを確定
prefix チェック失敗時は ntfy 警告 + fail closed (run-claude.sh が Claude 起動を拒否)

これにより悪意ある marketplace.json が ../../../Library/LaunchAgents/ のような相対パスを返しても、patch-telegram-plugin-v2.sh は任意ファイルを書き換えできない。symlink 経由の抜け道も realpath で塞がれる。


**パッチコード (stdin 経由、既存 record-intent.sh 2 引数モード対応 / v5 = 永続化のみに集約)**:

```typescript
// INTENT_RECORD_BEGIN
try {
  if (msgId != null && typeof text === "string" && text.trim() !== "" &&
      !/^\((photo|video|document|voice|audio|sticker|animation|contact|location|venue|poll|dice)\)$/i.test(text.trim())) {
    // intents DB への永続化のみを行う (ntfy 直通知は廃止 — v5)
    // v4 までは rate limit フォールバックとして全受信メッセージを ntfy に直送していたが、
    // プライバシー/スパム/機密テキスト流出リスクを指摘され撤回。rate limit 時のユーザー通知は
    // intents-timer.sh 側に集約 (セクション 6.5 参照)。pending が 30 秒以上解消しないことを
    // 間接条件にして初めて外部通知する設計に変更。
    const _irp = Bun.spawn(
      ["/bin/bash", "/Users/aiharataketo/.claude/scripts/record-intent.sh", String(chat_id), String(msgId)],
      { stdin: new Blob([text]).stream(), stdout: "ignore", stderr: "pipe" }
    )
    _irp.exited.then(async (code) => {
      if (code !== 0) {
        // 永続化そのものが失敗した時だけ ntfy。本文は一切送らない (メタのみ)
        console.error(`[intent-recording] record-intent.sh failed (exit=${code}) for msg=${msgId}`)
        try { await fetch("https://ntfy.sh/aihara-64d1132d60c2", { method: "POST", body: `[intent-recording] FAIL exit=${code} msg=${msgId}` }) } catch(_){}
      }
    }).catch((e) => {
      console.error(`[intent-recording] spawn error for msg=${msgId}:`, e)
      fetch("https://ntfy.sh/aihara-64d1132d60c2", { method: "POST", body: `[intent-recording] SPAWN ERROR msg=${msgId}` }).catch(()=>{})
    })
  }
} catch(_irpErr) {
  // runtime scope error のときも本文は送らない (メタのみ)
  console.error("[intent-recording] runtime scope error:", _irpErr)
  fetch("https://ntfy.sh/aihara-64d1132d60c2", { method: "POST", body: "[intent-recording] SCOPE ERROR: " + _irpErr.message }).catch(()=>{})
}
// INTENT_RECORD_END

なぜ server.ts から ntfy 本文送信を外したか (v5):

v4 では「Claude 本体が rate limit で動かない時に、server.ts は独立に動くから ntfy を叩こう」という発想で全受信メッセージを ntfy に直送していた
Codex レビュー 3 回目の指摘: 「rate limit 有無を判定しないまま全件を外部に放流するのは新たなリスク」「健人は Telegram 公式通知を既に受け取っているので二重送信にしかならない」「広告スパム・機密テキスト・第三者宛の返信も全て第三者サービス (ntfy.sh) に流れる」
実際 v4 の疑似コードは rate limit の有無を一切判定せず全件無条件送信だった (本文 80 文字 snippet 付き)。これはレビューの通り 明確にプライバシー/スパム問題
v5 では「秘書に届かない = pending が 30 秒以上解消しない」を 間接条件 として使い、intents-timer.sh 側で初めて ntfy を送る設計にする (6.5 参照)。通常時は ntfy ゼロ、異常時のみ通知される
永続化そのものの失敗 (exit != 0 / spawn error / scope error) は従来通り ntfy を出すが、本文は一切含めず msg_id と chat_id のメタだけ送る

挿入位置: // image_path goes in meta only コメント行の直前 (既存アンカー流用)。

cache 側の v1 残骸クリーンアップ (Codex レビュー 2/3 回目 SHOULD 対応で安全策強化):

~/.claude/plugins/cache/claude-plugins-official/telegram/<version>/server.ts に INTENT_RECORDING_PATCH マーカーが残っていれば除去する。ワンライナー sed ではなく以下の 6 段階で実施:

バージョンディレクトリの検出方針 (v5 で明記):

cache 側のバージョンは plugin upgrade で変化する (現在 0.0.4、将来 0.0.5 などに上がる)。ハードコードは使わず、以下のコマンドで最新バージョンを自動検出する:

CACHE_TELEGRAM_ROOT="$HOME/.claude/plugins/cache/claude-plugins-official/telegram"
if [ -d "$CACHE_TELEGRAM_ROOT" ]; then
  # 数字で始まるディレクトリ (0.0.4, 0.0.5, 1.2.3 ...) のみを対象にする
  # sort -V (version sort) で semver 順にソート → tail -1 で最新を取る
  CACHE_VER_DIR=$(find "$CACHE_TELEGRAM_ROOT" -maxdepth 1 -type d -name '[0-9]*' 2>/dev/null | sort -V | tail -1)
fi

-maxdepth 1 で直下のみ (再帰しない)
-type d でディレクトリ限定 (万一ファイルが混ざっても無視)
-name '[0-9]*' で「数字で始まる名前」のみ (メタファイルを除外)
sort -V は GNU/BSD 両対応。macOS は coreutils 無しでも sort が -V をサポート済み
検出 0 件なら CACHE_VER_DIR が空文字になるため、後段のファイル存在チェックで自動 skip

クリーンアップ手順:

バージョンディレクトリ検出 (上記)。空なら skip
cache ファイルが実在することを確認 ($CACHE_VER_DIR/server.ts。存在しなければ skip。fail にはしない)
バックアップ: server.ts.cache-cleanup-backup-YYYYMMDD-HHMMSS として一時保存
厳密マーカーによる範囲特定:
- 開始: // INTENT_RECORDING_PATCH — safety net: record to intents DB before MCP delivery
- 終了: 次の空行または } 行 (v1 パッチの構造を前提)
- 両端が見つからなければ skip + warn

Python で範囲削除 (sed ではなく Python を使い、マーカー完全一致を強制):

import re
src = open(path).read()
new = re.sub(
  r'\n\s*// INTENT_RECORDING_PATCH[\s\S]*?\n\s*\}\s*\n',
  '\n',
  src,
  count=1
)
# 削除範囲が想定サイズ (50-200 bytes) 外なら abort
diff = len(src) - len(new)
if diff < 50 or diff > 1000:
    sys.exit(f"cleanup diff suspicious: {diff} bytes")

bun build --no-bundle 構文検証 → 壊れていればバックアップから復元 + ntfy 警告

事故シナリオへの耐性:

バージョンディレクトリ不在 → skip (実害なし)
複数バージョン共存 → sort -V | tail -1 で最新のみ対象 (古いバージョンは参照されないので放置)
マーカー不一致 → skip (実害なし)
範囲サイズ異常 → abort (バックアップ保持)
構文エラー → revert
cache ファイル不在 → skip

この cache ファイルは現状実行されない (marketplaces 側が実体) が、将来 Claude Code が cache を参照するケースに備えて綺麗にしておく。

起動時フロー (run-claude.sh から呼び出し):

resolve_plugin_root() で実行対象パスを取得 (失敗時は exit 1)
cache 側の v1 マーカー清掃 (存在すれば)
そのパスの server.ts にパッチ適用 (既存 patch-telegram-plugin.sh の flow 流用)
bun build --no-bundle で構文検証 (壊れたら revert)
同期 post-patch 検証: marketplaces 側 server.ts に INTENT_RECORD_BEGIN が存在することを grep で確認 (Layer 8 相当)
正常終了時は ntfy 通知なし、異常時のみ ntfy 警告 + exit 1

6.2 secretary/settings.json (変更なし)

v1 draft では Layer 6 (PostToolUse reply matcher への check-pending-now チェーン) と Layer 7 (Stop hook) の追加を計画していたが、v3 では Codex レビューを受けて両方削除。既存 settings.json のまま運用する。

既存 hook (変更なし):

PostToolUse: mcp__plugin_telegram_telegram__reply → close-intent.sh
PostToolUse: Bash → check-pending-intents.sh
SessionStart → bootstrap-pending-intents.sh

6.3 check-pending-intents.sh (変更なし)

v1 draft では環境変数対応を追加する計画だったが、v3 では Layer 6 削除に伴い不要。既存のハードコード (PENDING_THRESHOLD_MINUTES=3, REMIND_INTERVAL_MINUTES=5) のまま運用する。

6.4 flush-on-stop.sh (削除) / check-pending-now.sh (削除)

v3 で削除。Codex レビューで「Stop hook は rate limit 時に発火しない可能性」「check-pending-now は Layer 3 と last_reminded_at が競合する」と指摘されたため、作成しない。

6.5 intents-timer.sh のバグ修正 + 間隔短縮 (Layer 4)

SQL 混入バグ修正: .timeout 5000 と SELECT changes() の出力分離

Before:
ORPHANED=$(sqlite3 "$DB" <<SQL
.timeout 5000
UPDATE intents SET status='ignored', closed_reason='orphaned' WHERE ...;
SELECT changes();
SQL
)

After:
ORPHANED=$(sqlite3 -cmd ".timeout 5000" "$DB" <<SQL
UPDATE intents SET status='ignored', closed_reason='orphaned' WHERE ...;
SELECT changes();
SQL
)

同じパターンを PENDING_IDS / SNIPPET / update_reserved / CHANGED の各 sqlite3 呼び出しにも適用。

StartInterval 短縮: com.aiharataketo.intents-timer.plist の StartInterval を 300→60。ThrottleInterval も 60 に。

THRESHOLD_MINUTES の見直し (Codex レビュー 2 回目 MUST FIX 対応):

v3 では THRESHOLD_MINUTES=10 のままだったが、Codex の指摘通り「pending が 10 分に到達するまで通知も flush も起きない」ため rate limit 持続時の即時性に欠ける。v4 以降は以下の二段構えに変更:

intents-timer.sh の通知判定ロジック (v5):

  対象: status = 'pending' の intents
  条件A (初回通知): last_reminded_at IS NULL AND received_at が過去 30 秒以上
    → tmux.py send (秘書キューに注入)
    → ntfy 送信 (本文 80 文字 snippet + プライバシー マスク)
  条件B (再通知): last_reminded_at IS NOT NULL AND (now - last_reminded_at) > 5 分
    → tmux.py send
    → ntfy 送信
  条件C: それ以外 → スキップ

  通知後: last_reminded_at = datetime('now', 'localtime') で必ず UPDATE

動作シミュレーション (連投抑制の確認):

T+0 秒    健人 → msg (1通目) → intents INSERT (last_reminded_at=NULL)
T+60 秒   launchd timer 発火 → 条件A (NULL かつ 30 秒経過) → tmux.py send + ntfy 1回
                                                      → last_reminded_at=T+60
T+120 秒  launchd timer 発火 → 条件B (経過 60 秒 < 5 分) → スキップ
T+180 秒  launchd timer 発火 → 条件B (経過 120 秒 < 5 分) → スキップ
...
T+360 秒  launchd timer 発火 → 条件B (経過 300 秒 = 5 分) → tmux.py send + ntfy 再通知
                                                      → last_reminded_at=T+360

メリット:

新規メッセージは最大 90 秒 (60 秒 launchd + 30 秒 debounce) で秘書に届く (v3 の最大 11 分から大幅改善)
5 分スロットルで連投通知を抑制 (既存動作と同じ)
THRESHOLD_MINUTES 廃止により設定の複雑さも減る
通常運用中は ntfy ゼロ (秘書が 30 秒以内に処理すれば通知は発生しない)
rate limit / 秘書停滞時のみ ntfy が鳴る — 通知と異常状態が 1 対 1 で対応する

実装変更点 (v5): intents-timer.sh の SELECT 条件と通知ロジック:

-- v3 (旧):
SELECT id, chat_id, raw_text FROM intents
WHERE status = 'pending'
  AND (julianday('now') - julianday(received_at)) * 24 * 60 >= 10;

-- v5 (現行):
SELECT id, chat_id, raw_text, last_reminded_at FROM intents
WHERE status = 'pending'
  AND (
    (last_reminded_at IS NULL AND (julianday('now') - julianday(received_at)) * 86400 >= 30)
    OR
    (last_reminded_at IS NOT NULL AND (julianday('now') - julianday(last_reminded_at)) * 86400 >= 300)
  );

通知発火と UPDATE SQL (v6 擬似コード):

# intents-timer.sh の通知ブロック (v6)
# $DB = ~/secretary-state/jobs.db
# PENDING_IDS の取得は上記の SELECT で行う
# MAX_SNIPPET_CHARS = 40 (v6 で 80 → 40 に短縮)

while IFS='|' read -r id chat_id raw_text last_reminded_at; do
  [ -z "$id" ] && continue

  # --- 本文のプライバシー マスク処理 (v6: Unicode aware) ---
  # Python で文字単位 (code point) で 40 文字 cut + 制御文字除去 + 改行正規化
  # bash の `cut -c` は byte 単位なので日本語で文字化けする。Python に置き換え
  snippet=$(MAX=40 RAW="$raw_text" python3 - <<'PY'
import os, sys, unicodedata

raw = os.environ.get("RAW", "")
max_chars = int(os.environ.get("MAX", "40"))

# 改行・タブを空白に正規化
normalized = raw.replace("\n", " ").replace("\r", " ").replace("\t", " ")

# 制御文字 (カテゴリ "Cc") を除去
cleaned = "".join(ch for ch in normalized if unicodedata.category(ch) != "Cc")

# 文字単位 (code point) で cut
if len(cleaned) > max_chars:
    snippet = cleaned[:max_chars] + "…"
else:
    snippet = cleaned

sys.stdout.write(snippet)
PY
  )

  # --- 秘書セッションへの tmux.py send (exit code チェック v6) ---
  python3 "$HOME/.claude/skills/tmux/scripts/tmux.py" send secretary:0 \
    "【未対応メッセージ】msg_id=$id chat=$chat_id: $snippet" \
    >>"$LOG_DIR/intents-timer.log" 2>&1
  tmux_rc=$?

  # --- ntfy 送信 (exit code チェック v6) ---
  # pending が 30 秒以上解消しない = 秘書停滞と間接判定
  # 通常運用では launchd 60 秒サイクルで通知されない
  curl -s -m 5 -o /dev/null -w "%{http_code}" \
    -H "Title: Telegram pending (intent-timer)" \
    -H "Priority: default" \
    -H "Tags: envelope" \
    -d "msg_id=$id chat=$chat_id: $snippet" \
    "https://ntfy.sh/aihara-64d1132d60c2" \
    >/tmp/intents-timer-ntfy-http 2>>"$LOG_DIR/intents-timer.log"
  curl_rc=$?
  ntfy_http=$(cat /tmp/intents-timer-ntfy-http 2>/dev/null || echo "000")

  # --- UPDATE last_reminded_at は「両方成功した時だけ」(v6) ---
  # どちらかが失敗していれば UPDATE せず次ループで再送 (最短 60 秒後)
  # これにより初回通知がネットワーク障害で落ちても 5 分待たずに回復
  if [ "$tmux_rc" -eq 0 ] && [ "$curl_rc" -eq 0 ] && [ "${ntfy_http:0:1}" = "2" ]; then
    sqlite3 -cmd ".timeout 5000" "$DB" <<SQL
UPDATE intents
SET last_reminded_at = datetime('now', 'localtime')
WHERE id = $id AND status = 'pending';
SQL
    log "NOTIFY OK id=$id tmux=$tmux_rc curl=$curl_rc http=$ntfy_http"
  else
    # 失敗を WARN ログに残す (次回ループで再送される)
    log "NOTIFY FAIL id=$id tmux=$tmux_rc curl=$curl_rc http=$ntfy_http — will retry next tick"
  fi
done <<EOF
$PENDING_IDS
EOF

マスク処理の設計意図 (v6):

Unicode aware cut: cut -c は byte 単位なので日本語 (UTF-8 で 3 バイト/文字) だと途中で切れて文字化けする。Python の str slice は code point 単位なので安全
40 文字に短縮 (v5: 80 → v6: 40): Codex Loop 4 の指摘「短い個人情報はそのまま流れる / 日本語で文字化け」に対応。40 文字は「msg 内容の種類が分かる最低限」に設定
制御文字除去: Unicode category "Cc" を全部落とすので tab / newline 以外の制御文字 (BEL, NUL, ESC 等) も取り除かれる
末尾 …: 切り詰めが起きたことをユーザーに明示 (元の文字数は隠す)
chat_id / msg_id は送る: どの会話の何番目メッセージかが分からないと対応できない

exit code チェックの設計意図 (v6):

tmux.py send が失敗する典型ケース: secretary セッション不在 (kill-session 直後)、tmux server 自体ダウン、python3 パス不在
curl が失敗する典型ケース: ntfy.sh DNS 失敗、ネットワーク切断、ntfy 側の 5xx
片方でも失敗していれば last_reminded_at を UPDATE しない → 60 秒後の次ループで再試行される
ntfy は HTTP ステータスも確認 (2xx のみ成功扱い) — curl exit 0 でも 500 返すケースを捕捉
ログに tmux_rc / curl_rc / http を残すので後追い調査が容易

SQL injection 対策: $id は SELECT で取得した整数なので原則安全だが、念のため status = 'pending' の条件を付けて二重遮断 (既に close された行を UPDATE しない)。

Python 起動回数の最適化 (v7, Codex Loop 5 SHOULD 対応):

v6 の擬似コードは pending 1 件あたり Python を 2 回起動していた (マスク生成 + tmux.py send)。Codex の指摘通り大量 backlog 時に負荷が気になる。v7 では以下の最適化を行う:

スニペット生成: pending 1 件あたり Python 1 回。マスク処理と通知テキスト組み立てを 1 スクリプトにまとめる
tmux.py send: 既存ツールをそのまま使う (独立性を保つため集約しない)。Python 起動 1 回は許容
バッチ化 (YAGNI): 「N 件をまとめて 1 スクリプトで処理」は過剰設計として v7 では見送り。理由:
- 通常運用で pending は 0-5 件程度 (連投で瞬間的に 10 件)
- 1 件 × Python 起動 50ms × 10 件 = 500ms (launchd 60 秒サイクルの 0.8%)
- バッチ化するとエラー時のリトライ粒度が粗くなる (1 件失敗で全件 UPDATE スキップ)
- pending が 50 件超えるような異常時は watchdog (Layer 4w) 側で検知される想定
代替案: cut + iconv + tr の bash パイプで Unicode aware cut を実装する手もあるが、UTF-8 マルチバイト境界の正確な処理は bash では困難 → Python が最もシンプルで確実

結論として v7 は「pending 1 件あたり Python 2 回起動 (マスク + tmux.py)」を維持するが、セクション 6.5 に「将来 pending 急増時の対策案」として注記のみ追加。

6.5.1 launchd watchdog (v6 新設)

背景: Codex Loop 4 の指摘「intents-timer.sh が秘書停止検知の唯一トリガなので、launchd ジョブが落ちた時に気付けない単一障害点」に対応。

設計方針:

別の launchd ジョブ (com.aiharataketo.intents-timer-watchdog) を 5 分間隔で走らせる
launchctl print gui/$(id -u)/com.aiharataketo.intents-timer で last exit code / last terminating signal / state を確認
異常判定ルール (v8 改訂):
- (i) ジョブ unload: launchctl print が空 → bootstrap で復旧
- (ii) last exit code が整数かつ非 0: bootout → bootstrap で再起動
- (iii) last terminating signal が存在、かつ Terminated 以外: bootout → bootstrap で再起動 (v8 追加。kill -9 相当は last exit code = -9 ではなく last terminating signal = Killed: 9 と表示されるため、signal 行を併せて解析しないと見逃す)
- (iv) (never exited) / - / 空文字 / 非整数 は「まだ一度も走ってない」= 正常扱い
- (v) state = waiting / running は正常。not running は launchd のノーマル状態でも現れるので判定には使わず診断情報のみ
異常時は ntfy 警告 + launchctl bootout → launchctl bootstrap で再起動
watchdog 自体が落ちるケースは諦める (再帰的監視は YAGNI)

intents-timer-watchdog.sh (v8 改訂):

#!/bin/bash
# pipefail を設定しないと awk の構文エラー (exit 1) が silent に握りつぶされ
# wc -l が 0 を返して永久に検知されない。将来の回帰防止として必須。
set -eo pipefail
LOG_DIR="$HOME/.claude/logs"
mkdir -p "$LOG_DIR"
LOG="$LOG_DIR/intents-timer-watchdog.log"
JOB="com.aiharataketo.intents-timer"
JOB_LABEL="gui/$(id -u)/$JOB"
PLIST="$HOME/Library/LaunchAgents/$JOB.plist"

log() { echo "[$(date '+%F %T')] $1" >>"$LOG"; }

# --- (i) launchctl print でジョブ状態取得 ---
state_out=$(launchctl print "$JOB_LABEL" 2>/dev/null || true)

if [ -z "$state_out" ]; then
  log "WARN: $JOB not loaded, attempting bootstrap"
  launchctl bootstrap "gui/$(id -u)" "$PLIST" 2>>"$LOG" || {
    log "ERROR: bootstrap failed"
    curl -s -m 5 -d "秘書: intents-timer launchd bootstrap 失敗" "https://ntfy.sh/aihara-64d1132d60c2" >/dev/null || true
    exit 1
  }
  exit 0
fi

# --- (ii) last exit code / last terminating signal / state 抽出 ---
# launchctl print の出力例:
#   last exit code = 0
#   last exit code = (never exited)
#   last exit code = 2
#   last terminating signal = Killed: 9        ← kill -9 時に出る (v8 追加)
#   last terminating signal = Terminated: 15   ← launchctl bootout 時の正常終了
#   state = running / waiting / not running
#
# exit code だけでは `kill -9` を拾えない (`last exit code` は変化しないまま) ので、
# `last terminating signal` 行も併せて解析する。sed で trailing space を除去。
last_exit=$(printf '%s\n' "$state_out" | sed -n 's/.*last exit code = //p' | head -1 | sed 's/[[:space:]]*$//')
last_signal=$(printf '%s\n' "$state_out" | sed -n 's/.*last terminating signal = //p' | head -1 | sed 's/[[:space:]]*$//')
state=$(printf '%s\n' "$state_out" | sed -n 's/.*state = //p' | head -1 | sed 's/[[:space:]]*$//')

log "state='$state' last_exit='$last_exit' last_signal='$last_signal'"

# --- (iii) 異常判定 ---
# 1) last_exit が整数 (0-255 または負の整数) かつ 0 でないとき → 異常
#    `(never exited)`, `-`, 空文字, その他の文字列は「正常」(まだ実行されてない / 情報なし)
# 2) last_signal が存在し、かつ `Terminated` で始まらないとき → 異常
#    `Terminated: 15` は launchctl bootout が送る SIGTERM なので正常処理扱い。
#    `Killed: 9` (SIGKILL), `Bus error: 10`, `Abort trap: 6` 等は異常扱い。
is_abnormal=0
reason=""
if [[ "$last_exit" =~ ^-?[0-9]+$ ]] && [ "$last_exit" != "0" ]; then
  is_abnormal=1
  reason="exit=$last_exit"
  log "WARN: last_exit='$last_exit' (non-zero integer) — abnormal"
fi
if [ -n "$last_signal" ] && ! printf '%s' "$last_signal" | grep -q '^Terminated'; then
  is_abnormal=1
  reason="${reason:+$reason, }signal='$last_signal'"
  log "WARN: last_signal='$last_signal' (non-Terminated) — abnormal"
fi

if [ "$is_abnormal" = "1" ]; then
  curl -s -m 5 -d "秘書: intents-timer 異常検出 ($reason, state='$state')、リロード試行" "https://ntfy.sh/aihara-64d1132d60c2" >/dev/null || true
  launchctl bootout "$JOB_LABEL" 2>>"$LOG" || true
  sleep 1
  launchctl bootstrap "gui/$(id -u)" "$PLIST" 2>>"$LOG" || {
    log "ERROR: bootstrap after bootout failed"
    curl -s -m 5 -d "秘書: intents-timer リロード失敗、手動確認必要" "https://ntfy.sh/aihara-64d1132d60c2" >/dev/null || true
    exit 1
  }
  log "reload OK"
fi

# --- (iv) 直近 10 分の intents-timer ログから ERROR 数をカウント (v8 BSD awk 互換修正) ---
# v6 の `awk '$0 > cutoff'` はログ行先頭 '[' で常に真になるバグ。
# v7 で `match($0, /regex/, m)` の 3 引数形式に変えたが、これは gawk 拡張で
# macOS 標準 BSD awk では構文エラーで即死し、err_count が常に 0 になる。
# v8 では POSIX 準拠の 2 引数 match() + RSTART/RLENGTH + substr() に差し替え、
# さらに awk の stderr を一時ファイルに捕捉して syntax error を明示検知する
# (`err_count=$(...)` の代入は set -e で検知できないため silent に戻る罠)。
err_count=0
if [ -f "$LOG_DIR/intents-timer.log" ]; then
  cutoff=$(date -v-10M '+%F %T' 2>/dev/null || date -d '10 minutes ago' '+%F %T' 2>/dev/null || echo "")
  if [ -n "$cutoff" ]; then
    awk_err=$(mktemp)
    awk_out=$(awk -v cutoff="$cutoff" '
      {
        if (match($0, /\[[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}\]/)) {
          # RSTART は 1-origin、先頭 `[` を除いて 19 文字が時刻本体
          ts = substr($0, RSTART + 1, RLENGTH - 2)
          if (ts > cutoff && /ERROR|FAIL/) print
        }
      }
    ' "$LOG_DIR/intents-timer.log" 2>"$awk_err") || {
      log "ERROR: awk failed — $(cat "$awk_err" 2>/dev/null || true)"
      curl -s -m 5 -d "秘書: watchdog awk 構文エラー検出、ログ監視停止中" \
        "https://ntfy.sh/aihara-64d1132d60c2" >/dev/null || true
      awk_out=""
    }
    rm -f "$awk_err"
    if [ -n "$awk_out" ]; then
      err_count=$(printf '%s\n' "$awk_out" | wc -l | tr -d ' ')
    fi

    if [ "${err_count:-0}" -gt 10 ]; then
      log "WARN: $err_count errors in intents-timer.log last 10 min"
      curl -s -m 5 -d "秘書: intents-timer 10分間に $err_count 件エラー発生" "https://ntfy.sh/aihara-64d1132d60c2" >/dev/null || true
    fi
  fi
fi

exit 0

v8 での修正点サマリ:

Codex Loop 6 指摘	v8 対応
MUST: `match($0, /regex/, m)` 3 引数は gawk 拡張、BSD awk で構文エラーになり err_count が常に 0	POSIX 2 引数 `match($0, /regex/)` + `RSTART`/`RLENGTH` + `substr()` に書き換え。macOS 標準 awk で動作
SHOULD: `kill -9` は `last exit code = -9` ではなく `last terminating signal = Killed: 9` として出る (Codex 実機検証済み)、v7 の `-9` 想定は発火せず signal 異常を見逃す	`last terminating signal` 行を追加解析。`Terminated` (bootout の SIGTERM 正常終了) 以外の signal は異常扱いにして bootout→bootstrap
SHOULD (KISS): 誤検知テストケース 6 ケースは親切だが launchctl 出力表記変更時に仕様との乖離を生みやすい	3 ケース (正常 / exit 異常 / signal 異常) に簡略化し、運用実態と表の整合を取りやすくする
Codex Loop 7 SHOULD: `set -o pipefail` 未設定 + `err_count=$(... awk ... \| wc -l)` の代入形式は awk syntax error を silent に通す余地がある	`set -eo pipefail` を設定、かつ awk の stderr を一時ファイルに捕捉して `\|\|` で失敗分岐を設け、syntax error 検出時に ntfy 警告を出す実装に変更

誤検知抑制のテストケース (v8 簡略化版):

ケース	`last_exit`	`last_signal`	判定
正常 (初回起動 / 正常終了 / 情報なし)	`(never exited)` / `0` / `-` / 空	空 / `Terminated: 15`	正常 (リロードしない)
exit code 異常	`2` など整数非 0	(任意)	異常 → リロード
signal 異常	(任意)	`Killed: 9` / `Bus error: 10` / `Abort trap: 6` など	異常 → リロード

com.aiharataketo.intents-timer-watchdog.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>Label</key>
  <string>com.aiharataketo.intents-timer-watchdog</string>
  <key>ProgramArguments</key>
  <array>
    <string>/bin/bash</string>
    <string>/Users/aiharataketo/.claude/scripts/intents-timer-watchdog.sh</string>
  </array>
  <key>StartInterval</key>
  <integer>300</integer>
  <key>ThrottleInterval</key>
  <integer>300</integer>
  <key>RunAtLoad</key>
  <true/>
  <key>StandardOutPath</key>
  <string>/Users/aiharataketo/.claude/logs/intents-timer-watchdog-out.log</string>
  <key>StandardErrorPath</key>
  <string>/Users/aiharataketo/.claude/logs/intents-timer-watchdog-err.log</string>
</dict>
</plist>

監視対象の障害シナリオ:

シナリオ	watchdog の反応
`intents-timer` が `launchctl bootout` 等で unload された	`launchctl print` が空 → `bootstrap` で復旧
`intents-timer.sh` が構文エラーで起動直後に落ちる	`last exit code != 0` を検知 → `bootout` → `bootstrap` で再試行
`intents-timer.sh` が `kill -9` / SIGBUS / SIGABRT で落ちる (v8 追加)	`last terminating signal` が `Terminated` 以外 → `bootout` → `bootstrap` で再試行
`intents-timer.sh` が DB ロックで頻発的にエラー	直近 10 分のログから `ERROR
通知は届くが `tmux.py send` 失敗率が高い	intents-timer 自体は exit 0 なので watchdog 経由では検知しない。別途ログ監視が必要 (次フェーズ検討)
watchdog 自体が落ちる	監視対象外 (YAGNI)。launchctl で軽いスクリプトなので稀

リロード操作の安全性:

launchctl bootout は対象プロセスに SIGTERM → SIGKILL を送るため、実行中のジョブは中断される
intents-timer.sh は mkdir lock で排他しているので、強制中断されても次回起動で lock を取り直す
中断されたトランザクションは sqlite3 の .timeout 5000 で自動的に解放される

StartInterval = 300 秒 (5 分) の根拠:

intents-timer 本体が 60 秒間隔なので、5 分で最大 5 回の発火機会を与える
5 回連続失敗していれば確実に異常 (単発のネットワーク障害ではない)
watchdog 自体の負荷を抑える (頻発チェックは無意味)

副作用の再評価:

60 秒間隔 × 条件A/B で通常は 1 分に 1 回のみ (新着がなければ 0 回)
rate limit 持続中でも 5 分に 1 回だけ再通知 (スパム防止)
THRESHOLD_MINUTES=10 の旧ハードコードは削除する

rate limit シナリオでの動作 (v5):

(A) server.ts の Layer 1' は intents DB 永続化のみ (ntfy 直通知は廃止 — v5)。pending 件として残り続ける
(B) 秘書セッションに対しては launchd 60 秒 + 30 秒 debounce で最大 90 秒以内 に tmux.py send + ntfy 通知
- 通常時 (秘書が生きている): 90 秒以内に close されるので launchd は通知を出さない (条件 A/B どちらも満たさない)
- rate limit / 停滞時: pending が 30 秒以上解消しないので launchd が通知を発火
(C) rate limit 解除後、Claude が次ターンで tmux.py send の注入内容を UserPromptSubmit として読む → Layer 3 の check-pending-intents.sh で全件 additionalContext 注入
(D) Claude crash で run-claude.sh 再起動の場合は Layer 5 の bootstrap-pending-intents.sh で全件提示

ntfy を intents-timer.sh に集約した効果:

シナリオ	v4 の動作	v5 の動作
通常メッセージ (秘書生存)	ntfy 送信 (全件無条件)	ntfy ゼロ
rate limit 中のメッセージ	ntfy 送信 + 90 秒後に tmux.py send	90 秒後に tmux.py send + ntfy 送信
秘書 crash 中のメッセージ	ntfy 送信 + 90 秒後に tmux.py send	90 秒後に tmux.py send + ntfy 送信

v5 では通常運用の大半で ntfy 通知が発生しない。外部サービスへの本文流出も 90 秒 debounce まで遅延される (その間に秘書が処理すれば本文は一切外に出ない)。

Stop hook は使わない。rate limit で秘書セッションが応答不能でも、(B) の 90 秒以内の tmux.py send で秘書キューに積まれ、同時に ntfy でユーザーが気付き、解除後は (C)/(D) で処理される。

6.6 run-claude.sh の修正

orphan kill パス修正:

Before: TELEGRAM_PLUGIN_DIR="/Users/aiharataketo/.claude/plugins/cache/claude-plugins-official/telegram"
After:  TELEGRAM_PLUGIN_DIR="/Users/aiharataketo/.claude/plugins/marketplaces/claude-plugins-official/external_plugins/telegram"

同期 post-patch 検証の追加 (Layer 8 相当を同期化):

# パッチ適用
if ! bash /Users/aiharataketo/.claude/scripts/patch-telegram-plugin-v2.sh 2>>"$LOG_DIR/patch-telegram-plugin.log"; then
  log "ERROR: patch-telegram-plugin-v2.sh failed — skipping Claude start"
  curl -s -d "秘書: patch-telegram-plugin-v2 失敗。手動確認が必要。" ntfy.sh/aihara-64d1132d60c2 >/dev/null 2>&1 || true
  sleep 60
  continue
fi

# 同期 post-patch 検証: 実体の server.ts に INTENT_RECORD_BEGIN が入ってるか
PLUGIN_TS="${TELEGRAM_PLUGIN_DIR}/server.ts"
if ! grep -q "INTENT_RECORD_BEGIN" "$PLUGIN_TS" 2>/dev/null; then
  log "ERROR: post-patch verify failed - marker missing at $PLUGIN_TS"
  curl -s -d "秘書: post-patch 検証失敗。marker 不在。" ntfy.sh/aihara-64d1132d60c2 >/dev/null 2>&1 || true
  sleep 60
  continue
fi
log "post-patch verify OK: INTENT_RECORD_BEGIN found"

非同期の sleep 30 版 (v1 draft の Layer 8) と比較した利点:

検証失敗時は 即座に Claude 起動を止められる (fail closed)
PID 特定不要 (grep だけ)
sleep 30 の無駄待ちなし
bun プロセス誤検知なし (パスが確定しているので絞り込み不要)

7. データモデル

7.1 スキーマ変更: なし

既存 intents テーブル (id, received_at, message_id, chat_id, raw_text, parsed_intent, status, linked_job_id, closed_reason, closed_at, last_reminded_at) をそのまま使う。新カラム不要。

7.2 status の運用 (既存と同じ)

status	意味	遷移元	遷移先
pending	未対応	INSERT 直後	linked / closed / ignored
linked	既存 job に紐付け (reminder 抑制)	pending	closed
closed	対応完了	pending / linked	-
ignored	無視 (手動 or migration test)	pending	-

8. フロー (本設計書適用後)

8.1 通常フロー (連投 4 通)

06:57  健人 → msg_id=3016 (1通目)
       server.ts handleInbound() → record-intent.sh → intents INSERT (3016, pending)
       mcp.notification(3016) → Claude (秘書) ← busy (ツール実行中)
06:57  健人 → msg_id=3017 (2通目)
       record-intent.sh → intents INSERT (3017, pending)
       mcp.notification(3017) → Claude coalesce (3016 を上書き)
06:58  健人 → msg_id=3018 (3通目)
       record-intent.sh → intents INSERT (3018, pending)
       mcp.notification(3018) → Claude