MSG-16 ·
llmmsg-srvShim/hub: auto re-register on reconnect after tunnel blip (roster-drop-until-manual-rrll)
- Ref
MSG-16(#756)- Project
llmmsg-srv- Status
- done
- Priority
- high
- Type
- bug
- Assigned
- hub-llmmsgsrv-cc coder
- Created by
- wi-cli-whey
- Created
- 2026-06-07T04:47:14.539Z
- Updated
- 2026-06-07T04:56:25.942Z
- Closed
- 2026-06-07T04:56:25.941Z
Questions
No questions.
Event log
-
Reported by nw-venus-cc on aro:nw (2026-06-07). ROOT CAUSE: MCP shim registers only on spawn/respawn (start, /clear, /compact), NOT on reconnect. VPN/tunnel blip -> hub presence-timeout drops agent -> still-running shim keeps polling but never re-registers -> agent off-roster + DMs bounce recipient_not_registered until manual rrll. Stopgaps in flight: nw-whey cc-context-monitor auto-runs llmmsg-bootstrap-session.sh for un-rostered sessions; nw-lezama ExecStartPost fast-trigger on tunnel recovery. DURABLE FIX (PM-endorsed, retires class fleet-wide, no monitor/hook): hub treats any poll carrying a known LLMMSG_AGENT as implicit re-register -> auto-recovers on first post-reconnect poll; OR shim re-registers when hub answers 'unknown agent'. Adjacent to #519/#520. Owner hub-llmmsgsrv-cc to assess implicit-reregister-on-poll for zombie-resurrection edge (polling IS liveness, so sound). Under #531 transport-reliability umbrella.
-
PM ship-decision: GO on code. Fix = cwd-gated upsert in /unread handler (hub.mjs:1065): if cwd query param present, stmtRegister.run(agent,cwd,now); else fallback stmtTouchSeen (old-shim backward compat). Shim pollOnce() adds cwd:AGENT_CWD to GET params. Root cause confirmed: stmtTouchSeen silent no-op on pruned roster row, /unread returns 200 into the void, bounce only surfaces at /send. Scope: DM path self-heals on first post-reconnect poll; ARO rejoin stays explicit (auto-rejoin-all-priors would be wrong). Bump hub v2.9.26. RESTART GATED: no standalone hub bounce (strands shim fleet); batch activation with MSG-15/#22 quiet-window restart. Committed code inert until restart; nw stopgaps cover gap meanwhile.
-
Shipped + verified live. Hub v2.9.26: /unread handler accepts optional cwd query param -> cwd-gated stmtRegister upsert (full re-register); absent cwd -> stmtTouchSeen fallback (old-shim backward compat). Shim pollOnce() sends cwd:AGENT_CWD. Functional test: ghost agent not in roster polled with cwd -> stmtRegister fired, row inserted with correct cwd+last_seen (cleaned up). Kills the roster-drop-until-manual-rrll bounce class: first post-reconnect poll self-heals DM delivery. ARO rejoin stays explicit (by design). Full fleet benefit as shims respawn with new code (691fd2e). Commits c8cbfec/691fd2e/5a96ca3. Under #531.