MSG-43 ·
llmmsg-srvRationalize errscan-vs-applog overlap fleet-wide (both = error->PM-DM; applog covers all 3 proj+now green, errscan mars-only, mars DOUBLE-covered) - decide consolidate-not-expand; needs PM/Elazar division-of-labor call
- Ref
MSG-43(#1071)- Project
llmmsg-srv- Status
- deferred
- Priority
- low
- Type
- task
- Assigned
- pm-llmmsgsrv-cc
- Created by
- wi-cli-whey
- Created
- 2026-06-14T14:58:00.165Z
- Updated
- 2026-06-14T15:04:50.919Z
Questions
No questions.
Event log
-
Rationalize errscan-vs-applog overlap fleet-wide (both = error->PM-DM; applog covers all 3 proj+now green, errscan mars-only, mars DOUBLE-covered) - decide consolidate-not-expand; needs PM/Elazar division-of-labor call
-
Reframed from 'deploy errscan pluto/venus' to OVERLAP RATIONALIZATION (nw-whey insight): errscan + applog are both error->PM-DM pipelines. applog (pull+listen) already covers pluto/mars/venus on venus + is green - so pluto/venus alerting is NOT a gap, deploying errscan there would be largely REDUNDANT. mars currently runs BOTH (applog-pull@mars + errscan@mars) = double-coverage/possible dup alerts. RIGHT MOVE = decide division of labor fleet-wide, likely CONSOLIDATE on applog + retire errscan, NOT expand errscan. nw-whey to produce per-project coverage map of both systems as decision input. Low priority - MSG-42 done, alerting works via applog.
-
COVERAGE MAP (nw-whey): mars=BOTH errscan@mars(whey,2h Supabase-view batch,error+WARN) + applog(venus,pull+listen,realtime). pluto/venus=applog-ONLY. errscan-only=nothing. DOUBLE-COVERED=mars only. DECISION RISK: errscan@mars has 4 features applog may NOT: (1) Elazar escalation lane (new-sig/spike>=10/PROD-DOWN unhandled-5xx); (2) extra recipient routing (coder-mars:actual_err); (3) suppression table (--suppress); (4) spike-only aggregate. So 'retire errscan consolidate on applog' could LOSE those for mars unless ported first -> consolidation may be PORT-then-retire, not clean decommission. NEXT: nw-whey getting applog feature-parity readout from nw-venus (which of the 4 applog already does) to complete the decision packet before Elazar's call.
-
DECISION PACKET complete (nw-whey + nw-venus parity read). PARITY: applog does NONE of errscan@mars's 4 (escalation lane / multi-recipient routing / suppression table / spike-aggregate) - applog is deliberately simpler (severity-routed, single PM recipient, cursor watermark). So consolidate=PORT-THEN-RETIRE, not free. OPTIONS: A=keep both (status quo, ZERO work, cost=mars double-DMs PM on warns); B=port-then-retire (build the 4 into applog, all 3 projects gain richer alerting, non-trivial build WI); C=retire-without-port (cheapest but SECURITY REGRESSION - loses Elazar prod-down escalation + credential-stuffing aggregate for mars, NOT recommended). INTERIM PM DECISION: default A (zero-work, no security loss, errscan@mars already bearer-fixed+working); B held as future WI GATED on Elazar wanting the 4 features fleet-wide (pluto/venus currently lack them). No live gap (MSG-42 fixed). Packet is Elazar-ready; surface as NON-URGENT (not during cdw cutover). Possible A-refinement: narrow errscan@mars to its 4 unique features only + let applog own routine warn-digest = kills mars double-DM (future tuning, not forced).
-
Decision packet ready, interim default A (keep both) adopted - zero work, no live gap. Awaits Elazar A-vs-B call (non-urgent), surfaced at a natural moment post-cdw. B=its own build WI if he wants the 4 features fleet-wide.