v12.4.3: one-time pollution cleanup migration + v12.4.1/v12.4.2 fixes (#2133)

* fix: 5 trivial bugs from v12.4.1 issue triage - #2092: emit CJS-safe banner (no import.meta.url) in worker-service.cjs - #2100: PreToolUse Read hook timeout 2000s → 60s - #2131: add "shell": "bash" to every hook for Windows compat - #2132: Antigravity dir typo .agent → .agents - #2088: clear inherited MCP servers in worker SDK query() calls Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: stop context overflow loop + block task-notification leak - SDKAgent: clear memorySessionId on "prompt is too long" so crash-recovery starts a fresh SDK session instead of resuming the same poisoned context forever (was producing 68+ failed pending_messages on a single stuck session in the wild) - tag-stripping: new isInternalProtocolPayload() predicate; session-init hook + SessionRoutes both skip storage when entire prompt is one of Claude Code's autonomous protocol blocks (currently <task-notification>; conservative deny-list — does NOT touch <command-name>/<command-message> which wrap real user slash-commands) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version to 12.4.2 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: update CHANGELOG.md for v12.4.2 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cleanup): one-time v12.4.3 migration purges observer-sessions and stuck pending_messages Adds CleanupV12_4_3 module that runs once per data dir on worker startup (after migrations apply, before Chroma backfill). Drops accumulated pollution that v12.4.0 (observer-sessions filter) and v12.4.2 (context-overflow guard + task-notification leak block) prevent from recurring: - DELETE FROM sdk_sessions WHERE project='observer-sessions' (cascades to user_prompts, observations, session_summaries via existing FK ON DELETE CASCADE) - DELETE FROM pending_messages stuck in 'failed'/'processing' for any session with >=10 such rows (poisoned chains from the pre-v12.4.2 retry loop; threshold spares legitimate transient failures) - Wipes ~/.claude-mem/chroma and chroma-sync-state.json so backfillAllProjects rebuilds the vector store from cleaned SQLite Pre-flight checks free disk (1.2x DB size + 100MB) via fs.statfsSync; backs up via VACUUM INTO with copyFileSync fallback; PRAGMA foreign_keys=ON on the cleanup connection (off by default in bun:sqlite). Marker file ~/.claude-mem/.cleanup-v12.4.3-applied records backup path and counts. Opt-out via CLAUDE_MEM_SKIP_CLEANUP_V12_4_3=1. Verified locally: 311MB DB backed up to 277MB in 943ms; 11 observer sessions + 3 cascade rows + 141 stuck pending_messages purged; chroma rebuilt via backfill. Total cleanup time 1.1s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address PR #2133 code review - SessionRoutes: check isInternalProtocolPayload before stripping tags so internal protocol prompts skip the strip work entirely. - tag-stripping: bound isInternalProtocolPayload input length to 256KB to prevent ReDoS-class scans on malformed unclosed tags. - SDKAgent: extract resetSessionForFreshStart helper; both context-overflow paths now share one nullification routine. - worker-service: drop the per-startup "Checking for one-time v12.4.3 cleanup" info log — runs every boot even after marker exists; the function already logs at debug/warn when relevant. - tests: add isInternalProtocolPayload edge cases (whitespace, attributes, partial tags, unrelated tags, oversize input). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address Greptile P2 comments on PR #2133 CleanupV12_4_3.ts: derive backup directory and restore-hint path from effectiveDataDir instead of the module-level BACKUPS_DIR/DB_PATH constants. The dataDirectory override is meant for test isolation; the prior version still wrote backups to the production directory. SessionRoutes.ts: move isInternalProtocolPayload guard to the top of handleSessionInitByClaudeId, before createSDKSession. The previous position blocked the user_prompts insert but still created an empty sdk_sessions row, asymmetric with the hook-layer guard in session-init.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cleanup): retry on disk-skip; survive chroma wipe failure CodeRabbit Major + Claude review: - Disk pre-flight skip no longer writes the marker. A user temporarily low on disk would otherwise have the cleanup permanently disabled even after freeing space. Retry on next startup instead. - Wrap wipeChromaArtifacts in try/catch and write the marker even on failure (with chromaWipeError captured). Without this, an rmSync permission failure on chroma/ left writeMarker unreached, so every subsequent boot re-ran the SQL purge AND created a fresh backup, consuming disk indefinitely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cleanup): close backup handle before copyFileSync fallback Claude review: - backupDb is now closed before falling into the copyFileSync fallback. On Windows an open SQLite handle holds a file lock that can prevent the fallback copy from reading the source. The previous version only closed after both branches completed. - Add empty-body <task-notification></task-notification> case to the isInternalProtocolPayload tests for completeness. Cascade-row count queries already match the actual FK columns (content_session_id for user_prompts, memory_session_id for observations / session_summaries) — no fix needed there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cleanup): accurate session count + add migration tests Claude review v3: session-init.ts: filter on rawPrompt before the [media prompt] substitution. Functionally equivalent but explicit — the check no longer depends on the substitution leaving real protocol payloads untouched. CleanupV12_4_3.ts: counts.observerSessions now comes from a pre-DELETE COUNT(*), not from result.changes. bun:sqlite inflates result.changes with FTS-trigger and cascade row counts (the user_prompts_fts triggers inflate a 3-session purge to 19 changes). The previous code logged a misleading total and wrote it to the marker. tests/infrastructure/cleanup-v12_4_3.test.ts: happy-path coverage of the migration against a real on-disk SQLite under a tmpdir. Verifies observer-session purge with cascades, stuck pending_messages purge, chroma artifact wipe, marker payload shape, idempotency on re-run, and CLAUDE_MEM_SKIP_CLEANUP_V12_4_3 opt-out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(protocol-filter): close two-block false positive; address review CodeRabbit + Claude review v5: tag-stripping.ts: PROTOCOL_ONLY_REGEX rewritten with a negative-lookahead body so a prompt like "<task-notification>x</task-notification> hi <task-notification>y</task-notification>" no longer matches as a single outer block — the prior greedy [\s\S]* spanned the middle user text and would have silently dropped a real prompt. Confirmed via probe. tag-stripping.test.ts: drop the 50ms wall-clock assertion (CI flake); add the two-block-with-text case as a regression test. SessionRoutes.ts: filter on req.body.prompt directly, before the [media prompt] substitution and 256KB truncation. Mirrors the session-init.ts hook-layer ordering and ensures a protocol payload that happens to be near the byte limit isn't truncated before the filter runs. cleanup-v12_4_3.test.ts: add stuckCount=9 below-threshold case verifying pending_messages with <10 stuck rows are preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cleanup): include WAL/SHM in backup fallback; safer rollback CodeRabbit Major + Claude review v6: CleanupV12_4_3.ts: when VACUUM INTO fails and copyFileSync runs, also copy any -wal/-shm sidecars. The DB is configured WAL mode, so recent committed pages can live in those files; copying only the .db would miss them. VACUUM INTO already captures everything in one file, so the happy path is unaffected. CleanupV12_4_3.ts: wrap ROLLBACK in try/catch so a no-op rollback (SQLite already rolled back on a constraint failure) cannot shadow the original purge error. SDKAgent.ts: align both context-overflow log levels to error. Both branches are fatal-recovery paths; the previous warn/error split was inconsistent and made the throw branch easy to miss in logs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: pre-count stuck pending_messages; document adjacent-block fall-through Claude review v7: CleanupV12_4_3.ts: runStuckPendingPurge now uses a SELECT COUNT(*) before the DELETE, matching the pattern in runObserverSessionsPurge. result.changes is reliable today (no FTS on pending_messages) but the explicit count protects against future schema additions, and keeps the two purges symmetric. tag-stripping.test.ts: add test documenting that adjacent protocol blocks (no user text between) deliberately fall through to storage. The deny-list is per-block; concatenations are out of scope. Skipped per project rules / Node API constraints: - frsize fallback in disk check: Node/Bun StatFs doesn't expose frsize - VACUUM-INTO comment: comment-only suggestion - Overflow string constant extraction: low value Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 16:30:34 -07:00
parent a2e174b90f
commit 703c64c756
24 changed files with 1191 additions and 511 deletions
@@ -10,7 +10,7 @@
  "plugins": [
    {
      "name": "claude-mem",
-      "version": "12.4.1",
+      "version": "12.4.3",
      "source": "./plugin",
      "description": "Persistent memory system for Claude Code - context compression across sessions"
    }
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem",
-  "version": "12.4.1",
+  "version": "12.4.3",
  "description": "Memory compression system for Claude Code - persist context across sessions",
  "author": {
    "name": "Alex Newman"
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem",
-  "version": "12.4.1",
+  "version": "12.4.3",
  "description": "Memory compression system for Claude Code - persist context across sessions",
  "author": {
    "name": "Alex Newman",
@@ -4,6 +4,25 @@ All notable changes to this project will be documented in this file.

 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

+## [12.4.2] - 2026-04-25
+
+## Two ship-blockers from yesterday's triage + 5 trivial fixes
+
+### Worker reliability
+- **Context overflow no longer loops forever.** When the Claude SDK throws `Prompt is too long`, `SDKAgent` now clears `session.memorySessionId` and sets `session.forceInit = true` before throwing — so the immediately-following crash-recovery spawn starts a fresh SDK session instead of resuming the same overflowed context. In the wild this had stranded 68+ pending messages on a single poisoned session before the windowed RestartGuard finally abandoned the queue.
+- **`<task-notification>` payloads no longer pollute `user_prompts`.** Claude Code's autonomous protocol blocks (emitted on background `Agent` completion) were being captured as if they were user prompts — 471 such rows in one local DB. New `isInternalProtocolPayload()` predicate in `src/utils/tag-stripping.ts` blocks them at both the hook layer (`session-init.ts`) and the worker boundary (`SessionRoutes.ts`). Conservative deny-list — does NOT touch `<command-name>` / `<command-message>` which wrap real user slash-commands.
+
+### Triage cleanup (from yesterday's open-issue review)
+- **#2092**: `worker-service.cjs` build banner now CJS-safe (no `import.meta.url`); `node -c` passes for the first time in several releases.
+- **#2100**: PreToolUse Read hook timeout reduced from `2000` (s, plainly a typo) to `60`.
+- **#2131**: `"shell": "bash"` added to every hook in `plugin/hooks/hooks.json` so Claude Code on Windows routes through Git Bash instead of cmd.exe.
+- **#2132**: Antigravity context file path corrected from `.agent/rules` to `.agents/rules`.
+- **#2088**: Worker SDK `query()` calls now pass `mcpServers: {}` to suppress inheritance of the user's global MCP servers (Serena, etc.) into observer/knowledge sessions.
+
+### Notes
+- Cleanup of polluted rows is included in the worker — fresh installs are clean. To clean an existing DB: `sqlite3 ~/.claude-mem/claude-mem.db "DELETE FROM user_prompts WHERE prompt_text LIKE '<task-notification>%';"` (the AFTER-DELETE trigger handles FTS).
+- The 5 triage fixes were authored from a multi-agent review of 38 open issues against the v12.3.0–v12.4.1 cleanup arc.
+
 ## [12.4.1] - 2026-04-25

 ## perf(chroma): Cache backfill watermarks to skip per-restart Chroma scans
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem",
-  "version": "12.4.1",
+  "version": "12.4.3",
  "description": "Memory compression system for Claude Code - persist context across sessions",
  "keywords": [
    "claude",
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem",
-  "version": "12.4.1",
+  "version": "12.4.3",
  "description": "Persistent memory system for Claude Code - seamlessly preserve context across sessions",
  "author": {
    "name": "Alex Newman"
@@ -7,6 +7,7 @@
        "hooks": [
          {
            "type": "command",
+            "shell": "bash",
            "command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/smart-install.js\"",
            "timeout": 300
          }
@@ -19,16 +20,19 @@
        "hooks": [
          {
            "type": "command",
+            "shell": "bash",
            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/smart-install.js\"",
            "timeout": 300
          },
          {
            "type": "command",
+            "shell": "bash",
 "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" start; echo '{\"continue\":true,\"suppressOutput\":true}'",
            "timeout": 60
          },
          {
            "type": "command",
+            "shell": "bash",
 "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code context",
            "timeout": 60
          }
@@ -40,6 +44,7 @@
        "hooks": [
          {
            "type": "command",
+            "shell": "bash",
            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-init",
            "timeout": 60
          }
@@ -52,6 +57,7 @@
        "hooks": [
          {
            "type": "command",
+            "shell": "bash",
            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code observation",
            "timeout": 120
          }
@@ -64,8 +70,9 @@
        "hooks": [
          {
            "type": "command",
+            "shell": "bash",
            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code file-context",
-            "timeout": 2000
+            "timeout": 60
          }
        ]
      }
@@ -75,6 +82,7 @@
        "hooks": [
          {
            "type": "command",
+            "shell": "bash",
            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code summarize",
            "timeout": 120
          }
@@ -86,6 +94,7 @@
        "hooks": [
          {
            "type": "command",
+            "shell": "bash",
 "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-complete",
            "timeout": 30
          }
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem-plugin",
-  "version": "12.4.1",
+  "version": "12.4.3",
  "private": true,
  "description": "Runtime dependencies for claude-mem bundled hooks",
  "type": "module",
@@ -175,8 +175,8 @@ async function buildHooks() {
      banner: {
        js: [
          '#!/usr/bin/env bun',
-          'var __filename = require("node:url").fileURLToPath(import.meta.url);',
-          'var __dirname = require("node:path").dirname(__filename);'
+          'var __filename = __filename || require("node:path").resolve(process.argv[1] || "");',
+          'var __dirname = __dirname || require("node:path").dirname(__filename);'
        ].join('\n')
      }
    });
@@ -12,6 +12,7 @@ import { HOOK_EXIT_CODES } from '../../shared/hook-constants.js';
 import { shouldTrackProject } from '../../shared/should-track-project.js';
 import { loadFromFileOnce } from '../../shared/hook-settings.js';
 import { normalizePlatformSource } from '../../shared/platform-source.js';
+import { isInternalProtocolPayload } from '../../utils/tag-stripping.js';

 interface SessionInitResponse {
  sessionDbId: number;
@@ -43,6 +44,15 @@ export const sessionInitHandler: EventHandler = {
      return { continue: true, suppressOutput: true };
    }

+    // Filter on the raw prompt so the check is independent of the
+    // [media prompt] substitution below.
+    if (rawPrompt && isInternalProtocolPayload(rawPrompt)) {
+      logger.debug('HOOK', 'session-init: skipping internal protocol payload', {
+        preview: rawPrompt.slice(0, 80),
+      });
+      return { continue: true, suppressOutput: true };
+    }
+
    // Handle image-only prompts (where text prompt is empty/undefined)
    // Use placeholder so sessions still get created and tracked for memory
    const prompt = (!rawPrompt || !rawPrompt.trim()) ? '[media prompt]' : rawPrompt;
@@ -0,0 +1,276 @@
+/**
+ * One-time v12.4.3 pollution cleanup.
+ *
+ * Removes accumulated junk that v12.4.0/v12.4.2 fixes prevent from ever recurring:
+ *   1. observer-sessions: rows that polluted user-facing search/timeline before
+ *      the observer-sessions filter shipped. Cascades to user_prompts, observations,
+ *      and session_summaries via existing FK ON DELETE CASCADE.
+ *   2. Stuck pending_messages: poisoned chains where ≥10 rows for a single
+ *      session_db_id are stuck in 'failed' or 'processing'. Threshold spares
+ *      legitimate transient failures while clearing the cascade-failure cases
+ *      from the pre-v12.4.2 context-overflow loop.
+ *
+ * After SQLite is cleaned, ~/.claude-mem/chroma/ and ~/.claude-mem/chroma-sync-state.json
+ * are removed so backfillAllProjects rebuilds the vector store from the cleaned SQLite.
+ *
+ * Marker-file gated. Idempotent. Opt-out via CLAUDE_MEM_SKIP_CLEANUP_V12_4_3=1.
+ *
+ * Mirrors the runOneTimeChromaMigration / runOneTimeCwdRemap pattern in
+ * ProcessManager.ts. Must run AFTER dbManager.initialize() (so migrations have
+ * applied) and BEFORE ChromaSync.backfillAllProjects (so backfill sees the
+ * cleaned state).
+ */
+
+import path from 'path';
+import { existsSync, writeFileSync, mkdirSync, rmSync, statSync, copyFileSync, statfsSync } from 'fs';
+import { Database } from 'bun:sqlite';
+import { DATA_DIR, OBSERVER_SESSIONS_PROJECT } from '../../shared/paths.js';
+import { logger } from '../../utils/logger.js';
+
+const MARKER_FILENAME = '.cleanup-v12.4.3-applied';
+const STUCK_PENDING_THRESHOLD = 10;
+
+interface CleanupCounts {
+  observerSessions: number;
+  observerCascadeRows: number;
+  stuckPendingMessages: number;
+}
+
+interface MarkerPayload {
+  appliedAt: string;
+  backupPath: string | null;
+  chromaWiped: boolean;
+  chromaWipeError?: string;
+  counts: CleanupCounts;
+  skipped?: string;
+}
+
+/**
+ * Run the one-time v12.4.3 cleanup. Safe to call on every worker startup;
+ * the marker file ensures the work runs at most once per data directory.
+ *
+ * @param dataDirectory - Override for DATA_DIR (used in tests)
+ */
+export function runOneTimeV12_4_3Cleanup(dataDirectory?: string): void {
+  const effectiveDataDir = dataDirectory ?? DATA_DIR;
+  const markerPath = path.join(effectiveDataDir, MARKER_FILENAME);
+
+  if (existsSync(markerPath)) {
+    logger.debug('SYSTEM', 'v12.4.3 cleanup marker exists, skipping');
+    return;
+  }
+
+  if (process.env.CLAUDE_MEM_SKIP_CLEANUP_V12_4_3 === '1') {
+    logger.warn('SYSTEM', 'v12.4.3 cleanup skipped via CLAUDE_MEM_SKIP_CLEANUP_V12_4_3=1; marker not written');
+    return;
+  }
+
+  const dbPath = path.join(effectiveDataDir, 'claude-mem.db');
+  if (!existsSync(dbPath)) {
+    mkdirSync(effectiveDataDir, { recursive: true });
+    writeMarker(markerPath, { appliedAt: new Date().toISOString(), backupPath: null, chromaWiped: false, counts: emptyCounts(), skipped: 'no-db' });
+    logger.debug('SYSTEM', 'No DB present, v12.4.3 cleanup marker written without work', { dbPath });
+    return;
+  }
+
+  logger.warn('SYSTEM', 'Running one-time v12.4.3 pollution cleanup', { dbPath });
+
+  try {
+    executeCleanup(dbPath, effectiveDataDir, markerPath);
+  } catch (err: unknown) {
+    const error = err instanceof Error ? err : new Error(String(err));
+    logger.error('SYSTEM', 'v12.4.3 cleanup failed, marker not written (will retry on next startup)', {}, error);
+  }
+}
+
+function executeCleanup(dbPath: string, effectiveDataDir: string, markerPath: string): void {
+  const dbSize = statSync(dbPath).size;
+  const required = Math.ceil(dbSize * 1.2) + 100 * 1024 * 1024;
+
+  let backupPath: string | null = null;
+  try {
+    const fs = statfsSync(effectiveDataDir);
+    const free = Number(fs.bavail) * Number(fs.bsize);
+    if (free < required) {
+      // Don't write the marker — once the user frees disk space, the next
+      // worker startup should retry the cleanup rather than skipping forever.
+      logger.error('SYSTEM', 'Insufficient disk for v12.4.3 backup; skipping cleanup (will retry on next startup)', { dbSize, free, required });
+      return;
+    }
+  } catch (err: unknown) {
+    const error = err instanceof Error ? err : new Error(String(err));
+    logger.warn('SYSTEM', 'statfsSync failed; proceeding without disk-space pre-flight', {}, error);
+  }
+
+  const effectiveBackupsDir = path.join(effectiveDataDir, 'backups');
+  mkdirSync(effectiveBackupsDir, { recursive: true });
+  const ts = new Date().toISOString().replace(/[:.]/g, '-');
+  backupPath = path.join(effectiveBackupsDir, `claude-mem-pre-12.4.3-${ts}.db`);
+
+  const backupDb = new Database(dbPath, { readonly: true });
+  let vacuumFailed = false;
+  let vacuumError: Error | null = null;
+  try {
+    backupDb.run(`VACUUM INTO '${backupPath.replace(/'/g, "''")}'`);
+    logger.info('SYSTEM', 'v12.4.3 backup created via VACUUM INTO', { backupPath, dbSize });
+  } catch (err: unknown) {
+    vacuumFailed = true;
+    vacuumError = err instanceof Error ? err : new Error(String(err));
+  }
+  // Close before any fallback: on Windows an open SQLite handle holds a
+  // file lock that can prevent copyFileSync from reading the source.
+  backupDb.close();
+
+  if (vacuumFailed) {
+    logger.warn('SYSTEM', 'VACUUM INTO failed, falling back to copyFileSync', {}, vacuumError ?? undefined);
+    try {
+      copyFileSync(dbPath, backupPath);
+      // The DB is in WAL mode; recent committed pages may live in -wal/-shm.
+      // VACUUM INTO captures them automatically; copyFileSync does not, so
+      // mirror them alongside so the backup represents the same state.
+      const walPath = `${dbPath}-wal`;
+      const shmPath = `${dbPath}-shm`;
+      if (existsSync(walPath)) copyFileSync(walPath, `${backupPath}-wal`);
+      if (existsSync(shmPath)) copyFileSync(shmPath, `${backupPath}-shm`);
+      logger.info('SYSTEM', 'v12.4.3 backup created via copyFileSync (incl. -wal/-shm if present)', { backupPath, dbSize });
+    } catch (copyErr: unknown) {
+      const copyError = copyErr instanceof Error ? copyErr : new Error(String(copyErr));
+      logger.error('SYSTEM', 'v12.4.3 backup failed via both VACUUM INTO and copyFileSync; aborting cleanup', {}, copyError);
+      return;
+    }
+  }
+
+  const counts = emptyCounts();
+  const db = new Database(dbPath);
+  // PRAGMA foreign_keys must be set OUTSIDE a transaction to take effect on this connection.
+  db.run('PRAGMA foreign_keys = ON');
+
+  try {
+    runObserverSessionsPurge(db, counts);
+    runStuckPendingPurge(db, counts);
+  } finally {
+    db.close();
+  }
+
+  // SQLite purge succeeded; chroma wipe failure must NOT re-run the migration
+  // on the next startup or we accumulate one new backup per boot. Capture the
+  // failure on the marker instead.
+  let chromaWiped = false;
+  let chromaWipeError: string | undefined;
+  try {
+    chromaWiped = wipeChromaArtifacts(effectiveDataDir);
+  } catch (err: unknown) {
+    const error = err instanceof Error ? err : new Error(String(err));
+    chromaWipeError = error.message;
+    logger.error('SYSTEM', 'v12.4.3: Chroma wipe failed; marker still written so cleanup does not re-run', {}, error);
+  }
+
+  writeMarker(markerPath, {
+    appliedAt: new Date().toISOString(),
+    backupPath,
+    chromaWiped,
+    chromaWipeError,
+    counts,
+  });
+
+  logger.info('SYSTEM', 'v12.4.3 cleanup complete', {
+    backupPath,
+    chromaWiped,
+    ...counts,
+  });
+  logger.info('SYSTEM', `To restore: cp '${backupPath}' '${dbPath}'`);
+}
+
+function runObserverSessionsPurge(db: Database, counts: CleanupCounts): void {
+  db.run('BEGIN IMMEDIATE');
+  try {
+    // Count rows before the delete: bun:sqlite's result.changes inflates with
+    // FTS-trigger and cascade row counts, so it can't stand in for a session
+    // count or a cascade-row count on its own.
+    const sessionCount = (db.prepare(`SELECT COUNT(*) AS n FROM sdk_sessions WHERE project = ?`).get(OBSERVER_SESSIONS_PROJECT) as { n: number }).n;
+    const cascadeRows =
+      (db.prepare(`SELECT COUNT(*) AS n FROM user_prompts WHERE content_session_id IN (SELECT content_session_id FROM sdk_sessions WHERE project = ?)`).get(OBSERVER_SESSIONS_PROJECT) as { n: number }).n
+      + (db.prepare(`SELECT COUNT(*) AS n FROM observations WHERE memory_session_id IN (SELECT memory_session_id FROM sdk_sessions WHERE project = ? AND memory_session_id IS NOT NULL)`).get(OBSERVER_SESSIONS_PROJECT) as { n: number }).n
+      + (db.prepare(`SELECT COUNT(*) AS n FROM session_summaries WHERE memory_session_id IN (SELECT memory_session_id FROM sdk_sessions WHERE project = ? AND memory_session_id IS NOT NULL)`).get(OBSERVER_SESSIONS_PROJECT) as { n: number }).n;
+
+    db.run(`DELETE FROM sdk_sessions WHERE project = ?`, [OBSERVER_SESSIONS_PROJECT]);
+    counts.observerSessions = sessionCount;
+    counts.observerCascadeRows = cascadeRows;
+
+    db.run('COMMIT');
+    logger.info('SYSTEM', 'v12.4.3: observer-sessions purge committed', {
+      sessions: counts.observerSessions,
+      cascadeRows: counts.observerCascadeRows,
+    });
+  } catch (err: unknown) {
+    // Defensive: SQLite may have already auto-rolled back on certain
+    // constraint failures. Don't let a no-op ROLLBACK shadow the real error.
+    try { db.run('ROLLBACK'); } catch { /* already rolled back */ }
+    throw err;
+  }
+}
+
+function runStuckPendingPurge(db: Database, counts: CleanupCounts): void {
+  db.run('BEGIN IMMEDIATE');
+  try {
+    // Pre-count for consistency with runObserverSessionsPurge: result.changes
+    // would be reliable today (no FTS on pending_messages) but the explicit
+    // count protects against future schema changes.
+    const stuckCount = (db.prepare(
+      `SELECT COUNT(*) AS n FROM pending_messages
+         WHERE status IN ('failed', 'processing')
+           AND session_db_id IN (
+             SELECT session_db_id FROM pending_messages
+              WHERE status IN ('failed', 'processing')
+              GROUP BY session_db_id
+              HAVING COUNT(*) >= ?
+           )`
+    ).get(STUCK_PENDING_THRESHOLD) as { n: number }).n;
+
+    db.run(
+      `DELETE FROM pending_messages
+         WHERE status IN ('failed', 'processing')
+           AND session_db_id IN (
+             SELECT session_db_id FROM pending_messages
+              WHERE status IN ('failed', 'processing')
+              GROUP BY session_db_id
+              HAVING COUNT(*) >= ?
+           )`,
+      [STUCK_PENDING_THRESHOLD]
+    );
+    counts.stuckPendingMessages = stuckCount;
+    db.run('COMMIT');
+    logger.info('SYSTEM', 'v12.4.3: stuck pending_messages purge committed', { rows: counts.stuckPendingMessages });
+  } catch (err: unknown) {
+    // Defensive: SQLite may have already auto-rolled back on certain
+    // constraint failures. Don't let a no-op ROLLBACK shadow the real error.
+    try { db.run('ROLLBACK'); } catch { /* already rolled back */ }
+    throw err;
+  }
+}
+
+function wipeChromaArtifacts(effectiveDataDir: string): boolean {
+  const chromaDir = path.join(effectiveDataDir, 'chroma');
+  const stateFile = path.join(effectiveDataDir, 'chroma-sync-state.json');
+  let wiped = false;
+
+  if (existsSync(chromaDir)) {
+    rmSync(chromaDir, { recursive: true, force: true });
+    logger.info('SYSTEM', 'v12.4.3: chroma directory removed (will rebuild via backfill)', { chromaDir });
+    wiped = true;
+  }
+  if (existsSync(stateFile)) {
+    rmSync(stateFile, { force: true });
+    logger.info('SYSTEM', 'v12.4.3: chroma-sync-state.json removed', { stateFile });
+    wiped = true;
+  }
+  return wiped;
+}
+
+function writeMarker(markerPath: string, payload: MarkerPayload): void {
+  writeFileSync(markerPath, JSON.stringify(payload, null, 2));
+}
+
+function emptyCounts(): CleanupCounts {
+  return { observerSessions: 0, observerCascadeRows: 0, stuckPendingMessages: 0 };
+}
@@ -5,3 +5,4 @@
 export * from './ProcessManager.js';
 export * from './HealthMonitor.js';
 export * from './GracefulShutdown.js';
+export * from './CleanupV12_4_3.js';
@@ -185,7 +185,7 @@ const ANTIGRAVITY_CONFIG: McpInstallerConfig = {
  configPath: path.join(homedir(), '.gemini', 'antigravity', 'mcp_config.json'),
  configKey: 'mcpServers',
  contextFile: {
-    path: path.join(process.cwd(), '.agent', 'rules', 'claude-mem-context.md'),
+    path: path.join(process.cwd(), '.agents', 'rules', 'claude-mem-context.md'),
    isWorkspaceRelative: true,
  },
 };
@@ -52,6 +52,7 @@ import {
  spawnDaemon,
  touchPidFile
 } from './infrastructure/ProcessManager.js';
+import { runOneTimeV12_4_3Cleanup } from './infrastructure/CleanupV12_4_3.js';
 import {
  isPortInUse,
  waitForHealth,
@@ -453,6 +454,10 @@ export class WorkerService implements WorkerRef {
        logger.warn('QUEUE', 'Startup GC for failed pending_messages rows failed', {}, err instanceof Error ? err : undefined);
      }

+      // One-time v12.4.3 pollution cleanup. Runs AFTER migrations have applied
+      // and BEFORE backfillAllProjects so the rebuilt Chroma sees a clean SQLite.
+      runOneTimeV12_4_3Cleanup();
+
      // Initialize search services
      logger.info('WORKER', 'Initializing search services...');
      const formattingService = new FormattingService();
@@ -42,6 +42,12 @@ export class SDKAgent {
    this.sessionManager = sessionManager;
  }

+  private resetSessionForFreshStart(session: ActiveSession): void {
+    this.dbManager.getSessionStore().updateMemorySessionId(session.sessionDbId, null);
+    session.memorySessionId = null;
+    session.forceInit = true;
+  }
+
  /**
   * Start SDK agent for a session (event-driven, no polling)
   * @param worker WorkerService reference for spinner control (optional)
@@ -151,7 +157,8 @@ export class SDKAgent {
        // Custom spawn factory: spawns the SDK child in its own POSIX process
        // group so the worker can tear down the whole subtree on shutdown.
        spawnClaudeCodeProcess: createSdkSpawnFactory(session.sessionDbId),
-        env: isolatedEnv  // Use isolated credentials from ~/.claude-mem/.env, not process.env
+        env: isolatedEnv,  // Use isolated credentials from ~/.claude-mem/.env, not process.env
+        mcpServers: {},
      }
    });

@@ -208,7 +215,8 @@ export class SDKAgent {
          // Check for context overflow - prevents infinite retry loops
          if (textContent.includes('prompt is too long') ||
              textContent.includes('context window')) {
-            logger.error('SDK', 'Context overflow detected - terminating session');
+            logger.error('SDK', 'Context overflow detected - terminating session and forcing fresh start');
+            this.resetSessionForFreshStart(session);
            session.abortController.abort();
            return;
          }
@@ -259,6 +267,12 @@ export class SDKAgent {

          // Detect fatal context overflow and terminate gracefully (issue #870)
          if (typeof textContent === 'string' && textContent.includes('Prompt is too long')) {
+            // Resume of this SDK session will overflow forever. Force a fresh session on the
+            // next spawn so crash-recovery can drain remaining pending messages successfully.
+            this.resetSessionForFreshStart(session);
+            logger.error('SDK', 'Context overflow — cleared memorySessionId so next spawn starts fresh', {
+              sessionDbId: session.sessionDbId
+            });
            throw new Error('Claude session context overflow: prompt is too long');
          }

@@ -11,7 +11,7 @@ import { ingestObservation } from '../shared.js';
 import { validateBody } from '../middleware/validateBody.js';
 import { getWorkerPort } from '../../../../shared/worker-utils.js';
 import { logger } from '../../../../utils/logger.js';
-import { stripMemoryTagsFromJson, stripMemoryTagsFromPrompt } from '../../../../utils/tag-stripping.js';
+import { stripMemoryTagsFromJson, stripMemoryTagsFromPrompt, isInternalProtocolPayload } from '../../../../utils/tag-stripping.js';
 import { SessionManager } from '../../SessionManager.js';
 import { DatabaseManager } from '../../DatabaseManager.js';
 import { SDKAgent } from '../../SDKAgent.js';
@@ -857,10 +857,20 @@ export class SessionRoutes extends BaseRouteHandler {
    // Only contentSessionId is truly required — Cursor and other platforms
    // may omit prompt/project in their payload (#838, #1049)
    const project = req.body.project || 'unknown';
-    let prompt = req.body.prompt || '[media prompt]';
+    const rawPrompt = typeof req.body.prompt === 'string' ? req.body.prompt : undefined;
    const platformSource = normalizePlatformSource(req.body.platformSource);
    const customTitle = req.body.customTitle || undefined;

+    // Filter on the raw prompt before truncation / [media prompt] substitution
+    // so the check is independent of those transforms.
+    if (rawPrompt && isInternalProtocolPayload(rawPrompt)) {
+      logger.debug('HTTP', 'session-init: skipping internal protocol payload before session creation', { contentSessionId });
+      res.json({ skipped: true, reason: 'internal_protocol' });
+      return;
+    }
+
+    let prompt = rawPrompt || '[media prompt]';
+
    const promptByteLength = Buffer.byteLength(prompt, 'utf8');
    if (promptByteLength > MAX_USER_PROMPT_BYTES) {
      logger.warn('HTTP', 'SessionRoutes: oversized prompt truncated at session-init boundary', {
@@ -79,7 +79,8 @@ export class KnowledgeAgent {
        cwd: OBSERVER_SESSIONS_DIR,
        disallowedTools: KNOWLEDGE_AGENT_DISALLOWED_TOOLS,
        pathToClaudeCodeExecutable: claudePath,
-        env: isolatedEnv
+        env: isolatedEnv,
+        mcpServers: {},
      }
    });

@@ -195,7 +196,8 @@ export class KnowledgeAgent {
        cwd: OBSERVER_SESSIONS_DIR,
        disallowedTools: KNOWLEDGE_AGENT_DISALLOWED_TOOLS,
        pathToClaudeCodeExecutable: claudePath,
-        env: isolatedEnv
+        env: isolatedEnv,
+        mcpServers: {},
      }
    });

@@ -104,3 +104,38 @@ export function stripMemoryTagsFromJson(content: string): string {
 export function stripMemoryTagsFromPrompt(content: string): string {
  return stripTags(content).stripped;
 }
+
+/**
+ * Tag names that Claude Code emits autonomously into the prompt stream as
+ * protocol notifications — never authored by the user. When the entire prompt
+ * payload is one of these blocks (with no surrounding user text), the hook
+ * MUST skip storage to keep `user_prompts` clean.
+ *
+ * Conservative deny-list: do NOT add `<command-name>` / `<command-message>`
+ * here — those wrap genuine user slash-command invocations.
+ */
+const PROTOCOL_ONLY_TAGS = ['task-notification'] as const;
+
+// Negative lookahead in the body keeps a payload like
+// "<task-notification>x</task-notification> hi <task-notification>y</task-notification>"
+// from matching as a single outer block (greedy [\s\S]* would otherwise span
+// the middle user text and silently drop a real prompt).
+const PROTOCOL_ONLY_REGEX = new RegExp(
+  `^\\s*<(${PROTOCOL_ONLY_TAGS.join('|')})\\b[^>]*>(?:(?!<\\1\\b|</\\1\\b)[\\s\\S])*</\\1>\\s*$`,
+);
+
+// Bounds the unanchored `[\s\S]*` body to keep a malformed 1MB+ payload that
+// opens a protocol tag and never closes it from running the regex engine
+// against the whole prompt before failing.
+const MAX_PROTOCOL_PAYLOAD_BYTES = 256 * 1024;
+
+/**
+ * Returns true when `text` is *entirely* a Claude Code protocol payload
+ * (e.g. a `<task-notification>` block emitted on background Agent completion)
+ * with no surrounding user-authored content.
+ */
+export function isInternalProtocolPayload(text: string): boolean {
+  if (!text) return false;
+  if (text.length > MAX_PROTOCOL_PAYLOAD_BYTES) return false;
+  return PROTOCOL_ONLY_REGEX.test(text);
+}
@@ -0,0 +1,196 @@
+/**
+ * Happy-path tests for runOneTimeV12_4_3Cleanup.
+ *
+ * Uses a real on-disk SQLite under a tmpdir so VACUUM INTO, statSync,
+ * statfsSync, and marker-file writes all exercise their real code paths.
+ */
+
+import { describe, it, expect, beforeEach, afterEach, spyOn } from 'bun:test';
+import { mkdtempSync, rmSync, existsSync, writeFileSync, mkdirSync, readFileSync, readdirSync } from 'fs';
+import path from 'path';
+import { tmpdir } from 'os';
+import { Database } from 'bun:sqlite';
+import { runOneTimeV12_4_3Cleanup } from '../../src/services/infrastructure/CleanupV12_4_3.js';
+import { ClaudeMemDatabase } from '../../src/services/sqlite/Database.js';
+import { OBSERVER_SESSIONS_PROJECT } from '../../src/shared/paths.js';
+import { logger } from '../../src/utils/logger.js';
+
+let loggerSpies: ReturnType<typeof spyOn>[] = [];
+
+function silenceLogger(): void {
+  loggerSpies = [
+    spyOn(logger, 'info').mockImplementation(() => {}),
+    spyOn(logger, 'debug').mockImplementation(() => {}),
+    spyOn(logger, 'warn').mockImplementation(() => {}),
+    spyOn(logger, 'error').mockImplementation(() => {}),
+  ];
+}
+
+function restoreLogger(): void {
+  loggerSpies.forEach(s => s.mockRestore());
+  loggerSpies = [];
+}
+
+function seedDatabase(dbPath: string, opts: { observerSessions: number; stuckCount: number }): { observerSessionDbIds: number[]; keepSessionDbId: number } {
+  const seed = new ClaudeMemDatabase(dbPath);
+  const db = seed.db;
+  const now = new Date().toISOString();
+  const epoch = Date.now();
+
+  const insertSession = db.prepare(
+    `INSERT INTO sdk_sessions (content_session_id, memory_session_id, project, started_at, started_at_epoch)
+     VALUES (?, ?, ?, ?, ?)`
+  );
+  const insertPrompt = db.prepare(
+    `INSERT INTO user_prompts (content_session_id, prompt_number, prompt_text, created_at, created_at_epoch)
+     VALUES (?, 1, ?, ?, ?)`
+  );
+  const insertObservation = db.prepare(
+    `INSERT INTO observations (memory_session_id, project, type, text, created_at, created_at_epoch)
+     VALUES (?, ?, 'discovery', ?, ?, ?)`
+  );
+
+  const observerSessionDbIds: number[] = [];
+  for (let i = 0; i < opts.observerSessions; i++) {
+    const result = insertSession.run(`obs-content-${i}`, `obs-memory-${i}`, OBSERVER_SESSIONS_PROJECT, now, epoch);
+    observerSessionDbIds.push(Number(result.lastInsertRowid));
+    insertPrompt.run(`obs-content-${i}`, `prompt ${i}`, now, epoch);
+    insertObservation.run(`obs-memory-${i}`, OBSERVER_SESSIONS_PROJECT, `obs ${i}`, now, epoch);
+  }
+
+  // Real session that should survive
+  const keepResult = insertSession.run('keep-content', 'keep-memory', 'real-project', now, epoch);
+  const keepSessionDbId = Number(keepResult.lastInsertRowid);
+  insertPrompt.run('keep-content', 'survives', now, epoch);
+
+  // Stuck pending_messages tied to the surviving session (so FK passes).
+  const insertPending = db.prepare(
+    `INSERT INTO pending_messages (session_db_id, content_session_id, message_type, status, created_at_epoch)
+     VALUES (?, 'keep-content', 'observation', 'failed', ?)`
+  );
+  for (let i = 0; i < opts.stuckCount; i++) {
+    insertPending.run(keepSessionDbId, epoch);
+  }
+
+  seed.close();
+  return { observerSessionDbIds, keepSessionDbId };
+}
+
+describe('runOneTimeV12_4_3Cleanup', () => {
+  let tmpDataDir: string;
+
+  beforeEach(() => {
+    tmpDataDir = mkdtempSync(path.join(tmpdir(), 'cleanup-v12_4_3-'));
+    silenceLogger();
+  });
+
+  afterEach(() => {
+    restoreLogger();
+    rmSync(tmpDataDir, { recursive: true, force: true });
+  });
+
+  it('writes a no-db marker when the DB is missing', () => {
+    runOneTimeV12_4_3Cleanup(tmpDataDir);
+
+    const markerPath = path.join(tmpDataDir, '.cleanup-v12.4.3-applied');
+    expect(existsSync(markerPath)).toBe(true);
+
+    const payload = JSON.parse(readFileSync(markerPath, 'utf8'));
+    expect(payload.skipped).toBe('no-db');
+    expect(payload.backupPath).toBeNull();
+    expect(payload.counts).toEqual({ observerSessions: 0, observerCascadeRows: 0, stuckPendingMessages: 0 });
+  });
+
+  it('purges observer-sessions and stuck pending_messages, writes marker, wipes chroma', () => {
+    const dbPath = path.join(tmpDataDir, 'claude-mem.db');
+    seedDatabase(dbPath, { observerSessions: 3, stuckCount: 12 });
+
+    // chroma artifacts that should be wiped
+    mkdirSync(path.join(tmpDataDir, 'chroma'), { recursive: true });
+    writeFileSync(path.join(tmpDataDir, 'chroma', 'collection.bin'), 'opaque');
+    writeFileSync(path.join(tmpDataDir, 'chroma-sync-state.json'), '{}');
+
+    runOneTimeV12_4_3Cleanup(tmpDataDir);
+
+    const markerPath = path.join(tmpDataDir, '.cleanup-v12.4.3-applied');
+    expect(existsSync(markerPath)).toBe(true);
+    const payload = JSON.parse(readFileSync(markerPath, 'utf8'));
+
+    expect(payload.counts.observerSessions).toBe(3);
+    expect(payload.counts.observerCascadeRows).toBe(6); // 3 user_prompts + 3 observations
+    expect(payload.counts.stuckPendingMessages).toBe(12);
+    expect(payload.chromaWiped).toBe(true);
+    expect(payload.chromaWipeError).toBeUndefined();
+    expect(payload.backupPath).toBeTruthy();
+
+    // Backup file is real and non-empty
+    expect(existsSync(payload.backupPath)).toBe(true);
+
+    // Chroma artifacts gone
+    expect(existsSync(path.join(tmpDataDir, 'chroma'))).toBe(false);
+    expect(existsSync(path.join(tmpDataDir, 'chroma-sync-state.json'))).toBe(false);
+
+    // Real session still present; observer rows gone
+    const verify = new Database(dbPath, { readonly: true });
+    const observerCount = (verify.prepare('SELECT COUNT(*) AS n FROM sdk_sessions WHERE project = ?').get(OBSERVER_SESSIONS_PROJECT) as { n: number }).n;
+    const realCount = (verify.prepare(`SELECT COUNT(*) AS n FROM sdk_sessions WHERE project = 'real-project'`).get() as { n: number }).n;
+    const survivingPrompts = (verify.prepare('SELECT COUNT(*) AS n FROM user_prompts').get() as { n: number }).n;
+    const survivingPending = (verify.prepare('SELECT COUNT(*) AS n FROM pending_messages').get() as { n: number }).n;
+    verify.close();
+
+    expect(observerCount).toBe(0);
+    expect(realCount).toBe(1);
+    expect(survivingPrompts).toBe(1); // only the keep-content prompt
+    expect(survivingPending).toBe(0);
+  });
+
+  it('preserves pending_messages when stuck count is below the threshold of 10', () => {
+    const dbPath = path.join(tmpDataDir, 'claude-mem.db');
+    seedDatabase(dbPath, { observerSessions: 0, stuckCount: 9 });
+
+    runOneTimeV12_4_3Cleanup(tmpDataDir);
+
+    const markerPath = path.join(tmpDataDir, '.cleanup-v12.4.3-applied');
+    const payload = JSON.parse(readFileSync(markerPath, 'utf8'));
+    expect(payload.counts.stuckPendingMessages).toBe(0);
+
+    const verify = new Database(dbPath, { readonly: true });
+    const survivingPending = (verify.prepare('SELECT COUNT(*) AS n FROM pending_messages').get() as { n: number }).n;
+    verify.close();
+    expect(survivingPending).toBe(9);
+  });
+
+  it('is idempotent: a second invocation does no work and does not create a second backup', () => {
+    const dbPath = path.join(tmpDataDir, 'claude-mem.db');
+    seedDatabase(dbPath, { observerSessions: 1, stuckCount: 10 });
+
+    runOneTimeV12_4_3Cleanup(tmpDataDir);
+    const backupsAfterFirst = readdirSync(path.join(tmpDataDir, 'backups'));
+    expect(backupsAfterFirst.length).toBe(1);
+
+    runOneTimeV12_4_3Cleanup(tmpDataDir);
+    const backupsAfterSecond = readdirSync(path.join(tmpDataDir, 'backups'));
+    expect(backupsAfterSecond).toEqual(backupsAfterFirst);
+  });
+
+  it('honors CLAUDE_MEM_SKIP_CLEANUP_V12_4_3=1 by exiting without writing the marker', () => {
+    const dbPath = path.join(tmpDataDir, 'claude-mem.db');
+    seedDatabase(dbPath, { observerSessions: 1, stuckCount: 10 });
+
+    const original = process.env.CLAUDE_MEM_SKIP_CLEANUP_V12_4_3;
+    process.env.CLAUDE_MEM_SKIP_CLEANUP_V12_4_3 = '1';
+    try {
+      runOneTimeV12_4_3Cleanup(tmpDataDir);
+    } finally {
+      if (original === undefined) delete process.env.CLAUDE_MEM_SKIP_CLEANUP_V12_4_3;
+      else process.env.CLAUDE_MEM_SKIP_CLEANUP_V12_4_3 = original;
+    }
+
+    expect(existsSync(path.join(tmpDataDir, '.cleanup-v12.4.3-applied'))).toBe(false);
+
+    const verify = new Database(dbPath, { readonly: true });
+    const observerCount = (verify.prepare('SELECT COUNT(*) AS n FROM sdk_sessions WHERE project = ?').get(OBSERVER_SESSIONS_PROJECT) as { n: number }).n;
+    verify.close();
+    expect(observerCount).toBe(1); // untouched
+  });
+});
@@ -10,7 +10,7 @@
 */

 import { describe, it, expect, beforeEach, afterEach, spyOn, mock } from 'bun:test';
-import { stripMemoryTagsFromPrompt, stripMemoryTagsFromJson } from '../../src/utils/tag-stripping.js';
+import { stripMemoryTagsFromPrompt, stripMemoryTagsFromJson, isInternalProtocolPayload } from '../../src/utils/tag-stripping.js';
 import { logger } from '../../src/utils/logger.js';

 // Suppress logger output during tests
@@ -410,4 +410,60 @@ after`;
      expect(cleanedPrompt.trim()).toBe('Please help me with my code');
    });
  });
+
+  describe('isInternalProtocolPayload', () => {
+    it('returns false for empty input', () => {
+      expect(isInternalProtocolPayload('')).toBe(false);
+    });
+
+    it('returns true for a bare task-notification block', () => {
+      expect(isInternalProtocolPayload('<task-notification>agent done</task-notification>')).toBe(true);
+    });
+
+    it('returns true for an empty-body task-notification block', () => {
+      expect(isInternalProtocolPayload('<task-notification></task-notification>')).toBe(true);
+    });
+
+    it('returns true with surrounding whitespace', () => {
+      expect(isInternalProtocolPayload('\n  <task-notification>x</task-notification>\n')).toBe(true);
+    });
+
+    it('returns true for multi-line payload', () => {
+      const payload = '<task-notification>\nline1\nline2\n</task-notification>';
+      expect(isInternalProtocolPayload(payload)).toBe(true);
+    });
+
+    it('returns true when tag has attributes', () => {
+      expect(isInternalProtocolPayload('<task-notification data-id="42">x</task-notification>')).toBe(true);
+    });
+
+    it('returns false for partial / unclosed tag', () => {
+      expect(isInternalProtocolPayload('<task-notification>oops')).toBe(false);
+    });
+
+    it('returns false when surrounded by user text', () => {
+      const text = 'hi <task-notification>x</task-notification> more';
+      expect(isInternalProtocolPayload(text)).toBe(false);
+    });
+
+    it('returns false for unrelated tags', () => {
+      expect(isInternalProtocolPayload('<private>secret</private>')).toBe(false);
+      expect(isInternalProtocolPayload('<system-reminder>hi</system-reminder>')).toBe(false);
+    });
+
+    it('returns false for over-large input', () => {
+      const huge = '<task-notification>' + 'a'.repeat(300 * 1024);
+      expect(isInternalProtocolPayload(huge)).toBe(false);
+    });
+
+    it('returns false for two protocol blocks separated by user text', () => {
+      const text = '<task-notification>a</task-notification> hello <task-notification>b</task-notification>';
+      expect(isInternalProtocolPayload(text)).toBe(false);
+    });
+
+    it('returns false for two adjacent protocol blocks (deliberate: deny-list per single block, not concatenations)', () => {
+      const text = '<task-notification>a</task-notification><task-notification>b</task-notification>';
+      expect(isInternalProtocolPayload(text)).toBe(false);
+    });
+  });
 });