Cynical deletion: close 27 issues by removing defenders + tolerators (#2141)

* fix: mirror migration 28 in SessionStore so pending_messages.tool_use_id and worker_pid columns are created (#2139) SessionStore's inline migration list jumped from v27 to v29, skipping rebuildPendingMessagesForSelfHealingClaim. The worker uses SessionStore directly via worker/DatabaseManager.ts and bypasses the canonical MigrationRunner, so fresh installs ended up at "max v29" with neither column present — every queue claim and observation insert failed. Adds addPendingMessagesToolUseIdAndWorkerPidColumns following the existing mirror precedent (addObservationSubagentColumns / addObservationsUniqueContentHashIndex). Uses ALTER TABLE + column-existence guards so already-broken DBs at v29 self-heal on next worker boot. Verified on fresh DB and on a synthetic v29-without-v28 broken DB: both columns and indexes (idx_pending_messages_worker_pid, ux_pending_session_tool) appear after one boot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: wrap v28 mirror dedup+index creation in transaction Addresses Greptile P2 review on PR #2140: matches the existing pattern in addObservationsUniqueContentHashIndex (v29 mirror at SessionStore.ts:1127) and runner.ts rebuildPendingMessagesForSelfHealingClaim. A crash between the dedup DELETE and the schema_versions INSERT no longer leaves the DB in a half-applied state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(plan): cynical-deletion plan for 29 open issues 9-phase plan applying delete-first lens to triaged issue corpus. Headlines: kill defenders (orphan cleanup, EncodedCommand spawn, restart-port-steal) and tolerators (silent JSON drops, drifted SSE filters). Each phase closes a named subset of issues. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: delete process-management theater (Phase 1: DEL-1 + DEL-2) Delete aggressiveStartupCleanup, the PowerShell -EncodedCommand spawn branch, and the restart-with-port-steal sequence. Replace daemon spawning with a single uniform child_process.spawn path using arg-array form, keeping setsid on Unix when available. The defenders (orphan cleanup, duplicate-worker probes, port stealing) bred more bugs than they fixed. PID file with start-time token already provides correct OS-trust ownership; restart now requests httpShutdown, waits 5s for the port to free, then exits 1 if it didn't (user resolves). Net -247 lines. Closes #2090, #2095 (already fixed at session-init.ts:78), #2107, #2111, #2114, #2117, #2123, #2097, #2135. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: observer-sessions trust boundary via CLAUDE_MEM_INTERNAL env (Phase 2: DEL-9) Replace the cwd === OBSERVER_SESSIONS_DIR discriminator (which every consumer must repeat and inevitably drifts) with a single env-var trust boundary set once at spawn time in buildIsolatedEnv. - buildIsolatedEnv now sets CLAUDE_MEM_INTERNAL=1, covering all three spawn sites (SDKAgent, KnowledgeAgent.prime, KnowledgeAgent.executeQuery) - shouldTrackProject checks the env var first (cwd check stays as belt-and-braces fallback) - New shared shouldEmitProjectRow predicate — SSE broadcaster and pagination filter share the same predicate so they can never drift apart (#2118) - ObservationBroadcaster filters observer rows from SSE stream - PaginationHelper hardcoded 'observer-sessions' replaced with OBSERVER_SESSIONS_PROJECT const - project-filter basename match pass — *observer-sessions* now matches basename, not just full path (globToRegex's [^/]* can't cross /) (#2126 item 1) - New `claude-mem cleanup [--dry-run]` subcommand wires CleanupV12_4_3 through to the worker for #2126 item 5 Closes #2118, #2126. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: strip proxy env vars before spawning worker (Phase 4: CON-1) User's HTTP_PROXY/HTTPS_PROXY config was bleeding into internal AI calls when claude-mem spawns the claude subprocess, causing connection failures. Strip unconditionally — no passthrough knob, which rejects #2099's whitelist proposal. Closes #2115, #2099. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: fail-fast on silent drops in stdin/file-context/memory-save (Phase 5: FF-1) Three independent fail-fast fixes: #2089 — stdin-reader silent drop. Non-empty stdin that fails JSON.parse now rejects with a clear error instead of resolving undefined. Empty stdin still resolves undefined. #2094 — PreToolUse:Read truncation Edit deadlock. file-context handler no longer returns a fake truncated Read result via updatedInput. Removes userOffset/userLimit/truncated machinery; injects the timeline via additionalContext only and lets the real Read pass through. Read state and Claude's expectation now stay consistent, eliminating the infinite Edit retry loop. #2116 — /api/memory/save metadata drop + project bug. Schema accepts metadata as a documented JSON column (migration 30 adds observations. metadata TEXT, mirrored in SessionStore). Schema also tightened to .strict() so unknown top-level fields fail fast instead of being silently dropped. Project resolution now consults metadata.project as a fallback before defaultProject. Closes #2089, #2094, #2116. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: small deletions — Zod externalize / Gemini fallback / session timeout / installCLI alias (Phase 6) DEL-4 (#2113): Externalize zod from mcp-server.cjs and context-generator.cjs hook bundles so OpenCode's runtime resolves a single Zod copy. Worker keeps Zod bundled (it's a daemon subprocess, not in OpenCode's hook bundle). Added zod to plugin/package.json so externalized requires resolve at runtime. DEL-5 (#2087): Delete the never-wired GeminiAgent → Claude fallback. fallbackAgent was always null in production. On 429 the agent now throws cleanly (message stays pending for retry). Removed setFallbackAgent, FallbackAgent interface, and the 429 fallback branch from both GeminiAgent and OpenRouterAgent. Updated docs that claimed automatic Claude fallback. DEL-6 (#2127, #2098): Raise MAX_SESSION_WALL_CLOCK_MS from 4h to 24h. The timeout is a real guard against runaway-cost loops (per issue #1590), but 4h kills legitimate long Claude Code days. 24h preserves the guard while never hitting in normal use. No knob — a session approaching this age is a bug worth investigating, not a value worth tuning. DEL-8 (#2054): Delete installCLI() alias function. Saves 4 keystrokes at the cost of cross-platform shell-config mutation surface — not worth it. Canonical entry is npx claude-mem (and bunx). Uninstall now strips legacy alias/function lines from ~/.bashrc, ~/.zshrc, and the PowerShell profile. Closes #2087, #2098, #2113, #2127, #2054. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: de-hardcode worker port + multi-account commit (Phase 3: CON-2 + DEL-7) Replace hardcoded 37777 fallbacks with SettingsDefaultsManager.get( 'CLAUDE_MEM_WORKER_PORT') in npx-cli (runtime/install/uninstall), opencode-plugin, OpenClaw installer, SearchRoutes example URLs. Timeline-report SKILL.md now resolves WORKER_PORT from settings.json at the top and uses ${WORKER_PORT} in all curl invocations. Remaining 37777 literals are doc comments + viewer build-time form- field placeholder (which is replaced by /api/settings on mount). hooks.json: add cygpath POSIX→Windows path translation between _R resolution and node invocation. No-op on macOS/Linux. Closes the Windows + Git Bash MODULE_NOT_FOUND in #2109. CLAUDE.md gains a Multi-account section documenting CLAUDE_MEM_DATA_DIR + optional CLAUDE_MEM_WORKER_PORT — every existing path/port code path now honors them. Closes #2103, #2109, #2101. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: install/uninstall improvements (Phase 7: #2106) 5 fixes for the install/uninstall flow: Item 1 — multiselect default. install.ts no longer pre-selects every detected IDE; user explicitly opts in. Item 3 — shutdown-before-overwrite. New src/services/install/shutdown-helper.ts shared by install and uninstall: POSTs /api/admin/shutdown then polls /api/health until the worker stops responding. install calls it before copyPluginToMarketplace so reinstall over a running worker doesn't conflict; uninstall calls it before deletion. Item 4 — uninstall path coverage. Removes ~/.npm/_npx/*/node_modules/ claude-mem, ~/.cache/claude-cli-nodejs/*/mcp-logs-plugin-claude-mem-*, ~/.claude/plugins/data/claude-mem-thedotmack/. Best-effort: per-path try/catch so a single permission failure doesn't abort uninstall. chroma-mcp shutdown is implicit via the worker's GracefulShutdown cascade in item 3's helper. Item 5 — install summary documents "Close all Claude Code sessions before uninstalling, or ~/.claude-mem will be recreated by active hooks." Item 6 — real-port query. After install, fetches /api/health on the configured port with 3s timeout. Reports actually-bound port if the response carries it; falls back to requested port. No retry loop. Closes #2106 (items 1, 3, 4, 5, 6). Items 2, 7 closed separately as already-fixed and insufficient-detail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: pin chroma-mcp to 0.2.6 (Phase 8: DEL-3 lite) Replace unpinned 'chroma-mcp' arg with chroma-mcp==0.2.6 in both local and remote modes. Pinning makes installs deterministic across machines and across time, eliminating the dependency-drift class of bugs. Verified 0.2.6 in a clean uv cache: starts cleanly, no httpcore/ httpx ImportError, no --with flags needed. The --with flags removed in a0dd516c are not required at this pin (transitive deps resolve correctly when the top-level version is fixed). #2102's three protections (transport cleanup on failure, stale onclose handler guard, 10s reconnect backoff) confirmed intact. Closes #2046, #2085, #2102. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: update stale assertions for per-UID port + migration 30 (Phase 9) SettingsDefaultsManager.CLAUDE_MEM_WORKER_PORT default is per-UID (37700 + uid%100), not literal '37777'. Three assertions in settings-defaults-manager.test.ts now compute the expected value the same way the source does. migration-runner.test.ts: drop expect(versions).toContain(19) (version 19 was a noop never recorded — pre-existing bug at parent), add expect(versions).toContain(30) for the new observations.metadata column added in Phase 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address Greptile P1/P2 review comments on PR #2141 P1: spawnDaemon return value was unchecked in worker-service.ts restart case, so a failed spawn silently exited 0 with a misleading "Worker restart spawned" log. Now error and exit 1 when restartPid is undefined. P2: shutdown-helper.ts health-poll catch treated AbortError (timeout) the same as connection-refused, so a slow worker could be reported confirmedStopped while still holding file locks. Now distinguish: AbortError continues polling; other errors return confirmedStopped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * build: rebuild plugin artifacts after merging main Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address CodeRabbit review comments on PR #2141 - hooks.json: quote $HOME in cache lookup so paths with spaces work - timeline-report SKILL.md: fall back when process.getuid is unavailable (Windows) - opencode-plugin: validate CLAUDE_MEM_WORKER_PORT before using - uninstall.ts: only strip alias lines, not function declarations (multi-line bodies left intact) - MemoryRoutes: trim whitespace-only project before precedence resolution - SessionStore migration 21: preserve metadata column if observations already has it - stdin-reader test: restore full property descriptor to avoid cross-test pollution Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 21:23:24 -07:00
parent 7f255cbc51
commit d13662d5d8
52 changed files with 2312 additions and 1222 deletions
@@ -37,6 +37,26 @@ npm run build-and-sync        # Build, sync to marketplace, restart worker

 Settings are managed in `~/.claude-mem/settings.json`. The file is auto-created with defaults on first run.

+## Multi-account
+
+Claude-mem supports running multiple isolated profiles on the same machine (e.g. work vs personal accounts) via environment variables. No CLI subcommand needed — set the env vars in the shell where you run Claude Code.
+
+- **Switch profiles per shell:** Set `CLAUDE_MEM_DATA_DIR=<path>` and every claude-mem path (database, chroma, logs, settings.json, worker.pid, transcripts config) derives from it. Example:
+
+  ```bash
+  export CLAUDE_MEM_DATA_DIR="$HOME/.claude-mem-work"
+  ```
+
+- **Port collisions are auto-handled:** The default worker port is `37700 + (uid % 100)`, so two different OS users on the same box get different ports for free. If you want fixed ports per profile (e.g. you run two profiles as the same UID), set `CLAUDE_MEM_WORKER_PORT` too:
+
+  ```bash
+  export CLAUDE_MEM_WORKER_PORT=37800
+  ```
+
+- **All paths and ports derive from these two env vars.** Hooks, npx-cli (`install`/`uninstall`/`start`/`search`), the OpenCode plugin, the OpenClaw installer, and the timeline-report skill all honor them. The settings file itself lives at `$CLAUDE_MEM_DATA_DIR/settings.json`.
+
+- **Closes #2101.** See `src/shared/SettingsDefaultsManager.ts` for the canonical port/data-dir defaults and `plugin/skills/timeline-report/SKILL.md` for the shell snippet that resolves the port for arbitrary skills.
+
 ## File Locations

 - **Source**: `<project-root>/src/`
@@ -15,7 +15,7 @@ Claude-mem supports Google's Gemini API as an alternative to the Claude Agent SD

 - **Cost savings**: The free tier covers most individual usage patterns
 - **Same quality**: Gemini extracts observations using the same XML format as Claude
- **Seamless fallback**: Automatically falls back to Claude if Gemini is unavailable
+- **Errors throw clearly**: 429s, 5xx, and network failures throw — leaving messages pending so they can be retried
 - **Hot-swappable**: Switch providers without restarting the worker

 ## Getting a Free API Key
@@ -103,23 +103,16 @@ You can switch between Claude and Gemini at any time:
 }
 ```

-## Fallback Behavior
+## Error Behavior

-If Gemini is selected but encounters errors, claude-mem automatically falls back to the Claude Agent SDK:
+If Gemini is selected and the API errors, claude-mem logs the failure and re-throws so the message stays pending for later retry. There is no Claude SDK fallback — earlier docs claimed automatic Claude fallback, but the wiring was never actually engaged in production (#2087). To switch providers, change `CLAUDE_MEM_PROVIDER` in settings.

-**Triggers fallback:**
+**Throwing conditions:**
 - Rate limiting (HTTP 429)
 - Server errors (HTTP 5xx)
 - Network issues (connection refused, timeout)
-
-**Does not trigger fallback:**
- Missing API key (logs warning, uses Claude from start)
- Invalid API key (fails with error)
-
-When fallback occurs:
-1. A warning is logged
-2. Any in-progress messages are reset to pending
-3. Claude SDK takes over with the full conversation context
+- 4xx errors other than 429
+- Missing API key

 ## Troubleshooting

@@ -16,7 +16,7 @@ Claude-mem supports [OpenRouter](https://openrouter.ai) as an alternative provid
 - **Access to 100+ models**: Choose from models across multiple providers through one API
 - **Free tier options**: Several high-quality models are completely free to use
 - **Cost flexibility**: Pay-as-you-go pricing on premium models with no commitments
- **Seamless fallback**: Automatically falls back to Claude if OpenRouter is unavailable
+- **Errors throw clearly**: 429s, 5xx, and network failures throw — leaving messages pending so they can be retried
 - **Hot-swappable**: Switch providers without restarting the worker
 - **Multi-turn conversations**: Full conversation history maintained across API calls

@@ -187,28 +187,16 @@ You can switch between providers at any time:
 }
 ```

-## Fallback Behavior
+## Error Behavior

-If OpenRouter encounters errors, claude-mem automatically falls back to the Claude Agent SDK:
+If OpenRouter errors, claude-mem logs the failure and re-throws so the message stays pending for later retry. There is no Claude SDK fallback — earlier docs claimed automatic Claude fallback, but the wiring was never actually engaged in production (#2087). To switch providers, change `CLAUDE_MEM_PROVIDER` in settings.

-**Triggers fallback:**
+**Throwing conditions:**
 - Rate limiting (HTTP 429)
 - Server errors (HTTP 500, 502, 503)
 - Network issues (connection refused, timeout)
- Generic fetch failures
-
-**Does not trigger fallback:**
- Missing API key (logs warning, uses Claude from start)
- Invalid API key (fails with error)
-
-When fallback occurs:
-1. A warning is logged
-2. Any in-progress messages are reset to pending
-3. Claude SDK takes over with the full conversation context
-
-<Note>
-**Fallback is transparent**: Your observations continue processing without interruption. The fallback preserves all conversation context.
-</Note>
+- 4xx errors other than 429
+- Missing API key

 ## Multi-Turn Conversation Support

@@ -245,7 +233,7 @@ Either:
 ### Rate Limiting

 Free models may have rate limits during peak usage. If you hit rate limits:
- Claude-mem automatically falls back to Claude SDK
+- The agent throws and leaves the message pending — it will be retried later
 - Consider switching to a different free model
 - Add credits for premium model access

@@ -268,7 +256,7 @@ If you see warnings about high token usage (>50,000 per request):
 If you see connection errors:
 - Check your internet connection
 - Verify OpenRouter service status at [status.openrouter.ai](https://status.openrouter.ai)
- The agent will automatically fall back to Claude
+- The agent throws and leaves the message pending for later retry

 ## API Details

@@ -305,7 +293,7 @@ Content-Type: application/json
 | **Models** | Claude only | Gemini only | 100+ models |
 | **Quality** | Highest | High | Varies by model |
 | **Rate limits** | Based on tier | 5-4000 RPM | Varies by model |
-| **Fallback** | N/A (primary) | → Claude | → Claude |
+| **On error** | Throws | Throws | Throws |
 | **Setup** | Automatic | API key required | API key required |

 <Tip>
@@ -0,0 +1,369 @@
+# Cynical Deletion Plan — 29 issues → ~7 deletions
+
+**Date:** 2026-04-25
+**Branch:** `claude-mem-skill-invocation-and-github-issue-2139`
+**Source:** Triage of all 29 open issues for `thedotmack/claude-mem` applied with delete-first lens.
+
+## Headline
+
+The codebase has accumulated **defenders** (orphan cleanup → duplicate detection → restart-port-stealing) and **tolerators** (silent JSON drops, drifted SQL/SSE filters, silent metadata drops). Each defender breeds two more bugs; each tolerator hides the bug it tolerates until it explodes as a "regression." The work is **deleting the moats**, not patching them.
+
+## Coverage map (29 issues)
+
+| Phase | Action | Closes |
+|---|---|---|
+| P1 | DEL-1 + DEL-2: process-management theater + shell-string spawning | #2090, #2095, #2107, #2111, #2114, #2117, #2135, #2123, #2097 |
+| P2 | DEL-9: observer-sessions trust boundary (`CLAUDE_MEM_INTERNAL` env) | #2126, #2118 |
+| P3 | CON-2 + DEL-7: multi-account commit, port/path de-hardcoding | #2103, #2109, #2101 |
+| P4 | CON-1: extend env sanitizer to proxy vars | #2115, #2099 |
+| P5 | FF-1: fail-fast cleanup | #2089, #2094, #2116 |
+| P6 | DEL-4 + DEL-5 + DEL-6 + DEL-8: small deletions | #2113, #2087, #2127, #2098, #2054 |
+| P7 | #2106 install fixes (UX + shutdown-before-overwrite + uninstall coverage + real-port query) | #2106 |
+| P8 | DEL-3 lite: pin chroma-mcp deterministically (full sqlite-vec migration deferred) | #2046, #2085, #2102 |
+| P9 | Verification + close-as-dup/already-fixed | #2112, #2123→#2135, #2097→#2135, #2098→#2127, #2126 (closed by P2) |
+
+---
+
+## Phase 0 — Documentation Discovery (DONE)
+
+### Allowed APIs (verified)
+
+- `child_process.spawn(cmd, [args], { detached, stdio, env })` — Node API used in `ProcessManager.ts`. Bun.spawn does NOT support `detached:true` (per `process-registry.ts:633-639` comment). Use Node `child_process` for daemon spawning.
+- `Bun.spawn([args], { env })` — used for non-detached children (e.g. `chroma-vector-sync.test.ts:25`). Arg-array form bypasses shell on all platforms.
+- `Agent SDK query({ cwd, env, spawnClaudeCodeProcess })` — used by `SDKAgent.ts:145-163` and `KnowledgeAgent.ts:75-84`. Custom `spawnClaudeCodeProcess` lets us inject env vars into the spawned `claude` subprocess.
+- `sanitizeEnv()` from `src/supervisor/env-sanitizer.ts` — currently strips `CLAUDE_CODE_*` and `CLAUDECODE_*` (preserve list: `CLAUDE_CODE_OAUTH_TOKEN`, `CLAUDE_CODE_GIT_BASH_PATH`).
+- `SettingsDefaultsManager.get('CLAUDE_MEM_WORKER_PORT')` — canonical port reader. Default: `37700 + (uid % 100)`.
+- `paths.ts` exports: `DATA_DIR`, `OBSERVER_SESSIONS_DIR`, `OBSERVER_SESSIONS_PROJECT`, `USER_SETTINGS_PATH`, `DB_PATH`. All resolve under `CLAUDE_MEM_DATA_DIR` if set.
+- Hook exit-code contract (CLAUDE.md:48-58): exit 0 = success, exit 1 = non-blocking error, exit 2 = blocking error. Worker errors should exit 0 to prevent Windows Terminal tab accumulation.
+
+### Anti-patterns to avoid
+
+- **Don't** invent shell-string variants of spawn. Use arg-array form everywhere. PowerShell `-EncodedCommand` and quoting heuristics are deletable once we stop building shell strings.
+- **Don't** add new defender code (orphan janitors, duplicate-worker probes, retry-with-backoff loops). The existing defenders are what we're removing.
+- **Don't** add new config knobs (env-passthrough whitelist, configurable timeout). Fix the default instead.
+- **Don't** add tolerators (`|| true`, silent JSON drops, `.passthrough()` schemas that drop fields). Fail loud or accept the input.
+- **Don't** start a sqlite-vec migration in this plan. It's a separate plan with its own discovery.
+
+### Surprising findings worth re-verifying mid-plan
+
+- **#2090/#2095** may already be fixed: `session-init.ts:78` returns `EXIT_CODE.SUCCESS` on worker-unreachable. Verify against the issue's repro before patching.
+- **#2115** root cause confirmed: `sanitizeEnv` does NOT strip `HTTP_PROXY`/`HTTPS_PROXY`/`NO_PROXY`. Extend the sanitizer; don't add a passthrough knob (#2099).
+- **#2094** `file-context.ts:184,196` truncation is intentional token economics. The bug is that the truncated Read return value confuses Claude into infinite Edit retries. Fix: don't return a partial Read result from a hook — emit an injected-context note instead, or let the full Read happen.
+- **#2126** items 2, 3, 4, 6 collapse into the P2 trust-boundary fix. Items 1 (basename glob) and 5 (cleanup CLI extension) are real but small.
+
+---
+
+## Phase 1 — Delete process-management theater (DEL-1 + DEL-2)
+
+**Closes:** #2090, #2095, #2107, #2111, #2114, #2117, #2135, #2123, #2097
+
+### What to delete
+
+1. **`aggressiveStartupCleanup()`** at `src/services/infrastructure/ProcessManager.ts:659-727`. Including:
+   - Windows WQL filter block (lines 563-606) — deletable; PowerShell WQL bug (#2114) disappears
+   - Linux/macOS `ps -eo pid,etime,command | grep` block (lines 607-644)
+   - `AGGRESSIVE_CLEANUP_PATTERNS` and `AGE_GATED_CLEANUP_PATTERNS` constants
+   - `ORPHAN_MAX_AGE_MINUTES` constant
+   - All callers of `aggressiveStartupCleanup` (grep for usage; expected: `worker-service.ts` startup)
+2. **PowerShell `-EncodedCommand` wrapper** at `ProcessManager.ts:944-1041`. Replace with `child_process.spawn(cmd, [args], { detached: true, stdio: 'ignore', windowsHide: true })`. Arg-array form bypasses shell on Windows, no quoting needed. The `setsid` Unix wrapper stays (it's correct).
+3. **Restart-with-port-steal sequence** at `worker-service.ts:1154-1175`. Replace with: try `httpShutdown(port)` → if port still bound after 5s, log error and exit 1 (let user resolve). Don't loop. Don't kill PID by force. The user sees the error and acts.
+4. **Worker-cli duplicate-worker self-detection.** Read `src/cli/worker-cli.js` (or wherever the restart entry-point lives). Find the path that triggers duplicate detection on a `restart` command and remove it. The PID file owns the lock; restart should atomically swap.
+
+### What stays
+
+- **`verifyPidFileOwnership()`** at `process-registry.ts:160-182` and `captureProcessStartToken()` at lines 94-146 — these are correct. PID file with start-time token is exactly the OS-trust pattern we want.
+- **The PID file itself** at `~/.claude-mem/worker.pid` (or `$DATA_DIR/worker.pid`). This is the lock.
+- **`waitForPortFree()`** with a short timeout — used to confirm shutdown completed. Stays.
+
+### Implementation steps
+
+1. `git grep -n aggressiveStartupCleanup` → list every callsite. Delete the function and every callsite. Run `npm run build-and-sync`.
+2. Replace daemon-spawn body in `ProcessManager.ts:944-1041`:
+   - Single platform-uniform path: `child_process.spawn(execPath, args, { detached: true, stdio: 'ignore', windowsHide: true }).unref()`
+   - Keep `setsid` wrapper on Unix when available (process-group cleanup on parent death).
+   - Delete the PowerShell branch entirely.
+3. Rewrite `worker-service.ts:1154-1175` restart case:
+   ```
+   await httpShutdown(port)
+   const free = await waitForPortFree(port, 5000)
+   if (!free) {
+     console.error('Port still bound after shutdown. Resolve manually.')
+     process.exit(1)
+   }
+   removePidFile()
+   spawnDaemon(__filename, port)
+   ```
+4. Re-verify #2090/#2095 are already fixed by reading `session-init.ts:30-80`. If yes, log "no-op" in plan execution notes. If the original repro still fires, add `|| true`-equivalent at the hooks.json shell wrapper layer (NOT in the handler itself).
+5. Confirm #2117 (cleanup SIGKILLs own ancestors) goes away once cleanup is deleted.
+
+### Verification
+
+- `git grep aggressiveStartupCleanup` returns zero hits.
+- `git grep -E "EncodedCommand|powershell.*Start-Process"` returns zero hits in `src/`.
+- Manual: kill worker, restart, confirm clean restart. Spawn 3 workers in parallel from different shells, confirm 2 fail with PID-file-owned errors and the first one wins (no kill cascade).
+- Windows VM (or CI): username with space (`C:\Users\Alex Newman\`) — confirm spawn works without quoting drama. Closes #2135/#2123/#2097.
+- Manually verify #2094 is NOT regressed (separate concern; covered in P5).
+
+### Anti-pattern guards
+
+- Don't add a "lighter" cleanup. There is no lighter cleanup. The OS owns process lifecycle.
+- Don't add a "warn user about orphan workers" branch. If orphans exist, they're someone else's bug.
+- Don't add platform branches in the spawn code beyond the existing `setsid` check.
+
+---
+
+## Phase 2 — Observer-sessions trust boundary (DEL-9)
+
+**Closes:** #2126 (items 2, 3, 4, 6 by deletion; items 1, 5 by small fix), #2118
+
+### What to do
+
+Replace the `cwd === OBSERVER_SESSIONS_DIR` discriminator pattern (which has to be repeated by every consumer and inevitably drifts) with a single env-var trust boundary.
+
+### Implementation steps
+
+1. **Set the env var at every spawn site:**
+   - `src/services/worker/SDKAgent.ts:113` (`buildIsolatedEnv`) — add `CLAUDE_MEM_INTERNAL: '1'` to the returned env.
+   - `src/services/worker/knowledge/KnowledgeAgent.ts:73` — same.
+   - Confirm both call `Agent SDK query()` with `env: isolatedEnv` so the spawned `claude` subprocess inherits.
+
+2. **Check the env var first in `shouldTrackProject`:**
+   - `src/shared/should-track-project.ts:35-44` — first line of function: `if (process.env.CLAUDE_MEM_INTERNAL === '1') return false;`
+   - Keep the existing `isWithin(cwd, OBSERVER_SESSIONS_DIR)` check as a belt-and-braces fallback.
+
+3. **Delete now-redundant filters:**
+   - `src/services/worker/PaginationHelper.ts:115-117` — keep (UI hides observer rows; harmless).
+   - `src/services/worker/PaginationHelper.ts:178` — change hardcoded string `'observer-sessions'` to `OBSERVER_SESSIONS_PROJECT` const for consistency. Tiny fix.
+   - `src/services/worker/SSEBroadcaster.ts:45-60` — add the SAME filter that SearchManager uses (`SearchManager.ts:194`). Don't invent a new one. Extract the filter predicate to a shared helper used by both. Closes #2118.
+
+4. **#2126 item 1 (basename glob fix):** Read the issue's exact bug. Likely `EXCLUDED_PROJECTS` matches by full path instead of basename. Fix in the matcher; one-liner.
+
+5. **#2126 item 5 (cleanup CLI):** Extend `src/services/infrastructure/CleanupV12_4_3.ts:185-205` to take a `--dry-run` and report counts. Don't write a new CLI; add the flag to existing.
+
+### Verification
+
+- Add a test: spawn `SDKAgent`, verify the spawned subprocess has `CLAUDE_MEM_INTERNAL=1` in its env.
+- Add a test: `shouldTrackProject('/any/path')` with `CLAUDE_MEM_INTERNAL=1` set returns `false`.
+- Manual: trigger an observer session, confirm zero new rows under user's project in the DB.
+- SSE: connect a client to `/api/events`, trigger an observer session, confirm no observer events on the SSE stream.
+
+### Anti-pattern guards
+
+- Don't add a `CLAUDE_MEM_OBSERVER_SESSION_DIR` env override (#2126 item 2). `CLAUDE_MEM_DATA_DIR` already overrides; the observer dir is derived.
+- Don't add per-consumer filter knobs. One trust boundary, two existing filters (PaginationHelper, SSE), shared helper.
+
+---
+
+## Phase 3 — Multi-account commit + port/path de-hardcoding (CON-2 + DEL-7)
+
+**Closes:** #2103, #2109, #2101
+
+Discovery showed multi-account is ~80% there: `DATA_DIR` is fully overridable, per-UID port already exists, PID files are DATA_DIR-relative. The remaining gap is 8 hardcoded `37777` literals + hooks.json bare-port assumption.
+
+### What to do
+
+1. **Eliminate every hardcoded `37777`:**
+   - `src/ui/viewer/constants/settings.ts:8` — change to read from settings/env at runtime if possible; otherwise leave as build-time default (least bad).
+   - `src/npx-cli/commands/runtime.ts:154`, `install.ts:545`, `uninstall.ts:109` — replace fallback with `SettingsDefaultsManager.get('CLAUDE_MEM_WORKER_PORT')`.
+   - `src/integrations/opencode-plugin/index.ts:97` — same. Read from settings.
+   - `src/services/integrations/OpenClawInstaller.ts:171` — drop the default; require the caller to pass it.
+   - `plugin/skills/timeline-report/SKILL.md:23,53` — replace literal with `${CLAUDE_MEM_WORKER_PORT:-37700}` or instruct the skill to read from settings.json. Closes #2103.
+
+2. **Fix hooks.json port handling for #2109:**
+   - `plugin/hooks/hooks.json` — every hook command needs to either (a) inherit the port from env or (b) read from settings.json. Update the `bun-runner.js` wrapper to do this once.
+   - On Windows + Git Bash, ensure POSIX path → Windows path conversion happens before passing to `node.exe`. The `bun-runner.js` wrapper is the right place.
+
+3. **Multi-account commit:**
+   - Document in CLAUDE.md: multi-account works by setting `CLAUDE_MEM_DATA_DIR=/path/to/account-N` per shell. All paths derive from it. Per-UID port collision is handled automatically.
+   - Add a one-line CLI command: `claude-mem profile use <name>` that exports the right env vars (or just print the export command for user to eval).
+   - Close #2101 with documentation pointing at the above.
+
+### Verification
+
+- `git grep -nE "37777" src/ plugin/` returns only the build-time default in `settings.ts`.
+- Run two workers in parallel under different `CLAUDE_MEM_DATA_DIR` values; both bind successfully on different ports; both have separate PID files; both serve separate SSE streams.
+- Run timeline-report skill against a non-default port; it picks up the right port from settings.
+
+### Anti-pattern guards
+
+- Don't add a "discover running workers on common ports" probe. The settings.json port is the source of truth.
+- Don't add a `--port` flag to every CLI command. The env / settings.json owns it.
+
+---
+
+## Phase 4 — Extend env sanitizer (CON-1)
+
+**Closes:** #2115, #2099
+
+### What to do
+
+1. `src/supervisor/env-sanitizer.ts` — extend `ENV_PREFIXES` and/or add a `PROXY_VARS` set that strips:
+   - `HTTP_PROXY`, `HTTPS_PROXY`, `ALL_PROXY`, `NO_PROXY` (and lowercase variants)
+   - Optionally: `npm_config_proxy`, `npm_config_https_proxy`
+2. Decide whether the strip should be unconditional or opt-in. Default: unconditional. Worker spawns `claude` for internal AI calls; the user's proxy config should not bleed in.
+3. **Reject #2099's passthrough-whitelist feature.** Close with: "we now strip proxy vars by default; if you have a real use case for letting them through, file a new issue with details."
+
+### Verification
+
+- Set `HTTPS_PROXY=http://bad-proxy:1234` in the worker shell. Spawn an SDK subprocess. Confirm the subprocess's env does NOT contain `HTTPS_PROXY`. Add a test for this.
+- `git grep -n "HTTP_PROXY\|HTTPS_PROXY"` shows the sanitizer is the only place that knows about them.
+
+---
+
+## Phase 5 — Fail-fast cleanup (FF-1)
+
+**Closes:** #2089, #2094, #2116. **#2118 is closed by P2.**
+
+### #2089 — stdin-reader silent drop
+
+`src/cli/stdin-reader.ts:156-164` — `onEnd` resolves with `undefined` even on parse failure. Change to: if input is non-empty AND parse fails, throw or call the safety-timeout error path. Match what the issue asks for: distinguish "no input" from "malformed input." Document in the function header.
+
+### #2094 — PreToolUse:Read truncation causes Edit deadlock
+
+`src/cli/handlers/file-context.ts:141-143, 184, 196` — the truncation is intentional (token economics), but returning a truncated Read result confuses Claude. Fix:
+
+- Hooks should not return modified Read results. They can inject context as `additionalContext` or skip entirely.
+- Audit what the handler returns to Claude Code. If it returns a fake Read response with 1 line, that's the bug. It should either return `{ continue: true }` (let the real Read happen) or inject context via `additionalContext` field.
+- Read Claude Code's PreToolUse hook contract for what fields are allowed in the response.
+
+### #2116 — `/api/memory/save` silently drops metadata
+
+`src/services/worker/http/routes/MemoryRoutes.ts:16-20, 38-67` — the schema uses `.passthrough()` which keeps unknown fields, but discovery suggests fields are dropped at insert time. Audit:
+
+- Where do the schema's accepted fields get inserted? If only `text/title/project` are in the INSERT statement, the metadata is dropped silently.
+- Fix: either accept arbitrary metadata into a `metadata` JSON column, or reject requests with unknown fields (`.strict()` instead of `.passthrough()`). Pick one. Default: accept into a JSON column.
+- The "force project to plugin's own project" line at `MemoryRoutes.ts:40` (`const targetProject = project || this.defaultProject`) is fine. It uses caller's value if provided. Verify the issue reporter wasn't omitting `project` field.
+
+### Verification
+
+- Test: `POST /api/memory/save` with `metadata: { foo: 'bar' }` — confirm the data is retrievable.
+- Test: malformed JSON to stdin-reader fires error, not silent undefined.
+- Manual: trigger PreToolUse:Read on a large file — confirm Edit succeeds afterward (no deadlock).
+
+---
+
+## Phase 6 — Small deletions (DEL-4 + DEL-5 + DEL-6 + DEL-8)
+
+### DEL-4 — Un-bundle Zod from hook scripts (#2113)
+
+- `scripts/build-hooks.js:163-171, 203-230, 294` — add `'zod'` to the `external` list for hook builds.
+- If hooks need validation, write a 20-line shape check (`typeof x.foo === 'string'` etc.). Don't reach for Zod for hook input.
+- Audit `src/hooks/` for Zod imports; replace with hand-rolled checks.
+- Worker (`worker-service.cjs`) can still bundle Zod — the conflict is only in hook-bundled scripts loaded by OpenCode.
+
+**Verification:** `node -e "require('./plugin/scripts/<hook>.js')"` shows no Zod in the bundle. Run with OpenCode hook environment; #2113's TypeError doesn't reproduce.
+
+### DEL-5 — Delete GeminiAgent fallback (#2087)
+
+- `src/services/worker/GeminiAgent.ts:130-132` — delete `setFallbackAgent`.
+- `src/services/worker/GeminiAgent.ts:365` — delete the `if (this.fallbackAgent)` branch. On 429: log + throw.
+- `src/services/worker/OpenRouterAgent.ts:79-81` — same.
+- `tests/gemini_agent.test.ts:279, 313` — delete the fallback tests; add an explicit "429 throws" test.
+- Update docs anywhere that mentions Gemini-falls-back-to-Claude (it never did in production).
+
+### DEL-6 — Delete the 4-hour session timeout knob request (#2127, #2098)
+
+- Find `MAX_SESSION_WALL_CLOCK_MS` (likely `src/services/worker/sessions/SessionManager.ts` or similar). Read the surrounding code: what does the timeout do? (Likely cleanup of stale sessions.)
+- If the timeout is arbitrary: raise to 24h or remove. Document why.
+- If the timeout exists for a real reason (memory pressure, abandoned sessions): document the reason in code, raise to a value nobody hits in practice, and close both issues with the explanation.
+- Close #2098 as dup of #2127.
+
+### DEL-8 — Delete `installCLI()` alias (#2054)
+
+- `plugin/scripts/smart-install.js:345-395` — delete `installCLI` function.
+- `plugin/scripts/smart-install.js:633` — delete the call.
+- `src/npx-cli/commands/uninstall.ts` — add a one-time legacy-alias-strip pass:
+  - Read `~/.bashrc`, `~/.zshrc`, `~/Documents/PowerShell/Microsoft.PowerShell_profile.ps1`.
+  - Remove any line matching `^alias claude-mem=` or `^function claude-mem`.
+  - Print "Removed legacy claude-mem alias from <file>" so users know.
+- Update README + docs: canonical entry points are `npx claude-mem <cmd>` and `bunx claude-mem <cmd>`.
+
+**Verification:** Fresh install creates no shell-config mutations. Existing user with the alias runs uninstall — alias is gone. `which claude-mem` after uninstall returns nothing.
+
+---
+
+## Phase 7 — #2106 install fixes (modest scope)
+
+**Closes:** #2106 (items 1, 3, 4, 6 by fix; items 2, 7 by close-as-already-fixed/insufficient-detail; item 5 by documentation).
+
+### Fixes
+
+1. **Item 1 — multiselect default:** `src/npx-cli/commands/install.ts:275-277` — change `initialValues: detected.filter(...).map(...)` to `initialValues: []`. Force explicit opt-in.
+2. **Item 3 — install-shutdown-before-overwrite:** Extract `uninstall.ts:109-132` (HTTP shutdown + poll) to `src/services/install/shutdown-helper.ts`. Call it from both `uninstall.ts` and `install.ts` before `copyPluginToMarketplace`.
+3. **Item 4 — uninstall path coverage:** `src/npx-cli/commands/uninstall.ts` — add removal of:
+   - `~/.npm/_npx/*/node_modules/claude-mem`
+   - `~/.cache/claude-cli-nodejs/*/mcp-logs-plugin-claude-mem-*`
+   - `~/.claude/plugins/data/claude-mem-thedotmack/`
+   - Cascade shutdown to chroma-mcp (call its shutdown endpoint or kill PID).
+4. **Item 6 — real port query:** `install.ts:545` — after `smart-install.js` completes, hit `http://127.0.0.1:<settingsPort>/api/health` and report the actually-bound port. If health fails, just print "worker not yet ready" and exit cleanly.
+5. **Item 5 — documentation:** Add to install summary output: "Close all Claude Code sessions before uninstalling, or `~/.claude-mem` will be recreated by active hooks."
+
+### Close
+
+- Item 2 (SQLite migration race): closed as already fixed by `ba37b2b2`/`68e92edc`.
+- Item 7 (vague SessionStart errors): closed as insufficient detail.
+
+### Verification
+
+- Fresh install on a clean VM: only the IDEs the user explicitly checks are installed.
+- Reinstall while worker is running: install succeeds, no "overwrite" loop.
+- Uninstall + `find ~/.npm ~/.cache ~/.claude -name "*claude-mem*"` returns empty.
+- Install summary prints the actual port when the user has overridden via env or settings.
+
+---
+
+## Phase 8 — Chroma deterministic pinning (DEL-3 lite)
+
+**Closes:** #2046, #2085, #2102
+
+Full sqlite-vec migration is a separate plan (would require replacing the embedding pipeline currently owned by chroma-mcp's bundled SBERT). For this plan: stop using `uvx --with` flags ad-hoc and pin chroma-mcp to a specific version with locked deps.
+
+### Implementation
+
+1. **Pin chroma-mcp version.** `src/services/sync/ChromaMcpManager.ts:200-244` — change `buildCommandArgs()` to invoke a specific pinned version: `uvx --python 3.11 chroma-mcp==<X.Y.Z>` (pick a known-good version that bundles its own deps).
+2. **Re-add `--with httpcore --with httpx` ONLY if the pinned version requires them.** Verify by running the pinned command in a clean uvx cache. If the deps are declared properly upstream, the `--with` flags are unnecessary.
+3. **Verify #2102 fix is intact:** commit `05114bec` added transport cleanup on timeout, stale onclose handler guard, and 10s reconnect backoff. Read `ChromaMcpManager.ts` to confirm these are still present.
+
+### Decision deferred to a separate plan
+
+- Replacing chroma-mcp with sqlite-vec or a different vector store. This requires picking an embedding strategy (OpenAI? local model?) and rewriting `ChromaSync.ts`. Not in this plan.
+
+### Verification
+
+- Fresh install on a clean machine: `~/.claude-mem/chroma/` populates, `chroma_query_documents` returns results without errors.
+- No "No module named 'httpcore'" error in worker logs (closes #2046, #2085).
+- Force a chroma-mcp timeout (e.g. kill the subprocess); confirm the worker reconnects after backoff without spawning duplicate subprocesses (closes #2102).
+
+---
+
+## Phase 9 — Verification + close-as-dup
+
+### Cross-cutting verification
+
+1. `git grep -nE "aggressiveStartupCleanup|EncodedCommand|setFallbackAgent|installCLI"` — all return zero hits.
+2. `git grep -nE "37777" src/ plugin/` — only the build-time default in `viewer/constants/settings.ts`.
+3. Full test suite passes.
+4. `npm run build-and-sync` completes; worker starts; SessionStart context injection works (manual test: open a new session, confirm memory recap appears).
+5. CI runs on Windows (or manual VM): username with space spawns successfully.
+
+### Close issues
+
+- #2112: already fixed → close with link to fix commit.
+- #2123: dup of #2135.
+- #2097: dup of #2135.
+- #2098: dup of #2127.
+- #2126: closed by P2 trust-boundary fix.
+- #2099: closed by P4 (proxy strip is the right fix; passthrough whitelist not needed).
+- #2101: closed by P3 documentation + multi-account commit.
+- #2117: closed by P1 (deletion of aggressive cleanup).
+- #2087: closed by P6 (DEL-5).
+
+All other issues close as part of their respective phase verification.
+
+---
+
+## Plan execution order
+
+P1 first (highest leverage; closes 9 issues; reverses regression treadmill). Then P2 (single trust boundary closes 2 issues + prevents future leaks). P3-P8 are independent and can run in parallel by different sessions. P9 last.
+
+If time-constrained, the high-value subset is **P1 + P2 + P5**: kills the two structural patterns (defenders, tolerators) plus the trust-boundary leak. That alone closes 14 of 29 issues with mostly deletions.
@@ -0,0 +1,247 @@
+# Plan: Fix Issue #2139 — Missing migration for `pending_messages.tool_use_id` and `pending_messages.worker_pid`
+
+## Root Cause (verified)
+
+There are **two parallel migration code paths** in this repo:
+
+1. `src/services/sqlite/migrations/runner.ts::MigrationRunner.runAllMigrations()` — the canonical runner. It includes `rebuildPendingMessagesForSelfHealingClaim()` (v28) which adds `tool_use_id` + `worker_pid` columns and the `idx_pending_messages_worker_pid` + `ux_pending_session_tool` indexes.
+2. `src/services/sqlite/SessionStore.ts` constructor (lines 56–77) — a **duplicated** inline migration list. **It is missing migration 28 entirely** — it calls `addObservationSubagentColumns()` (v27) directly followed by `addObservationsUniqueContentHashIndex()` (v29).
+
+The worker bypasses `Database.ts → MigrationRunner` and instantiates `SessionStore` directly via `src/services/worker/DatabaseManager.ts:34` (`this.sessionStore = new SessionStore(this.db);`). So in a fresh worker boot, the worker only runs SessionStore's incomplete list, leaves v28 unapplied, marks v29 as applied, and the bundled `plugin/scripts/worker-service.cjs` ships without v28's logic (verified: `grep -c "rebuildPendingMessagesForSelfHealingClaim" plugin/scripts/worker-service.cjs` returns 0; `.run(28,` is absent while `.run(27,` and `.run(29,` are present).
+
+Result: `pending_messages` is created from `createPendingMessagesTable()` (v16) which has neither column, no later step adds them, and every queue claim and observation insert fails as the issue describes.
+
+## Fix Strategy
+
+Mirror `MigrationRunner.rebuildPendingMessagesForSelfHealingClaim` into `SessionStore.ts` following the **exact mirror precedent already established** in that file at `SessionStore.ts:1003-1039` (`addObservationSubagentColumns`) and `SessionStore.ts:1041-…` (`addObservationsUniqueContentHashIndex`). Each existing mirror's docstring explicitly says: "Mirrors `MigrationRunner.<name>` so bundled artifacts that embed SessionStore (e.g. worker-service.cjs, context-generator.cjs) stay schema-consistent."
+
+We do **not** need a new schema_versions number. The existing migration is v28; we just need SessionStore to apply it. The mirror should be **column-existence driven** (not version-trust driven) per the SessionStore convention at line 952: *"Cannot trust schema_versions alone — the old MigrationRunner may have recorded version 26 without the ALTER TABLE actually succeeding. Always check column existence directly."* This matters because real-world affected DBs already have v29 recorded (per the issue) — checking version alone would skip the fix.
+
+We should use the **simple `ALTER TABLE` approach** the issue suggests rather than the full table-rebuild from runner.ts, because:
+- ALTER TABLE is safe to run on DBs that already reached v29 with rows present.
+- The runner.ts rebuild's only extra work was dropping a legacy stale-reset epoch column that hasn't existed since v20 in DBs created by the SessionStore path.
+- Idempotency is achieved by `PRAGMA table_info` + column-name guards.
+
+## Phase 0: Documentation Discovery (already done inline above)
+
+Sources consulted:
+- `src/services/sqlite/SessionStore.ts:30-77` (constructor migration list)
+- `src/services/sqlite/SessionStore.ts:949-1100` (existing mirror methods + docstrings)
+- `src/services/sqlite/migrations/runner.ts:22-43` (canonical migration order)
+- `src/services/sqlite/migrations/runner.ts:1005-1153` (canonical v28 logic)
+- `src/services/sqlite/PendingMessageStore.ts:106-194` (consumer SQL using both columns)
+- `src/services/sqlite/schema.sql:121-156` (canonical fresh-DB schema — already has both columns + indexes)
+- `src/services/worker/DatabaseManager.ts:31-35` (worker uses SessionStore directly)
+- `plugin/scripts/worker-service.cjs` — confirmed bundled artifact has `.run(27,` and `.run(29,` but no `.run(28,` and no `rebuildPendingMessagesForSelfHealingClaim` symbol.
+
+Allowed APIs (verified to exist):
+- `this.db.query('PRAGMA table_info(pending_messages)').all() as TableColumnInfo[]` — used at SessionStore.ts:1024.
+- `this.db.run('ALTER TABLE pending_messages ADD COLUMN <col> <type>')` — used at SessionStore.ts:1029, 1032.
+- `this.db.run('CREATE INDEX IF NOT EXISTS …')` — used throughout.
+- `this.db.run('CREATE UNIQUE INDEX IF NOT EXISTS …')` — used at runner.ts:1134.
+- `this.db.prepare('INSERT OR IGNORE INTO schema_versions …').run(28, new Date().toISOString())` — same pattern as v27, v29 mirrors.
+- `TableColumnInfo` is already imported at SessionStore.ts top.
+
+Anti-patterns to avoid:
+- Do NOT trust `schema_versions.version = 28` alone — check `PRAGMA table_info` for column existence first (real-world DBs from issue #2139 already have v29 recorded with no v28 logic ever applied).
+- Do NOT do a full table rebuild in SessionStore — risky on populated DBs and unnecessary; use ALTER TABLE.
+- Do NOT add a new version number (e.g. v30). The migration is v28 — we are completing what was already specified, not creating new schema.
+- Do NOT modify `runner.ts` — its v28 is correct already; the bug is only that SessionStore doesn't mirror it.
+- Do NOT remove the duplicated migration system. That's a larger refactor (see observation 71512). For this fix, just complete the mirror.
+
+## Phase 1: Add the mirror method to SessionStore.ts
+
+**File:** `src/services/sqlite/SessionStore.ts`
+
+### 1A. Add the call site
+
+In the constructor migration list, insert one line between line 75 (`this.addObservationSubagentColumns();`) and line 76 (`this.addObservationsUniqueContentHashIndex();`):
+
+```ts
+this.addObservationSubagentColumns();
+this.addPendingMessagesToolUseIdAndWorkerPidColumns();   // ← new
+this.addObservationsUniqueContentHashIndex();
+```
+
+This places the call in the same relative position as `rebuildPendingMessagesForSelfHealingClaim` in `runner.ts:41`.
+
+### 1B. Add the method body
+
+Insert immediately before `addObservationsUniqueContentHashIndex` (around SessionStore.ts:1041), following the docstring pattern of the two adjacent mirrors:
+
+```ts
+/**
+ * Add tool_use_id and worker_pid columns + indexes to pending_messages (migration 28).
+ *
+ * Mirrors MigrationRunner.rebuildPendingMessagesForSelfHealingClaim so bundled
+ * artifacts that embed SessionStore (e.g. worker-service.cjs, context-generator.cjs)
+ * stay schema-consistent. Without this, every queue-claim cycle fails with
+ * "no such column: worker_pid" and every observation insert fails with
+ * "table pending_messages has no column named tool_use_id" (issue #2139).
+ *
+ * Uses ALTER TABLE rather than the full table rebuild from MigrationRunner because:
+ *   - It's safe on populated DBs that already reached v29 without ever applying v28.
+ *   - The legacy stale-reset epoch column the rebuild dropped never existed in
+ *     pending_messages tables created by the SessionStore migration path.
+ *
+ * Column existence is checked directly — schema_versions cannot be trusted because
+ * affected DBs may already have v29 recorded with neither column present (#2139).
+ */
+private addPendingMessagesToolUseIdAndWorkerPidColumns(): void {
+  // pending_messages may not exist yet on freshly-created DBs at this point in
+  // the migration order — createPendingMessagesTable (v16) has already run by
+  // the time we get here, so this guard is defensive only.
+  const tables = this.db.query(
+    "SELECT name FROM sqlite_master WHERE type='table' AND name='pending_messages'"
+  ).all() as TableNameRow[];
+  if (tables.length === 0) {
+    this.db.prepare('INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)').run(28, new Date().toISOString());
+    return;
+  }
+
+  const cols = this.db.query('PRAGMA table_info(pending_messages)').all() as TableColumnInfo[];
+  const hasToolUseId = cols.some(c => c.name === 'tool_use_id');
+  const hasWorkerPid = cols.some(c => c.name === 'worker_pid');
+
+  if (!hasToolUseId) {
+    this.db.run('ALTER TABLE pending_messages ADD COLUMN tool_use_id TEXT');
+  }
+  if (!hasWorkerPid) {
+    this.db.run('ALTER TABLE pending_messages ADD COLUMN worker_pid INTEGER');
+  }
+
+  // Indexes are idempotent — match runner.ts:1117-1120 + 1134-1138.
+  this.db.run('CREATE INDEX IF NOT EXISTS idx_pending_messages_worker_pid ON pending_messages(worker_pid)');
+
+  // The UNIQUE partial index requires no duplicate (content_session_id, tool_use_id)
+  // pairs. Dedup before creating it (matches runner.ts:1124-1132). Safe to run
+  // unconditionally — if tool_use_id was just added, every row has it as NULL
+  // and the WHERE filter excludes them.
+  this.db.run(`
+    DELETE FROM pending_messages
+     WHERE tool_use_id IS NOT NULL
+       AND id NOT IN (
+         SELECT MIN(id) FROM pending_messages
+          WHERE tool_use_id IS NOT NULL
+          GROUP BY content_session_id, tool_use_id
+       )
+  `);
+  this.db.run(`
+    CREATE UNIQUE INDEX IF NOT EXISTS ux_pending_session_tool
+    ON pending_messages(content_session_id, tool_use_id)
+    WHERE tool_use_id IS NOT NULL
+  `);
+
+  this.db.prepare('INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)').run(28, new Date().toISOString());
+}
+```
+
+`TableNameRow` is not currently imported in SessionStore.ts. **Check the existing imports**; if absent, either:
+- Add `TableNameRow` to the existing `import { TableColumnInfo, … } from '../../types/database.js';` line, or
+- Inline the cast as `as Array<{ name: string }>` (matches the inline pattern used elsewhere in the file).
+
+### 1C. Anti-pattern guards
+
+- ❌ Do **not** wrap in `BEGIN TRANSACTION` — the surrounding constructor doesn't, and `ALTER TABLE … ADD COLUMN` is auto-committed in SQLite.
+- ❌ Do **not** call `PRAGMA foreign_keys = OFF` — only needed for table rebuilds, not ALTER TABLE.
+- ❌ Do **not** key off `SELECT version FROM schema_versions WHERE version = 28` to early-return — affected DBs have v29 recorded without v28 columns. Always inspect `PRAGMA table_info` first.
+
+### 1D. Verification (Phase 1)
+
+```bash
+# Source-side smoke checks
+grep -n "addPendingMessagesToolUseIdAndWorkerPidColumns" src/services/sqlite/SessionStore.ts
+# Should show 2 matches (call site + method definition)
+
+# Confirm relative ordering is correct
+grep -n "addObservationSubagentColumns\|addPendingMessagesToolUseIdAndWorkerPid\|addObservationsUniqueContentHashIndex" src/services/sqlite/SessionStore.ts | head -3
+# Should print three lines in order: subagent, pending-messages, unique-hash
+```
+
+## Phase 2: Build and verify the bundle
+
+```bash
+npm run build-and-sync
+```
+
+Verification:
+
+```bash
+# Bundled artifact must now contain v28 logic.
+grep -c "addPendingMessagesToolUseIdAndWorkerPidColumns\|tool_use_id" plugin/scripts/worker-service.cjs
+# tool_use_id count should rise from 6 to >=10 (CREATE INDEX strings + new ALTERs).
+
+grep -on ".run(2[7-9]," plugin/scripts/worker-service.cjs
+# Must now include .run(28, in addition to existing .run(27, and .run(29,
+```
+
+## Phase 3: End-to-end verification on a real worker
+
+1. Move the existing DB aside to simulate a fresh install:
+   ```bash
+   mv ~/.claude-mem/claude-mem.db ~/.claude-mem/claude-mem.db.preissue2139
+   mv ~/.claude-mem/claude-mem.db-wal ~/.claude-mem/claude-mem.db-wal.preissue2139 2>/dev/null
+   mv ~/.claude-mem/claude-mem.db-shm ~/.claude-mem/claude-mem.db-shm.preissue2139 2>/dev/null
+   ```
+2. Restart the worker (kill PID from `~/.claude-mem/supervisor.json`; the supervisor respawns it).
+3. Confirm the schema:
+   ```bash
+   sqlite3 ~/.claude-mem/claude-mem.db "PRAGMA table_info(pending_messages);" | grep -E 'tool_use_id|worker_pid'
+   # Both rows must appear.
+   sqlite3 ~/.claude-mem/claude-mem.db "SELECT version FROM schema_versions ORDER BY version;"
+   # Must include 28 and 29.
+   sqlite3 ~/.claude-mem/claude-mem.db ".indexes pending_messages" | grep -E 'worker_pid|session_tool'
+   # idx_pending_messages_worker_pid and ux_pending_session_tool must appear.
+   ```
+4. Run a tool call in Claude Code so PostToolUse fires.
+5. `tail -n 200 ~/.claude-mem/logs/<latest>.log | grep -E 'no such column|has no column'` — must be empty.
+6. `sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"` — must be > 0 after a real session.
+7. Restore the original DB so the test isn't destructive:
+   ```bash
+   mv ~/.claude-mem/claude-mem.db.preissue2139 ~/.claude-mem/claude-mem.db
+   # (and the -wal/-shm if they existed)
+   ```
+
+## Phase 4: Existing-DB upgrade verification
+
+The user's reported scenario (v29 already applied, columns missing) must also self-heal once the bundle ships. To prove that without waiting for an external user:
+
+1. Copy the current dev DB to a scratch path.
+2. Force the broken state:
+   ```bash
+   cp ~/.claude-mem/claude-mem.db /tmp/issue2139-test.db
+   sqlite3 /tmp/issue2139-test.db "
+     ALTER TABLE pending_messages DROP COLUMN tool_use_id;
+     ALTER TABLE pending_messages DROP COLUMN worker_pid;
+     DROP INDEX IF EXISTS idx_pending_messages_worker_pid;
+     DROP INDEX IF EXISTS ux_pending_session_tool;
+     INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (29, datetime('now'));
+   "
+   # If DROP COLUMN errors on an older sqlite3 build, simulate via a fresh DB
+   # at a 12.4.4-equivalent state instead.
+   ```
+3. Point a one-off SessionStore at it (a tiny `bun run` script invoking `new SessionStore('/tmp/issue2139-test.db')`).
+4. Re-run `PRAGMA table_info(pending_messages)` — both columns must be present, and `schema_versions` must contain `28`.
+
+## Phase 5: Issue follow-through
+
+1. Reply on issue #2139:
+   - Confirm the diagnosis (SessionStore mirror missing v28).
+   - Note the fix is shipping — give the version number after `version-bump`.
+   - Thank the reporter (offer was already in their post; we don't need a PR from them).
+2. After the next claude-mem release, the affected user's worker will self-heal on next boot via the column-existence guards.
+
+## Anti-Pattern Audit (final)
+
+- [ ] No new schema_versions number invented (we use existing v28). ✅
+- [ ] No version-trust early returns added — column-existence is the source of truth. ✅
+- [ ] No table rebuild — straight `ALTER TABLE` to keep the existing rows safe. ✅
+- [ ] No edits to `runner.ts` (already correct). ✅
+- [ ] Mirror docstring follows the exact precedent at SessionStore.ts:1003 + :1041. ✅
+- [ ] Bundle rebuilt and grep-verified to include `.run(28,`. ✅
+
+## Risk Assessment
+
+- **Low risk**: ALTER TABLE ADD COLUMN with a NULLable type cannot fail on a non-empty table; CREATE INDEX IF NOT EXISTS is no-op on subsequent boots; the dedup DELETE is bounded by `tool_use_id IS NOT NULL`, which is empty immediately after the first ALTER.
+- **No data loss**: Adding columns and partial unique indexes is non-destructive. The dedup DELETE only fires if duplicate `(content_session_id, tool_use_id)` pairs already exist — an impossibility in the broken-DB scenario where `tool_use_id` was never persisted.
+- **Idempotent**: Repeated boots are safe — `PRAGMA table_info` + `IF NOT EXISTS` + `INSERT OR IGNORE`.
@@ -8,7 +8,7 @@
          {
            "type": "command",
            "shell": "bash",
-            "command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/smart-install.js\"",
+            "command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt \"$HOME/.claude/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; command -v cygpath >/dev/null 2>&1 && { _W=$(cygpath -w \"$_R\" 2>/dev/null); [ -n \"$_W\" ] && _R=\"$_W\"; }; node \"$_R/scripts/smart-install.js\"",
            "timeout": 300
          }
        ]
@@ -21,19 +21,19 @@
          {
            "type": "command",
            "shell": "bash",
-            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/smart-install.js\"",
+            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt \"$HOME/.claude/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; command -v cygpath >/dev/null 2>&1 && { _W=$(cygpath -w \"$_R\" 2>/dev/null); [ -n \"$_W\" ] && _R=\"$_W\"; }; node \"$_R/scripts/smart-install.js\"",
            "timeout": 300
          },
          {
            "type": "command",
            "shell": "bash",
-"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" start; echo '{\"continue\":true,\"suppressOutput\":true}'",
+"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt \"$HOME/.claude/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; command -v cygpath >/dev/null 2>&1 && { _W=$(cygpath -w \"$_R\" 2>/dev/null); [ -n \"$_W\" ] && _R=\"$_W\"; }; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" start; echo '{\"continue\":true,\"suppressOutput\":true}'",
            "timeout": 60
          },
          {
            "type": "command",
            "shell": "bash",
-"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code context",
+"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt \"$HOME/.claude/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; command -v cygpath >/dev/null 2>&1 && { _W=$(cygpath -w \"$_R\" 2>/dev/null); [ -n \"$_W\" ] && _R=\"$_W\"; }; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code context",
            "timeout": 60
          }
        ]
@@ -45,7 +45,7 @@
          {
            "type": "command",
            "shell": "bash",
-            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-init",
+            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt \"$HOME/.claude/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; command -v cygpath >/dev/null 2>&1 && { _W=$(cygpath -w \"$_R\" 2>/dev/null); [ -n \"$_W\" ] && _R=\"$_W\"; }; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-init",
            "timeout": 60
          }
        ]
@@ -58,7 +58,7 @@
          {
            "type": "command",
            "shell": "bash",
-            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code observation",
+            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt \"$HOME/.claude/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; command -v cygpath >/dev/null 2>&1 && { _W=$(cygpath -w \"$_R\" 2>/dev/null); [ -n \"$_W\" ] && _R=\"$_W\"; }; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code observation",
            "timeout": 120
          }
        ]
@@ -71,7 +71,7 @@
          {
            "type": "command",
            "shell": "bash",
-            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code file-context",
+            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt \"$HOME/.claude/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; command -v cygpath >/dev/null 2>&1 && { _W=$(cygpath -w \"$_R\" 2>/dev/null); [ -n \"$_W\" ] && _R=\"$_W\"; }; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code file-context",
            "timeout": 60
          }
        ]
@@ -83,7 +83,7 @@
          {
            "type": "command",
            "shell": "bash",
-            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code summarize",
+            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt \"$HOME/.claude/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; command -v cygpath >/dev/null 2>&1 && { _W=$(cygpath -w \"$_R\" 2>/dev/null); [ -n \"$_W\" ] && _R=\"$_W\"; }; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code summarize",
            "timeout": 120
          }
        ]
@@ -5,6 +5,7 @@
  "description": "Runtime dependencies for claude-mem bundled hooks",
  "type": "module",
  "dependencies": {
+    "zod": "^4.3.6",
    "tree-sitter-cli": "^0.26.5",
    "tree-sitter-c": "^0.24.1",
    "tree-sitter-cpp": "^0.23.4",
@@ -339,61 +339,6 @@ function installUv() {
  }
 }

-/**
- * Add shell alias for claude-mem command
- */
-function installCLI() {
-  const WORKER_CLI = join(ROOT, 'scripts', 'worker-service.cjs');
-  const bunPath = getBunPath() || 'bun';
-  const aliasLine = `alias claude-mem='${bunPath} "${WORKER_CLI}"'`;
-  const markerPath = join(ROOT, '.cli-installed');
-
-  // Skip if already installed
-  if (existsSync(markerPath)) return;
-
-  try {
-    if (IS_WINDOWS) {
-      // Windows: Add to PATH via PowerShell profile
-      const profilePath = join(process.env.USERPROFILE || homedir(), 'Documents', 'PowerShell', 'Microsoft.PowerShell_profile.ps1');
-      const profileDir = join(process.env.USERPROFILE || homedir(), 'Documents', 'PowerShell');
-      const functionDef = `function claude-mem { & "${bunPath}" "${WORKER_CLI}" $args }\n`;
-
-      if (!existsSync(profileDir)) {
-        execSync(`mkdir "${profileDir}"`, { stdio: 'ignore', shell: true });
-      }
-
-      const existingContent = existsSync(profilePath) ? readFileSync(profilePath, 'utf-8') : '';
-      if (!existingContent.includes('function claude-mem')) {
-        writeFileSync(profilePath, existingContent + '\n' + functionDef);
-        console.error(`✅ PowerShell function added to profile`);
-        console.error('   Restart your terminal to use: claude-mem <command>');
-      }
-    } else {
-      // Unix: Add alias to shell configs
-      const shellConfigs = [
-        join(homedir(), '.bashrc'),
-        join(homedir(), '.zshrc')
-      ];
-
-      for (const config of shellConfigs) {
-        if (existsSync(config)) {
-          const content = readFileSync(config, 'utf-8');
-          if (!content.includes('alias claude-mem=')) {
-            writeFileSync(config, content + '\n' + aliasLine + '\n');
-            console.error(`✅ Alias added to ${config}`);
-          }
-        }
-      }
-      console.error('   Restart your terminal to use: claude-mem <command>');
-    }
-
-    writeFileSync(markerPath, new Date().toISOString());
-  } catch (error) {
-    console.error(`⚠️  Could not add shell alias: ${error.message}`);
-    console.error(`   Use directly: ${bunPath} "${WORKER_CLI}" <command>`);
-  }
-}
-
 /**
 * Check if dependencies need to be installed
 */
@@ -629,8 +574,8 @@ try {
    // Worker will be started fresh by next hook in chain (worker-service.cjs start)
  }

-  // Step 4: Install CLI to PATH
-  installCLI();
+  // Step 4 (removed in #2054): legacy `claude-mem` shell alias was deleted.
+  // Users invoke the CLI via `npx claude-mem <cmd>` or `bunx claude-mem <cmd>`.

  // Step 5: Warn if the bundled native binary is incompatible with this platform
  checkBinaryPlatformCompatibility();
@@ -20,7 +20,15 @@ Use when users ask for:

 ## Prerequisites

-The claude-mem worker must be running on localhost:37777. The project must have claude-mem observations recorded.
+The claude-mem worker must be running. The project must have claude-mem observations recorded.
+
+**Resolve the worker port** (do this once at the start and reuse `$WORKER_PORT` in every curl call below):
+
+```bash
+WORKER_PORT="${CLAUDE_MEM_WORKER_PORT:-$(node -e "const fs=require('fs'),p=require('path'),os=require('os');const uid=(typeof process.getuid==='function'?process.getuid():77);const fallback=String(37700+(uid%100));try{const s=JSON.parse(fs.readFileSync(p.join(os.homedir(),'.claude-mem','settings.json'),'utf-8'));process.stdout.write(String(s.CLAUDE_MEM_WORKER_PORT||fallback));}catch{process.stdout.write(fallback);}" 2>/dev/null)}"
+```
+
+This honors `CLAUDE_MEM_WORKER_PORT` env, then `~/.claude-mem/settings.json`, then falls back to the per-UID default `37700 + (uid % 100)` — matching how the worker itself picks its port. Required for multi-account setups (#2101) and any user who has overridden the default port (#2103).

 ## Workflow

@@ -50,7 +58,7 @@ If a worktree is detected, use `$parent_project` (the basename of the parent rep
 Use Bash to fetch the complete timeline from the claude-mem worker API:

 ```bash
-curl -s "http://localhost:37777/api/context/inject?project=PROJECT_NAME&full=true"
+curl -s "http://localhost:${WORKER_PORT}/api/context/inject?project=PROJECT_NAME&full=true"
 ```

 This returns the entire compressed timeline -- every observation, session boundary, and summary across the project's full history. The response is pre-formatted markdown optimized for LLM consumption.
@@ -60,7 +68,7 @@ This returns the entire compressed timeline -- every observation, session bounda
 - Medium project (1,000-10,000 observations): ~50-300K tokens
 - Large project (10,000-35,000 observations): ~300-750K tokens

-If the response is empty or returns an error, the worker may not be running or the project name may be wrong. Try `curl -s "http://localhost:37777/api/search?query=*&limit=1"` to verify the worker is healthy.
+If the response is empty or returns an error, the worker may not be running or the project name may be wrong. Try `curl -s "http://localhost:${WORKER_PORT}/api/search?query=*&limit=1"` to verify the worker is healthy.

 ### Step 3: Estimate Token Count

@@ -187,15 +195,15 @@ Tell the user:

 ## Error Handling

- **Empty timeline:** "No observations found for project 'X'. Check the project name with: `curl -s 'http://localhost:37777/api/search?query=*&limit=1'`"
- **Worker not running:** "The claude-mem worker is not responding on port 37777. Start it with your usual method or check `ps aux | grep worker-service`."
- **Timeline too large:** For projects with 50,000+ observations, the timeline may exceed context limits. Suggest using date range filtering: `curl -s "http://localhost:37777/api/context/inject?project=X&full=true"` -- the current endpoint returns all observations; for extremely large projects, the user may want to analyze in time-windowed segments.
+- **Empty timeline:** "No observations found for project 'X'. Check the project name with: `curl -s \"http://localhost:${WORKER_PORT}/api/search?query=*&limit=1\"`"
+- **Worker not running:** "The claude-mem worker is not responding on port ${WORKER_PORT}. Start it with your usual method or check `ps aux | grep worker-service`."
+- **Timeline too large:** For projects with 50,000+ observations, the timeline may exceed context limits. Suggest using date range filtering: `curl -s "http://localhost:${WORKER_PORT}/api/context/inject?project=X&full=true"` -- the current endpoint returns all observations; for extremely large projects, the user may want to analyze in time-windowed segments.

 ## Example

 User: "Write a journey report for the tokyo project"

-1. Fetch: `curl -s "http://localhost:37777/api/context/inject?project=tokyo&full=true"`
+1. Fetch: `curl -s "http://localhost:${WORKER_PORT}/api/context/inject?project=tokyo&full=true"`
 2. Estimate: "Timeline fetched: ~34,722 observations, estimated ~718K tokens. Proceed?"
 3. User confirms
 4. Deploy analysis agent with full timeline
@@ -101,6 +101,11 @@ async function buildHooks() {
      description: 'Runtime dependencies for claude-mem bundled hooks',
      type: 'module',
      dependencies: {
+        // Externalized from mcp-server.cjs to avoid Zod version conflicts when
+        // OpenCode's Bun bundler assembles hook scripts (#2113). MCP SDK
+        // transitively imports Zod; loading it via node_modules at runtime
+        // ensures OpenCode controls the version.
+        'zod': '^4.3.6',
        'tree-sitter-cli': '^0.26.5',
        'tree-sitter-c': '^0.24.1',
        'tree-sitter-cpp': '^0.23.4',
@@ -202,6 +207,11 @@ async function buildHooks() {
      logLevel: 'error',
      external: [
        'bun:sqlite',
+        // Externalize Zod to avoid version conflicts when OpenCode's Bun bundler
+        // assembles hook scripts (see #2113). The MCP server transitively imports
+        // Zod via @modelcontextprotocol/sdk; bundling it caused two Zod versions
+        // to coexist at runtime and the v4 ↔ v3 _zod.def access crashed.
+        'zod',
        'tree-sitter-cli',
        'tree-sitter-javascript',
        'tree-sitter-typescript',
@@ -291,7 +301,7 @@ async function buildHooks() {
      outfile: `${hooksDir}/${CONTEXT_GENERATOR.name}.cjs`,
      minify: true,
      logLevel: 'error',
-      external: ['bun:sqlite'],
+      external: ['bun:sqlite', 'zod'],
      define: {
        '__DEFAULT_PACKAGE_VERSION__': `"${version}"`
      },
@@ -106,8 +106,7 @@ function deduplicateObservations(

 function formatFileTimeline(
  observations: ObservationRow[],
-  filePath: string,
-  truncated: boolean
+  filePath: string
 ): string {
  // Escape filePath for safe interpolation into recovery hints (quotes, backslashes, newlines)
  const safePath = filePath.replace(/\\/g, '\\\\').replace(/"/g, '\\"').replace(/\n/g, '\\n');
@@ -138,17 +137,14 @@ function formatFileTimeline(
  }).toLowerCase().replace(' ', '');
  const currentTimezone = now.toLocaleTimeString('en-US', { timeZoneName: 'short' }).split(' ').pop();

-  const headerLine = truncated
-    ? `This file has prior observations. Only line 1 was read to save tokens.`
-    : `This file has prior observations. The requested section was read normally.`;
-
+  // The hook never modifies the Read call (#2094) — Claude always sees the
+  // full requested section. The timeline below is supplementary priming, not
+  // a replacement for the file contents.
  const lines: string[] = [
    `Current: ${currentDate} ${currentTime} ${currentTimezone}`,
-    headerLine,
-    `- **Already know enough?** The timeline below may be all you need (semantic priming).`,
-    `- **Need details?** get_observations([IDs]) — ~300 tokens each.`,
-    `- **Need full file?** Read again with offset/limit for the section you need.`,
-    `- **Need to edit?** Edit works — the file is registered as read. Use smart_outline("${safePath}") for line numbers.`,
+    `This file has prior observations — supplementary context follows. The Read result below is the full requested section.`,
+    `- **Need details on a past observation?** get_observations([IDs]) — ~300 tokens each.`,
+    `- **Need a structural map first?** smart_outline("${safePath}") — line numbers only, cheaper than re-reading.`,
  ];

  for (const [day, dayObservations] of sortedDays) {
@@ -176,15 +172,8 @@ export const fileContextHandler: EventHandler = {
      return { continue: true, suppressOutput: true };
    }

-    // Preserve user-supplied offset/limit to avoid read-dedup collisions (fixes #1719)
-    const userOffset = typeof toolInput?.offset === 'number' && Number.isFinite(toolInput.offset) && toolInput.offset >= 0
-      ? Math.floor(toolInput.offset) : undefined;
-    const userLimit = typeof toolInput?.limit === 'number' && Number.isFinite(toolInput.limit) && toolInput.limit > 0
-      ? Math.floor(toolInput.limit) : undefined;
-    const isTargetedRead = userOffset !== undefined || userLimit !== undefined;
-
    // Stat the file once: size (gate) + mtime (cache invalidation).
-    // 0 = stat failed non-fatally (e.g. EPERM) — skip mtime check, fall through to truncation.
+    // 0 = stat failed non-fatally (e.g. EPERM) — skip mtime check, fall through to context injection.
    let fileMtimeMs = 0;
    try {
      const statPath = path.isAbsolute(filePath)
@@ -241,12 +230,12 @@ export const fileContextHandler: EventHandler = {
      return { continue: true, suppressOutput: true };
    }

-    // mtime invalidation: bypass truncation when the file is newer than the latest observation.
-    // Uses >= to handle same-millisecond edits (cost: one extra full read vs risk of stuck truncation).
+    // mtime invalidation: skip the timeline injection when the file is newer than the latest
+    // observation — past observations are stale and adding them risks misleading the model.
    if (fileMtimeMs > 0) {
      const newestObservationMs = Math.max(...data.observations.map(o => o.created_at_epoch));
      if (fileMtimeMs >= newestObservationMs) {
-        logger.debug('HOOK', 'File modified since last observation, skipping truncation', {
+        logger.debug('HOOK', 'File modified since last observation, skipping context injection', {
          filePath: relativePath,
          fileMtimeMs,
          newestObservationMs,
@@ -261,23 +250,18 @@ export const fileContextHandler: EventHandler = {
      return { continue: true, suppressOutput: true };
    }

-    // Unconstrained → truncate to 1 line; targeted → preserve offset/limit.
-    const truncated = !isTargetedRead;
-    const timeline = formatFileTimeline(dedupedObservations, filePath, truncated);
-    const updatedInput: Record<string, unknown> = { file_path: filePath };
-    if (isTargetedRead) {
-      if (userOffset !== undefined) updatedInput.offset = userOffset;
-      if (userLimit !== undefined) updatedInput.limit = userLimit;
-    } else {
-      updatedInput.limit = 1;
-    }
+    // #2094: never modify the Read call. Returning `updatedInput` with `limit: 1` previously
+    // truncated unconstrained reads, leaving Claude with a stale 1-line snapshot in context
+    // while the timeline told it not to re-read. Subsequent Edit calls then deadlocked because
+    // Claude Code's read-state tracker reported the file as "read" but the actual content was
+    // missing. The hook now only injects supplementary context — the Read proceeds unmodified.
+    const timeline = formatFileTimeline(dedupedObservations, filePath);

    return {
      hookSpecificOutput: {
        hookEventName: 'PreToolUse',
        additionalContext: timeline,
        permissionDecision: 'allow',
-        updatedInput,
      },
    };
  },
@@ -6,6 +6,16 @@
 // Solution: JSON is self-delimiting. We detect complete JSON by attempting
 // to parse after each chunk. Once we have valid JSON, we resolve immediately
 // without waiting for EOF. This is the proper fix, not a timeout workaround.
+//
+// Resolve/reject contract:
+//   - Resolves with parsed JSON value when stdin yields valid JSON.
+//   - Resolves with `undefined` when stdin is unavailable, closes empty,
+//     or emits a stream error.
+//   - Rejects with an Error when stdin closes (or the safety timeout fires)
+//     after non-empty bytes that never form valid JSON. Malformed input is
+//     a handler/client bug — surfacing it lets the upstream exit-code
+//     strategy treat it as a blocking error (exit 2) rather than silently
+//     proceeding as if no input was given. (#2089)

 import { logger } from '../utils/logger.js';

@@ -157,8 +167,14 @@ export async function readJsonFromStdin(): Promise<unknown> {
      // stdin closed - parse whatever we have
      if (!resolved) {
        if (!tryResolveWithJson()) {
-          // Empty or invalid - resolve with undefined
-          resolveWith(input.trim() ? undefined : undefined);
+          // Mirror the safety-timeout semantics (#2089):
+          // non-empty bytes that never parsed = malformed input, surface it.
+          // Empty stdin = "no input given", resolve undefined.
+          if (input.trim()) {
+            rejectWith(new Error(`Malformed JSON at stdin EOF: ${input.slice(0, 100)}...`));
+          } else {
+            resolveWith(undefined);
+          }
        }
      }
    };
@@ -94,7 +94,24 @@ interface SessionDeletedEvent {
 // Constants
 // ============================================================================

-const WORKER_BASE_URL = "http://127.0.0.1:37777";
+/**
+ * Resolve the worker port matching SettingsDefaultsManager's algorithm:
+ *   process.env.CLAUDE_MEM_WORKER_PORT, else 37700 + (uid % 100).
+ * Required for multi-account isolation (#2101) and so this plugin talks to
+ * the same worker the rest of claude-mem (hooks, npx-cli) connects to.
+ * Inlined rather than imported to keep this OpenCode plugin standalone.
+ */
+function resolveWorkerPort(): string {
+  const fromEnv = process.env.CLAUDE_MEM_WORKER_PORT;
+  const parsed = fromEnv ? Number.parseInt(fromEnv.trim(), 10) : NaN;
+  if (Number.isInteger(parsed) && parsed >= 1 && parsed <= 65535) {
+    return String(parsed);
+  }
+  const uid = typeof process.getuid === "function" ? process.getuid() : 77;
+  return String(37700 + (uid % 100));
+}
+
+const WORKER_BASE_URL = `http://127.0.0.1:${resolveWorkerPort()}`;
 const MAX_TOOL_RESPONSE_LENGTH = 1000;

 // ============================================================================
@@ -55,6 +55,8 @@ import {
  writeJsonFileAtomic,
 } from '../utils/paths.js';
 import { readJsonSafe } from '../../utils/json-utils.js';
+import { SettingsDefaultsManager } from '../../shared/SettingsDefaultsManager.js';
+import { shutdownWorkerAndWait } from '../../services/install/shutdown-helper.js';
 import { detectInstalledIDEs } from './ide-detection.js';

 // ---------------------------------------------------------------------------
@@ -272,9 +274,9 @@ async function promptForIDESelection(): Promise<string[]> {
  const result = await p.multiselect({
    message: 'Which IDEs do you use?',
    options,
-    initialValues: detected
-      .filter((ide) => ide.supported)
-      .map((ide) => ide.id),
+    // No pre-selection — users must explicitly opt in to each IDE so we
+    // never wire up an integration the user did not actually request (#2106).
+    initialValues: [],
    required: true,
  });

@@ -458,6 +460,19 @@ export async function runInstallCommand(options: InstallOptions = {}): Promise<v
  const needsManualInstall = selectedIDEs.some((id) => id !== 'claude-code');

  if (needsManualInstall) {
+    // Shut down any running worker FIRST so it isn't holding open file
+    // handles when we overwrite plugin files (#2106 item 3). Best-effort:
+    // helper swallows its own errors when no worker is running.
+    const installPort = SettingsDefaultsManager.get('CLAUDE_MEM_WORKER_PORT');
+    try {
+      const result = await shutdownWorkerAndWait(installPort, 10000);
+      if (result.workerWasRunning) {
+        log.info('Stopped running worker before overwrite.');
+      }
+    } catch (error: unknown) {
+      console.warn('[install] Pre-overwrite worker shutdown failed:', error instanceof Error ? error.message : String(error));
+    }
+
    await runTasks([
      {
        title: 'Copying plugin files',
@@ -542,12 +557,47 @@ export async function runInstallCommand(options: InstallOptions = {}): Promise<v
    summaryLines.forEach(l => console.log(`  ${l}`));
  }

-  const workerPort = process.env.CLAUDE_MEM_WORKER_PORT || '37777';
+  // Resolve port via SettingsDefaultsManager so CLAUDE_MEM_WORKER_PORT env
+  // takes priority and the per-UID default (37700 + uid % 100) is used
+  // otherwise. Required for multi-account isolation (#2101).
+  const workerPort = SettingsDefaultsManager.get('CLAUDE_MEM_WORKER_PORT');
+
+  // Probe the actually-bound port (#2106 item 6). smart-install just
+  // started the worker; if it's reachable we report the real port the
+  // worker bound to. If the probe fails, the worker is still spinning
+  // up — say so plainly and exit cleanly. Don't loop, don't block.
+  let actualPort: number | string = workerPort;
+  let workerReady = false;
+  try {
+    const healthResponse = await fetch(`http://127.0.0.1:${workerPort}/api/health`, {
+      signal: AbortSignal.timeout(3000),
+    });
+    if (healthResponse.ok) {
+      workerReady = true;
+      try {
+        const body = await healthResponse.json() as { port?: number | string };
+        if (body && (typeof body.port === 'number' || typeof body.port === 'string')) {
+          actualPort = body.port;
+        }
+      } catch {
+        // Health endpoint returned non-JSON — keep using the requested port.
+      }
+    }
+  } catch {
+    // Health probe failed — worker may still be starting.
+  }
+
+  const portLine = workerReady
+    ? `Worker port: ${pc.cyan(String(actualPort))}`
+    : `Worker port: ${pc.cyan(String(workerPort))} (worker not yet ready -- still starting up; check ${pc.bold('claude-mem status')} later)`;
+
  const nextSteps = [
    'Open Claude Code and start a conversation -- memory is automatic!',
-    `View your memories: ${pc.underline(`http://localhost:${workerPort}`)}`,
+    portLine,
+    `View your memories: ${pc.underline(`http://localhost:${actualPort}`)}`,
    `Search past work: use ${pc.bold('/mem-search')} in Claude Code`,
    `Start worker: ${pc.bold('npx claude-mem start')}`,
+    `Note: Close all Claude Code sessions before uninstalling, or ${pc.cyan('~/.claude-mem')} will be recreated by active hooks.`,
  ];

  if (isInteractive) {
@@ -12,6 +12,7 @@ import { join } from 'path';
 import pc from 'picocolors';
 import { resolveBunBinaryPath } from '../utils/bun-resolver.js';
 import { isPluginInstalled, marketplaceDirectory } from '../utils/paths.js';
+import { SettingsDefaultsManager } from '../../shared/SettingsDefaultsManager.js';

 // ---------------------------------------------------------------------------
 // Installation guard
@@ -139,6 +140,15 @@ export function runAdoptCommand(extraArgs: string[] = []): void {
  });
 }

+/**
+ * Run the one-time v12.4.3 pollution cleanup, or preview it via --dry-run.
+ * Delegates to the worker-service.cjs `cleanup` subcommand so the scan and
+ * (optional) deletion run in Bun (needed for bun:sqlite). (#2126 item 5)
+ */
+export function runCleanupCommand(extraArgs: string[] = []): void {
+  spawnBunWorkerCommand('cleanup', extraArgs);
+}
+
 /**
 * Search the worker API at `GET /api/search?query=<query>`.
 */
@@ -151,7 +161,10 @@ export async function runSearchCommand(queryParts: string[]): Promise<void> {
    process.exit(1);
  }

-  const workerPort = process.env.CLAUDE_MEM_WORKER_PORT || '37777';
+  // Resolve port via SettingsDefaultsManager so CLAUDE_MEM_WORKER_PORT env
+  // takes priority and the per-UID default (37700 + uid % 100) is used
+  // otherwise. Required for multi-account isolation (#2101).
+  const workerPort = SettingsDefaultsManager.get('CLAUDE_MEM_WORKER_PORT');
  const searchUrl = `http://127.0.0.1:${workerPort}/api/search?query=${encodeURIComponent(query)}`;

  let response: Response;
@@ -9,7 +9,8 @@
 */
 import * as p from '@clack/prompts';
 import pc from 'picocolors';
-import { existsSync, rmSync } from 'fs';
+import { existsSync, readFileSync, readdirSync, rmSync, writeFileSync } from 'fs';
+import { homedir } from 'os';
 import { join } from 'path';
 import {
  claudeSettingsPath,
@@ -21,6 +22,8 @@ import {
  writeJsonFileAtomic,
 } from '../utils/paths.js';
 import { readJsonSafe } from '../../utils/json-utils.js';
+import { SettingsDefaultsManager } from '../../shared/SettingsDefaultsManager.js';
+import { shutdownWorkerAndWait } from '../../services/install/shutdown-helper.js';

 // ---------------------------------------------------------------------------
 // Cleanup helpers
@@ -60,6 +63,48 @@ function removeFromInstalledPlugins(): void {
  }
 }

+/**
+ * Strip the legacy `claude-mem` shell alias/function from common shell rc files
+ * (#2054). The alias used to be added by `installCLI()` in smart-install.js;
+ * that function was deleted, but existing users still have the line. This is
+ * a one-time best-effort cleanup — idempotent (no-op if the line is absent),
+ * and safely matches only lines that BEGIN with `alias claude-mem=` or
+ * `function claude-mem` to avoid mangling unrelated code.
+ */
+function stripLegacyClaudeMemAlias(): void {
+  const home = homedir();
+  const candidateFiles = [
+    join(home, '.bashrc'),
+    join(home, '.zshrc'),
+    join(home, 'Documents', 'PowerShell', 'Microsoft.PowerShell_profile.ps1'),
+  ];
+
+  // Only strip simple aliases. A function declaration would span multiple
+  // lines and can't be safely removed by a line filter — leave it for the
+  // user to remove manually.
+  const aliasLineRegex = /^\s*alias\s+claude-mem\s*=/;
+
+  for (const filePath of candidateFiles) {
+    if (!existsSync(filePath)) continue;
+    let content: string;
+    try {
+      content = readFileSync(filePath, 'utf-8');
+    } catch (error: unknown) {
+      console.warn(`[uninstall] Could not read ${filePath}:`, error instanceof Error ? error.message : String(error));
+      continue;
+    }
+    const lines = content.split('\n');
+    const filtered = lines.filter((line) => !aliasLineRegex.test(line));
+    if (filtered.length === lines.length) continue; // no match — leave file untouched
+    try {
+      writeFileSync(filePath, filtered.join('\n'));
+      console.error(`Removed legacy claude-mem alias from ${filePath}`);
+    } catch (error: unknown) {
+      console.warn(`[uninstall] Could not rewrite ${filePath}:`, error instanceof Error ? error.message : String(error));
+    }
+  }
+}
+
 function removeFromClaudeSettings(): void {
  const settings = readJsonSafe<Record<string, any>>(claudeSettingsPath(), {});
  if (settings.enabledPlugins?.['claude-mem@thedotmack'] !== undefined) {
@@ -68,6 +113,90 @@ function removeFromClaudeSettings(): void {
  }
 }

+/**
+ * Best-effort cleanup of stray claude-mem residue (#2106 item 4) that
+ * accumulates outside of `~/.claude/plugins/marketplaces/thedotmack/`:
+ *
+ *   - `~/.npm/_npx/<hash>/node_modules/claude-mem` (npx install caches)
+ *   - `~/.cache/claude-cli-nodejs/<project>/mcp-logs-plugin-claude-mem-*`
+ *   - `~/.claude/plugins/data/claude-mem-thedotmack/`
+ *
+ * Each step is wrapped in its own try/catch — a failure on one path
+ * (e.g. permissions denied on a single npx hash dir) must not abort
+ * the rest. We log the failure and continue.
+ *
+ * Returns the count of paths actually removed (purely for reporting).
+ */
+function removeStrayClaudeMemPaths(): number {
+  const home = homedir();
+  let removedCount = 0;
+
+  // 1. ~/.npm/_npx/*/node_modules/claude-mem
+  const npxRoot = join(home, '.npm', '_npx');
+  if (existsSync(npxRoot)) {
+    let hashDirs: string[] = [];
+    try {
+      hashDirs = readdirSync(npxRoot);
+    } catch (error: unknown) {
+      console.warn(`[uninstall] Could not read ${npxRoot}:`, error instanceof Error ? error.message : String(error));
+    }
+    for (const hashDir of hashDirs) {
+      const candidate = join(npxRoot, hashDir, 'node_modules', 'claude-mem');
+      if (!existsSync(candidate)) continue;
+      try {
+        rmSync(candidate, { recursive: true, force: true });
+        removedCount++;
+      } catch (error: unknown) {
+        console.warn(`[uninstall] Could not remove ${candidate}:`, error instanceof Error ? error.message : String(error));
+      }
+    }
+  }
+
+  // 2. ~/.cache/claude-cli-nodejs/*/mcp-logs-plugin-claude-mem-*
+  const cacheRoot = join(home, '.cache', 'claude-cli-nodejs');
+  if (existsSync(cacheRoot)) {
+    let projectDirs: string[] = [];
+    try {
+      projectDirs = readdirSync(cacheRoot);
+    } catch (error: unknown) {
+      console.warn(`[uninstall] Could not read ${cacheRoot}:`, error instanceof Error ? error.message : String(error));
+    }
+    for (const projectDir of projectDirs) {
+      const projectPath = join(cacheRoot, projectDir);
+      let logEntries: string[] = [];
+      try {
+        logEntries = readdirSync(projectPath);
+      } catch (error: unknown) {
+        console.warn(`[uninstall] Could not read ${projectPath}:`, error instanceof Error ? error.message : String(error));
+        continue;
+      }
+      for (const entry of logEntries) {
+        if (!entry.startsWith('mcp-logs-plugin-claude-mem-')) continue;
+        const logPath = join(projectPath, entry);
+        try {
+          rmSync(logPath, { recursive: true, force: true });
+          removedCount++;
+        } catch (error: unknown) {
+          console.warn(`[uninstall] Could not remove ${logPath}:`, error instanceof Error ? error.message : String(error));
+        }
+      }
+    }
+  }
+
+  // 3. ~/.claude/plugins/data/claude-mem-thedotmack/
+  const pluginDataDir = join(home, '.claude', 'plugins', 'data', 'claude-mem-thedotmack');
+  if (existsSync(pluginDataDir)) {
+    try {
+      rmSync(pluginDataDir, { recursive: true, force: true });
+      removedCount++;
+    } catch (error: unknown) {
+      console.warn(`[uninstall] Could not remove ${pluginDataDir}:`, error instanceof Error ? error.message : String(error));
+    }
+  }
+
+  return removedCount;
+}
+
 // ---------------------------------------------------------------------------
 // Public API
 // ---------------------------------------------------------------------------
@@ -105,30 +234,23 @@ export async function runUninstallCommand(): Promise<void> {
    }
  }

-  // Stop the worker and wait for it to exit before deleting files
-  const workerPort = process.env.CLAUDE_MEM_WORKER_PORT || '37777';
+  // Stop the worker and wait for it to exit before deleting files.
+  // Resolve port via SettingsDefaultsManager so CLAUDE_MEM_WORKER_PORT env
+  // takes priority and the per-UID default (37700 + uid % 100) is used
+  // otherwise. Required for multi-account isolation (#2101).
+  //
+  // The worker's graceful shutdown also stops chroma-mcp via
+  // GracefulShutdown -> ChromaMcpManager.stop(), so this single call
+  // cascades to the chroma-mcp subprocess as well.
+  const workerPort = SettingsDefaultsManager.get('CLAUDE_MEM_WORKER_PORT');
  try {
-    await fetch(`http://127.0.0.1:${workerPort}/api/admin/shutdown`, {
-      method: 'POST',
-      signal: AbortSignal.timeout(5000),
-    });
-    // Poll health endpoint until worker is gone (max 10s)
-    for (let attempt = 0; attempt < 20; attempt++) {
-      await new Promise((resolve) => setTimeout(resolve, 500));
-      try {
-        await fetch(`http://127.0.0.1:${workerPort}/api/health`, {
-          signal: AbortSignal.timeout(1000),
-        });
-        // Still alive — keep waiting
-      } catch (error: unknown) {
-        // Connection refused = worker is gone (expected shutdown behavior)
-        console.error('[uninstall] Worker health check failed (worker stopped):', error instanceof Error ? error.message : String(error));
-        break;
-      }
+    const result = await shutdownWorkerAndWait(workerPort, 10000);
+    if (result.workerWasRunning) {
+      p.log.info('Worker service stopped.');
    }
-    p.log.info('Worker service stopped.');
-  } catch {
-    // Worker may not be running — that is fine
+  } catch (error: unknown) {
+    // shutdownWorkerAndWait swallows its own errors, but guard anyway.
+    console.warn('[uninstall] Worker shutdown attempt failed:', error instanceof Error ? error.message : String(error));
  }

  await p.tasks([
@@ -171,6 +293,22 @@ export async function runUninstallCommand(): Promise<void> {
        return `Claude settings updated ${pc.green('OK')}`;
      },
    },
+    {
+      title: 'Removing legacy claude-mem shell alias',
+      task: async () => {
+        stripLegacyClaudeMemAlias();
+        return `Legacy alias check complete ${pc.green('OK')}`;
+      },
+    },
+    {
+      title: 'Removing stray claude-mem caches and logs',
+      task: async () => {
+        const removed = removeStrayClaudeMemPaths();
+        return removed > 0
+          ? `Stray paths removed: ${removed} ${pc.green('OK')}`
+          : `No stray paths found ${pc.dim('skipped')}`;
+      },
+    },
  ]);

  // Remove IDE-specific hooks and config (best-effort, each is independent)
@@ -53,6 +53,7 @@ ${pc.bold('Runtime Commands')} (requires Bun, delegates to installed plugin):
  ${pc.cyan('npx claude-mem status')}               Show worker status
  ${pc.cyan('npx claude-mem search <query>')}       Search observations
  ${pc.cyan('npx claude-mem adopt [--dry-run] [--branch <name>]')}    Stamp merged worktrees into parent project
+  ${pc.cyan('npx claude-mem cleanup [--dry-run]')}    Run one-time v12.4.3 pollution cleanup (or preview counts)
  ${pc.cyan('npx claude-mem transcript watch')}     Start transcript watcher

 ${pc.bold('IDE Identifiers')}:
@@ -153,6 +154,13 @@ async function main(): Promise<void> {
      break;
    }

+    // -- One-time v12.4.3 cleanup ------------------------------------------
+    case 'cleanup': {
+      const { runCleanupCommand } = await import('./commands/runtime.js');
+      runCleanupCommand(args.slice(1));
+      break;
+    }
+
    // -- Transcript --------------------------------------------------------
    case 'transcript': {
      const subCommand = args[1]?.toLowerCase();
@@ -50,29 +50,52 @@ interface MarkerPayload {
 * the marker file ensures the work runs at most once per data directory.
 *
 * @param dataDirectory - Override for DATA_DIR (used in tests)
+ * @param options.dryRun - When true, scans + reports counts but performs NO
+ *        DB writes, NO backup, NO chroma wipe, and does NOT write the marker.
+ *        Used by `claude-mem cleanup --dry-run` to preview what would happen
+ *        without mutating user state. (#2126 item 5)
 */
-export function runOneTimeV12_4_3Cleanup(dataDirectory?: string): void {
+export function runOneTimeV12_4_3Cleanup(
+  dataDirectory?: string,
+  options: { dryRun?: boolean } = {},
+): CleanupCounts | undefined {
+  const dryRun = options.dryRun === true;
  const effectiveDataDir = dataDirectory ?? DATA_DIR;
  const markerPath = path.join(effectiveDataDir, MARKER_FILENAME);

-  if (existsSync(markerPath)) {
+  if (existsSync(markerPath) && !dryRun) {
    logger.debug('SYSTEM', 'v12.4.3 cleanup marker exists, skipping');
    return;
  }

-  if (process.env.CLAUDE_MEM_SKIP_CLEANUP_V12_4_3 === '1') {
+  if (process.env.CLAUDE_MEM_SKIP_CLEANUP_V12_4_3 === '1' && !dryRun) {
    logger.warn('SYSTEM', 'v12.4.3 cleanup skipped via CLAUDE_MEM_SKIP_CLEANUP_V12_4_3=1; marker not written');
    return;
  }

  const dbPath = path.join(effectiveDataDir, 'claude-mem.db');
  if (!existsSync(dbPath)) {
+    if (dryRun) {
+      logger.info('SYSTEM', 'v12.4.3 cleanup --dry-run: no DB present, nothing to scan', { dbPath });
+      return emptyCounts();
+    }
    mkdirSync(effectiveDataDir, { recursive: true });
    writeMarker(markerPath, { appliedAt: new Date().toISOString(), backupPath: null, chromaWiped: false, counts: emptyCounts(), skipped: 'no-db' });
    logger.debug('SYSTEM', 'No DB present, v12.4.3 cleanup marker written without work', { dbPath });
    return;
  }

+  if (dryRun) {
+    logger.info('SYSTEM', 'Running v12.4.3 cleanup --dry-run (read-only scan, no writes)', { dbPath });
+    try {
+      return scanCleanupCounts(dbPath);
+    } catch (err: unknown) {
+      const error = err instanceof Error ? err : new Error(String(err));
+      logger.error('SYSTEM', 'v12.4.3 cleanup --dry-run scan failed', {}, error);
+      return undefined;
+    }
+  }
+
  logger.warn('SYSTEM', 'Running one-time v12.4.3 pollution cleanup', { dbPath });

  try {
@@ -83,6 +106,43 @@ export function runOneTimeV12_4_3Cleanup(dataDirectory?: string): void {
  }
 }

+/**
+ * Read-only scan: count what runOneTimeV12_4_3Cleanup *would* delete.
+ * Mirrors the COUNT(*) queries from runObserverSessionsPurge and
+ * runStuckPendingPurge. Opens the DB read-only — never mutates.
+ */
+function scanCleanupCounts(dbPath: string): CleanupCounts {
+  const counts = emptyCounts();
+  const db = new Database(dbPath, { readonly: true });
+  try {
+    counts.observerSessions = (
+      db.prepare(`SELECT COUNT(*) AS n FROM sdk_sessions WHERE project = ?`).get(OBSERVER_SESSIONS_PROJECT) as { n: number }
+    ).n;
+    counts.observerCascadeRows =
+      (db.prepare(`SELECT COUNT(*) AS n FROM user_prompts WHERE content_session_id IN (SELECT content_session_id FROM sdk_sessions WHERE project = ?)`).get(OBSERVER_SESSIONS_PROJECT) as { n: number }).n
+      + (db.prepare(`SELECT COUNT(*) AS n FROM observations WHERE memory_session_id IN (SELECT memory_session_id FROM sdk_sessions WHERE project = ? AND memory_session_id IS NOT NULL)`).get(OBSERVER_SESSIONS_PROJECT) as { n: number }).n
+      + (db.prepare(`SELECT COUNT(*) AS n FROM session_summaries WHERE memory_session_id IN (SELECT memory_session_id FROM sdk_sessions WHERE project = ? AND memory_session_id IS NOT NULL)`).get(OBSERVER_SESSIONS_PROJECT) as { n: number }).n;
+    counts.stuckPendingMessages = (db.prepare(
+      `SELECT COUNT(*) AS n FROM pending_messages
+         WHERE status IN ('failed', 'processing')
+           AND session_db_id IN (
+             SELECT session_db_id FROM pending_messages
+              WHERE status IN ('failed', 'processing')
+              GROUP BY session_db_id
+              HAVING COUNT(*) >= ?
+           )`
+    ).get(STUCK_PENDING_THRESHOLD) as { n: number }).n;
+  } finally {
+    db.close();
+  }
+  logger.info('SYSTEM', 'v12.4.3 cleanup --dry-run scan complete', {
+    observerSessions: counts.observerSessions,
+    observerCascadeRows: counts.observerCascadeRows,
+    stuckPendingMessages: counts.stuckPendingMessages,
+  });
+  return counts;
+}
+
 function executeCleanup(dbPath: string, effectiveDataDir: string, markerPath: string): void {
  const dbSize = statSync(dbPath).size;
  const required = Math.ceil(dbSize * 1.2) + 100 * 1024 * 1024;
@@ -541,191 +541,6 @@ export async function cleanupOrphanedProcesses(): Promise<void> {
  logger.info('SYSTEM', 'Orphaned processes cleaned up', { count: pidsToKill.length });
 }

-// Patterns that should be killed immediately at startup (no age gate)
-// These are child processes that should not outlive their parent worker
-const AGGRESSIVE_CLEANUP_PATTERNS = ['worker-service.cjs', 'chroma-mcp'];
-
-// Patterns that keep the age-gated threshold (may be legitimately running)
-const AGE_GATED_CLEANUP_PATTERNS = ['mcp-server.cjs'];
-
-/**
- * Enumerate processes for aggressive startup cleanup. Aggressive patterns are
- * killed immediately; age-gated patterns only if older than ORPHAN_MAX_AGE_MINUTES.
- */
-async function enumerateAggressiveCleanupProcesses(
-  isWindows: boolean,
-  currentPid: number,
-  protectedPids: Set<number>,
-  allPatterns: string[]
-): Promise<number[]> {
-  const pidsToKill: number[] = [];
-
-  if (isWindows) {
-    // Use WQL -Filter for server-side filtering (no $_ pipeline syntax).
-    // Avoids Git Bash $_ interpretation (#1062) and PowerShell syntax errors (#1024).
-    const wqlPatternConditions = allPatterns
-      .map(p => `CommandLine LIKE '%${p}%'`)
-      .join(' OR ');
-
-    const cmd = `powershell -NoProfile -NonInteractive -Command "Get-CimInstance Win32_Process -Filter '(${wqlPatternConditions}) AND ProcessId != ${currentPid}' | Select-Object ProcessId, CommandLine, CreationDate | ConvertTo-Json"`;
-    const { stdout } = await execAsync(cmd, { timeout: HOOK_TIMEOUTS.POWERSHELL_COMMAND, windowsHide: true });
-
-    if (!stdout.trim() || stdout.trim() === 'null') {
-      logger.debug('SYSTEM', 'No orphaned claude-mem processes found (Windows)');
-      return [];
-    }
-
-    const processes = JSON.parse(stdout);
-    const processList = Array.isArray(processes) ? processes : [processes];
-    const now = Date.now();
-
-    for (const proc of processList) {
-      const pid = proc.ProcessId;
-      if (!Number.isInteger(pid) || pid <= 0 || protectedPids.has(pid)) continue;
-
-      const commandLine = proc.CommandLine || '';
-      const isAggressive = AGGRESSIVE_CLEANUP_PATTERNS.some(p => commandLine.includes(p));
-
-      if (isAggressive) {
-        // Kill immediately — no age check
-        pidsToKill.push(pid);
-        logger.debug('SYSTEM', 'Found orphaned process (aggressive)', { pid, commandLine: commandLine.substring(0, 80) });
-      } else {
-        // Age-gated: only kill if older than threshold
-        const creationMatch = proc.CreationDate?.match(/\/Date\((\d+)\)\//);
-        if (creationMatch) {
-          const creationTime = parseInt(creationMatch[1], 10);
-          const ageMinutes = (now - creationTime) / (1000 * 60);
-          if (ageMinutes >= ORPHAN_MAX_AGE_MINUTES) {
-            pidsToKill.push(pid);
-            logger.debug('SYSTEM', 'Found orphaned process (age-gated)', { pid, ageMinutes: Math.round(ageMinutes) });
-          }
-        }
-      }
-    }
-  } else {
-    // Unix: Use ps with elapsed time
-    const patternRegex = allPatterns.join('|');
-    const { stdout } = await execAsync(
-      `ps -eo pid,etime,command | grep -E "${patternRegex}" | grep -v grep || true`
-    );
-
-    if (!stdout.trim()) {
-      logger.debug('SYSTEM', 'No orphaned claude-mem processes found (Unix)');
-      return [];
-    }
-
-    const lines = stdout.trim().split('\n');
-    for (const line of lines) {
-      const match = line.trim().match(/^(\d+)\s+(\S+)\s+(.*)$/);
-      if (!match) continue;
-
-      const pid = parseInt(match[1], 10);
-      const etime = match[2];
-      const command = match[3];
-
-      if (!Number.isInteger(pid) || pid <= 0 || protectedPids.has(pid)) continue;
-
-      const isAggressive = AGGRESSIVE_CLEANUP_PATTERNS.some(p => command.includes(p));
-
-      if (isAggressive) {
-        // Kill immediately — no age check
-        pidsToKill.push(pid);
-        logger.debug('SYSTEM', 'Found orphaned process (aggressive)', { pid, command: command.substring(0, 80) });
-      } else {
-        // Age-gated: only kill if older than threshold
-        const ageMinutes = parseElapsedTime(etime);
-        if (ageMinutes >= ORPHAN_MAX_AGE_MINUTES) {
-          pidsToKill.push(pid);
-          logger.debug('SYSTEM', 'Found orphaned process (age-gated)', { pid, ageMinutes, command: command.substring(0, 80) });
-        }
-      }
-    }
-  }
-
-  return pidsToKill;
-}
-
-/**
- * Aggressive startup cleanup for orphaned claude-mem processes.
- *
- * Unlike cleanupOrphanedProcesses() which age-gates everything at 30 minutes,
- * this function kills worker-service.cjs and chroma-mcp processes immediately
- * (they should not outlive their parent worker). Only mcp-server.cjs keeps
- * the age threshold since it may be legitimately running.
- *
- * Called once at daemon startup.
- */
-export async function aggressiveStartupCleanup(): Promise<void> {
-  const isWindows = process.platform === 'win32';
-  const currentPid = process.pid;
-  const allPatterns = [...AGGRESSIVE_CLEANUP_PATTERNS, ...AGE_GATED_CLEANUP_PATTERNS];
-
-  // Protect parent process (the hook that spawned us) from being killed.
-  // Without this, a new daemon kills its own parent hook process (#1426).
-  //
-  // Note: readPidFile() is not used here because start() writes the new PID
-  // before initializeBackground() calls this function, so readPidFile() would
-  // just return process.pid (already protected). If a pre-existing worker needs
-  // protection, ensureWorkerStarted() handles that by returning early when a
-  // healthy worker is detected — we never reach this code in that case.
-  const protectedPids = new Set<number>([currentPid]);
-  if (process.ppid && process.ppid > 0) {
-    protectedPids.add(process.ppid);
-  }
-
-  let pidsToKill: number[];
-  try {
-    pidsToKill = await enumerateAggressiveCleanupProcesses(isWindows, currentPid, protectedPids, allPatterns);
-  } catch (error: unknown) {
-    if (error instanceof Error) {
-      logger.error('SYSTEM', 'Failed to enumerate orphaned processes during aggressive cleanup', {}, error);
-    } else {
-      logger.error('SYSTEM', 'Failed to enumerate orphaned processes during aggressive cleanup', {}, new Error(String(error)));
-    }
-    return;
-  }
-
-  if (pidsToKill.length === 0) {
-    return;
-  }
-
-  logger.info('SYSTEM', 'Aggressive startup cleanup: killing orphaned processes', {
-    platform: isWindows ? 'Windows' : 'Unix',
-    count: pidsToKill.length,
-    pids: pidsToKill
-  });
-
-  if (isWindows) {
-    for (const pid of pidsToKill) {
-      if (!Number.isInteger(pid) || pid <= 0) continue;
-      try {
-        execSync(`taskkill /PID ${pid} /T /F`, { timeout: HOOK_TIMEOUTS.POWERSHELL_COMMAND, stdio: 'ignore', windowsHide: true });
-      } catch (error: unknown) {
-        if (error instanceof Error) {
-          logger.debug('SYSTEM', 'Failed to kill process, may have already exited', { pid }, error);
-        } else {
-          logger.debug('SYSTEM', 'Failed to kill process, may have already exited', { pid }, new Error(String(error)));
-        }
-      }
-    }
-  } else {
-    for (const pid of pidsToKill) {
-      try {
-        process.kill(pid, 'SIGKILL');
-      } catch (error: unknown) {
-        if (error instanceof Error) {
-          logger.debug('SYSTEM', 'Process already exited', { pid }, error);
-        } else {
-          logger.debug('SYSTEM', 'Process already exited', { pid }, new Error(String(error)));
-        }
-      }
-    }
-  }
-
-  logger.info('SYSTEM', 'Aggressive startup cleanup complete', { count: pidsToKill.length });
-}
-
 const CHROMA_MIGRATION_MARKER_FILENAME = '.chroma-cleaned-v10.3';

 /**
@@ -929,14 +744,20 @@ function executeCwdRemap(dbPath: string, effectiveDataDir: string, markerPath: s
 }

 /**
- * Spawn a detached daemon process
- * Returns the child PID or undefined if spawn failed
+ * Spawn a detached daemon process.
 *
- * On Windows, uses PowerShell Start-Process with -WindowStyle Hidden to spawn
- * a truly independent process without console popups. Unlike WMIC, PowerShell
- * inherits environment variables from the parent process.
+ * Uses Node's child_process.spawn with the arg-array form on every platform.
+ * The arg-array form bypasses the shell entirely on Windows, so no quoting
+ * heuristics or PowerShell wrappers are needed (handles paths with spaces
+ * like `C:\Users\Alex Newman\...` natively).
 *
- * On Unix, uses standard detached spawn.
+ * On Unix, prefer setsid to detach from the controlling terminal so SIGHUP
+ * can't reach the daemon even if the in-process handler fails. The
+ * `detached: true` option already creates a new process group on POSIX;
+ * setsid is the belt-and-suspenders extra.
+ *
+ * Bun.spawn is intentionally NOT used here: it does not support detached
+ * spawning (see comment in process-registry.ts:633-639).
 *
 * PID file is written by the worker itself after listen() succeeds,
 * not by the spawner (race-free, works on all platforms).
@@ -946,7 +767,6 @@ export function spawnDaemon(
  port: number,
  extraEnv: Record<string, string> = {}
 ): number | undefined {
-  const isWindows = process.platform === 'win32';
  getSupervisor().assertCanSpawn('worker daemon');

  const env = sanitizeEnv({
@@ -957,9 +777,7 @@ export function spawnDaemon(

  // worker-service.cjs imports `bun:sqlite`, so the spawned runtime MUST be
  // Bun on every platform — never the current process.execPath, which may be
-  // Node when the caller is the MCP server. Resolve once before the OS branch
-  // split so we don't pay for a duplicate PATH lookup if Bun isn't found at a
-  // well-known path. See resolveWorkerRuntimePath() for the candidate list.
+  // Node when the caller is the MCP server.
  const runtimePath = resolveWorkerRuntimePath();
  if (!runtimePath) {
    logger.error(
@@ -969,65 +787,20 @@ export function spawnDaemon(
    return undefined;
  }

-  if (isWindows) {
-    // Use PowerShell Start-Process to spawn a hidden, independent process
-    // Unlike WMIC, PowerShell inherits environment variables from parent
-    // -WindowStyle Hidden prevents console popup
-
-    // Use -EncodedCommand to avoid all shell quoting issues with spaces in paths
-    const psScript = `Start-Process -FilePath '${runtimePath.replace(/'/g, "''")}' -ArgumentList @('${scriptPath.replace(/'/g, "''")}','--daemon') -WindowStyle Hidden`;
-    const encodedCommand = Buffer.from(psScript, 'utf16le').toString('base64');
-
-    try {
-      execSync(`powershell -NoProfile -EncodedCommand ${encodedCommand}`, {
-        stdio: 'ignore',
-        windowsHide: true,
-        env
-      });
-      // Windows success sentinel: PowerShell `Start-Process` does not return
-      // the spawned PID, and we don't want to pay for an extra `Get-Process`
-      // round-trip just to discover it. Return 0 (a conventionally invalid
-      // Unix PID) so callers can distinguish "spawn dispatched" from "spawn
-      // failed". Callers MUST use `pid === undefined` to detect failure —
-      // never falsy checks like `if (!pid)`, which would silently treat
-      // success as failure here.
-      return 0;
-    } catch (error: unknown) {
-      // APPROVED OVERRIDE: Windows daemon spawn is best-effort; log and let callers fall back to health checks/retry flow.
-      if (error instanceof Error) {
-        logger.error('SYSTEM', 'Failed to spawn worker daemon on Windows', { runtimePath }, error);
-      } else {
-        logger.error('SYSTEM', 'Failed to spawn worker daemon on Windows', { runtimePath }, new Error(String(error)));
-      }
-      return undefined;
-    }
-  }
-
-  // Unix: Use setsid to create a new session, fully detaching from the
-  // controlling terminal. This prevents SIGHUP from reaching the daemon
-  // even if the in-process SIGHUP handler somehow fails (belt-and-suspenders).
-  // Fall back to standard detached spawn if setsid is not available.
-  // `runtimePath` was resolved at the top of this function (see comment there).
+  // On Unix, prefer setsid to fully detach from the controlling terminal.
+  // On Windows or systems without setsid, spawn the runtime directly.
  const setsidPath = '/usr/bin/setsid';
-  if (existsSync(setsidPath)) {
-    const child = spawn(setsidPath, [runtimePath, scriptPath, '--daemon'], {
-      detached: true,
-      stdio: 'ignore',
-      env
-    });
+  const useSetsid = process.platform !== 'win32' && existsSync(setsidPath);

-    if (child.pid === undefined) {
-      return undefined;
-    }
+  const execPath = useSetsid ? setsidPath : runtimePath;
+  const args = useSetsid
+    ? [runtimePath, scriptPath, '--daemon']
+    : [scriptPath, '--daemon'];

-    child.unref();
-    return child.pid;
-  }
-
-  // Fallback: standard detached spawn (macOS, systems without setsid)
-  const child = spawn(runtimePath, [scriptPath, '--daemon'], {
+  const child = spawn(execPath, args, {
    detached: true,
    stdio: 'ignore',
+    windowsHide: true,
    env
  });

@@ -1036,7 +809,6 @@ export function spawnDaemon(
  }

  child.unref();
-
  return child.pid;
 }

@@ -0,0 +1,58 @@
+/**
+ * Shared worker-shutdown helper used by both `install` (to clear out a
+ * running worker before overwriting plugin files) and `uninstall` (to
+ * release file locks before deletion).
+ *
+ * Posts to `/api/admin/shutdown`, then polls `/api/health` until the
+ * connection is refused (= worker is gone) or the timeout elapses.
+ *
+ * Best-effort: if the worker is not running, the POST throws and we
+ * return immediately. Callers should never depend on this throwing.
+ */
+
+export interface ShutdownResult {
+  /** True if we actively shut down a worker; false if none was running. */
+  workerWasRunning: boolean;
+  /** True if we observed the worker stop responding before the timeout. */
+  confirmedStopped: boolean;
+}
+
+export async function shutdownWorkerAndWait(
+  port: number | string,
+  timeoutMs: number = 10000,
+): Promise<ShutdownResult> {
+  const baseUrl = `http://127.0.0.1:${port}`;
+  let workerWasRunning = false;
+
+  try {
+    await fetch(`${baseUrl}/api/admin/shutdown`, {
+      method: 'POST',
+      signal: AbortSignal.timeout(5000),
+    });
+    workerWasRunning = true;
+  } catch {
+    // Worker not running (connection refused) or shutdown POST timed out.
+    // Either way, nothing more to do.
+    return { workerWasRunning: false, confirmedStopped: true };
+  }
+
+  const pollIntervalMs = 500;
+  const maxAttempts = Math.ceil(timeoutMs / pollIntervalMs);
+  for (let attempt = 0; attempt < maxAttempts; attempt++) {
+    await new Promise((resolve) => setTimeout(resolve, pollIntervalMs));
+    try {
+      await fetch(`${baseUrl}/api/health`, {
+        signal: AbortSignal.timeout(1000),
+      });
+      // Health endpoint still responding — worker is still alive, keep waiting.
+    } catch (err) {
+      // AbortError = health endpoint timed out (worker still accepting
+      // connections but slow). Keep polling. Any other error
+      // (ECONNREFUSED, ECONNRESET) means the worker is gone.
+      if (err instanceof Error && err.name === 'AbortError') continue;
+      return { workerWasRunning, confirmedStopped: true };
+    }
+  }
+
+  return { workerWasRunning, confirmedStopped: false };
+}
@@ -26,6 +26,7 @@ import {
  unlinkSync,
 } from 'fs';
 import { logger } from '../../utils/logger.js';
+import { SettingsDefaultsManager } from '../../shared/SettingsDefaultsManager.js';

 // ============================================================================
 // Path Resolution
@@ -168,7 +169,7 @@ function writeOpenClawConfig(config: Record<string, any>): void {
 * and the memory slot.
 */
 function registerPluginInOpenClawConfig(
-  workerPort: number = 37777,
+  workerPort: number,
  project: string = 'openclaw',
  syncMemoryFile: boolean = true,
 ): void {
@@ -305,7 +306,11 @@ function copyPluginFilesAndRegister(
    'utf-8',
  );

-  registerPluginInOpenClawConfig();
+  // Resolve port via SettingsDefaultsManager so CLAUDE_MEM_WORKER_PORT env
+  // takes priority and the per-UID default (37700 + uid % 100) is used
+  // otherwise. Required for multi-account isolation (#2101).
+  const workerPort = SettingsDefaultsManager.getInt('CLAUDE_MEM_WORKER_PORT');
+  registerPluginInOpenClawConfig(workerPort);
  console.log(`  Registered in openclaw.json`);

  logger.info('OPENCLAW', 'Plugin installed', { destination: extensionDirectory });
@@ -75,6 +75,7 @@ export class SessionStore {
    this.addObservationSubagentColumns();
    this.addPendingMessagesToolUseIdAndWorkerPidColumns();
    this.addObservationsUniqueContentHashIndex();
+    this.addObservationsMetadataColumn();
  }

  /**
@@ -715,6 +716,14 @@ export class SessionStore {
    // Clean up leftover temp table from a previously-crashed run
    this.db.run('DROP TABLE IF EXISTS observations_new');

+    // If the live observations table already has metadata (added in v30 or
+    // by an older bundled artifact that ran v30 before v21 was recorded),
+    // preserve it so this rebuild doesn't silently drop the column's data.
+    const observationsCols = this.db.query('PRAGMA table_info(observations)').all() as TableColumnInfo[];
+    const observationsHasMetadata = observationsCols.some(c => c.name === 'metadata');
+    const metadataColumnSQL = observationsHasMetadata ? ',\n        metadata TEXT' : '';
+    const metadataSelectSQL = observationsHasMetadata ? ', metadata' : '';
+
    const observationsNewSQL = `
      CREATE TABLE observations_new (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
@@ -732,7 +741,7 @@ export class SessionStore {
        prompt_number INTEGER,
        discovery_tokens INTEGER DEFAULT 0,
        created_at TEXT NOT NULL,
-        created_at_epoch INTEGER NOT NULL,
+        created_at_epoch INTEGER NOT NULL${metadataColumnSQL},
        FOREIGN KEY(memory_session_id) REFERENCES sdk_sessions(memory_session_id) ON DELETE CASCADE ON UPDATE CASCADE
      )
    `;
@@ -740,7 +749,7 @@ export class SessionStore {
      INSERT INTO observations_new
      SELECT id, memory_session_id, project, text, type, title, subtitle, facts,
             narrative, concepts, files_read, files_modified, prompt_number,
-             discovery_tokens, created_at, created_at_epoch
+             discovery_tokens, created_at, created_at_epoch${metadataSelectSQL}
      FROM observations
    `;
    const observationsIndexesSQL = `
@@ -1156,6 +1165,29 @@ export class SessionStore {
    }
  }

+  /**
+   * Add metadata TEXT column to observations (migration 30).
+   *
+   * Mirrors MigrationRunner.addObservationsMetadataColumn so bundled artifacts
+   * that embed SessionStore (e.g. worker-service.cjs, context-generator.cjs)
+   * stay schema-consistent. Without this, INSERT … (..., metadata, ...) raises
+   * "table observations has no column named metadata" and POST /api/memory/save
+   * starts failing on every call once it begins persisting metadata (#2116).
+   *
+   * Idempotent via PRAGMA table_info guard.
+   */
+  private addObservationsMetadataColumn(): void {
+    const cols = this.db.query('PRAGMA table_info(observations)').all() as TableColumnInfo[];
+    const hasColumn = cols.some(c => c.name === 'metadata');
+
+    if (!hasColumn) {
+      this.db.run('ALTER TABLE observations ADD COLUMN metadata TEXT');
+      logger.debug('DB', 'Added metadata column to observations table (#2116)');
+    }
+
+    this.db.prepare('INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)').run(30, new Date().toISOString());
+  }
+
  /**
   * Update the memory session ID for a session
   * Called by SDKAgent when it captures the session ID from the first SDK message
@@ -2009,6 +2041,9 @@ export class SessionStore {
      files_modified: string[];
      agent_type?: string | null;
      agent_id?: string | null;
+      // Caller-supplied JSON metadata, stored verbatim in the metadata column (#2116).
+      // Pre-stringified by the caller so we don't double-encode an already-JSON value.
+      metadata?: string | null;
    },
    promptNumber?: number,
    discoveryTokens: number = 0,
@@ -2027,8 +2062,8 @@ export class SessionStore {
      INSERT INTO observations
      (memory_session_id, project, type, title, subtitle, facts, narrative, concepts,
       files_read, files_modified, prompt_number, discovery_tokens, agent_type, agent_id, content_hash, created_at, created_at_epoch,
-       generated_by_model)
-      VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+       generated_by_model, metadata)
+      VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
      ON CONFLICT(memory_session_id, content_hash) DO NOTHING
      RETURNING id, created_at_epoch
    `);
@@ -2051,7 +2086,8 @@ export class SessionStore {
      contentHash,
      timestampIso,
      timestampEpoch,
-      generatedByModel || null
+      generatedByModel || null,
+      observation.metadata ?? null
    ) as { id: number; created_at_epoch: number } | null;

    if (inserted) {
@@ -40,6 +40,7 @@ export class MigrationRunner {
    this.addObservationSubagentColumns();
    this.rebuildPendingMessagesForSelfHealingClaim();
    this.addObservationsUniqueContentHashIndex();
+    this.addObservationsMetadataColumn();
  }

  /**
@@ -1204,4 +1205,27 @@ export class MigrationRunner {
      throw new Error(`Migration 29 failed: ${String(error)}`);
    }
  }
+
+  /**
+   * Add metadata TEXT column to observations (migration 30).
+   *
+   * Backward-compatible: nullable, no default. Holds JSON-encoded arbitrary
+   * metadata supplied by callers of POST /api/memory/save (#2116). Without
+   * this column, the route's Zod `.passthrough()` accepted unknown fields
+   * but the INSERT silently dropped them — a quiet contract violation.
+   *
+   * Idempotent via PRAGMA table_info guard so cross-machine DB sync that
+   * leaves schema_versions ahead of actual schema still self-heals.
+   */
+  private addObservationsMetadataColumn(): void {
+    const cols = this.db.query('PRAGMA table_info(observations)').all() as TableColumnInfo[];
+    const hasColumn = cols.some(c => c.name === 'metadata');
+
+    if (!hasColumn) {
+      this.db.run('ALTER TABLE observations ADD COLUMN metadata TEXT');
+      logger.debug('DB', 'Added metadata column to observations table (#2116)');
+    }
+
+    this.db.prepare('INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)').run(30, new Date().toISOString());
+  }
 }
@@ -74,6 +74,7 @@ CREATE TABLE IF NOT EXISTS observations (
  agent_id             TEXT,
  merged_into_project  TEXT,
  generated_by_model   TEXT,
+  metadata             TEXT,
  created_at           TEXT    NOT NULL,
  created_at_epoch     INTEGER NOT NULL,
  FOREIGN KEY(memory_session_id) REFERENCES sdk_sessions(memory_session_id)
@@ -31,6 +31,24 @@ const RECONNECT_BACKOFF_MS = 10_000; // Don't retry connections faster than this
 const DEFAULT_CHROMA_DATA_DIR = path.join(os.homedir(), '.claude-mem', 'chroma');
 const CHROMA_SUPERVISOR_ID = 'chroma-mcp';

+/**
+ * Pinned chroma-mcp version for deterministic installs.
+ *
+ * Why pin: `uvx chroma-mcp` (unpinned) resolves whatever version PyPI happens
+ * to serve at install time. That has bitten us multiple ways:
+ *   - #2046: transient missing httpcore/httpx after dependency resolver shifts
+ *   - #2085: surprise breaking changes between point releases
+ *   - #2102: subprocess spawn storms triggered by version drift in chromadb deps
+ *
+ * Pinning to a specific known-good version makes installs reproducible across
+ * machines and across time. Bump deliberately, not accidentally.
+ *
+ * Verified 2026-04-25 with `uvx --python 3.13 chroma-mcp==0.2.6 --help` in a
+ * clean uv cache: starts cleanly, no httpcore/httpx ImportError, no `--with`
+ * flags required. If that changes on a future bump, re-add the flags here.
+ */
+const CHROMA_MCP_PINNED_VERSION = '0.2.6';
+
 export class ChromaMcpManager {
  private static instance: ChromaMcpManager | null = null;
  private client: Client | null = null;
@@ -212,7 +230,7 @@ export class ChromaMcpManager {

      const args = [
        '--python', pythonVersion,
-        'chroma-mcp',
+        `chroma-mcp==${CHROMA_MCP_PINNED_VERSION}`,
        '--client-type', 'http',
        '--host', chromaHost,
        '--port', chromaPort
@@ -238,7 +256,7 @@ export class ChromaMcpManager {
    // Local mode: persistent client with data directory
    return [
      '--python', pythonVersion,
-      'chroma-mcp',
+      `chroma-mcp==${CHROMA_MCP_PINNED_VERSION}`,
      '--client-type', 'persistent',
      '--data-dir', DEFAULT_CHROMA_DATA_DIR.replace(/\\/g, '/')
    ];
@@ -44,7 +44,6 @@ import {
  readPidFile,
  removePidFile,
  getPlatformTimeout,
-  aggressiveStartupCleanup,
  runOneTimeChromaMigration,
  runOneTimeCwdRemap,
  cleanStalePidFile,
@@ -386,7 +385,6 @@ export class WorkerService implements WorkerRef {
  private async initializeBackground(): Promise<void> {
    try {
      logger.info('WORKER', 'Background initialization starting...');
-      await aggressiveStartupCleanup();

      // Load mode configuration
      const { ModeManager } = await import('./domain/ModeManager.js');
@@ -1154,34 +1152,21 @@ async function main() {
    case 'restart': {
      logger.info('SYSTEM', 'Restarting worker');
      await httpShutdown(port);
-      const restartFreed = await waitForPortFree(port, getPlatformTimeout(15000));
+      const restartFreed = await waitForPortFree(port, 5000);
      if (!restartFreed) {
-        logger.error('SYSTEM', 'Port did not free up after shutdown, aborting restart', { port });
-        process.exit(0);
+        // Don't loop, don't force-kill, don't steal the port. The PID file
+        // owns the lock; if the previous worker won't release the port the
+        // user resolves it manually.
+        console.error('Port still bound after shutdown. Resolve manually.');
+        process.exit(1);
      }
      removePidFile();
-
-      const pid = spawnDaemon(__filename, port);
-      if (pid === undefined) {
-        logger.error('SYSTEM', 'Failed to spawn worker daemon during restart');
-        // Exit gracefully: Windows Terminal won't keep tab open on exit 0
-        // The wrapper/plugin will handle restart logic if needed
-        process.exit(0);
+      const restartPid = spawnDaemon(__filename, port);
+      if (restartPid === undefined) {
+        console.error('Failed to spawn worker daemon during restart.');
+        process.exit(1);
      }
-
-      // PID file is written by the worker itself after listen() succeeds
-      // This is race-free and works correctly on Windows where cmd.exe PID is useless
-
-      const healthy = await waitForHealth(port, getPlatformTimeout(HOOK_TIMEOUTS.POST_SPAWN_WAIT));
-      if (!healthy) {
-        removePidFile();
-        logger.error('SYSTEM', 'Worker failed to restart');
-        // Exit gracefully: Windows Terminal won't keep tab open on exit 0
-        // The wrapper/plugin will handle restart logic if needed
-        process.exit(0);
-      }
-
-      logger.info('SYSTEM', 'Worker restarted successfully');
+      logger.info('SYSTEM', 'Worker restart spawned', { pid: restartPid });
      process.exit(0);
      break;
    }
@@ -1298,6 +1283,26 @@ async function main() {
      process.exit(0);
    }

+    case 'cleanup': {
+      // CLI surface for the v12.4.3 pollution cleanup. Shares its scan logic
+      // with the auto-run-on-startup path so --dry-run reports counts that
+      // exactly match what the next startup would delete. (#2126 item 5)
+      const dryRun = process.argv.includes('--dry-run');
+      const counts = runOneTimeV12_4_3Cleanup(undefined, { dryRun });
+      const tag = dryRun ? '(dry-run, no changes made)' : '(applied)';
+      console.log(`\nv12.4.3 cleanup ${tag}`);
+      if (counts) {
+        console.log(`  Observer sessions:        ${counts.observerSessions}`);
+        console.log(`  Observer cascade rows:    ${counts.observerCascadeRows}`);
+        console.log(`  Stuck pending_messages:   ${counts.stuckPendingMessages}`);
+      } else if (dryRun) {
+        console.log('  Scan failed — see worker log for details.');
+      } else {
+        console.log('  Already applied (marker present) or skipped.');
+      }
+      process.exit(0);
+    }
+
    case '--daemon':
    default: {
      // GUARD 1: Refuse to start if another worker is already alive.
@@ -25,10 +25,8 @@ import { ModeManager } from '../domain/ModeManager.js';
 import type { ModeConfig } from '../domain/types.js';
 import {
  processAgentResponse,
-  shouldFallbackToClaude,
  isAbortError,
-  type WorkerRef,
-  type FallbackAgent
+  type WorkerRef
 } from './agents/index.js';

 // Gemini API endpoint — use v1 (stable), not v1beta.
@@ -116,21 +114,12 @@ interface GeminiContent {
 export class GeminiAgent {
  private dbManager: DatabaseManager;
  private sessionManager: SessionManager;
-  private fallbackAgent: FallbackAgent | null = null;

  constructor(dbManager: DatabaseManager, sessionManager: SessionManager) {
    this.dbManager = dbManager;
    this.sessionManager = sessionManager;
  }

-  /**
-   * Set the fallback agent (Claude SDK) for when Gemini API fails
-   * Must be set after construction to avoid circular dependency
-   */
-  setFallbackAgent(agent: FallbackAgent): void {
-    this.fallbackAgent = agent;
-  }
-
  /**
   * Start Gemini agent for a session
   * Uses multi-turn conversation to maintain context across messages
@@ -352,28 +341,19 @@ export class GeminiAgent {
  }

  /**
-   * Handle errors from Gemini API calls with abort detection and Claude fallback.
+   * Handle errors from Gemini API calls with abort detection.
   * Shared by init query and message processing try blocks.
+   *
+   * Note: The previous Claude-SDK fallback path was removed in #2087 — it was
+   * never wired in production (`fallbackAgent` was always null), so 429s
+   * already threw in practice. The throw is now explicit.
   */
-  private handleGeminiError(error: unknown, session: ActiveSession, worker?: WorkerRef): Promise<void> | never {
+  private handleGeminiError(error: unknown, session: ActiveSession, _worker?: WorkerRef): never {
    if (isAbortError(error)) {
      logger.warn('SDK', 'Gemini agent aborted', { sessionId: session.sessionDbId });
      throw error;
    }

-    // Check if we should fall back to Claude
-    if (shouldFallbackToClaude(error) && this.fallbackAgent) {
-      logger.warn('SDK', 'Gemini API failed, falling back to Claude SDK', {
-        sessionDbId: session.sessionDbId,
-        error: error instanceof Error ? error.message : String(error),
-        historyLength: session.conversationHistory.length
-      });
-
-      // Fall back to Claude - it will use the same session with shared conversationHistory
-      // Note: With claim-and-delete queue pattern, messages are already deleted on claim
-      return this.fallbackAgent.startSession(session, worker);
-    }
-
    logger.failure('SDK', 'Gemini agent error', { sessionDbId: session.sessionDbId }, error instanceof Error ? error : new Error(String(error)));
    throw error;
  }
@@ -24,8 +24,6 @@ import { SessionManager } from './SessionManager.js';
 import {
  isAbortError,
  processAgentResponse,
-  shouldFallbackToClaude,
-  type FallbackAgent,
  type WorkerRef
 } from './agents/index.js';

@@ -65,21 +63,12 @@ interface OpenRouterResponse {
 export class OpenRouterAgent {
  private dbManager: DatabaseManager;
  private sessionManager: SessionManager;
-  private fallbackAgent: FallbackAgent | null = null;

  constructor(dbManager: DatabaseManager, sessionManager: SessionManager) {
    this.dbManager = dbManager;
    this.sessionManager = sessionManager;
  }

-  /**
-   * Set the fallback agent (Claude SDK) for when OpenRouter API fails
-   * Must be set after construction to avoid circular dependency
-   */
-  setFallbackAgent(agent: FallbackAgent): void {
-    this.fallbackAgent = agent;
-  }
-
  /**
   * Start OpenRouter agent for a session
   * Uses multi-turn conversation to maintain context across messages
@@ -327,27 +316,18 @@ export class OpenRouterAgent {
  }

  /**
-   * Handle errors from session processing: abort re-throw, fallback to Claude, or log and re-throw.
+   * Handle errors from session processing: abort re-throw or log and re-throw.
+   *
+   * Note: The previous Claude-SDK fallback path was removed in #2087 — it was
+   * never wired in production (`fallbackAgent` was always null), so 429s
+   * already threw in practice. The throw is now explicit.
   */
-  private async handleSessionError(error: unknown, session: ActiveSession, worker?: WorkerRef): Promise<never | void> {
+  private async handleSessionError(error: unknown, session: ActiveSession, _worker?: WorkerRef): Promise<never> {
    if (isAbortError(error)) {
      logger.warn('SDK', 'OpenRouter agent aborted', { sessionId: session.sessionDbId });
      throw error;
    }

-    if (shouldFallbackToClaude(error) && this.fallbackAgent) {
-      logger.warn('SDK', 'OpenRouter API failed, falling back to Claude SDK', {
-        sessionDbId: session.sessionDbId,
-        error: error instanceof Error ? error.message : String(error),
-        historyLength: session.conversationHistory.length
-      });
-
-      // Fall back to Claude - it will use the same session with shared conversationHistory
-      // Note: With claim-and-delete queue pattern, messages are already deleted on claim
-      await this.fallbackAgent.startSession(session, worker);
-      return;
-    }
-
    logger.failure('SDK', 'OpenRouter agent error', { sessionDbId: session.sessionDbId }, error instanceof Error ? error : new Error(String(error)));
    throw error;
  }
@@ -175,7 +175,8 @@ export class PaginationHelper {
      params.push(project, project);
    } else {
      // Hide internal observer-session rows from the unfiltered UI list.
-      conditions.push("ss.project != 'observer-sessions'");
+      conditions.push('ss.project != ?');
+      params.push(OBSERVER_SESSIONS_PROJECT);
    }

    if (platformSource) {
@@ -229,7 +230,8 @@ export class PaginationHelper {
      params.push(project);
    } else {
      // Hide internal observer-session rows from the unfiltered UI list.
-      conditions.push("s.project != 'observer-sessions'");
+      conditions.push('s.project != ?');
+      params.push(OBSERVER_SESSIONS_PROJECT);
    }

    if (platformSource) {
@@ -13,6 +13,7 @@

 import type { WorkerRef, ObservationSSEPayload, SummarySSEPayload } from './types.js';
 import { logger } from '../../../utils/logger.js';
+import { shouldEmitProjectRow } from '../../../shared/should-track-project.js';

 /**
 * Broadcast a new observation to SSE clients
@@ -28,6 +29,18 @@ export function broadcastObservation(
    return;
  }

+  // Parity with PaginationHelper's unfiltered-list SQL filter (#2118):
+  // observer-session rows are internal and must not stream to viewer clients.
+  // Same predicate used by both filters via shouldEmitProjectRow so they
+  // can never drift apart.
+  if (!shouldEmitProjectRow(payload.project)) {
+    logger.debug('WORKER', 'SSE observation broadcast skipped (internal project)', {
+      project: payload.project,
+      id: payload.id,
+    });
+    return;
+  }
+
  worker.sseBroadcaster.broadcast({
    type: 'new_observation',
    observation: payload
@@ -48,6 +61,15 @@ export function broadcastSummary(
    return;
  }

+  // Parity with PaginationHelper's unfiltered-list SQL filter (#2118).
+  if (!shouldEmitProjectRow(payload.project)) {
+    logger.debug('WORKER', 'SSE summary broadcast skipped (internal project)', {
+      project: payload.project,
+      id: payload.id,
+    });
+    return;
+  }
+
  worker.sseBroadcaster.broadcast({
    type: 'new_summary',
    summary: payload
@@ -6,7 +6,7 @@
 *
 * Usage:
 * ```typescript
- * import { processAgentResponse, shouldFallbackToClaude } from './agents/index.js';
+ * import { processAgentResponse, isAbortError } from './agents/index.js';
 * ```
 */

@@ -19,7 +19,6 @@ export type {
  StorageResult,
  ResponseProcessingContext,
  ParsedResponse,
-  FallbackAgent,
  BaseAgentConfig,
 } from './types.js';

@@ -98,17 +98,6 @@ export interface ParsedResponse {
  summary: ParsedSummary | null;
 }

-// ============================================================================
-// Fallback Agent Interface
-// ============================================================================
-
-/**
- * Interface for fallback agent (used by Gemini/OpenRouter to fall back to Claude)
- */
-export interface FallbackAgent {
-  startSession(session: ActiveSession, worker?: WorkerRef): Promise<void>;
-}
-
 // ============================================================================
 // Agent Configuration Types
 // ============================================================================
@@ -13,11 +13,22 @@ import { logger } from '../../../../utils/logger.js';
 import type { DatabaseManager } from '../../DatabaseManager.js';

 // Plan 06 Phase 3 — per-route Zod schema.
+//
+// `metadata` is an arbitrary JSON object the caller can use to attach
+// integration-specific provenance (e.g. obsidian_note, claude_mem_version,
+// custom_key). It is stored verbatim in the observations.metadata column
+// (migration 30) — no schema enforcement on its keys (#2116).
+//
+// `metadata.project`, when present and the top-level `project` is omitted,
+// is honored as the project assignment. This lets integrating plugins file
+// observations under a project other than their own without having to know
+// the top-level field name.
 const saveMemorySchema = z.object({
  text: z.string().trim().min(1),
  title: z.string().optional(),
  project: z.string().optional(),
-}).passthrough();
+  metadata: z.record(z.string(), z.unknown()).optional(),
+}).strict();

 export class MemoryRoutes extends BaseRouteHandler {
  constructor(
@@ -33,11 +44,26 @@ export class MemoryRoutes extends BaseRouteHandler {

  /**
   * POST /api/memory/save - Save a manual memory/observation
-   * Body: { text: string, title?: string, project?: string }
+   * Body: {
+   *   text: string,
+   *   title?: string,
+   *   project?: string,
+   *   metadata?: Record<string, unknown>  // arbitrary JSON, persisted verbatim (#2116)
+   * }
+   *
+   * Project resolution order: top-level `project` → `metadata.project` (string)
+   * → this.defaultProject. Unknown top-level fields are now rejected (400) —
+   * `.strict()` replaced `.passthrough()` so silent drops can't recur.
   */
  private handleSaveMemory = this.wrapHandler(async (req: Request, res: Response): Promise<void> => {
-    const { text, title, project } = req.body as z.infer<typeof saveMemorySchema>;
-    const targetProject = project || this.defaultProject;
+    const { text, title, project, metadata } = req.body as z.infer<typeof saveMemorySchema>;
+    const explicitProject = typeof project === 'string' && project.trim()
+      ? project.trim()
+      : undefined;
+    const metadataProject = typeof metadata?.project === 'string' && metadata.project.trim()
+      ? metadata.project.trim()
+      : undefined;
+    const targetProject = explicitProject || metadataProject || this.defaultProject;

    const sessionStore = this.dbManager.getSessionStore();
    const chromaSync = this.dbManager.getChromaSync();
@@ -54,7 +80,10 @@ export class MemoryRoutes extends BaseRouteHandler {
      narrative: text,
      concepts: [] as string[],
      files_read: [] as string[],
-      files_modified: [] as string[]
+      files_modified: [] as string[],
+      // Stringify here so the storage layer doesn't need to know about JSON shape.
+      // Preserved verbatim, including nested objects.
+      metadata: metadata ? JSON.stringify(metadata) : null,
    };

    // 3. Store to SQLite
@@ -449,6 +449,10 @@ export class SearchRoutes extends BaseRouteHandler {
   * GET /api/search/help
   */
  private handleSearchHelp = this.wrapHandler((req: Request, res: Response): void => {
+    // Use the actual host:port the request came in on so example URLs always
+    // round-trip back to this same worker — matters for multi-account / non-
+    // default-port setups (#2101, #2103).
+    const baseUrl = `http://${req.headers.host ?? 'localhost'}`;
    res.json({
      title: 'Claude-Mem Search API',
      description: 'HTTP API for searching persistent memory',
@@ -551,10 +555,10 @@ export class SearchRoutes extends BaseRouteHandler {
        }
      ],
      examples: [
-        'curl "http://localhost:37777/api/search/observations?query=authentication&limit=5"',
-        'curl "http://localhost:37777/api/search/by-type?type=bugfix&limit=10"',
-        'curl "http://localhost:37777/api/context/recent?project=claude-mem&limit=3"',
-        'curl "http://localhost:37777/api/context/timeline?anchor=123&depth_before=5&depth_after=5"'
+        `curl "${baseUrl}/api/search/observations?query=authentication&limit=5"`,
+        `curl "${baseUrl}/api/search/by-type?type=bugfix&limit=10"`,
+        `curl "${baseUrl}/api/context/recent?project=claude-mem&limit=3"`,
+        `curl "${baseUrl}/api/context/timeline?anchor=123&depth_before=5&depth_after=5"`
      ]
    });
  });
@@ -95,7 +95,18 @@ export class SessionRoutes extends BaseRouteHandler {
   * The next generator will use the new provider with shared conversationHistory.
   */
  private static readonly STALE_GENERATOR_THRESHOLD_MS = 30_000; // 30 seconds (#1099)
-  private static readonly MAX_SESSION_WALL_CLOCK_MS = 4 * 60 * 60 * 1000; // 4 hours (#1590)
+
+  // Wall-clock cap on a single in-memory session — exists to prevent runaway
+  // API costs from a session that is somehow stuck in a re-activation loop
+  // (#1590, #2127, #2098). 4h was the original value, picked when bugs in the
+  // re-activation path made cost runaways more plausible; users in practice
+  // have legitimate long-running sessions (24h+ Claude Code days) that this
+  // killed without warning. 24h is the new ceiling — long enough that
+  // a real human workday never hits it, short enough that a runaway loop is
+  // still bounded. We deliberately do NOT expose this as a config knob: a
+  // session approaching this age is almost certainly a bug worth investigating,
+  // not a knob worth tuning.
+  private static readonly MAX_SESSION_WALL_CLOCK_MS = 24 * 60 * 60 * 1000; // 24 hours (#1590, #2127)

  public ensureGeneratorRunning(sessionDbId: number, source: string): void {
    const session = this.sessionManager.getSession(sessionDbId);
@@ -217,6 +217,13 @@ export function buildIsolatedEnv(includeCredentials: boolean = true): Record<str
  // 2. Override SDK entrypoint marker
  isolatedEnv.CLAUDE_CODE_ENTRYPOINT = 'sdk-ts';

+  // 2a. Mark this as an internal claude-mem subprocess so spawned hooks can
+  // skip tracking unconditionally. This is the single trust boundary for
+  // observer-session detection — every consumer can check
+  // process.env.CLAUDE_MEM_INTERNAL instead of repeating cwd-based exclusion
+  // checks (which inevitably drift; see #2118 / #2126).
+  isolatedEnv.CLAUDE_MEM_INTERNAL = '1';
+
  // 3. Re-inject managed credentials from claude-mem's .env file
  if (includeCredentials) {
    const credentials = loadClaudeMemEnv();
@@ -14,7 +14,7 @@
 import { relative, isAbsolute } from 'path';
 import { isProjectExcluded } from '../utils/project-filter.js';
 import { loadFromFileOnce } from './hook-settings.js';
-import { OBSERVER_SESSIONS_DIR } from './paths.js';
+import { OBSERVER_SESSIONS_DIR, OBSERVER_SESSIONS_PROJECT } from './paths.js';

 function isWithin(child: string, parent: string): boolean {
  if (child === parent) return true;
@@ -27,12 +27,18 @@ function isWithin(child: string, parent: string): boolean {
 *          tracking, i.e., the hook should proceed; false when the project
 *          matches one of the exclusion globs.
 *
- * Hard-excludes OBSERVER_SESSIONS_DIR: the SDK agent spawns Claude Code with
- * that cwd, and its hooks must never feed the worker — otherwise the observer's
+ * Single trust boundary: when the spawning worker set CLAUDE_MEM_INTERNAL=1
+ * (see EnvManager.buildIsolatedEnv), the spawned subprocess is an internal
+ * claude-mem agent and must never feed the worker — otherwise the observer's
 * own init/continuation/summary prompts end up stored as `user_prompts` and
- * leak into the viewer (meta-observation).
+ * leak into the viewer (meta-observation; see #2118, #2126).
+ *
+ * The cwd-based OBSERVER_SESSIONS_DIR check stays as belt-and-braces for any
+ * pre-env-var spawn path (e.g., user manually launching `claude` inside the
+ * observer dir) and for tests that don't exercise the env var.
 */
 export function shouldTrackProject(cwd: string): boolean {
+  if (process.env.CLAUDE_MEM_INTERNAL === '1') return false;
  if (!cwd) return true;
  // path.relative handles separator differences (Windows '\\' vs POSIX '/')
  // and trailing-slash variance, which a literal startsWith would miss.
@@ -42,3 +48,17 @@ export function shouldTrackProject(cwd: string): boolean {
  const settings = loadFromFileOnce();
  return !isProjectExcluded(cwd, settings.CLAUDE_MEM_EXCLUDED_PROJECTS);
 }
+
+/**
+ * Shared predicate: should a row tagged with `project` be emitted to user-facing
+ * surfaces (SSE stream, viewer UI list)? Used by both PaginationHelper SQL
+ * filters and SSEBroadcaster payload filters so they can never drift.
+ *
+ * Internal claude-mem rows (project === OBSERVER_SESSIONS_PROJECT) are hidden
+ * from the unfiltered list view and the live SSE stream. They remain queryable
+ * by id and by explicit `project=observer-sessions` filter for diagnostics.
+ */
+export function shouldEmitProjectRow(project: string | null | undefined): boolean {
+  if (!project) return true;
+  return project !== OBSERVER_SESSIONS_PROJECT;
+}
@@ -55,7 +55,8 @@ let cachedHost: string | null = null;

 /**
 * Get the worker port number from settings
- * Uses CLAUDE_MEM_WORKER_PORT from settings file or default (37777)
+ * Uses CLAUDE_MEM_WORKER_PORT from settings file, or the per-UID default
+ * (37700 + uid % 100) defined in SettingsDefaultsManager.
 * Caches the port value to avoid repeated file reads
 */
 export function getWorkerPort(): number {
@@ -6,6 +6,24 @@ export const ENV_EXACT_MATCHES = new Set([
  'MCP_SESSION_ID',
 ]);

+/**
+ * Proxy-related env vars stripped before spawning the worker / `claude` subprocess.
+ * The user's proxy config bleeding into internal AI calls causes connection failures
+ * (see issues #2115, #2099). Stripped unconditionally — no opt-in flag.
+ */
+export const ENV_PROXY_VARS = new Set([
+  'HTTP_PROXY',
+  'HTTPS_PROXY',
+  'ALL_PROXY',
+  'NO_PROXY',
+  'http_proxy',
+  'https_proxy',
+  'all_proxy',
+  'no_proxy',
+  'npm_config_proxy',
+  'npm_config_https_proxy',
+]);
+
 /** Vars that start with CLAUDE_CODE_ but must be preserved for subprocess auth/tooling */
 export const ENV_PRESERVE = new Set([
  'CLAUDE_CODE_OAUTH_TOKEN',
@@ -19,6 +37,7 @@ export function sanitizeEnv(env: NodeJS.ProcessEnv = process.env): NodeJS.Proces
    if (value === undefined) continue;
    if (ENV_PRESERVE.has(key)) { sanitized[key] = value; continue; }
    if (ENV_EXACT_MATCHES.has(key)) continue;
+    if (ENV_PROXY_VARS.has(key)) continue;
    if (ENV_PREFIXES.some(prefix => key.startsWith(prefix))) continue;
    sanitized[key] = value;
  }
@@ -5,6 +5,12 @@
 export const DEFAULT_SETTINGS = {
  CLAUDE_MEM_MODEL: 'claude-sonnet-4-6',
  CLAUDE_MEM_CONTEXT_OBSERVATIONS: '50',
+  // Build-time placeholder only. The viewer runs in-browser served by the
+  // worker itself, so actual API calls use window.location and the real port
+  // is fetched from /api/settings into useSettings(). This literal is just a
+  // form-field fallback rendered for the brief moment before the API response
+  // arrives. Multi-account / per-UID port resolution lives in
+  // SettingsDefaultsManager (server-side); see CLAUDE.md → Multi-account.
  CLAUDE_MEM_WORKER_PORT: '37777',
  CLAUDE_MEM_WORKER_HOST: '127.0.0.1',

@@ -6,6 +6,7 @@
 */

 import { homedir } from 'os';
+import { basename } from 'path';

 /**
 * Convert a glob pattern to a regular expression
@@ -50,6 +51,11 @@ export function isProjectExcluded(projectPath: string, exclusionPatterns: string

  // Normalize cwd path separators
  const normalizedProjectPath = projectPath.replace(/\\/g, '/');
+  // Basename match pass: users intuitively expect `observer-sessions` or
+  // `*observer-sessions*` to match any cwd whose final segment matches, but
+  // globToRegex translates `*` → `[^/]*` which can't cross `/`. Without this,
+  // both bare names and basename globs silently fail (#2126 item 1).
+  const projectBasename = basename(normalizedProjectPath);

  // Parse comma-separated patterns
  const patternList = exclusionPatterns
@@ -60,7 +66,7 @@ export function isProjectExcluded(projectPath: string, exclusionPatterns: string
  for (const pattern of patternList) {
    try {
      const regex = globToRegex(pattern);
-      if (regex.test(normalizedProjectPath)) {
+      if (regex.test(normalizedProjectPath) || regex.test(projectBasename)) {
        return true;
      }
    } catch (error: unknown) {
@@ -0,0 +1,59 @@
+// Tests for readJsonFromStdin's onEnd contract (#2089).
+//
+// The previous implementation silently dropped malformed JSON when stdin
+// closed, returning undefined just like the empty-input case. The fix mirrors
+// the safety-timeout path: non-empty + unparseable = reject.
+
+import { describe, it, expect, afterEach } from 'bun:test';
+import { Readable } from 'stream';
+
+import { readJsonFromStdin } from '../../src/cli/stdin-reader.js';
+
+const realStdin = process.stdin;
+const realStdinDescriptor = Object.getOwnPropertyDescriptor(process, 'stdin');
+
+function installFakeStdin(payload: string): void {
+  // Build a Readable that emits the payload, then ends — matches the
+  // shape of a process.stdin pipe closing after a single write.
+  const fake = Readable.from([payload], { objectMode: false }) as unknown as NodeJS.ReadStream;
+  // The reader checks isTTY (must be falsy) and `.readable` access.
+  Object.defineProperty(fake, 'isTTY', { value: false, configurable: true });
+  Object.defineProperty(process, 'stdin', {
+    configurable: true,
+    enumerable: realStdinDescriptor?.enumerable ?? true,
+    writable: true,
+    value: fake,
+  });
+}
+
+afterEach(() => {
+  if (realStdinDescriptor) {
+    Object.defineProperty(process, 'stdin', realStdinDescriptor);
+  } else {
+    Object.defineProperty(process, 'stdin', { value: realStdin, configurable: true, writable: true });
+  }
+});
+
+describe('readJsonFromStdin — onEnd contract (#2089)', () => {
+  it('resolves with parsed JSON when stdin yields a complete object', async () => {
+    installFakeStdin('{"hello":"world"}');
+    const result = await readJsonFromStdin();
+    expect(result).toEqual({ hello: 'world' });
+  });
+
+  it('resolves with undefined when stdin closes empty', async () => {
+    installFakeStdin('');
+    const result = await readJsonFromStdin();
+    expect(result).toBeUndefined();
+  });
+
+  it('rejects when stdin closes with non-empty but unparseable bytes', async () => {
+    installFakeStdin('{"truncated":');
+    await expect(readJsonFromStdin()).rejects.toThrow(/Malformed JSON at stdin EOF/);
+  });
+
+  it('rejects when stdin closes with junk that is clearly not JSON', async () => {
+    installFakeStdin('not json at all');
+    await expect(readJsonFromStdin()).rejects.toThrow(/Malformed JSON at stdin EOF/);
+  });
+});
@@ -251,7 +251,10 @@ describe('GeminiAgent', () => {
    expect(session.cumulativeInputTokens).toBeGreaterThan(0);
  });

-  it('should fallback to Claude on rate limit error', async () => {
+  it('should throw on rate limit (429) error — no Claude fallback (#2087)', async () => {
+    // The Claude-SDK fallback path was removed in #2087: it was never wired in
+    // production (`fallbackAgent` was always null) so 429s already threw.
+    // This test pins the new explicit behavior.
    const session = {
      sessionDbId: 1,
      contentSessionId: 'test-session',
@@ -273,19 +276,10 @@ describe('GeminiAgent', () => {

    global.fetch = mock(() => Promise.resolve(new Response('Resource has been exhausted (e.g. check quota).', { status: 429 })));

-    const fallbackAgent = {
-      startSession: mock(() => Promise.resolve())
-    };
-    agent.setFallbackAgent(fallbackAgent);
-
-    await agent.startSession(session);
-
-    // Verify fallback to Claude was triggered
-    expect(fallbackAgent.startSession).toHaveBeenCalledWith(session, undefined);
-    // Note: resetStuckMessages is called by worker-service.ts, not by GeminiAgent
+    await expect(agent.startSession(session)).rejects.toThrow(/429/);
  });

-  it('should NOT fallback on other errors', async () => {
+  it('should throw on other errors', async () => {
    const session = {
      sessionDbId: 1,
      contentSessionId: 'test-session',
@@ -307,13 +301,7 @@ describe('GeminiAgent', () => {

    global.fetch = mock(() => Promise.resolve(new Response('Invalid argument', { status: 400 })));

-    const fallbackAgent = {
-      startSession: mock(() => Promise.resolve())
-    };
-    agent.setFallbackAgent(fallbackAgent);
-
    await expect(agent.startSession(session)).rejects.toThrow('Gemini API error: 400 - Invalid argument');
-    expect(fallbackAgent.startSession).not.toHaveBeenCalled();
  });

  it('should respect rate limits when rate limiting enabled', async () => {
@@ -1,4 +1,10 @@
-// Tests for file-context cache validation fix (#1719)
+// Tests for file-context cache validation and the #2094 deadlock fix.
+//
+// The hook used to truncate Reads to limit:1 and inject "you have enough info"
+// guidance — that combination broke Edit-after-Read because Claude Code's
+// read-state tracker saw a "read" but content was missing. Behavior now:
+// inject the timeline as supplementary context only; never set updatedInput.
+
 import { describe, it, expect, beforeEach, afterEach, spyOn, mock } from 'bun:test';
 import { mkdtempSync, writeFileSync, utimesSync, rmSync } from 'fs';
 import { tmpdir, homedir } from 'os';
@@ -89,8 +95,8 @@ afterEach(() => {
  try { rmSync(tmpDir, { recursive: true, force: true }); } catch {}
 });

-describe('fileContextHandler — cache validation fix (#1719)', () => {
-  it('truncates to limit:1 for an unconstrained Read (existing behavior)', async () => {
+describe('fileContextHandler — #2094 (no Read mutation)', () => {
+  it('injects timeline context but never sets updatedInput on an unconstrained Read', async () => {
    // File mtime is "now" (just written). Make observations newer to avoid mtime bypass.
    const future = Date.now() + 60_000;
    fetchSpy = spyOn(globalThis, 'fetch').mockResolvedValue(
@@ -105,13 +111,12 @@ describe('fileContextHandler — cache validation fix (#1719)', () => {
    });

    expect(result.hookSpecificOutput).toBeDefined();
-    expect(result.hookSpecificOutput!.updatedInput).toEqual({
-      file_path: testFile,
-      limit: 1,
-    });
+    expect(result.hookSpecificOutput!.additionalContext).toContain('prior observations');
+    // The whole point of #2094: do not rewrite the Read call.
+    expect((result.hookSpecificOutput as any).updatedInput).toBeUndefined();
  });

-  it('preserves user-supplied offset/limit on a targeted Read (#1719 fix)', async () => {
+  it('does not set updatedInput on a targeted Read either', async () => {
    const future = Date.now() + 60_000;
    fetchSpy = spyOn(globalThis, 'fetch').mockResolvedValue(
      makeObservationsResponse([{ id: 1, created_at_epoch: future }])
@@ -125,55 +130,10 @@ describe('fileContextHandler — cache validation fix (#1719)', () => {
    });

    expect(result.hookSpecificOutput).toBeDefined();
-    expect(result.hookSpecificOutput!.updatedInput).toEqual({
-      file_path: testFile,
-      offset: 289,
-      limit: 140,
-    });
+    expect((result.hookSpecificOutput as any).updatedInput).toBeUndefined();
  });

-  it('preserves user-supplied offset only', async () => {
-    const future = Date.now() + 60_000;
-    fetchSpy = spyOn(globalThis, 'fetch').mockResolvedValue(
-      makeObservationsResponse([{ id: 1, created_at_epoch: future }])
-    );
-
-    const result = await fileContextHandler.execute({
-      sessionId: 'sess',
-      cwd: tmpDir,
-      toolName: 'Read',
-      toolInput: { file_path: testFile, offset: 100 },
-    });
-
-    expect(result.hookSpecificOutput!.updatedInput).toEqual({
-      file_path: testFile,
-      offset: 100,
-    });
-    expect((result.hookSpecificOutput!.updatedInput as any).limit).toBeUndefined();
-  });
-
-  it('preserves user-supplied limit only', async () => {
-    const future = Date.now() + 60_000;
-    fetchSpy = spyOn(globalThis, 'fetch').mockResolvedValue(
-      makeObservationsResponse([{ id: 1, created_at_epoch: future }])
-    );
-
-    const result = await fileContextHandler.execute({
-      sessionId: 'sess',
-      cwd: tmpDir,
-      toolName: 'Read',
-      toolInput: { file_path: testFile, limit: 50 },
-    });
-
-    expect(result.hookSpecificOutput!.updatedInput).toEqual({
-      file_path: testFile,
-      limit: 50,
-    });
-    // offset must NOT be present
-    expect((result.hookSpecificOutput!.updatedInput as any).offset).toBeUndefined();
-  });
-
-  it('bypasses truncation when file mtime is newer than newest observation (#1719 fix)', async () => {
+  it('skips entirely when file mtime is newer than newest observation (#1719 still honored)', async () => {
    // Backdate observations 1 hour into the past so the just-written file is newer.
    const stale = Date.now() - 3_600_000;
    fetchSpy = spyOn(globalThis, 'fetch').mockResolvedValue(
@@ -190,12 +150,12 @@ describe('fileContextHandler — cache validation fix (#1719)', () => {
      toolInput: { file_path: testFile },
    });

-    // Pass-through: no hookSpecificOutput, no updatedInput rewrite
+    // Pass-through: no hookSpecificOutput
    expect(result.continue).toBe(true);
    expect(result.hookSpecificOutput).toBeUndefined();
  });

-  it('still truncates when file mtime is older than newest observation', async () => {
+  it('still injects context when file mtime is older than newest observation', async () => {
    // Backdate the file by 1 hour, observations stamped "now"
    const past = (Date.now() - 3_600_000) / 1000;
    utimesSync(testFile, past, past);
@@ -213,13 +173,11 @@ describe('fileContextHandler — cache validation fix (#1719)', () => {
    });

    expect(result.hookSpecificOutput).toBeDefined();
-    expect(result.hookSpecificOutput!.updatedInput).toEqual({
-      file_path: testFile,
-      limit: 1,
-    });
+    expect(result.hookSpecificOutput!.additionalContext).toContain('prior observations');
+    expect((result.hookSpecificOutput as any).updatedInput).toBeUndefined();
  });

-  it('targeted-read header line reflects that the section was read normally', async () => {
+  it('header text no longer claims the file was truncated', async () => {
    const future = Date.now() + 60_000;
    fetchSpy = spyOn(globalThis, 'fetch').mockResolvedValue(
      makeObservationsResponse([{ id: 1, created_at_epoch: future }])
@@ -229,11 +187,12 @@ describe('fileContextHandler — cache validation fix (#1719)', () => {
      sessionId: 'sess',
      cwd: tmpDir,
      toolName: 'Read',
-      toolInput: { file_path: testFile, offset: 10, limit: 20 },
+      toolInput: { file_path: testFile },
    });

-    const ctx = result.hookSpecificOutput!.additionalContext;
-    expect(ctx).toContain('The requested section was read normally');
+    const ctx = result.hookSpecificOutput!.additionalContext as string;
    expect(ctx).not.toContain('Only line 1 was read');
+    // The new copy explicitly states the Read result is the full requested section.
+    expect(ctx).toContain('full requested section');
  });
 });
@@ -132,10 +132,10 @@ describe('MigrationRunner', () => {
      expect(versions).toContain(11);  // discovery_tokens
      expect(versions).toContain(16);  // pending_messages
      expect(versions).toContain(17);  // rename columns
-      expect(versions).toContain(19);  // repair (noop)
      expect(versions).toContain(20);  // failed_at_epoch
      expect(versions).toContain(21);  // ON UPDATE CASCADE
      expect(versions).toContain(22);  // content_hash
+      expect(versions).toContain(30);  // observations.metadata
    });
  });

@@ -310,13 +310,16 @@ describe('SettingsDefaultsManager', () => {
  describe('get', () => {
    it('should return default value for key', () => {
      expect(SettingsDefaultsManager.get('CLAUDE_MEM_MODEL')).toBe('claude-sonnet-4-6');
-      expect(SettingsDefaultsManager.get('CLAUDE_MEM_WORKER_PORT')).toBe('37777');
+      // Per-UID port: 37700 + (uid % 100). See SettingsDefaultsManager.ts.
+      const expectedPort = String(37700 + ((process.getuid?.() ?? 77) % 100));
+      expect(SettingsDefaultsManager.get('CLAUDE_MEM_WORKER_PORT')).toBe(expectedPort);
    });
  });

  describe('getInt', () => {
    it('should return integer value for numeric string', () => {
-      expect(SettingsDefaultsManager.getInt('CLAUDE_MEM_WORKER_PORT')).toBe(37777);
+      const expectedPort = 37700 + ((process.getuid?.() ?? 77) % 100);
+      expect(SettingsDefaultsManager.getInt('CLAUDE_MEM_WORKER_PORT')).toBe(expectedPort);
      expect(SettingsDefaultsManager.getInt('CLAUDE_MEM_CONTEXT_OBSERVATIONS')).toBe(50);
    });
  });
@@ -438,9 +441,10 @@ describe('SettingsDefaultsManager', () => {
      const result = SettingsDefaultsManager.loadFromFile(settingsPath);

      // Priority check:
-      // Default is 37777, file is 22222, env is 33333
+      // Default is per-UID (37700 + uid%100), file is 22222, env is 33333
      // Result should be env (33333) because env > file > default
-      expect(defaults.CLAUDE_MEM_WORKER_PORT).toBe('37777'); // Confirm default
+      const expectedDefault = String(37700 + ((process.getuid?.() ?? 77) % 100));
+      expect(defaults.CLAUDE_MEM_WORKER_PORT).toBe(expectedDefault); // Confirm default
      expect(result.CLAUDE_MEM_WORKER_PORT).toBe('33333'); // Env wins
    });
  });
@@ -133,6 +133,34 @@ describe('sanitizeEnv', () => {
    expect(result.HOME).toBe('/home/user');
  });

+  it('strips proxy env vars (uppercase and lowercase) so the worker subprocess is not routed through the user proxy', () => {
+    const result = sanitizeEnv({
+      HTTP_PROXY: 'http://bad-proxy:1234',
+      HTTPS_PROXY: 'http://bad-proxy:1234',
+      ALL_PROXY: 'socks5://bad-proxy:1080',
+      NO_PROXY: 'localhost,127.0.0.1',
+      http_proxy: 'http://bad-proxy:1234',
+      https_proxy: 'http://bad-proxy:1234',
+      all_proxy: 'socks5://bad-proxy:1080',
+      no_proxy: 'localhost,127.0.0.1',
+      npm_config_proxy: 'http://bad-proxy:1234',
+      npm_config_https_proxy: 'http://bad-proxy:1234',
+      PATH: '/usr/bin'
+    });
+
+    expect(result.HTTP_PROXY).toBeUndefined();
+    expect(result.HTTPS_PROXY).toBeUndefined();
+    expect(result.ALL_PROXY).toBeUndefined();
+    expect(result.NO_PROXY).toBeUndefined();
+    expect(result.http_proxy).toBeUndefined();
+    expect(result.https_proxy).toBeUndefined();
+    expect(result.all_proxy).toBeUndefined();
+    expect(result.no_proxy).toBeUndefined();
+    expect(result.npm_config_proxy).toBeUndefined();
+    expect(result.npm_config_https_proxy).toBeUndefined();
+    expect(result.PATH).toBe('/usr/bin');
+  });
+
  it('selectively preserves only allowed CLAUDE_CODE_* vars while stripping others', () => {
    const result = sanitizeEnv({
      CLAUDE_CODE_OAUTH_TOKEN: 'my-oauth-token',
@@ -0,0 +1,187 @@
+/**
+ * MemoryRoutes Tests — POST /api/memory/save (#2116)
+ *
+ * Asserts:
+ *  - `metadata` is persisted verbatim (no silent drop)
+ *  - top-level `project` wins; `metadata.project` used as fallback
+ *  - unknown top-level fields are rejected (400) — no silent drop
+ *  - chromaSync is invoked when present, skipped when absent
+ */
+
+import { describe, it, expect, mock, beforeEach, afterEach, spyOn } from 'bun:test';
+import type { Request, Response } from 'express';
+import { logger } from '../../../../src/utils/logger.js';
+
+mock.module('../../../../src/shared/paths.js', () => ({
+  getPackageRoot: () => '/tmp/test',
+}));
+mock.module('../../../../src/shared/worker-utils.js', () => ({
+  getWorkerPort: () => 37777,
+}));
+
+import { MemoryRoutes } from '../../../../src/services/worker/http/routes/MemoryRoutes.js';
+
+let loggerSpies: ReturnType<typeof spyOn>[] = [];
+
+function createMockReqRes(body: any): { req: Partial<Request>; res: Partial<Response>; jsonSpy: ReturnType<typeof mock>; statusSpy: ReturnType<typeof mock> } {
+  const jsonSpy = mock(() => {});
+  const statusSpy = mock(() => ({ json: jsonSpy }));
+  return {
+    req: { body, path: '/api/memory/save', query: {} } as Partial<Request>,
+    res: { json: jsonSpy, status: statusSpy } as unknown as Partial<Response>,
+    jsonSpy,
+    statusSpy,
+  };
+}
+
+function captureChain(mockApp: any, targetPath: string): (req: Request, res: Response) => void {
+  let middleware: ((req: Request, res: Response, next: () => void) => void) | undefined;
+  let handler: ((req: Request, res: Response) => void) | undefined;
+  mockApp.post = mock((path: string, ...rest: any[]) => {
+    if (path !== targetPath) return;
+    if (rest.length === 1) {
+      handler = rest[0];
+    } else {
+      middleware = rest[0];
+      handler = rest[1];
+    }
+  });
+  return (req: Request, res: Response): void => {
+    if (!middleware) {
+      handler!(req, res);
+      return;
+    }
+    let nextCalled = false;
+    middleware(req, res, () => {
+      nextCalled = true;
+    });
+    if (nextCalled) handler!(req, res);
+  };
+}
+
+describe('MemoryRoutes — POST /api/memory/save (#2116)', () => {
+  let routes: MemoryRoutes;
+  let mockStoreObservation: ReturnType<typeof mock>;
+  let mockGetOrCreateManualSession: ReturnType<typeof mock>;
+  let storeObservationCalls: any[][] = [];
+
+  beforeEach(() => {
+    loggerSpies = [
+      spyOn(logger, 'info').mockImplementation(() => {}),
+      spyOn(logger, 'debug').mockImplementation(() => {}),
+      spyOn(logger, 'warn').mockImplementation(() => {}),
+      spyOn(logger, 'error').mockImplementation(() => {}),
+      spyOn(logger, 'failure').mockImplementation(() => {}),
+    ];
+
+    storeObservationCalls = [];
+    mockStoreObservation = mock((...args: any[]) => {
+      storeObservationCalls.push(args);
+      return { id: 42, createdAtEpoch: 1234567890 };
+    });
+    mockGetOrCreateManualSession = mock((project: string) => `manual-${project}`);
+
+    const mockDbManager = {
+      getSessionStore: () => ({
+        storeObservation: mockStoreObservation,
+        getOrCreateManualSession: mockGetOrCreateManualSession,
+      }),
+      // Return null so we skip the chroma path in tests
+      getChromaSync: () => null,
+    };
+
+    routes = new MemoryRoutes(mockDbManager as any, 'claude-mem');
+  });
+
+  afterEach(() => {
+    loggerSpies.forEach(spy => spy.mockRestore());
+    mock.restore();
+  });
+
+  function buildHandler(): (req: Request, res: Response) => void {
+    const mockApp: any = {
+      get: mock(() => {}),
+      delete: mock(() => {}),
+      use: mock(() => {}),
+    };
+    const handler = captureChain(mockApp, '/api/memory/save');
+    routes.setupRoutes(mockApp as any);
+    return handler;
+  }
+
+  it('persists arbitrary metadata as JSON-encoded string', () => {
+    const handler = buildHandler();
+    const metadata = {
+      obsidian_note: 'Atom — Test',
+      claude_mem_version: '12.4.4',
+      custom_key: 'value',
+    };
+    const { req, res } = createMockReqRes({ text: 'hello', metadata });
+    handler(req as Request, res as Response);
+
+    expect(mockStoreObservation).toHaveBeenCalledTimes(1);
+    const observationArg = storeObservationCalls[0][2];
+    expect(observationArg.metadata).toBe(JSON.stringify(metadata));
+  });
+
+  it('passes metadata: null when none provided', () => {
+    const handler = buildHandler();
+    const { req, res } = createMockReqRes({ text: 'hello' });
+    handler(req as Request, res as Response);
+
+    const observationArg = storeObservationCalls[0][2];
+    expect(observationArg.metadata).toBeNull();
+  });
+
+  it('uses top-level project when present', () => {
+    const handler = buildHandler();
+    const { req, res } = createMockReqRes({
+      text: 'hello',
+      project: 'top-level-project',
+      metadata: { project: 'metadata-project' },
+    });
+    handler(req as Request, res as Response);
+
+    expect(mockGetOrCreateManualSession).toHaveBeenCalledWith('top-level-project');
+    expect(storeObservationCalls[0][1]).toBe('top-level-project');
+  });
+
+  it('falls back to metadata.project when top-level project is omitted (#2116)', () => {
+    const handler = buildHandler();
+    const { req, res } = createMockReqRes({
+      text: 'hello',
+      metadata: { project: 'my-custom-project' },
+    });
+    handler(req as Request, res as Response);
+
+    expect(mockGetOrCreateManualSession).toHaveBeenCalledWith('my-custom-project');
+    expect(storeObservationCalls[0][1]).toBe('my-custom-project');
+  });
+
+  it('falls back to defaultProject when no project supplied anywhere', () => {
+    const handler = buildHandler();
+    const { req, res } = createMockReqRes({ text: 'hello' });
+    handler(req as Request, res as Response);
+
+    expect(mockGetOrCreateManualSession).toHaveBeenCalledWith('claude-mem');
+    expect(storeObservationCalls[0][1]).toBe('claude-mem');
+  });
+
+  it('rejects unknown top-level fields with HTTP 400 (no silent drop)', () => {
+    const handler = buildHandler();
+    const { req, res, statusSpy } = createMockReqRes({ text: 'hello', foo: 'bar' });
+    handler(req as Request, res as Response);
+
+    expect(statusSpy).toHaveBeenCalledWith(400);
+    expect(mockStoreObservation).not.toHaveBeenCalled();
+  });
+
+  it('rejects empty/missing text with HTTP 400', () => {
+    const handler = buildHandler();
+    const { req, res, statusSpy } = createMockReqRes({});
+    handler(req as Request, res as Response);
+
+    expect(statusSpy).toHaveBeenCalledWith(400);
+    expect(mockStoreObservation).not.toHaveBeenCalled();
+  });
+});