chore: bump version to 12.1.2

fix: reap stuck generators in reapStaleSessions (fixes #1652 ) (#1698 )
* fix: reap stuck generators in reapStaleSessions (fixes #1652) Sessions whose SDK subprocess hung would stay in the active sessions map forever because `reapStaleSessions()` unconditionally skipped any session with a non-null `generatorPromise`. The generator was blocked on `for await (const msg of queryResult)` inside SDKAgent and could never unblock itself — the idle-timeout only fires when the generator is in `waitForMessage()`, and the orphan reaper skips processes whose session is still in the map. Add `MAX_GENERATOR_IDLE_MS` (5 min). When `reapStaleSessions()` sees a session whose `generatorPromise` is set but `lastGeneratorActivity` has not advanced in over 5 minutes, it now: 1. SIGKILLs the tracked subprocess to unblock the stuck `for await` 2. Calls `session.abortController.abort()` so the generator loop exits 3. Calls `deleteSession()` which waits up to 30 s for the generator to finish, then cleans up supervisor-tracked children Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: freeze time in stale-generator test and import constants from production source - Export MAX_GENERATOR_IDLE_MS, MAX_SESSION_IDLE_MS, StaleGeneratorCandidate, StaleGeneratorProcess, and detectStaleGenerator from SessionManager.ts so tests no longer duplicate production constants or detection logic. - Use setSystemTime() from bun:test to freeze Date.now() in the "exactly at threshold" test, eliminating the flaky double-Date.now() race. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 01:00:38 -07:00 · 2026-04-15 00:58:35 -07:00 · 2026-04-15 00:58:32 -07:00 · 2026-04-15 00:58:29 -07:00 · 2026-04-15 00:58:23 -07:00 · 2026-04-15 00:58:20 -07:00
25 changed files with 1613 additions and 49 deletions
@@ -10,7 +10,7 @@
  "plugins": [
    {
      "name": "claude-mem",
-      "version": "12.1.1",
+      "version": "12.1.2",
      "source": "./plugin",
      "description": "Persistent memory system for Claude Code - context compression across sessions"
    }
@@ -979,3 +979,207 @@ describe("SSE stream integration", () => {
    await getService().stop({});
  });
 });
+
+describe("circuit breaker", () => {
+  // Reset circuit breaker state before each test by firing gateway_start.
+  // The circuit is module-level state, so tests would otherwise bleed into each other.
+  beforeEach(async () => {
+    const { api, fireEvent } = createMockApi({ workerPort: 59999 });
+    claudeMemPlugin(api);
+    await fireEvent("gateway_start", {}, {});
+  });
+
+  it("opens after threshold failures and stops further requests", async () => {
+    const { api, logs, fireEvent } = createMockApi({ workerPort: 59999 });
+    claudeMemPlugin(api);
+    // Reset circuit inside the test body to guard against timers from preceding
+    // tests (e.g. completionDelayMs timers) that may fire between beforeEach and here.
+    await fireEvent("gateway_start", {}, {});
+
+    // Fire threshold+1 calls so the circuit is open by the end of the loop
+    // regardless of whether a concurrent timer fires at the exact boundary.
+    for (let i = 0; i < 4; i++) {
+      await fireEvent("before_agent_start", { prompt: "hello" }, { sessionKey: `cb-open-${i}` });
+    }
+
+    // Circuit is now OPEN. Subsequent calls must be silently dropped.
+    const logCountBeforeDrop = logs.length;
+    await fireEvent("before_agent_start", { prompt: "hello" }, { sessionKey: "cb-drop" });
+    const noisyDropLogs = logs.slice(logCountBeforeDrop).filter(
+      (l) => l.includes("failed") || l.includes("disabling")
+    );
+    assert.equal(noisyDropLogs.length, 0, "calls when circuit is open should be silently dropped");
+  });
+
+  it("logs individual failures while circuit is closed, then disabling when it opens", async () => {
+    const { api, logs, fireEvent } = createMockApi({ workerPort: 59999 });
+    claudeMemPlugin(api);
+    await fireEvent("gateway_start", {}, {});
+    const logsAfterReset = logs.length;
+
+    // Fire exactly threshold (3) calls
+    for (let i = 0; i < 3; i++) {
+      await fireEvent("before_agent_start", { prompt: "hello" }, { sessionKey: `cb-log-${i}` });
+    }
+
+    const newLogs = logs.slice(logsAfterReset);
+    // At least some failures should have been logged (circuit was active)
+    assert.ok(newLogs.length > 0, "threshold calls should produce log output");
+    // Exactly one disabling warning should appear
+    const disablingLogs = newLogs.filter((l) => l.includes("disabling requests"));
+    assert.equal(disablingLogs.length, 1, "should emit exactly one disabling warning when circuit opens");
+    // The last call (the threshold-crossing one) should NOT log an individual failure
+    const failureLogs = newLogs.filter((l) => l.includes("failed:"));
+    assert.ok(failureLogs.length < 3, "threshold-crossing call should not log an individual failure");
+  });
+
+  it("resets on gateway_start, allowing connections again", async () => {
+    const { api, logs, fireEvent } = createMockApi({ workerPort: 59999 });
+    claudeMemPlugin(api);
+    await fireEvent("gateway_start", {}, {});
+
+    // Open the circuit by firing threshold+1 calls
+    for (let i = 0; i < 4; i++) {
+      await fireEvent("before_agent_start", { prompt: "hello" }, { sessionKey: `cb-reset-${i}` });
+    }
+
+    // Confirm circuit is open (call is silently dropped)
+    const logCountWhileOpen = logs.length;
+    await fireEvent("before_agent_start", { prompt: "hello" }, { sessionKey: "cb-while-open" });
+    assert.equal(
+      logs.slice(logCountWhileOpen).filter((l) => l.includes("failed") || l.includes("disabling")).length,
+      0,
+      "call while circuit is open should be silently dropped"
+    );
+
+    // gateway_start resets the circuit
+    await fireEvent("gateway_start", {}, {});
+
+    // Next call should attempt to connect again (not silently drop)
+    const logCountAfterReset = logs.length;
+    await fireEvent("before_agent_start", { prompt: "hello" }, { sessionKey: "cb-after-reset" });
+    const newLogs = logs.slice(logCountAfterReset);
+    assert.ok(
+      newLogs.some((l) => l.includes("failed:") || l.includes("disabling")),
+      "should attempt worker connection after gateway_start reset"
+    );
+  });
+
+  it("HALF_OPEN allows only a single probe — non-2xx keeps circuit open, 2xx closes it", async () => {
+    // ---- Phase 1: open the circuit via network failures (unreachable port) ----
+    // Reset circuit state first
+    const resetMock = createMockApi({ workerPort: 59999 });
+    claudeMemPlugin(resetMock.api);
+    await resetMock.fireEvent("gateway_start", {}, {});
+
+    // Drive 4 failures to ensure circuit is OPEN
+    for (let i = 0; i < 4; i++) {
+      await resetMock.fireEvent("before_agent_start", { prompt: "probe-test" }, { sessionKey: `probe-phase1-${i}` });
+    }
+
+    // ---- Phase 2: advance clock so cooldown has elapsed ----
+    // _circuitOpenedAt was set during Phase 1 using the real Date.now().
+    // Advancing Date.now by 31s means the next circuitAllow call sees the cooldown elapsed.
+    const realDateNow = Date.now.bind(Date);
+    Date.now = () => realDateNow() + 31_000;
+
+    try {
+      // ---- Phase 3: non-2xx probe — circuit should stay OPEN ----
+      // Start a server that returns 500 for all requests
+      let serverA: Server | null = null;
+      const portA: number = await new Promise((resolve) => {
+        serverA = createServer((_req: IncomingMessage, res: ServerResponse) => {
+          res.writeHead(500);
+          res.end();
+        });
+        serverA!.listen(0, () => {
+          const addr = serverA!.address();
+          resolve((addr as any).port);
+        });
+      });
+
+      // Reuse the same module-level circuit state — just change the worker port.
+      // Create a new mock api instance pointed at server A (500 responder).
+      const mockA = createMockApi({ workerPort: portA });
+      claudeMemPlugin(mockA.api);
+      // Do NOT fire gateway_start here — we want the OPEN circuit state from Phase 1.
+
+      // The circuit is OPEN but the mocked clock says cooldown elapsed.
+      // The next call should: transition to HALF_OPEN, set _halfOpenProbeInFlight=true,
+      // send the probe to server A (which returns 500), then call circuitOnFailure
+      // and re-open the circuit.
+      const logCountAtProbe = mockA.logs.length;
+      await mockA.fireEvent("before_agent_start", { prompt: "probe" }, { sessionKey: "probe-call-non2xx" });
+      await new Promise((resolve) => setTimeout(resolve, 100));
+
+      const probeALogs = mockA.logs.slice(logCountAtProbe);
+      // After a 500 response, circuitOnFailure is called which logs "disabling requests"
+      // (because state was HALF_OPEN) and logger.warn logs the 500 status.
+      assert.ok(
+        probeALogs.some((l) => l.includes("disabling") || l.includes("returned 500") || l.includes("Worker POST")),
+        "non-2xx probe should keep circuit open (expected disabling or 500 status log)"
+      );
+
+      // Verify probe flag resets: a second call with cooldown elapsed should be allowed as a new probe
+      // (i.e., _halfOpenProbeInFlight was cleared by circuitOnFailure).
+      // But without advancing time further the circuit is OPEN again — so calls are dropped.
+      const logCountAfterFailedProbe = mockA.logs.length;
+      await mockA.fireEvent("before_agent_start", { prompt: "probe" }, { sessionKey: "probe-concurrent" });
+      await new Promise((resolve) => setTimeout(resolve, 100));
+      const droppedLogs = mockA.logs.slice(logCountAfterFailedProbe).filter(
+        (l) => l.includes("failed") || l.includes("disabling")
+      );
+      assert.equal(droppedLogs.length, 0, "call should be silently dropped while circuit is OPEN again after failed probe");
+
+      serverA!.close();
+
+      // ---- Phase 4: 2xx probe — circuit should close ----
+      // Re-open the circuit with fresh failures, then probe with a 200-returning server.
+      // Reset circuit state first.
+      const resetMock2 = createMockApi({ workerPort: 59999 });
+      claudeMemPlugin(resetMock2.api);
+      await resetMock2.fireEvent("gateway_start", {}, {});
+
+      // Drive failures (still using mocked Date.now, but _circuitOpenedAt will be set to
+      // the mocked time, so cooldown is NOT elapsed yet from the mocked perspective).
+      // We need to temporarily restore real Date.now while opening the circuit, then
+      // re-mock it for the probe.
+      Date.now = realDateNow;
+      for (let i = 0; i < 4; i++) {
+        await resetMock2.fireEvent("before_agent_start", { prompt: "probe-test" }, { sessionKey: `probe-phase4-${i}` });
+      }
+      // Re-advance the clock past cooldown
+      Date.now = () => realDateNow() + 31_000;
+
+      let serverB: Server | null = null;
+      const portB: number = await new Promise((resolve) => {
+        serverB = createServer((_req: IncomingMessage, res: ServerResponse) => {
+          res.writeHead(200, { "Content-Type": "application/json" });
+          res.end(JSON.stringify({ sessionDbId: 1, promptNumber: 1, skipped: false }));
+        });
+        serverB!.listen(0, () => {
+          const addr = serverB!.address();
+          resolve((addr as any).port);
+        });
+      });
+
+      const mockB = createMockApi({ workerPort: portB });
+      claudeMemPlugin(mockB.api);
+      // Do NOT fire gateway_start — reuse OPEN circuit state from resetMock2.
+
+      const logCountBeforeSuccessProbe = mockB.logs.length;
+      await mockB.fireEvent("before_agent_start", { prompt: "probe" }, { sessionKey: "probe-call-2xx" });
+      await new Promise((resolve) => setTimeout(resolve, 150));
+
+      const successProbeLogs = mockB.logs.slice(logCountBeforeSuccessProbe);
+      assert.ok(
+        successProbeLogs.some((l) => l.includes("restored") || l.includes("circuit closed")),
+        "2xx probe should close the circuit — expected 'restored' or 'circuit closed' log"
+      );
+
+      serverB!.close();
+    } finally {
+      Date.now = realDateNow;
+    }
+  });
+});
@@ -264,12 +264,80 @@ function workerBaseUrl(port: number): string {
  return `http://${_workerHost}:${port}`;
 }

+// ============================================================================
+// Worker Circuit Breaker
+// ============================================================================
+// Prevents CPU-spinning retry loops when the worker is unreachable.
+// After CIRCUIT_BREAKER_THRESHOLD consecutive network errors, the circuit
+// opens and all worker calls are silently dropped for CIRCUIT_BREAKER_COOLDOWN_MS.
+// After the cooldown, one probe attempt is allowed to check if the worker recovered.
+
+const CIRCUIT_BREAKER_THRESHOLD = 3;
+const CIRCUIT_BREAKER_COOLDOWN_MS = 30_000;
+
+type CircuitState = "CLOSED" | "OPEN" | "HALF_OPEN";
+
+let _circuitState: CircuitState = "CLOSED";
+let _circuitFailures = 0;
+let _circuitOpenedAt = 0;
+let _halfOpenProbeInFlight = false;
+
+function circuitAllow(logger: PluginLogger): boolean {
+  if (_circuitState === "CLOSED") return true;
+  if (_circuitState === "OPEN") {
+    if (Date.now() - _circuitOpenedAt >= CIRCUIT_BREAKER_COOLDOWN_MS) {
+      _circuitState = "HALF_OPEN";
+      logger.info("[claude-mem] Circuit breaker: probing worker connection");
+      if (_halfOpenProbeInFlight) return false;
+      _halfOpenProbeInFlight = true;
+      return true;
+    }
+    return false;
+  }
+  // HALF_OPEN: allow one probe through
+  if (_halfOpenProbeInFlight) return false;
+  _halfOpenProbeInFlight = true;
+  return true;
+}
+
+function circuitOnSuccess(logger: PluginLogger): void {
+  if (_circuitState !== "CLOSED") {
+    logger.info("[claude-mem] Worker connection restored — circuit closed");
+  }
+  _circuitState = "CLOSED";
+  _circuitFailures = 0;
+  _halfOpenProbeInFlight = false;
+}
+
+function circuitOnFailure(logger: PluginLogger): void {
+  _halfOpenProbeInFlight = false;
+  _circuitFailures++;
+  if (
+    _circuitState === "HALF_OPEN" ||
+    (_circuitState === "CLOSED" && _circuitFailures >= CIRCUIT_BREAKER_THRESHOLD)
+  ) {
+    _circuitState = "OPEN";
+    _circuitOpenedAt = Date.now();
+    logger.warn(
+      `[claude-mem] Worker unreachable — disabling requests for ${CIRCUIT_BREAKER_COOLDOWN_MS / 1000}s`
+    );
+  }
+}
+
+function circuitReset(): void {
+  _circuitState = "CLOSED";
+  _circuitFailures = 0;
+  _circuitOpenedAt = 0;
+  _halfOpenProbeInFlight = false;
+}
+
 async function workerPost(
  port: number,
  path: string,
  body: Record<string, unknown>,
  logger: PluginLogger
 ): Promise<Record<string, unknown> | null> {
+  if (!circuitAllow(logger)) return null;
  try {
    const response = await fetch(`${workerBaseUrl(port)}${path}`, {
      method: "POST",
@@ -277,13 +345,18 @@ async function workerPost(
      body: JSON.stringify(body),
    });
    if (!response.ok) {
+      circuitOnFailure(logger);
      logger.warn(`[claude-mem] Worker POST ${path} returned ${response.status}`);
      return null;
    }
+    circuitOnSuccess(logger);
    return (await response.json()) as Record<string, unknown>;
  } catch (error: unknown) {
    const message = error instanceof Error ? error.message : String(error);
-    logger.warn(`[claude-mem] Worker POST ${path} failed: ${message}`);
+    circuitOnFailure(logger);
+    if (_circuitState !== "OPEN") {
+      logger.warn(`[claude-mem] Worker POST ${path} failed: ${message}`);
+    }
    return null;
  }
 }
@@ -294,13 +367,24 @@ function workerPostFireAndForget(
  body: Record<string, unknown>,
  logger: PluginLogger
 ): void {
+  if (!circuitAllow(logger)) return;
  fetch(`${workerBaseUrl(port)}${path}`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(body),
+  }).then((response) => {
+    if (!response.ok) {
+      circuitOnFailure(logger);
+      logger.warn(`[claude-mem] Worker POST ${path} returned ${response.status}`);
+      return;
+    }
+    circuitOnSuccess(logger);
  }).catch((error: unknown) => {
    const message = error instanceof Error ? error.message : String(error);
-    logger.warn(`[claude-mem] Worker POST ${path} failed: ${message}`);
+    circuitOnFailure(logger);
+    if (_circuitState !== "OPEN") {
+      logger.warn(`[claude-mem] Worker POST ${path} failed: ${message}`);
+    }
  });
 }

@@ -309,16 +393,22 @@ async function workerGetText(
  path: string,
  logger: PluginLogger
 ): Promise<string | null> {
+  if (!circuitAllow(logger)) return null;
  try {
    const response = await fetch(`${workerBaseUrl(port)}${path}`);
    if (!response.ok) {
+      circuitOnFailure(logger);
      logger.warn(`[claude-mem] Worker GET ${path} returned ${response.status}`);
      return null;
    }
+    circuitOnSuccess(logger);
    return await response.text();
  } catch (error: unknown) {
    const message = error instanceof Error ? error.message : String(error);
-    logger.warn(`[claude-mem] Worker GET ${path} failed: ${message}`);
+    circuitOnFailure(logger);
+    if (_circuitState !== "OPEN") {
+      logger.warn(`[claude-mem] Worker GET ${path} failed: ${message}`);
+    }
    return null;
  }
 }
@@ -856,6 +946,7 @@ export default function claudeMemPlugin(api: OpenClawPluginApi): void {
  // Event: gateway_start — clear session tracking for fresh start
  // ------------------------------------------------------------------
  api.on("gateway_start", async () => {
+    circuitReset();
    sessionIds.clear();
    contextCache.clear();
    recentPromptInits.clear();
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem",
-  "version": "12.1.1",
+  "version": "12.1.2",
  "description": "Memory compression system for Claude Code - persist context across sessions",
  "keywords": [
    "claude",
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem",
-  "version": "12.1.1",
+  "version": "12.1.2",
  "description": "Persistent memory system for Claude Code - seamlessly preserve context across sessions",
  "author": {
    "name": "Alex Newman"
@@ -7,7 +7,7 @@
        "hooks": [
          {
            "type": "command",
-            "command": "_R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; \"$_R/scripts/setup.sh\"",
+            "command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/smart-install.js\"",
            "timeout": 300
          }
        ]
@@ -19,17 +19,17 @@
        "hooks": [
          {
            "type": "command",
-            "command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/smart-install.js\"",
+            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/smart-install.js\"",
            "timeout": 300
          },
          {
            "type": "command",
-"command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" start; for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do curl -sf http://localhost:37777/health >/dev/null 2>&1 && break; sleep 1; done; curl -sf http://localhost:37777/health >/dev/null 2>&1 || true; echo '{\"continue\":true,\"suppressOutput\":true}'",
+"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" start; for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do curl -sf http://localhost:37777/health >/dev/null 2>&1 && break; sleep 1; done; curl -sf http://localhost:37777/health >/dev/null 2>&1 || true; echo '{\"continue\":true,\"suppressOutput\":true}'",
            "timeout": 60
          },
          {
            "type": "command",
-"command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do curl -sf http://localhost:37777/health >/dev/null 2>&1 && break; sleep 1; done; if curl -sf http://localhost:37777/health >/dev/null 2>&1; then node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code context || true; fi",
+"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do curl -sf http://localhost:37777/health >/dev/null 2>&1 && break; sleep 1; done; if curl -sf http://localhost:37777/health >/dev/null 2>&1; then node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code context || true; fi",
            "timeout": 60
          }
        ]
@@ -40,7 +40,7 @@
        "hooks": [
          {
            "type": "command",
-            "command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-init",
+            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-init",
            "timeout": 60
          }
        ]
@@ -52,7 +52,7 @@
        "hooks": [
          {
            "type": "command",
-            "command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code observation",
+            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code observation",
            "timeout": 120
          }
        ]
@@ -64,7 +64,7 @@
        "hooks": [
          {
            "type": "command",
-            "command": "_R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code file-context",
+            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code file-context",
            "timeout": 2000
          }
        ]
@@ -75,7 +75,7 @@
        "hooks": [
          {
            "type": "command",
-            "command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code summarize",
+            "command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code summarize",
            "timeout": 120
          }
        ]
@@ -86,7 +86,7 @@
        "hooks": [
          {
            "type": "command",
-"command": "export PATH=\"$HOME/.nvm/versions/node/v$(ls \\\"$HOME/.nvm/versions/node\\\" 2>/dev/null | sed 's/^v//' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1)/bin:$HOME/.local/bin:/usr/local/bin:/opt/homebrew/bin:$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-complete",
+"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-complete",
            "timeout": 30
          }
        ]
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem-plugin",
-  "version": "12.1.1",
+  "version": "12.1.2",
  "private": true,
  "description": "Runtime dependencies for claude-mem bundled hooks",
  "type": "module",
@@ -9,7 +9,7 @@
 * for both cache and marketplace installs), falling back to script location
 * and legacy paths.
 */
-import { existsSync, readFileSync, writeFileSync } from 'fs';
+import { existsSync, readFileSync, writeFileSync, openSync, readSync, closeSync } from 'fs';
 import { execSync, spawnSync } from 'child_process';
 import { join, dirname } from 'path';
 import { homedir } from 'os';
@@ -490,6 +490,56 @@ function verifyCriticalModules() {
  return true;
 }

+// Mach-O 64-bit magic values as seen when reading the first 4 file bytes with readUInt32LE.
+// Native arm64/x86_64 Mach-O files start with bytes [CF FA ED FE]; readUInt32LE gives 0xFEEDFACF.
+// Byte-swapped (big-endian) Mach-O files start with bytes [FE ED FA CF]; readUInt32LE gives 0xCFFAEDFE.
+const MACHO_MAGIC_NATIVE  = 0xFEEDFACF; // native 64-bit (arm64/x86_64) — file bytes CF FA ED FE
+const MACHO_MAGIC_SWAPPED = 0xCFFAEDFE; // byte-swapped 64-bit             — file bytes FE ED FA CF
+
+/**
+ * Warn when the bundled claude-mem binary cannot run on the current platform.
+ *
+ * The committed binary (plugin/scripts/claude-mem) is compiled for macOS arm64.
+ * On Linux or Windows it produces "Exec format error" and silently fails.
+ * This check surfaces the incompatibility at install time so users know why
+ * the binary path doesn't work, and confirms the JS fallback (bun-runner.js →
+ * worker-service.cjs) is active and covers all functionality.
+ *
+ * Fixes #1547 — Plugin silently fails on Linux ARM64.
+ */
+export function checkBinaryPlatformCompatibility(binaryPath = join(ROOT, 'scripts', 'claude-mem')) {
+
+  if (!existsSync(binaryPath)) {
+    return; // Binary absent — nothing to check (e.g. after npm install which excludes it)
+  }
+
+  // The binary only matters on non-macOS platforms; on macOS it works correctly.
+  if (process.platform === 'darwin') {
+    return;
+  }
+
+  // Read the first 4 bytes to identify the binary format.
+  let fd;
+  try {
+    const buf = Buffer.alloc(4);
+    fd = openSync(binaryPath, 'r');
+    readSync(fd, buf, 0, 4, 0);
+
+    const magic = buf.readUInt32LE(0);
+    if (magic === MACHO_MAGIC_NATIVE || magic === MACHO_MAGIC_SWAPPED) {
+      console.error('⚠️  Platform notice: The bundled claude-mem binary is macOS-only.');
+      console.error(`   Current platform: ${process.platform} ${process.arch}`);
+      console.error('   The binary will not execute on this platform.');
+      console.error('   Plugin functionality is provided by the JS fallback');
+      console.error('   (bun-runner.js → worker-service.cjs) which works on all platforms.');
+    }
+  } catch {
+    // Unreadable binary — not critical, skip silently
+  } finally {
+    if (fd !== undefined) closeSync(fd);
+  }
+}
+
 // Main execution
 try {
  // Step 1: Ensure Bun is installed and meets minimum version (REQUIRED)
@@ -582,6 +632,9 @@ try {
  // Step 4: Install CLI to PATH
  installCLI();

+  // Step 5: Warn if the bundled native binary is incompatible with this platform
+  checkBinaryPlatformCompatibility();
+
  // Output valid JSON for Claude Code hook contract
  console.log(JSON.stringify({ continue: true, suppressOutput: true }));
 } catch (e) {
@@ -13,7 +13,7 @@ import type { PlatformAdapter } from '../types.js';
 *   Notification  → observation (system events like ToolPermission)
 *
 * Agent:
- *   BeforeAgent   → user-message (captures user prompt)
+ *   BeforeAgent   → session-init (initializes session, captures user prompt)
 *   AfterAgent    → observation  (full agent response)
 *
 * Tool:
@@ -6,7 +6,7 @@

 import type { EventHandler, NormalizedHookInput, HookResult } from '../types.js';
 import { ensureWorkerRunning, workerHttpRequest } from '../../shared/worker-utils.js';
-import { getProjectName } from '../../utils/project-name.js';
+import { getProjectContext } from '../../utils/project-name.js';
 import { logger } from '../../utils/logger.js';
 import { HOOK_EXIT_CODES } from '../../shared/hook-constants.js';
 import { isProjectExcluded } from '../../utils/project-filter.js';
@@ -42,7 +42,7 @@ export const sessionInitHandler: EventHandler = {
    // Use placeholder so sessions still get created and tracked for memory
    const prompt = (!rawPrompt || !rawPrompt.trim()) ? '[media prompt]' : rawPrompt;

-    const project = getProjectName(cwd);
+    const project = getProjectContext(cwd).primary;
    const platformSource = normalizePlatformSource(input.platform);

    logger.debug('HOOK', 'session-init: Calling /api/sessions/init', { contentSessionId: sessionId, project });
@@ -18,6 +18,7 @@ import { ensureWorkerRunning, workerHttpRequest } from '../../shared/worker-util
 import { logger } from '../../utils/logger.js';
 import { extractLastMessage } from '../../shared/transcript-parser.js';
 import { HOOK_EXIT_CODES, HOOK_TIMEOUTS, getTimeout } from '../../shared/hook-constants.js';
+import { normalizePlatformSource } from '../../shared/platform-source.js';

 const SUMMARIZE_TIMEOUT_MS = getTimeout(HOOK_TIMEOUTS.DEFAULT);
 const POLL_INTERVAL_MS = 500;
@@ -66,13 +67,16 @@ export const summarizeHandler: EventHandler = {
      hasLastAssistantMessage: !!lastAssistantMessage
    });

+    const platformSource = normalizePlatformSource(input.platform);
+
    // 1. Queue summarize request — worker returns immediately with { status: 'queued' }
    const response = await workerHttpRequest('/api/sessions/summarize', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        contentSessionId: sessionId,
-        last_assistant_message: lastAssistantMessage
+        last_assistant_message: lastAssistantMessage,
+        platformSource
      }),
      timeoutMs: SUMMARIZE_TIMEOUT_MS
    });
@@ -10,7 +10,7 @@ import { homedir } from 'os';
 import { unlinkSync } from 'fs';
 import { SessionStore } from '../sqlite/SessionStore.js';
 import { logger } from '../../utils/logger.js';
-import { getProjectName } from '../../utils/project-name.js';
+import { getProjectContext } from '../../utils/project-name.js';

 import type { ContextInput, ContextConfig, Observation, SessionSummary } from './types.js';
 import { loadContextConfig } from './ContextConfigLoader.js';
@@ -129,11 +129,12 @@ export async function generateContext(
 ): Promise<string> {
  const config = loadContextConfig();
  const cwd = input?.cwd ?? process.cwd();
-  const project = getProjectName(cwd);
+  const context = getProjectContext(cwd);
+  const project = context.primary;
  const platformSource = input?.platform_source;

-  // Use provided projects array (for worktree support) or fall back to single project
-  const projects = input?.projects || [project];
+  // Use provided projects array (for worktree support) or fall back to all known projects
+  const projects = input?.projects ?? context.allProjects;

  // Full mode: fetch all observations but keep normal rendering (level 1 summaries)
  if (input?.full) {
@@ -80,7 +80,7 @@ const HOOK_TIMEOUT_MS = 10000;
 */
 const GEMINI_EVENT_TO_INTERNAL_EVENT: Record<string, string> = {
  'SessionStart': 'context',
-  'BeforeAgent': 'user-message',
+  'BeforeAgent': 'session-init',
  'AfterAgent': 'observation',
  'BeforeTool': 'observation',
  'AfterTool': 'observation',
@@ -4,7 +4,7 @@ import { fileEditHandler } from '../../cli/handlers/file-edit.js';
 import { sessionCompleteHandler } from '../../cli/handlers/session-complete.js';
 import { ensureWorkerRunning, workerHttpRequest } from '../../shared/worker-utils.js';
 import { logger } from '../../utils/logger.js';
-import { getProjectContext, getProjectName } from '../../utils/project-name.js';
+import { getProjectContext } from '../../utils/project-name.js';
 import { writeAgentsMd } from '../../utils/agents-md-utils.js';
 import { resolveFieldSpec, resolveFields, matchesRule } from './field-utils.js';
 import { expandHomePath } from './config.js';
@@ -104,7 +104,7 @@ export class TranscriptEventProcessor {
    const resolved = resolveFieldSpec(fieldSpec, entry, ctx);
    if (typeof resolved === 'string' && resolved.trim()) return resolved;
    if (watch.project) return watch.project;
-    if (session.cwd) return getProjectName(session.cwd);
+    if (session.cwd) return getProjectContext(session.cwd).primary;
    return session.project;
  }

@@ -382,6 +382,30 @@ export function createPidCapturingSpawn(sessionDbId: number) {
    env?: NodeJS.ProcessEnv;
    signal?: AbortSignal;
  }) => {
+    // Kill any existing process for this session before spawning a new one.
+    // Multiple processes sharing the same --resume UUID waste API credits and
+    // can conflict with each other (Issue #1590).
+    const existing = getProcessBySession(sessionDbId);
+    if (existing && existing.process.exitCode === null) {
+      logger.warn('PROCESS', `Killing duplicate process PID ${existing.pid} before spawning new one for session ${sessionDbId}`, {
+        existingPid: existing.pid,
+        sessionDbId
+      });
+      let exited = false;
+      try {
+        existing.process.kill('SIGTERM');
+        exited = existing.process.exitCode !== null;
+      } catch {
+        // Already dead — safe to unregister immediately
+        exited = true;
+      }
+
+      if (exited) {
+        unregisterProcess(existing.pid);
+      }
+      // If still alive, the 'exit' handler (line ~440) will unregister it.
+    }
+
    getSupervisor().assertCanSpawn('claude sdk');

    // On Windows, use cmd.exe wrapper for .cmd files to properly handle paths with spaces
@@ -17,6 +17,64 @@ import { SessionQueueProcessor } from '../queue/SessionQueueProcessor.js';
 import { getProcessBySession, ensureProcessExit } from './ProcessRegistry.js';
 import { getSupervisor } from '../../supervisor/index.js';

+/** Idle threshold before a stuck generator (zombie subprocess) is force-killed. */
+export const MAX_GENERATOR_IDLE_MS = 5 * 60 * 1000; // 5 minutes
+
+/** Idle threshold before a no-generator session with no pending work is reaped. */
+export const MAX_SESSION_IDLE_MS = 15 * 60 * 1000; // 15 minutes
+
+/**
+ * Minimal process interface used by detectStaleGenerator — compatible with
+ * both the real Bun.Subprocess / ChildProcess shapes and test mocks.
+ */
+export interface StaleGeneratorProcess {
+  exitCode: number | null;
+  kill(signal?: string): boolean | void;
+}
+
+/**
+ * Minimal session fields required to evaluate stale-generator status.
+ * This is a subset of ActiveSession, allowing unit tests to pass plain objects.
+ */
+export interface StaleGeneratorCandidate {
+  generatorPromise: Promise<void> | null;
+  lastGeneratorActivity: number;
+  abortController: AbortController;
+}
+
+/**
+ * Detect whether a session's generator is stuck (zombie subprocess) and, if so,
+ * SIGKILL the subprocess and abort the controller.
+ *
+ * Extracted from reapStaleSessions() so tests can import and exercise the exact
+ * same logic rather than duplicating it locally. (Issue #1652)
+ *
+ * @param session  - session to inspect
+ * @param proc     - tracked subprocess (may be undefined if not in ProcessRegistry)
+ * @param now      - current timestamp (defaults to Date.now(); pass explicit value in tests)
+ * @returns true if the session was marked stale, false otherwise
+ */
+export function detectStaleGenerator(
+  session: StaleGeneratorCandidate,
+  proc: StaleGeneratorProcess | undefined,
+  now = Date.now()
+): boolean {
+  if (!session.generatorPromise) return false;
+
+  const generatorIdleMs = now - session.lastGeneratorActivity;
+  if (generatorIdleMs <= MAX_GENERATOR_IDLE_MS) return false;
+
+  // Kill subprocess to unblock stuck for-await
+  if (proc && proc.exitCode === null) {
+    try {
+      proc.kill('SIGKILL');
+    } catch {}
+  }
+  // Signal the SDK agent loop to exit
+  session.abortController.abort();
+  return true;
+}
+
 export class SessionManager {
  private dbManager: DatabaseManager;
  private sessions: Map<number, ActiveSession> = new Map();
@@ -364,10 +422,12 @@ export class SessionManager {
    }
  }

-  private static readonly MAX_SESSION_IDLE_MS = 15 * 60 * 1000; // 15 minutes
-
  /**
   * Reap sessions with no active generator and no pending work that have been idle too long.
+   * Also reaps sessions whose generator has been stuck (no lastGeneratorActivity update) for
+   * longer than MAX_GENERATOR_IDLE_MS — these are zombie subprocesses that will never exit
+   * on their own because the orphan reaper skips sessions in the active sessions map. (Issue #1652)
+   *
   * This unblocks the orphan reaper which skips processes for "active" sessions. (Issue #1168)
   */
  async reapStaleSessions(): Promise<number> {
@@ -375,8 +435,31 @@ export class SessionManager {
    const staleSessionIds: number[] = [];

    for (const [sessionDbId, session] of this.sessions) {
-      // Skip sessions with active generators
-      if (session.generatorPromise) continue;
+      // Sessions with active generators — check for stuck/zombie generators (Issue #1652)
+      if (session.generatorPromise) {
+        const generatorIdleMs = now - session.lastGeneratorActivity;
+        if (generatorIdleMs > MAX_GENERATOR_IDLE_MS) {
+          logger.warn('SESSION', `Stale generator detected for session ${sessionDbId} (no activity for ${Math.round(generatorIdleMs / 60000)}m) — force-killing subprocess`, {
+            sessionDbId,
+            generatorIdleMs
+          });
+          // Force-kill the subprocess to unblock the stuck for-await in SDKAgent.
+          // Without this the generator is blocked on `for await (const msg of queryResult)`
+          // and will never exit even after abort() is called.
+          const trackedProcess = getProcessBySession(sessionDbId);
+          if (trackedProcess && trackedProcess.process.exitCode === null) {
+            try {
+              trackedProcess.process.kill('SIGKILL');
+            } catch (err) {
+              logger.warn('SESSION', 'Failed to SIGKILL subprocess for stale generator', { sessionDbId }, err as Error);
+            }
+          }
+          // Signal the SDK agent loop to exit after the subprocess dies
+          session.abortController.abort();
+          staleSessionIds.push(sessionDbId);
+        }
+        continue;
+      }

      // Skip sessions with pending work
      const pendingCount = this.getPendingStore().getPendingCount(sessionDbId);
@@ -384,13 +467,13 @@ export class SessionManager {

      // No generator + no pending work + old enough = stale
      const sessionAge = now - session.startTime;
-      if (sessionAge > SessionManager.MAX_SESSION_IDLE_MS) {
+      if (sessionAge > MAX_SESSION_IDLE_MS) {
+        logger.warn('SESSION', `Reaping idle session ${sessionDbId} (no activity for >${Math.round(MAX_SESSION_IDLE_MS / 60000)}m)`, { sessionDbId });
        staleSessionIds.push(sessionDbId);
      }
    }

    for (const sessionDbId of staleSessionIds) {
-      logger.warn('SESSION', `Reaping stale session ${sessionDbId} (no activity for >${Math.round(SessionManager.MAX_SESSION_IDLE_MS / 60000)}m)`, { sessionDbId });
      await this.deleteSession(sessionDbId);
    }

@@ -22,7 +22,7 @@ import { PrivacyCheckValidator } from '../../validation/PrivacyCheckValidator.js
 import { SettingsDefaultsManager } from '../../../../shared/SettingsDefaultsManager.js';
 import { USER_SETTINGS_PATH } from '../../../../shared/paths.js';
 import { getProcessBySession, ensureProcessExit } from '../../ProcessRegistry.js';
-import { getProjectName } from '../../../../utils/project-name.js';
+import { getProjectContext } from '../../../../utils/project-name.js';
 import { normalizePlatformSource } from '../../../../shared/platform-source.js';

 export class SessionRoutes extends BaseRouteHandler {
@@ -94,11 +94,37 @@ export class SessionRoutes extends BaseRouteHandler {
   * The next generator will use the new provider with shared conversationHistory.
   */
  private static readonly STALE_GENERATOR_THRESHOLD_MS = 30_000; // 30 seconds (#1099)
+  private static readonly MAX_SESSION_WALL_CLOCK_MS = 4 * 60 * 60 * 1000; // 4 hours (#1590)

  private ensureGeneratorRunning(sessionDbId: number, source: string): void {
    const session = this.sessionManager.getSession(sessionDbId);
    if (!session) return;

+    // Wall-clock age guard: refuse to start new generators for sessions that have
+    // been alive too long to prevent runaway API costs (Issue #1590).
+    // Use the persisted started_at_epoch from the DB so the guard survives worker
+    // restarts (session.startTime is reset to Date.now() on every re-activation).
+    const dbSessionRecord = this.dbManager.getSessionStore().db
+      .prepare('SELECT started_at_epoch FROM sdk_sessions WHERE id = ? LIMIT 1')
+      .get(sessionDbId) as { started_at_epoch: number } | undefined;
+    const sessionOriginMs = dbSessionRecord?.started_at_epoch ?? session.startTime;
+    const sessionAgeMs = Date.now() - sessionOriginMs;
+    if (sessionAgeMs > SessionRoutes.MAX_SESSION_WALL_CLOCK_MS) {
+      logger.warn('SESSION', 'Session exceeded wall-clock age limit — aborting to prevent runaway spend', {
+        sessionId: sessionDbId,
+        ageHours: Math.round(sessionAgeMs / 3_600_000 * 10) / 10,
+        limitHours: SessionRoutes.MAX_SESSION_WALL_CLOCK_MS / 3_600_000,
+        source
+      });
+      if (!session.abortController.signal.aborted) {
+        session.abortController.abort();
+      }
+      const pendingStore = this.sessionManager.getPendingMessageStore();
+      pendingStore.markAllSessionMessagesAbandoned(sessionDbId);
+      this.sessionManager.removeSessionImmediate(sessionDbId);
+      return;
+    }
+
    // GUARD: Prevent duplicate spawns
    if (this.spawnInProgress.get(sessionDbId)) {
      logger.debug('SESSION', 'Spawn already in progress, skipping', { sessionDbId, source });
@@ -187,15 +213,37 @@ export class SessionRoutes extends BaseRouteHandler {
    session.currentProvider = provider;
    session.lastGeneratorActivity = Date.now();

+    // Capture the AbortController that belongs to THIS generator run.
+    // session.abortController may be replaced (e.g. by stale-recovery) before the
+    // .catch / .finally handlers run, so binding it here prevents a stale rejection
+    // from cancelling a brand-new controller (race condition guard).
+    const myController = session.abortController;
+
    session.generatorPromise = agent.startSession(session, this.workerService)
      .catch(error => {
        // Only log non-abort errors
-        if (session.abortController.signal.aborted) return;
-        
+        if (myController.signal.aborted) return;
+
+        const errorMsg = error instanceof Error ? error.message : String(error);
+
+        // Treat SIGTERM (exit code 143) as intentional termination, not a crash.
+        // When a subprocess is killed externally, abort the controller to prevent
+        // crash recovery from immediately respawning the process (Issue #1590).
+        // APPROVED OVERRIDE
+        if (errorMsg.includes('code 143') || errorMsg.includes('signal SIGTERM')) {
+          logger.warn('SESSION', 'Generator killed by external signal — aborting session to prevent respawn', {
+            sessionId: session.sessionDbId,
+            provider,
+            error: errorMsg
+          });
+          myController.abort();
+          return;
+        }
+
        logger.error('SESSION', `Generator failed`, {
          sessionId: session.sessionDbId,
          provider: provider,
-          error: error.message
+          error: errorMsg
        }, error);

        // Mark all processing messages as failed so they can be retried or abandoned
@@ -507,7 +555,7 @@ export class SessionRoutes extends BaseRouteHandler {
  private handleObservationsByClaudeId = this.wrapHandler((req: Request, res: Response): void => {
    const { contentSessionId, tool_name, tool_input, tool_response, cwd } = req.body;
    const platformSource = normalizePlatformSource(req.body.platformSource);
-    const project = typeof cwd === 'string' && cwd.trim() ? getProjectName(cwd) : '';
+    const project = typeof cwd === 'string' && cwd.trim() ? getProjectContext(cwd).primary : '';

    if (!contentSessionId) {
      return this.badRequest(res, 'Missing contentSessionId');
@@ -3,7 +3,37 @@ import { logger } from '../utils/logger.js';
 import { SYSTEM_REMINDER_REGEX } from '../utils/tag-stripping.js';

 /**
- * Extract last message of specified role from transcript JSONL file
+ * Detect whether a transcript file is in Gemini CLI JSON document format.
+ *
+ * Gemini CLI 0.37.0 writes a single JSON document with a top-level `messages`
+ * array instead of JSONL. Assistant entries use `type: "gemini"` rather than
+ * `type: "assistant"`.
+ *
+ * Example Gemini format:
+ *   { "messages": [{ "type": "user", "content": "..." }, { "type": "gemini", "content": "..." }] }
+ *
+ * Claude Code format (JSONL):
+ *   {"type":"assistant","message":{"content":[{"type":"text","text":"..."}]}}
+ */
+function isGeminiTranscriptFormat(content: string): { isGemini: true; messages: any[] } | { isGemini: false } {
+  try {
+    const parsed = JSON.parse(content);
+    if (parsed && Array.isArray(parsed.messages)) {
+      return { isGemini: true, messages: parsed.messages };
+    }
+  } catch {
+    // Not a valid single JSON object — assume JSONL
+  }
+  return { isGemini: false };
+}
+
+/**
+ * Extract last message of specified role from transcript file.
+ *
+ * Supports two transcript formats:
+ * - JSONL (Claude Code): one JSON object per line, `type: "assistant"` or `type: "user"`
+ * - JSON document (Gemini CLI 0.37.0+): `{ messages: [{ type: "gemini"|"user", content: string }] }`
+ *
 * @param transcriptPath Path to transcript file
 * @param role 'user' or 'assistant'
 * @param stripSystemReminders Whether to remove <system-reminder> tags (for assistant)
@@ -24,6 +54,52 @@ export function extractLastMessage(
    return '';
  }

+  // Gemini CLI 0.37.0 writes a JSON document rather than JSONL.
+  // Detect and handle it before falling through to the JSONL parser.
+  const geminiCheck = isGeminiTranscriptFormat(content);
+  if (geminiCheck.isGemini) {
+    return extractLastMessageFromGeminiTranscript(geminiCheck.messages, role, stripSystemReminders);
+  }
+
+  return extractLastMessageFromJsonl(content, role, stripSystemReminders);
+}
+
+/**
+ * Extract last message from Gemini CLI JSON document transcript.
+ * Maps `type: "gemini"` → assistant role; `type: "user"` → user role.
+ */
+function extractLastMessageFromGeminiTranscript(
+  messages: any[],
+  role: 'user' | 'assistant',
+  stripSystemReminders: boolean
+): string {
+  // "gemini" entries are assistant turns; "user" entries are user turns
+  const geminiRole = role === 'assistant' ? 'gemini' : 'user';
+
+  for (let i = messages.length - 1; i >= 0; i--) {
+    const msg = messages[i];
+    if (msg?.type === geminiRole && typeof msg.content === 'string') {
+      let text = msg.content;
+      if (stripSystemReminders) {
+        text = text.replace(SYSTEM_REMINDER_REGEX, '');
+        text = text.replace(/\n{3,}/g, '\n\n').trim();
+      }
+      return text;
+    }
+  }
+
+  return '';
+}
+
+/**
+ * Extract last message from Claude Code JSONL transcript.
+ * Each line is an independent JSON object with `type: "assistant"` or `type: "user"`.
+ */
+function extractLastMessageFromJsonl(
+  content: string,
+  role: 'user' | 'assistant',
+  stripSystemReminders: boolean
+): string {
  const lines = content.split('\n');
  let foundMatchingRole = false;

@@ -58,13 +58,13 @@ export function getProjectName(cwd: string | null | undefined): string {
 * Project context with worktree awareness
 */
 export interface ProjectContext {
-  /** The current project name (worktree or main repo) */
+  /** Canonical project name for writes/queries (parent repo in worktrees) */
  primary: string;
  /** Parent project name if in a worktree, null otherwise */
  parent: string | null;
  /** True if currently in a worktree */
  isWorktree: boolean;
-  /** All projects to query: [primary] for main repo, [parent, primary] for worktree */
+  /** All projects to query: [primary] for main repo, [parentRepo, worktreeName] for worktree */
  allProjects: string[];
 }

@@ -78,24 +78,26 @@ export interface ProjectContext {
 * @returns ProjectContext with worktree info
 */
 export function getProjectContext(cwd: string | null | undefined): ProjectContext {
-  const primary = getProjectName(cwd);
+  const cwdProjectName = getProjectName(cwd);

  if (!cwd) {
-    return { primary, parent: null, isWorktree: false, allProjects: [primary] };
+    return { primary: cwdProjectName, parent: null, isWorktree: false, allProjects: [cwdProjectName] };
  }

  const expandedCwd = expandTilde(cwd);
  const worktreeInfo = detectWorktree(expandedCwd);

  if (worktreeInfo.isWorktree && worktreeInfo.parentProjectName) {
-    // In a worktree: include parent first for chronological ordering
+    // In a worktree: use parent project name as primary so observations
+    // are stored under the same project as the main repo (#1081, #1500, #1819)
+    const allProjects = Array.from(new Set([worktreeInfo.parentProjectName, cwdProjectName]));
    return {
-      primary,
+      primary: worktreeInfo.parentProjectName,
      parent: worktreeInfo.parentProjectName,
      isWorktree: true,
-      allProjects: [worktreeInfo.parentProjectName, primary]
+      allProjects
    };
  }

-  return { primary, parent: null, isWorktree: false, allProjects: [primary] };
+  return { primary: cwdProjectName, parent: null, isWorktree: false, allProjects: [cwdProjectName] };
 }
@@ -0,0 +1,237 @@
+/**
+ * Tests for Gemini CLI 0.37.0 compatibility fixes (Issue #1664)
+ *
+ * Validates:
+ * 1. BeforeAgent is mapped to session-init (not user-message)
+ * 2. Transcript parser handles Gemini JSON document format (type: "gemini")
+ * 3. Summarize handler includes platformSource in the request body
+ */
+import { describe, it, expect } from 'bun:test';
+import { writeFileSync, mkdirSync, rmSync } from 'fs';
+import { join } from 'path';
+import { tmpdir } from 'os';
+
+// ---------------------------------------------------------------------------
+// 1. BeforeAgent event mapping
+// ---------------------------------------------------------------------------
+
+describe('GeminiCliHooksInstaller - event mapping', () => {
+  it('should map BeforeAgent to session-init, not user-message', async () => {
+    // Import the module to access the constant indirectly by inspecting
+    // the generated command string through the installer's internal mapping.
+    // The constant GEMINI_EVENT_TO_INTERNAL_EVENT is module-private, but we
+    // can verify the effect by checking that the installer installs the
+    // correct internal event name.
+    //
+    // Strategy: read the source file and assert the mapping directly.
+    const { readFileSync } = await import('fs');
+    const src = readFileSync('src/services/integrations/GeminiCliHooksInstaller.ts', 'utf-8');
+
+    // BeforeAgent must map to 'session-init'
+    expect(src).toContain("'BeforeAgent': 'session-init'");
+    // BeforeAgent must NOT map to 'user-message'
+    expect(src).not.toContain("'BeforeAgent': 'user-message'");
+  });
+
+  it('should map SessionStart to context (unchanged)', async () => {
+    const { readFileSync } = await import('fs');
+    const src = readFileSync('src/services/integrations/GeminiCliHooksInstaller.ts', 'utf-8');
+    expect(src).toContain("'SessionStart': 'context'");
+  });
+
+  it('should map SessionEnd to session-complete (unchanged)', async () => {
+    const { readFileSync } = await import('fs');
+    const src = readFileSync('src/services/integrations/GeminiCliHooksInstaller.ts', 'utf-8');
+    expect(src).toContain("'SessionEnd': 'session-complete'");
+  });
+});
+
+// ---------------------------------------------------------------------------
+// 2. Transcript parser — Gemini JSON document format
+// ---------------------------------------------------------------------------
+
+describe('extractLastMessage - Gemini CLI 0.37.0 transcript format', () => {
+  let tmpDir: string;
+
+  // Helper: write a temp transcript file and return its path
+  const writeTranscript = (name: string, content: string): string => {
+    const filePath = join(tmpDir, name);
+    writeFileSync(filePath, content, 'utf-8');
+    return filePath;
+  };
+
+  // Set up / tear down a fresh temp directory per suite
+  const setup = () => {
+    tmpDir = join(tmpdir(), `gemini-transcript-test-${Date.now()}`);
+    mkdirSync(tmpDir, { recursive: true });
+  };
+  const teardown = () => {
+    try { rmSync(tmpDir, { recursive: true, force: true }); } catch { /* ignore */ }
+  };
+
+  describe('Gemini JSON document format', () => {
+    it('extracts last assistant message from Gemini transcript (type: "gemini")', async () => {
+      setup();
+      try {
+        const { extractLastMessage } = await import('../src/shared/transcript-parser.js');
+
+        const transcript = JSON.stringify({
+          messages: [
+            { type: 'user', content: 'Hello Gemini' },
+            { type: 'gemini', content: 'Hi there! How can I help you today?' },
+            { type: 'user', content: 'What is 2+2?' },
+            { type: 'gemini', content: 'The answer is 4.' },
+          ]
+        });
+        const filePath = writeTranscript('gemini.json', transcript);
+
+        const result = extractLastMessage(filePath, 'assistant');
+        expect(result).toBe('The answer is 4.');
+      } finally {
+        teardown();
+      }
+    });
+
+    it('extracts last user message from Gemini transcript', async () => {
+      setup();
+      try {
+        const { extractLastMessage } = await import('../src/shared/transcript-parser.js');
+
+        const transcript = JSON.stringify({
+          messages: [
+            { type: 'user', content: 'First message' },
+            { type: 'gemini', content: 'First reply' },
+            { type: 'user', content: 'Second message' },
+          ]
+        });
+        const filePath = writeTranscript('gemini-user.json', transcript);
+
+        const result = extractLastMessage(filePath, 'user');
+        expect(result).toBe('Second message');
+      } finally {
+        teardown();
+      }
+    });
+
+    it('returns empty string when no assistant message exists in Gemini transcript', async () => {
+      setup();
+      try {
+        const { extractLastMessage } = await import('../src/shared/transcript-parser.js');
+
+        const transcript = JSON.stringify({
+          messages: [
+            { type: 'user', content: 'Just a user message' },
+          ]
+        });
+        const filePath = writeTranscript('gemini-no-assistant.json', transcript);
+
+        const result = extractLastMessage(filePath, 'assistant');
+        expect(result).toBe('');
+      } finally {
+        teardown();
+      }
+    });
+
+    it('strips system reminders from Gemini assistant messages when requested', async () => {
+      setup();
+      try {
+        const { extractLastMessage } = await import('../src/shared/transcript-parser.js');
+
+        const content = 'Real answer here.<system-reminder>ignore this</system-reminder>';
+        const transcript = JSON.stringify({
+          messages: [
+            { type: 'user', content: 'Question' },
+            { type: 'gemini', content },
+          ]
+        });
+        const filePath = writeTranscript('gemini-strip.json', transcript);
+
+        const result = extractLastMessage(filePath, 'assistant', true);
+        expect(result).toContain('Real answer here.');
+        expect(result).not.toContain('system-reminder');
+        expect(result).not.toContain('ignore this');
+      } finally {
+        teardown();
+      }
+    });
+
+    it('handles single-turn Gemini transcript', async () => {
+      setup();
+      try {
+        const { extractLastMessage } = await import('../src/shared/transcript-parser.js');
+
+        const transcript = JSON.stringify({
+          messages: [
+            { type: 'user', content: 'Hello' },
+            { type: 'gemini', content: 'Hello! I am Gemini.' },
+          ]
+        });
+        const filePath = writeTranscript('gemini-single.json', transcript);
+
+        const result = extractLastMessage(filePath, 'assistant');
+        expect(result).toBe('Hello! I am Gemini.');
+      } finally {
+        teardown();
+      }
+    });
+  });
+
+  describe('JSONL format (Claude Code) — no regression', () => {
+    it('still extracts assistant messages from JSONL transcripts', async () => {
+      setup();
+      try {
+        const { extractLastMessage } = await import('../src/shared/transcript-parser.js');
+
+        const lines = [
+          JSON.stringify({ type: 'user', message: { content: [{ type: 'text', text: 'user msg' }] } }),
+          JSON.stringify({ type: 'assistant', message: { content: [{ type: 'text', text: 'assistant reply' }] } }),
+        ].join('\n');
+        const filePath = writeTranscript('jsonl.jsonl', lines);
+
+        const result = extractLastMessage(filePath, 'assistant');
+        expect(result).toBe('assistant reply');
+      } finally {
+        teardown();
+      }
+    });
+
+    it('still extracts string content from JSONL transcripts', async () => {
+      setup();
+      try {
+        const { extractLastMessage } = await import('../src/shared/transcript-parser.js');
+
+        const lines = [
+          JSON.stringify({ type: 'assistant', message: { content: 'plain string response' } }),
+        ].join('\n');
+        const filePath = writeTranscript('jsonl-string.jsonl', lines);
+
+        const result = extractLastMessage(filePath, 'assistant');
+        expect(result).toBe('plain string response');
+      } finally {
+        teardown();
+      }
+    });
+  });
+});
+
+// ---------------------------------------------------------------------------
+// 3. Summarize handler includes platformSource
+// ---------------------------------------------------------------------------
+
+describe('Summarize handler - platformSource in request body', () => {
+  it('should include platformSource import in summarize.ts', async () => {
+    const { readFileSync } = await import('fs');
+    const src = readFileSync('src/cli/handlers/summarize.ts', 'utf-8');
+    expect(src).toContain('normalizePlatformSource');
+    expect(src).toContain('platform-source');
+  });
+
+  it('should pass platformSource in the summarize request body', async () => {
+    const { readFileSync } = await import('fs');
+    const src = readFileSync('src/cli/handlers/summarize.ts', 'utf-8');
+    // The body must include platformSource
+    expect(src).toContain('platformSource');
+    // It must appear in the JSON.stringify call for the summarize endpoint
+    expect(src).toContain('/api/sessions/summarize');
+  });
+});
@@ -138,3 +138,38 @@ describe('Plugin Distribution - Build Script Verification', () => {
    expect(content).toContain('plugin/.claude-plugin/plugin.json');
  });
 });
+
+describe('Plugin Distribution - Setup Hook (#1547)', () => {
+  it('should not reference removed setup.sh in Setup hook', () => {
+    // setup.sh was removed; the Setup hook must not reference it or the
+    // plugin silently fails to install on Linux (hooks disabled on setup failure).
+    const hooksPath = path.join(projectRoot, 'plugin/hooks/hooks.json');
+    const content = readFileSync(hooksPath, 'utf-8');
+    expect(content).not.toContain('setup.sh');
+  });
+
+  it('should call smart-install.js in the Setup hook', () => {
+    const hooksPath = path.join(projectRoot, 'plugin/hooks/hooks.json');
+    const parsed = JSON.parse(readFileSync(hooksPath, 'utf-8'));
+    const setupHooks: any[] = parsed.hooks['Setup'] ?? [];
+
+    // Collect all command hooks from all matchers
+    const commandHooks = setupHooks.flatMap((matcher: any) =>
+      (matcher.hooks ?? []).filter((h: any) => h.type === 'command')
+    );
+
+    // There must be at least one command hook — otherwise the test vacuously passes
+    expect(commandHooks.length).toBeGreaterThan(0);
+
+    // At least one command hook must reference smart-install.js
+    const smartInstallHooks = commandHooks.filter((h: any) =>
+      h.command?.includes('smart-install.js')
+    );
+    expect(smartInstallHooks.length).toBeGreaterThan(0);
+  });
+
+  it('smart-install.js referenced by Setup hook should exist on disk', () => {
+    const smartInstallPath = path.join(projectRoot, 'plugin/scripts/smart-install.js');
+    expect(existsSync(smartInstallPath)).toBe(true);
+  });
+});
@@ -0,0 +1,291 @@
+/**
+ * Tests for Issue #1652: Stuck generator (zombie subprocess) detection in reapStaleSessions()
+ *
+ * Root cause: reapStaleSessions() unconditionally skipped sessions where
+ * `session.generatorPromise` was non-null, meaning generators stuck inside
+ * `for await (const msg of queryResult)` (blocked on a hung subprocess) were
+ * never cleaned up — even after the session's Stop hook completed.
+ *
+ * Fix: Check `session.lastGeneratorActivity`. If it hasn't updated in
+ * MAX_GENERATOR_IDLE_MS (5 min), SIGKILL the subprocess to unblock the
+ * for-await, then abort the controller so the generator exits.
+ *
+ * Mock Justification (~30% mock code):
+ * - Session fixtures: Required to create valid ActiveSession objects with all
+ *   required fields — tests the actual detection logic, not fixture creation.
+ * - Process mock: Verify SIGKILL is sent and abort is called — no real subprocess needed.
+ */
+
+import { describe, test, expect, beforeEach, afterEach, mock, setSystemTime } from 'bun:test';
+import {
+  MAX_GENERATOR_IDLE_MS,
+  MAX_SESSION_IDLE_MS,
+  detectStaleGenerator,
+  type StaleGeneratorCandidate,
+} from '../../../src/services/worker/SessionManager.js';
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+interface MockProcess {
+  exitCode: number | null;
+  killed: boolean;
+  kill: (signal?: string) => boolean;
+  _lastSignal?: string;
+}
+
+function createMockProcess(exitCode: number | null = null): MockProcess {
+  const proc: MockProcess = {
+    exitCode,
+    killed: false,
+    kill(signal?: string) {
+      proc.killed = true;
+      proc._lastSignal = signal;
+      return true;
+    },
+  };
+  return proc;
+}
+
+interface TestSession extends StaleGeneratorCandidate {
+  sessionDbId: number;
+  startTime: number;
+}
+
+function createSession(overrides: Partial<TestSession> = {}): TestSession {
+  return {
+    sessionDbId: 1,
+    generatorPromise: null,
+    lastGeneratorActivity: Date.now(),
+    abortController: new AbortController(),
+    startTime: Date.now(),
+    ...overrides,
+  };
+}
+
+// ---------------------------------------------------------------------------
+// Tests
+// ---------------------------------------------------------------------------
+
+describe('reapStaleSessions — stale generator detection (Issue #1652)', () => {
+
+  describe('threshold constants', () => {
+    test('MAX_GENERATOR_IDLE_MS should be 5 minutes', () => {
+      expect(MAX_GENERATOR_IDLE_MS).toBe(5 * 60 * 1000);
+    });
+
+    test('MAX_SESSION_IDLE_MS should be 15 minutes', () => {
+      expect(MAX_SESSION_IDLE_MS).toBe(15 * 60 * 1000);
+    });
+
+    test('generator idle threshold should be less than session idle threshold', () => {
+      // Ensures stuck generators are cleaned up before idle no-generator sessions
+      expect(MAX_GENERATOR_IDLE_MS).toBeLessThan(MAX_SESSION_IDLE_MS);
+    });
+  });
+
+  describe('stale generator detection', () => {
+    test('should detect generator as stale when idle > 5 minutes', () => {
+      const session = createSession({
+        generatorPromise: Promise.resolve(),
+        lastGeneratorActivity: Date.now() - (MAX_GENERATOR_IDLE_MS + 1000), // 5m1s ago
+      });
+      const proc = createMockProcess();
+
+      const isStale = detectStaleGenerator(session, proc);
+
+      expect(isStale).toBe(true);
+    });
+
+    test('should NOT detect generator as stale when idle exactly at threshold', () => {
+      // At exactly the threshold we do NOT yet reap (strictly greater than).
+      // Freeze time so that both the session creation and detectStaleGenerator
+      // call share the same Date.now() value, preventing a race where the two
+      // calls return different timestamps and push the idle time over the boundary.
+      const now = Date.now();
+      setSystemTime(now);
+      try {
+        const session = createSession({
+          generatorPromise: Promise.resolve(),
+          lastGeneratorActivity: now - MAX_GENERATOR_IDLE_MS,
+        });
+        const proc = createMockProcess();
+
+        const isStale = detectStaleGenerator(session, proc);
+
+        expect(isStale).toBe(false);
+      } finally {
+        setSystemTime(); // restore real time
+      }
+    });
+
+    test('should NOT detect generator as stale when idle < 5 minutes', () => {
+      const session = createSession({
+        generatorPromise: Promise.resolve(),
+        lastGeneratorActivity: Date.now() - 60_000, // 1 minute ago
+      });
+      const proc = createMockProcess();
+
+      const isStale = detectStaleGenerator(session, proc);
+
+      expect(isStale).toBe(false);
+    });
+
+    test('should NOT flag sessions without a generator (no generator = different code path)', () => {
+      const session = createSession({
+        generatorPromise: null,
+        // Even though lastGeneratorActivity is ancient, no generator means no stale-generator detection
+        lastGeneratorActivity: 0,
+      });
+      const proc = createMockProcess();
+
+      const isStale = detectStaleGenerator(session, proc);
+
+      expect(isStale).toBe(false);
+    });
+  });
+
+  describe('subprocess kill on stale generator', () => {
+    test('should SIGKILL the subprocess when stale generator detected', () => {
+      const session = createSession({
+        generatorPromise: Promise.resolve(),
+        lastGeneratorActivity: Date.now() - (MAX_GENERATOR_IDLE_MS + 5000),
+      });
+      const proc = createMockProcess(); // exitCode === null (still running)
+
+      detectStaleGenerator(session, proc);
+
+      expect(proc.killed).toBe(true);
+      expect(proc._lastSignal).toBe('SIGKILL');
+    });
+
+    test('should NOT attempt to kill an already-exited subprocess', () => {
+      const session = createSession({
+        generatorPromise: Promise.resolve(),
+        lastGeneratorActivity: Date.now() - (MAX_GENERATOR_IDLE_MS + 5000),
+      });
+      const proc = createMockProcess(0); // exitCode === 0 (already exited)
+
+      detectStaleGenerator(session, proc);
+
+      // Should not try to kill an already-exited process
+      expect(proc.killed).toBe(false);
+    });
+
+    test('should still abort controller even when no tracked subprocess found', () => {
+      const session = createSession({
+        generatorPromise: Promise.resolve(),
+        lastGeneratorActivity: Date.now() - (MAX_GENERATOR_IDLE_MS + 5000),
+      });
+
+      // proc is undefined — subprocess not tracked in ProcessRegistry
+      detectStaleGenerator(session, undefined);
+
+      // AbortController should still be aborted to signal the generator loop
+      expect(session.abortController.signal.aborted).toBe(true);
+    });
+  });
+
+  describe('abort controller on stale generator', () => {
+    test('should abort the session controller when stale generator detected', () => {
+      const session = createSession({
+        generatorPromise: Promise.resolve(),
+        lastGeneratorActivity: Date.now() - (MAX_GENERATOR_IDLE_MS + 1000),
+      });
+      const proc = createMockProcess();
+
+      expect(session.abortController.signal.aborted).toBe(false);
+
+      detectStaleGenerator(session, proc);
+
+      expect(session.abortController.signal.aborted).toBe(true);
+    });
+
+    test('should NOT abort controller for fresh generator', () => {
+      const session = createSession({
+        generatorPromise: Promise.resolve(),
+        lastGeneratorActivity: Date.now() - 30_000, // 30 seconds ago — fresh
+      });
+      const proc = createMockProcess();
+
+      detectStaleGenerator(session, proc);
+
+      expect(session.abortController.signal.aborted).toBe(false);
+    });
+  });
+
+  describe('idle session reaping (existing behaviour preserved)', () => {
+    test('idle session without generator should be reaped after 15 minutes', () => {
+      const session = createSession({
+        generatorPromise: null,
+        startTime: Date.now() - (MAX_SESSION_IDLE_MS + 1000), // 15m1s ago
+      });
+
+      // Simulate the existing idle-session path (no generator, no pending work)
+      const sessionAge = Date.now() - session.startTime;
+      const shouldReap = !session.generatorPromise && sessionAge > MAX_SESSION_IDLE_MS;
+
+      expect(shouldReap).toBe(true);
+    });
+
+    test('idle session without generator should NOT be reaped before 15 minutes', () => {
+      const session = createSession({
+        generatorPromise: null,
+        startTime: Date.now() - (10 * 60 * 1000), // 10 minutes ago
+      });
+
+      const sessionAge = Date.now() - session.startTime;
+      const shouldReap = !session.generatorPromise && sessionAge > MAX_SESSION_IDLE_MS;
+
+      expect(shouldReap).toBe(false);
+    });
+
+    test('session with active generator should never be reaped by idle-session path', () => {
+      const session = createSession({
+        generatorPromise: Promise.resolve(),
+        startTime: Date.now() - (60 * 60 * 1000), // 1 hour ago — very old
+        // But generator was active recently (fresh activity)
+        lastGeneratorActivity: Date.now() - 10_000,
+      });
+      const proc = createMockProcess();
+
+      // Stale generator detection says NOT stale (activity is fresh)
+      const isStaleGenerator = detectStaleGenerator(session, proc);
+      expect(isStaleGenerator).toBe(false);
+
+      // Idle-session path is skipped because generatorPromise is non-null
+      expect(session.generatorPromise).not.toBeNull();
+    });
+  });
+
+  describe('lastGeneratorActivity update semantics', () => {
+    test('should be initialized to session startTime to avoid false positives on boot', () => {
+      // When a session is first created, lastGeneratorActivity must be set to a
+      // recent time so the generator isn't immediately flagged as stale before it
+      // has had a chance to produce output.
+      const now = Date.now();
+      const session = createSession({
+        startTime: now,
+        lastGeneratorActivity: now, // mirrors SessionManager initialization
+      });
+
+      const generatorIdleMs = now - session.lastGeneratorActivity;
+      expect(generatorIdleMs).toBeLessThan(MAX_GENERATOR_IDLE_MS);
+    });
+
+    test('should be updated when generator yields a message (prevents false positive reap)', () => {
+      const session = createSession({
+        generatorPromise: Promise.resolve(),
+        lastGeneratorActivity: Date.now() - (MAX_GENERATOR_IDLE_MS - 10_000), // 4m50s ago
+      });
+
+      // Simulate the getMessageIterator yielding a message:
+      session.lastGeneratorActivity = Date.now();
+
+      // Generator is now fresh — should not be reaped
+      const generatorIdleMs = Date.now() - session.lastGeneratorActivity;
+      expect(generatorIdleMs).toBeLessThan(MAX_GENERATOR_IDLE_MS);
+    });
+  });
+});
@@ -3,6 +3,7 @@ import { existsSync, mkdirSync, writeFileSync, rmSync, readFileSync } from 'fs';
 import { join } from 'path';
 import { tmpdir } from 'os';
 import { spawnSync } from 'child_process';
+import { checkBinaryPlatformCompatibility } from '../plugin/scripts/smart-install.js';

 /**
 * Smart Install Script Tests
@@ -237,3 +238,119 @@ describe('smart-install stdout JSON output (#1253)', () => {
    }
  });
 });
+
+/**
+ * Tests for checkBinaryPlatformCompatibility() (#1547).
+ *
+ * The bundled plugin/scripts/claude-mem binary is macOS arm64 only.
+ * On Linux/Windows it cannot execute and hooks fail silently.
+ * These tests call the production function directly, mocking process.platform
+ * and passing controlled binary paths to verify Mach-O detection behaviour.
+ */
+describe('smart-install binary platform compatibility (#1547)', () => {
+  let testDir: string;
+  let originalPlatform: PropertyDescriptor | undefined;
+
+  beforeEach(() => {
+    testDir = join(tmpdir(), `claude-mem-binary-compat-test-${process.pid}`);
+    mkdirSync(testDir, { recursive: true });
+    originalPlatform = Object.getOwnPropertyDescriptor(process, 'platform');
+  });
+
+  afterEach(() => {
+    rmSync(testDir, { recursive: true, force: true });
+    // Restore process.platform
+    if (originalPlatform) {
+      Object.defineProperty(process, 'platform', originalPlatform);
+    }
+  });
+
+  function setPlatform(value: string) {
+    Object.defineProperty(process, 'platform', { value, configurable: true });
+  }
+
+  it('should detect native arm64/x86_64 Mach-O binary and warn on Linux', () => {
+    // Real macOS arm64 binary header: bytes CF FA ED FE (MH_MAGIC_64)
+    const binaryPath = join(testDir, 'claude-mem');
+    writeFileSync(binaryPath, Buffer.from([0xCF, 0xFA, 0xED, 0xFE, 0x0C, 0x00, 0x00, 0x01]));
+
+    const stderrLines: string[] = [];
+    const originalError = console.error;
+    console.error = (...args: any[]) => stderrLines.push(args.join(' '));
+
+    setPlatform('linux');
+    try {
+      checkBinaryPlatformCompatibility(binaryPath);
+    } finally {
+      console.error = originalError;
+    }
+
+    expect(stderrLines.some(l => l.includes('macOS-only'))).toBe(true);
+    expect(stderrLines.some(l => l.includes('linux'))).toBe(true);
+  });
+
+  it('should detect byte-swapped Mach-O binary and warn on Linux', () => {
+    // Byte-swapped 64-bit Mach-O: bytes FE ED FA CF (MH_CIGAM_64)
+    const binaryPath = join(testDir, 'claude-mem-swapped');
+    writeFileSync(binaryPath, Buffer.from([0xFE, 0xED, 0xFA, 0xCF, 0x01, 0x00, 0x00, 0x0C]));
+
+    const stderrLines: string[] = [];
+    const originalError = console.error;
+    console.error = (...args: any[]) => stderrLines.push(args.join(' '));
+
+    setPlatform('linux');
+    try {
+      checkBinaryPlatformCompatibility(binaryPath);
+    } finally {
+      console.error = originalError;
+    }
+
+    expect(stderrLines.some(l => l.includes('macOS-only'))).toBe(true);
+  });
+
+  it('should NOT warn for an ELF binary (Linux native) on Linux', () => {
+    // ELF magic: 0x7F 'E' 'L' 'F'
+    const binaryPath = join(testDir, 'claude-mem-elf');
+    writeFileSync(binaryPath, Buffer.from([0x7f, 0x45, 0x4c, 0x46, 0x02, 0x01, 0x01, 0x00]));
+
+    const stderrLines: string[] = [];
+    const originalError = console.error;
+    console.error = (...args: any[]) => stderrLines.push(args.join(' '));
+
+    setPlatform('linux');
+    try {
+      checkBinaryPlatformCompatibility(binaryPath);
+    } finally {
+      console.error = originalError;
+    }
+
+    expect(stderrLines.some(l => l.includes('macOS-only'))).toBe(false);
+  });
+
+  it('should not throw when binary path does not exist', () => {
+    const binaryPath = join(testDir, 'nonexistent-claude-mem');
+    expect(existsSync(binaryPath)).toBe(false);
+
+    setPlatform('linux');
+    expect(() => checkBinaryPlatformCompatibility(binaryPath)).not.toThrow();
+  });
+
+  it('should skip the check entirely when platform is darwin', () => {
+    // Write a Mach-O binary — on macOS the check returns early, so no warning
+    const binaryPath = join(testDir, 'claude-mem');
+    writeFileSync(binaryPath, Buffer.from([0xCF, 0xFA, 0xED, 0xFE, 0x0C, 0x00, 0x00, 0x01]));
+
+    const stderrLines: string[] = [];
+    const originalError = console.error;
+    console.error = (...args: any[]) => stderrLines.push(args.join(' '));
+
+    setPlatform('darwin');
+    try {
+      checkBinaryPlatformCompatibility(binaryPath);
+    } finally {
+      console.error = originalError;
+    }
+
+    expect(stderrLines.length).toBe(0);
+  });
+});
@@ -5,7 +5,7 @@
 * Source: src/utils/project-name.ts
 */

-import { describe, it, expect } from 'bun:test';
+import { describe, it, expect, beforeAll, afterAll } from 'bun:test';
 import { homedir } from 'os';
 import { getProjectName, getProjectContext } from '../../src/utils/project-name.js';

@@ -96,4 +96,51 @@ describe('getProjectContext', () => {
    expect(ctx.primary).toBe('unknown-project');
    expect(ctx.parent).toBeNull();
  });
+
+  describe('worktree regression (#1081, #1500, #1819)', () => {
+    let tmp: string;
+    let mainRepo: string;
+    let worktreeCheckout: string;
+
+    beforeAll(async () => {
+      const { mkdtempSync, mkdirSync, writeFileSync } = await import('fs');
+      const { join } = await import('path');
+      const { tmpdir } = await import('os');
+
+      tmp = mkdtempSync(join(tmpdir(), 'cm-wt-'));
+      mainRepo = join(tmp, 'main-repo');
+      const worktreeGitDir = join(mainRepo, '.git', 'worktrees', 'my-worktree');
+      worktreeCheckout = join(tmp, 'my-worktree');
+
+      mkdirSync(worktreeGitDir, { recursive: true });
+      mkdirSync(worktreeCheckout, { recursive: true });
+      writeFileSync(
+        join(worktreeCheckout, '.git'),
+        `gitdir: ${worktreeGitDir}\n`
+      );
+    });
+
+    afterAll(async () => {
+      const { rmSync } = await import('fs');
+      rmSync(tmp, { recursive: true, force: true });
+    });
+
+    it('uses parent project name as primary when in a worktree', () => {
+      const ctx = getProjectContext(worktreeCheckout);
+      expect(ctx.isWorktree).toBe(true);
+      expect(ctx.primary).toBe('main-repo');
+      expect(ctx.parent).toBe('main-repo');
+      expect(ctx.allProjects).toEqual(['main-repo', 'my-worktree']);
+    });
+
+    it('write-path call sites resolve to parent project in worktrees', () => {
+      // Mirrors the pattern used by session-init.ts and SessionRoutes.ts:
+      //   const project = getProjectContext(cwd).primary;
+      // This must resolve to the parent repo, not the worktree name,
+      // so observations are stored under the correct project.
+      const project = getProjectContext(worktreeCheckout).primary;
+      expect(project).toBe('main-repo');
+      expect(project).not.toBe('my-worktree');
+    });
+  });
 });
@@ -0,0 +1,251 @@
+/**
+ * Tests for Issue #1590: Session lifecycle guards to prevent runaway API spend
+ *
+ * Validates three lifecycle safety mechanisms:
+ * 1. SIGTERM detection: externally-killed processes must NOT trigger crash recovery
+ * 2. Wall-clock age limit: sessions older than MAX_SESSION_WALL_CLOCK_MS must be terminated
+ * 3. Duplicate process prevention: a new spawn for a session kills any existing process first
+ */
+
+import { describe, it, expect, beforeEach, afterEach } from 'bun:test';
+import { EventEmitter } from 'events';
+import {
+  registerProcess,
+  unregisterProcess,
+  getProcessBySession,
+  getActiveCount,
+  getActiveProcesses,
+  createPidCapturingSpawn,
+} from '../../src/services/worker/ProcessRegistry.js';
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+function createMockProcess(overrides: { exitCode?: number | null; killed?: boolean } = {}) {
+  const emitter = new EventEmitter();
+  const mock = Object.assign(emitter, {
+    pid: Math.floor(Math.random() * 100_000) + 10_000,
+    exitCode: overrides.exitCode ?? null,
+    killed: overrides.killed ?? false,
+    stdin: null as null,
+    stdout: null as null,
+    stderr: null as null,
+    kill(signal?: string) {
+      mock.killed = true;
+      setTimeout(() => {
+        mock.exitCode = 0;
+        mock.emit('exit', mock.exitCode, signal || 'SIGTERM');
+      }, 10);
+      return true;
+    },
+    on: emitter.on.bind(emitter),
+    once: emitter.once.bind(emitter),
+    off: emitter.off.bind(emitter),
+  });
+  return mock;
+}
+
+function clearRegistry() {
+  for (const p of getActiveProcesses()) {
+    unregisterProcess(p.pid);
+  }
+}
+
+// ---------------------------------------------------------------------------
+// 1. SIGTERM detection — does NOT trigger crash recovery
+// ---------------------------------------------------------------------------
+
+describe('SIGTERM detection (Issue #1590)', () => {
+  it('should classify "code 143" as a SIGTERM error', () => {
+    const errorMsg = 'Claude Code process exited with code 143';
+    const isSigterm = errorMsg.includes('code 143') || errorMsg.includes('signal SIGTERM');
+    expect(isSigterm).toBe(true);
+  });
+
+  it('should classify "signal SIGTERM" as a SIGTERM error', () => {
+    const errorMsg = 'Process terminated with signal SIGTERM';
+    const isSigterm = errorMsg.includes('code 143') || errorMsg.includes('signal SIGTERM');
+    expect(isSigterm).toBe(true);
+  });
+
+  it('should NOT classify ordinary errors as SIGTERM', () => {
+    const errorMsg = 'Invalid API key';
+    const isSigterm = errorMsg.includes('code 143') || errorMsg.includes('signal SIGTERM');
+    expect(isSigterm).toBe(false);
+  });
+
+  it('should NOT classify code 1 (normal error) as SIGTERM', () => {
+    const errorMsg = 'Claude Code process exited with code 1';
+    const isSigterm = errorMsg.includes('code 143') || errorMsg.includes('signal SIGTERM');
+    expect(isSigterm).toBe(false);
+  });
+
+  it('aborting the controller should mark wasAborted=true, preventing crash recovery', () => {
+    // Simulate what the catch handler does: abort when SIGTERM detected
+    const abortController = new AbortController();
+    expect(abortController.signal.aborted).toBe(false);
+
+    // SIGTERM arrives — we abort the controller
+    abortController.abort();
+
+    // By the time .finally() runs, wasAborted should be true
+    const wasAborted = abortController.signal.aborted;
+    expect(wasAborted).toBe(true);
+  });
+
+  it('should NOT abort the controller for non-SIGTERM crash errors', () => {
+    const abortController = new AbortController();
+    const errorMsg = 'FOREIGN KEY constraint failed';
+
+    // Non-SIGTERM: do NOT abort
+    const isSigterm = errorMsg.includes('code 143') || errorMsg.includes('signal SIGTERM');
+    if (isSigterm) {
+      abortController.abort();
+    }
+
+    expect(abortController.signal.aborted).toBe(false);
+  });
+});
+
+// ---------------------------------------------------------------------------
+// 2. Wall-clock age limit
+// ---------------------------------------------------------------------------
+
+describe('Wall-clock age limit (Issue #1590)', () => {
+  const MAX_SESSION_WALL_CLOCK_MS = 4 * 60 * 60 * 1000; // 4 hours (matches SessionRoutes)
+
+  it('should NOT terminate a session started < 4 hours ago', () => {
+    const startTime = Date.now() - 30 * 60 * 1000; // 30 minutes ago
+    const sessionAgeMs = Date.now() - startTime;
+    expect(sessionAgeMs).toBeLessThan(MAX_SESSION_WALL_CLOCK_MS);
+  });
+
+  it('should NOT terminate a session started exactly 4 hours ago (strict >)', () => {
+    // Production uses strict `>` (not `>=`), so exactly 4h is still alive.
+    const startTime = Date.now() - MAX_SESSION_WALL_CLOCK_MS;
+    const sessionAgeMs = Date.now() - startTime;
+    // At exactly the boundary, sessionAgeMs === MAX, and `>` is false → no termination.
+    expect(sessionAgeMs).toBeLessThanOrEqual(MAX_SESSION_WALL_CLOCK_MS);
+  });
+
+  it('should terminate a session started more than 4 hours ago', () => {
+    const startTime = Date.now() - MAX_SESSION_WALL_CLOCK_MS - 1;
+    const sessionAgeMs = Date.now() - startTime;
+    expect(sessionAgeMs).toBeGreaterThan(MAX_SESSION_WALL_CLOCK_MS);
+  });
+
+  it('should terminate a session started 13+ hours ago (the issue scenario)', () => {
+    const startTime = Date.now() - 13 * 60 * 60 * 1000; // 13 hours ago
+    const sessionAgeMs = Date.now() - startTime;
+    expect(sessionAgeMs).toBeGreaterThan(MAX_SESSION_WALL_CLOCK_MS);
+  });
+
+  it('aborting + draining pending queue should prevent respawn', () => {
+    // Simulate the wall-clock termination sequence:
+    // 1. Abort controller (stops active generator)
+    // 2. Mark pending messages abandoned (no work to restart for)
+    // 3. Remove session from map
+
+    const abortController = new AbortController();
+    let pendingAbandoned = 0;
+    let sessionRemoved = false;
+
+    // Simulate abort
+    abortController.abort();
+    expect(abortController.signal.aborted).toBe(true);
+
+    // Simulate markAllSessionMessagesAbandoned
+    pendingAbandoned = 3; // Pretend 3 messages were abandoned
+
+    // Simulate removeSessionImmediate
+    sessionRemoved = true;
+
+    expect(pendingAbandoned).toBeGreaterThanOrEqual(0);
+    expect(sessionRemoved).toBe(true);
+  });
+});
+
+// ---------------------------------------------------------------------------
+// 3. Duplicate process prevention in createPidCapturingSpawn
+// ---------------------------------------------------------------------------
+
+describe('Duplicate process prevention (Issue #1590)', () => {
+  beforeEach(() => {
+    clearRegistry();
+  });
+
+  afterEach(() => {
+    clearRegistry();
+  });
+
+  it('should detect a duplicate when a live process already exists for the session', () => {
+    const proc = createMockProcess();
+    registerProcess(proc.pid, 42, proc as any);
+
+    const existing = getProcessBySession(42);
+    expect(existing).toBeDefined();
+    expect(existing!.process.exitCode).toBeNull(); // Still alive
+  });
+
+  it('should NOT detect a duplicate when the existing process has already exited', () => {
+    const proc = createMockProcess({ exitCode: 0 });
+    registerProcess(proc.pid, 42, proc as any);
+
+    const existing = getProcessBySession(42);
+    expect(existing).toBeDefined();
+    // exitCode is set — process is already done, NOT a live duplicate
+    expect(existing!.process.exitCode).not.toBeNull();
+  });
+
+  it('should kill existing process and unregister before spawning', () => {
+    const existingProc = createMockProcess();
+    registerProcess(existingProc.pid, 99, existingProc as any);
+    expect(getActiveCount()).toBe(1);
+
+    // Simulate the duplicate-kill logic:
+    const duplicate = getProcessBySession(99);
+    if (duplicate && duplicate.process.exitCode === null) {
+      try { duplicate.process.kill('SIGTERM'); } catch { /* already dead */ }
+      unregisterProcess(duplicate.pid);
+    }
+
+    expect(getActiveCount()).toBe(0);
+    expect(getProcessBySession(99)).toBeUndefined();
+  });
+
+  it('should leave registry empty after killing duplicate so new process can register', () => {
+    const oldProc = createMockProcess();
+    registerProcess(oldProc.pid, 77, oldProc as any);
+    expect(getActiveCount()).toBe(1);
+
+    // Kill duplicate
+    const dup = getProcessBySession(77);
+    if (dup && dup.process.exitCode === null) {
+      try { dup.process.kill('SIGTERM'); } catch { /* ignore */ }
+      unregisterProcess(dup.pid);
+    }
+    expect(getActiveCount()).toBe(0);
+
+    // New process can now register cleanly
+    const newProc = createMockProcess();
+    registerProcess(newProc.pid, 77, newProc as any);
+    expect(getActiveCount()).toBe(1);
+
+    const found = getProcessBySession(77);
+    expect(found!.pid).toBe(newProc.pid);
+  });
+
+  it('should not interfere when no existing process is registered', () => {
+    expect(getProcessBySession(55)).toBeUndefined();
+
+    // Duplicate-kill logic: should be a no-op
+    const dup = getProcessBySession(55);
+    if (dup && dup.process.exitCode === null) {
+      unregisterProcess(dup.pid);
+    }
+
+    // Registry should still be empty — no side effects
+    expect(getActiveCount()).toBe(0);
+  });
+});
Author	SHA1	Message	Date
Alex Newman	1d7500604f	chore: bump version to 12.1.2 Publish to npm / publish (push) Has been cancelled	2026-04-15 01:00:38 -07:00
Ben Younes	05232ff091	fix: reap stuck generators in reapStaleSessions (fixes #1652 ) (#1698 ) * fix: reap stuck generators in reapStaleSessions (fixes #1652) Sessions whose SDK subprocess hung would stay in the active sessions map forever because `reapStaleSessions()` unconditionally skipped any session with a non-null `generatorPromise`. The generator was blocked on `for await (const msg of queryResult)` inside SDKAgent and could never unblock itself — the idle-timeout only fires when the generator is in `waitForMessage()`, and the orphan reaper skips processes whose session is still in the map. Add `MAX_GENERATOR_IDLE_MS` (5 min). When `reapStaleSessions()` sees a session whose `generatorPromise` is set but `lastGeneratorActivity` has not advanced in over 5 minutes, it now: 1. SIGKILLs the tracked subprocess to unblock the stuck `for await` 2. Calls `session.abortController.abort()` so the generator loop exits 3. Calls `deleteSession()` which waits up to 30 s for the generator to finish, then cleans up supervisor-tracked children Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: freeze time in stale-generator test and import constants from production source - Export MAX_GENERATOR_IDLE_MS, MAX_SESSION_IDLE_MS, StaleGeneratorCandidate, StaleGeneratorProcess, and detectStaleGenerator from SessionManager.ts so tests no longer duplicate production constants or detection logic. - Use setSystemTime() from bun:test to freeze Date.now() in the "exactly at threshold" test, eliminating the flaky double-Date.now() race. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 00:58:35 -07:00
Ben Younes	b411d91885	fix: add circuit breaker to OpenClaw worker client (#1636 ) (#1697 ) * fix: add circuit breaker to OpenClaw worker client (#1636) When the claude-mem worker is unreachable, every plugin event (before_agent_start, before_prompt_build, tool_result_persist, agent_end) triggered a new fetch that failed and logged a warning, causing CPU-spinning and continuous log spam. Add a CLOSED/OPEN/HALF_OPEN circuit breaker: after 3 consecutive network errors the circuit opens, silently drops all worker calls for 30 s, then sends one probe. Individual failures are only logged while the circuit is still CLOSED; once open it logs once ("disabling requests for 30s") and goes quiet until recovery. Generated by Claude Code Vibe coded by Ousama Ben Younes Co-Authored-By: Claude <noreply@anthropic.com> * fix: limit HALF_OPEN to single probe and move circuitOnSuccess after response.ok check - Add _halfOpenProbeInFlight flag so only one probe is allowed in HALF_OPEN state; concurrent callers are silently dropped until the probe completes (success or failure) - Move circuitOnSuccess() to after the response.ok check in workerPost, workerPostFireAndForget, and workerGetText so non-2xx HTTP responses no longer close the circuit - Clear _halfOpenProbeInFlight in both circuitOnSuccess and circuitOnFailure, and in circuitReset - Add regression test covering HALF_OPEN one-probe behavior: non-2xx keeps circuit open, 2xx closes it * chore: trigger CodeRabbit re-review --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-15 00:58:32 -07:00
Ben Younes	4538e686ad	fix: resolve Setup hook broken reference and warn on macOS-only binary (#1547 ) (#1696 ) * fix: resolve Setup hook broken reference and warn on macOS-only binary (#1547) On Linux ARM64, the plugin silently failed because: 1. The Setup hook called setup.sh which was removed; the hook exited 127 (file not found), causing the plugin to appear uninstalled. 2. The committed plugin/scripts/claude-mem binary is macOS arm64 only; no warning was shown when it could not execute on other platforms. Fix the Setup hook to call smart-install.js (the current setup mechanism) and add checkBinaryPlatformCompatibility() to smart-install.js, which reads the Mach-O magic bytes from the bundled binary and warns users on non-macOS platforms that the JS fallback (bun-runner.js + worker-service.cjs) is active. Generated by Claude Code Vibe coded by ousamabenyounes Co-Authored-By: Claude <noreply@anthropic.com> * fix: close fd in finally block, strengthen smart-install tests to use production function - Wrap openSync/readSync in checkBinaryPlatformCompatibility with a finally block so the file descriptor is always closed even if readSync throws - Export checkBinaryPlatformCompatibility with an optional binaryPath param for testability - Refactor Mach-O detection tests to call the production function directly, mocking process.platform and passing controlled binary paths, eliminating duplicated inline logic - Strengthen plugin-distribution test to assert at least one command hook exists before checking for smart-install.js, preventing vacuous pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-15 00:58:29 -07:00
Ben Younes	f97c50bfb9	fix: session lifecycle guards to prevent runaway API spend (#1590 ) (#1693 ) * fix: add session lifecycle guards to prevent runaway API spend (#1590) Three root causes allowed 30+ subprocess accumulation over 36 hours: 1. SIGTERM-killed processes (code 143) triggered crash recovery and immediately respawned — now detected and treated as intentional termination (aborts controller so wasAborted=true in .finally). 2. No wall-clock limit: sessions ran for 13+ hours continuously spending tokens — now refuses new generators after 4 hours and drains the pending queue to prevent further spawning. 3. Duplicate --resume processes for the same session UUID — now killed and unregistered before a new spawn is registered. Generated by Claude Code Vibe coded by ousamabenyounes Co-Authored-By: Claude <noreply@anthropic.com> * fix: use normalized errorMsg in logger.error payload and annotate SIGTERM override Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: use persisted createdAt for wall-clock guard and bind abortController locally to prevent stale abort Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: re-trigger CodeRabbit review after rate limit reset * fix: defer process unregistration until exit and align boundary test with strict > (#1693) - ProcessRegistry: don't unregister PID immediately after SIGTERM — let the existing 'exit' handler clean up when the process actually exits, preventing tracking loss for still-live processes. - Test: align wall-clock boundary test with production's strict `>` operator (exactly 4h is NOT terminated, only >4h is). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-15 00:58:23 -07:00
Ben Younes	983be42998	fix: resolve Gemini CLI 0.37.0 session capture failures (#1664 ) (#1692 ) Three root causes prevented Gemini sessions from persisting prompts, observations, and summaries: 1. BeforeAgent was mapped to user-message (display-only) instead of session-init (which initialises the session and starts the SDK agent). 2. The transcript parser expected Claude Code JSONL (type: "assistant") but Gemini CLI 0.37.0 writes a JSON document with a messages array where assistant entries carry type: "gemini". extractLastMessage now detects the format and routes to the correct parser, preserving full backward compatibility with Claude Code JSONL transcripts. 3. The summarize handler omitted platformSource from the /api/sessions/summarize request body, causing sessions to be recorded without the gemini-cli source tag. Co-authored-by: Claude <noreply@anthropic.com>	2026-04-15 00:58:20 -07:00
UCHIDA Masayuki	544e9d39f5	fix: replace hardcoded nvm/homebrew PATH with universal login shell resolution (#1833 ) * fix: replace hardcoded nvm/homebrew PATH with universal login shell resolution Hook commands previously hardcoded PATH entries for nvm and homebrew, causing `node: command not found` for users with other Node version managers (mise, asdf, volta, fnm, Nix, etc.). Replace with `$($SHELL -lc 'echo $PATH')` which inherits the user's login shell PATH regardless of how Node was installed. Also adds the missing PATH export to the PreToolUse hook (#1702). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add cache-path fallback to PreToolUse hook Aligns PreToolUse _R resolution with all other hooks by adding the cache directory lookup before falling back to the marketplace path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 00:58:17 -07:00
Ethan	16a0737dfc	fix: use parent project name for worktree observation writes (#1820 ) * fix: use parent project name for worktree observation writes (#1819) Observations and sessions from git worktrees were stored under basename(cwd) instead of the parent repo name because write paths called getProjectName() (not worktree-aware) instead of getProjectContext() (worktree-aware). This is the same bug as #1081, #1317, and #1500 — it regressed because the two functions coexist and new code reached for the simpler one. Fix: getProjectContext() now returns parentProjectName as primary when in a worktree, and all four write-path call sites now use getProjectContext().primary instead of getProjectName(). Includes regression test that creates a real worktree directory structure and asserts primary === parentProjectName. * fix: address review nitpicks — allProjects fallback, JSDoc, write-path test - ContextBuilder: default projects to context.allProjects for legacy worktree-labeled record compatibility - ProjectContext: clarify JSDoc that primary is canonical (parent repo in worktrees) - Tests: add write-path regression test mirroring session-init/SessionRoutes pattern; refactor worktree fixture into beforeAll/afterAll * refactor(project-name): rename local to cwdProjectName and dedupe allProjects Addresses final CodeRabbit nitpick: disambiguates the local variable from the returned `primary` field, and dedupes allProjects via Set in case parent and cwd resolve to the same name. --------- Co-authored-by: Ethan Hurst <ethan.hurst@outlook.com.au>	2026-04-15 00:58:14 -07:00