chore: bump version to 10.1.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merge pull request #1125 from thedotmack/feat/session-start-system-message
2026-02-16 00:17:34 -05:00 · 2026-02-16 00:15:52 -05:00 · 2026-02-16 00:11:25 -05:00 · 2026-02-16 00:05:13 -05:00 · 2026-02-15 23:33:51 -05:00 · 2026-02-15 23:33:20 -05:00
16 changed files with 387 additions and 368 deletions
@@ -10,7 +10,7 @@
  "plugins": [
    {
      "name": "claude-mem",
-      "version": "10.0.7",
+      "version": "10.1.0",
      "source": "./plugin",
      "description": "Persistent memory system for Claude Code - context compression across sessions"
    }
@@ -0,0 +1,21 @@
+name: Publish to npm
+
+on:
+  push:
+    tags:
+      - 'v*'
+
+jobs:
+  publish:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+          registry-url: 'https://registry.npmjs.org'
+      - run: npm install --ignore-scripts
+      - run: npm run build
+      - run: npm publish
+        env:
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
@@ -2,6 +2,45 @@

 All notable changes to claude-mem.

+## [v10.0.8] - 2026-02-16
+
+## Bug Fixes
+
+### Orphaned Subprocess Cleanup
+- Add explicit subprocess cleanup after SDK query loop using existing `ProcessRegistry` infrastructure (`getProcessBySession` + `ensureProcessExit`), preventing orphaned Claude subprocesses from accumulating
+- Closes #1010, #1089, #1090, #1068
+
+### Chroma Binary Resolution
+- Replace `npx chroma run` with absolute binary path resolution via `require.resolve`, falling back to `npx` with explicit `cwd` when the binary isn't found directly
+- Closes #1120
+
+### Cross-Platform Embedding Fix
+- Remove `@chroma-core/default-embed` which pulled in `onnxruntime` + `sharp` native binaries that fail on many platforms
+- Use WASM backend for Chroma embeddings, eliminating native binary compilation issues
+- Closes #1104, #1105, #1110
+
+## [v10.0.7] - 2026-02-14
+
+## Chroma HTTP Server Architecture
+
+- **Persistent HTTP server**: Switched from in-process Chroma to a persistent HTTP server managed by the new `ChromaServerManager` for better reliability and performance
+- **Local embeddings**: Added `DefaultEmbeddingFunction` for local vector embeddings — no external API required
+- **Pinned chromadb v3.2.2**: Fixed compatibility with v2 API heartbeat endpoint
+- **Server lifecycle improvements**: Addressed PR review feedback for proper start/stop/health check handling
+
+## Bug Fixes
+
+- Fixed SDK spawn failures and sharp native binary crashes
+- Added `plugin.json` to root `.claude-plugin` directory for proper plugin structure
+- Removed duplicate else block from merge artifact
+
+## Infrastructure
+
+- Added multi-tenancy support for claude-mem Pro
+- Updated OpenClaw install URLs to `install.cmem.ai`
+- Added Vercel deploy workflow for install scripts
+- Added `.claude/plans` and `.claude/worktrees` to `.gitignore`
+
 ## [v10.0.6] - 2026-02-13

 ## Bug Fixes
@@ -1463,107 +1502,3 @@ Huge thanks to **Alexander Knigge** ([@AlexanderKnigge](https://x.com/AlexanderK

 **Full Changelog**: https://github.com/thedotmack/claude-mem/compare/v8.1.0...v8.2.0

-## [v8.1.0] - 2025-12-25
-
-## The 3-Month Battle Against Complexity
-
-**TL;DR:** For three months, Claude's instinct to add code instead of delete it caused the same bugs to recur. What should have been 5 lines of code became ~1000 lines, 11 useless methods, and 7+ failed "fixes." The timestamp corruption that finally broke things was just a symptom. The real achievement: **984 lines of code deleted.**
-
---
-
-## What Actually Happened
-
-Every Claude Code hook receives a session ID. That's all you need.
-
-But Claude built an entire redundant session management system on top:
- An `sdk_sessions` table with status tracking, port assignment, and prompt counting
- 11 methods in `SessionStore` to manage this artificial complexity
- Auto-creation logic scattered across 3 locations
- A cleanup hook that "completed" sessions at the end
-
-**Why?** Because it seemed "robust." Because "what if the session doesn't exist?" 
-
-But the edge cases didn't exist. Hooks ALWAYS provide session IDs. The "defensive" code was solving imaginary problems while creating real ones.
-
---
-
-## The Pattern of Failure
-
-Every time a bug appeared, Claude's instinct was to **ADD** more code:
-
-| Bug | What Claude Added | What Should Have Happened |
-|-----|------------------|--------------------------|
-| Race conditions | Auto-create fallbacks | Delete the auto-create logic |
-| Duplicate observations | Validation layers | Delete the code path allowing duplicates |
-| UNIQUE constraint violations | Try-catch with fallbacks | Use `INSERT OR IGNORE` (5 characters) |
-| Session not found | Silent auto-creation | **FAIL LOUDLY** (it's a hook bug) |
-
---
-
-## The 7+ Failed Attempts
-
- **Nov 4**: "Always store session data regardless of pre-existence." Complexity planted.
- **Nov 11**: `INSERT OR IGNORE` recognized. But complexity documented, not removed.
- **Nov 21**: Duplicate observations bug. Fixed. Then broken again by endless mode.
- **Dec 5**: "6 hours of work delivered zero value." User requests self-audit.
- **Dec 20**: "Phase 2: Eliminated Race Conditions" — felt like progress. Complexity remained.
- **Dec 24**: Finally, forced deletion.
-
-The user stated "hooks provide session IDs, no extra management needed" **seven times** across months. Claude didn't listen.
-
---
-
-## The Fix
-
-### Deleted (984 lines):
- 11 `SessionStore` methods: `incrementPromptCounter`, `getPromptCounter`, `setWorkerPort`, `getWorkerPort`, `markSessionCompleted`, `markSessionFailed`, `reactivateSession`, `findActiveSDKSession`, `findAnySDKSession`, `updateSDKSessionId`
- Auto-create logic from `storeObservation` and `storeSummary`
- The entire cleanup hook (was aborting SDK agent and causing data loss)
- 117 lines from `worker-utils.ts`
-
-### What remains (~10 lines):
-```javascript
-createSDKSession(sessionId) {
-  db.run('INSERT OR IGNORE INTO sdk_sessions (...) VALUES (...)');
-  return db.query('SELECT id FROM sdk_sessions WHERE ...').get(sessionId);
-}
-```
-
-**That's it.**
-
---
-
-## Behavior Change
-
- **Before:** Missing session? Auto-create silently. Bug hidden.
- **After:** Missing session? Storage fails. Bug visible immediately.
-
---
-
-## New Tools
-
-Since we're now explicit about recovery instead of silently papering over problems:
-
- `GET /api/pending-queue` - See what's stuck
- `POST /api/pending-queue/process` - Manually trigger recovery  
- `npm run queue:check` / `npm run queue:process` - CLI equivalents
-
---
-
-## Dependencies
- Upgraded `@anthropic-ai/claude-agent-sdk` from `^0.1.67` to `^0.1.76`
-
---
-
-**PR #437:** https://github.com/thedotmack/claude-mem/pull/437
-
-*The evidence: Observations #3646, #6738, #7598, #12860, #12866, #13046, #15259, #20995, #21055, #30524, #31080, #32114, #32116, #32125, #32126, #32127, #32146, #32324—the complete record of a 3-month battle.*
-
-## [v8.0.6] - 2025-12-24
-
-## Bug Fixes
-
- Add error handlers to Chroma sync operations to prevent worker crashes on timeout (#428)
-
-This patch release improves stability by adding proper error handling to Chroma vector database sync operations, preventing worker crashes when sync operations timeout.
-
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem",
-  "version": "10.0.7",
+  "version": "10.1.0",
  "description": "Memory compression system for Claude Code - persist context across sessions",
  "keywords": [
    "claude",
@@ -97,8 +97,8 @@
  },
  "dependencies": {
    "@anthropic-ai/claude-agent-sdk": "^0.1.76",
-    "@chroma-core/default-embed": "^0.1.9",
    "@modelcontextprotocol/sdk": "^1.25.1",
+    "@chroma-core/default-embed": "^0.1.9",
    "ansi-to-html": "^0.7.2",
    "chromadb": "^3.2.2",
    "dompurify": "^3.3.1",
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem",
-  "version": "10.0.7",
+  "version": "10.1.0",
  "description": "Persistent memory system for Claude Code - seamlessly preserve context across sessions",
  "author": {
    "name": "Alex Newman"
@@ -1,6 +1,6 @@
 {
  "name": "claude-mem-plugin",
-  "version": "10.0.7",
+  "version": "10.1.0",
  "private": true,
  "description": "Runtime dependencies for claude-mem bundled hooks",
  "type": "module",
@@ -17,7 +17,11 @@ export const claudeCodeAdapter: PlatformAdapter = {
  },
  formatOutput(result) {
    if (result.hookSpecificOutput) {
-      return { hookSpecificOutput: result.hookSpecificOutput };
+      const output: Record<string, unknown> = { hookSpecificOutput: result.hookSpecificOutput };
+      if (result.systemMessage) {
+        output.systemMessage = result.systemMessage;
+      }
+      return output;
    }
    return { continue: result.continue ?? true, suppressOutput: result.suppressOutput ?? true };
  }
@@ -37,7 +37,12 @@ export const contextHandler: EventHandler = {
    // Note: Removed AbortSignal.timeout due to Windows Bun cleanup issue (libuv assertion)
    // Worker service has its own timeouts, so client-side timeout is redundant
    try {
-      const response = await fetch(url);
+      // Fetch both markdown (for Claude context) and colored (for user display) truly in parallel
+      const colorUrl = `${url}&colors=true`;
+      const [response, colorResponse] = await Promise.all([
+        fetch(url),
+        fetch(colorUrl).catch(() => null)
+      ]);

      if (!response.ok) {
        // Log but don't throw — context fetch failure should not block session start
@@ -48,14 +53,23 @@ export const contextHandler: EventHandler = {
        };
      }

-      const result = await response.text();
-      const additionalContext = result.trim();
+      const [contextResult, colorResult] = await Promise.all([
+        response.text(),
+        colorResponse?.ok ? colorResponse.text() : Promise.resolve('')
+      ]);
+
+      const additionalContext = contextResult.trim();
+      const coloredTimeline = colorResult.trim();
+      const systemMessage = coloredTimeline
+        ? `${coloredTimeline}\n\nView Observations Live @ http://localhost:${port}`
+        : undefined;

      return {
        hookSpecificOutput: {
          hookEventName: 'SessionStart',
          additionalContext
-        }
+        },
+        systemMessage
      };
    } catch (error) {
      // Worker unreachable — return empty context gracefully
@@ -16,6 +16,7 @@ export interface HookResult {
  continue?: boolean;
  suppressOutput?: boolean;
  hookSpecificOutput?: { hookEventName: string; additionalContext: string };
+  systemMessage?: string;
  exitCode?: number;
 }

@@ -11,7 +11,7 @@
 import { spawn, ChildProcess, execSync } from 'child_process';
 import path from 'path';
 import os from 'os';
-import fs from 'fs';
+import fs, { existsSync } from 'fs';
 import { logger } from '../../utils/logger.js';

 export interface ChromaServerConfig {
@@ -108,14 +108,35 @@ export class ChromaServerManager {

    // Cross-platform: use npx.cmd on Windows
    const isWindows = process.platform === 'win32';
-    const command = isWindows ? 'npx.cmd' : 'npx';

-    const args = [
-      'chroma', 'run',
-      '--path', this.config.dataDir,
-      '--host', this.config.host,
-      '--port', String(this.config.port)
-    ];
+    // Resolve chroma binary absolutely — npx fails when spawned from cache dirs (#1120)
+    let command: string;
+    let args: string[];
+    try {
+      // chromadb package installs a 'chroma' bin entry
+      const chromaBinDir = path.dirname(require.resolve('chromadb/package.json'));
+      // Check project-level .bin first (most common npm/bun installation layout)
+      const projectBin = path.join(chromaBinDir, '..', '.bin', isWindows ? 'chroma.cmd' : 'chroma');
+      // Fallback: nested node_modules .bin (rare — pnpm or workspace hoisting)
+      const nestedBin = path.join(chromaBinDir, 'node_modules', '.bin', isWindows ? 'chroma.cmd' : 'chroma');
+
+      if (existsSync(projectBin)) {
+        command = projectBin;
+      } else if (existsSync(nestedBin)) {
+        command = nestedBin;
+      } else {
+        // Last resort: npx with explicit cwd
+        command = isWindows ? 'npx.cmd' : 'npx';
+      }
+    } catch {
+      command = isWindows ? 'npx.cmd' : 'npx';
+    }
+
+    if (command.includes('npx')) {
+      args = ['chroma', 'run', '--path', this.config.dataDir, '--host', this.config.host, '--port', String(this.config.port)];
+    } else {
+      args = ['run', '--path', this.config.dataDir, '--host', this.config.host, '--port', String(this.config.port)];
+    }

    logger.info('CHROMA_SERVER', 'Starting Chroma server', {
      command,
@@ -125,11 +146,20 @@ export class ChromaServerManager {

    const spawnEnv = this.getSpawnEnv();

+    // Resolve cwd for npx fallback — ensures node_modules is findable (#1120)
+    let spawnCwd: string | undefined;
+    try {
+      spawnCwd = path.dirname(require.resolve('chromadb/package.json'));
+    } catch {
+      // If chromadb isn't resolvable, omit cwd and let npx handle it
+    }
+
    this.serverProcess = spawn(command, args, {
      stdio: ['ignore', 'pipe', 'pipe'],
      detached: !isWindows,  // Don't detach on Windows (no process groups)
      windowsHide: true,     // Hide console window on Windows
-      env: spawnEnv
+      env: spawnEnv,
+      ...(spawnCwd && { cwd: spawnCwd })
    });

    // Log server output for debugging
@@ -189,17 +189,20 @@ export class ChromaSync {
    }

    try {
-      // getOrCreateCollection handles both cases
-      // Lazy-load DefaultEmbeddingFunction to avoid eagerly pulling in
-      // @huggingface/transformers → sharp native binaries at bundle startup
+      // Use WASM backend to avoid native ONNX binary issues (#1104, #1105, #1110).
+      // Same model (all-MiniLM-L6-v2), same embeddings, but runs in WASM —
+      // no native binary loading, no segfaults, no ENOENT errors.
      const { DefaultEmbeddingFunction } = await import('@chroma-core/default-embed');
-      const embeddingFunction = new DefaultEmbeddingFunction();
+      const embeddingFunction = new DefaultEmbeddingFunction({ wasm: true });
+
      this.collection = await this.chromaClient.getOrCreateCollection({
        name: this.collectionName,
        embeddingFunction
      });

-      logger.debug('CHROMA_SYNC', 'Collection ready', { collection: this.collectionName });
+      logger.debug('CHROMA_SYNC', 'Collection ready', {
+        collection: this.collectionName
+      });
    } catch (error) {
      logger.error('CHROMA_SYNC', 'Failed to get/create collection', { collection: this.collectionName }, error as Error);
      throw new Error(`Collection setup failed: ${error instanceof Error ? error.message : String(error)}`);
@@ -141,134 +141,143 @@ export class SDKAgent {
      }
    });

-    // Process SDK messages
-    for await (const message of queryResult) {
-      // Capture or update memory session ID from SDK message
-      // IMPORTANT: The SDK may return a DIFFERENT session_id on resume than what we sent!
-      // We must always sync the DB to match what the SDK actually uses.
-      //
-      // MULTI-TERMINAL COLLISION FIX (FK constraint bug):
-      // Use ensureMemorySessionIdRegistered() instead of updateMemorySessionId() because:
-      // 1. It's idempotent - safe to call multiple times
-      // 2. It verifies the update happened (SELECT before UPDATE)
-      // 3. Consistent with ResponseProcessor's usage pattern
-      // This ensures FK constraint compliance BEFORE any observations are stored.
-      if (message.session_id && message.session_id !== session.memorySessionId) {
-        const previousId = session.memorySessionId;
-        session.memorySessionId = message.session_id;
-        // Persist to database IMMEDIATELY for FK constraint compliance
-        // This must happen BEFORE any observations referencing this ID are stored
-        this.dbManager.getSessionStore().ensureMemorySessionIdRegistered(
-          session.sessionDbId,
-          message.session_id
-        );
-        // Verify the update by reading back from DB
-        const verification = this.dbManager.getSessionStore().getSessionById(session.sessionDbId);
-        const dbVerified = verification?.memory_session_id === message.session_id;
-        const logMessage = previousId
-          ? `MEMORY_ID_CHANGED | sessionDbId=${session.sessionDbId} | from=${previousId} | to=${message.session_id} | dbVerified=${dbVerified}`
-          : `MEMORY_ID_CAPTURED | sessionDbId=${session.sessionDbId} | memorySessionId=${message.session_id} | dbVerified=${dbVerified}`;
-        logger.info('SESSION', logMessage, {
-          sessionId: session.sessionDbId,
-          memorySessionId: message.session_id,
-          previousId
-        });
-        if (!dbVerified) {
-          logger.error('SESSION', `MEMORY_ID_MISMATCH | sessionDbId=${session.sessionDbId} | expected=${message.session_id} | got=${verification?.memory_session_id}`, {
-            sessionId: session.sessionDbId
+    // Process SDK messages — cleanup in finally ensures subprocess termination
+    // even if the loop throws (e.g., context overflow, invalid API key)
+    try {
+      for await (const message of queryResult) {
+        // Capture or update memory session ID from SDK message
+        // IMPORTANT: The SDK may return a DIFFERENT session_id on resume than what we sent!
+        // We must always sync the DB to match what the SDK actually uses.
+        //
+        // MULTI-TERMINAL COLLISION FIX (FK constraint bug):
+        // Use ensureMemorySessionIdRegistered() instead of updateMemorySessionId() because:
+        // 1. It's idempotent - safe to call multiple times
+        // 2. It verifies the update happened (SELECT before UPDATE)
+        // 3. Consistent with ResponseProcessor's usage pattern
+        // This ensures FK constraint compliance BEFORE any observations are stored.
+        if (message.session_id && message.session_id !== session.memorySessionId) {
+          const previousId = session.memorySessionId;
+          session.memorySessionId = message.session_id;
+          // Persist to database IMMEDIATELY for FK constraint compliance
+          // This must happen BEFORE any observations referencing this ID are stored
+          this.dbManager.getSessionStore().ensureMemorySessionIdRegistered(
+            session.sessionDbId,
+            message.session_id
+          );
+          // Verify the update by reading back from DB
+          const verification = this.dbManager.getSessionStore().getSessionById(session.sessionDbId);
+          const dbVerified = verification?.memory_session_id === message.session_id;
+          const logMessage = previousId
+            ? `MEMORY_ID_CHANGED | sessionDbId=${session.sessionDbId} | from=${previousId} | to=${message.session_id} | dbVerified=${dbVerified}`
+            : `MEMORY_ID_CAPTURED | sessionDbId=${session.sessionDbId} | memorySessionId=${message.session_id} | dbVerified=${dbVerified}`;
+          logger.info('SESSION', logMessage, {
+            sessionId: session.sessionDbId,
+            memorySessionId: message.session_id,
+            previousId
          });
-        }
-        // Debug-level alignment log for detailed tracing
-        logger.debug('SDK', `[ALIGNMENT] ${previousId ? 'Updated' : 'Captured'} | contentSessionId=${session.contentSessionId} → memorySessionId=${message.session_id} | Future prompts will resume with this ID`);
-      }
-
-      // Handle assistant messages
-      if (message.type === 'assistant') {
-        const content = message.message.content;
-        const textContent = Array.isArray(content)
-          ? content.filter((c: any) => c.type === 'text').map((c: any) => c.text).join('\n')
-          : typeof content === 'string' ? content : '';
-
-        // Check for context overflow - prevents infinite retry loops
-        if (textContent.includes('prompt is too long') ||
-            textContent.includes('context window')) {
-          logger.error('SDK', 'Context overflow detected - terminating session');
-          session.abortController.abort();
-          return;
+          if (!dbVerified) {
+            logger.error('SESSION', `MEMORY_ID_MISMATCH | sessionDbId=${session.sessionDbId} | expected=${message.session_id} | got=${verification?.memory_session_id}`, {
+              sessionId: session.sessionDbId
+            });
+          }
+          // Debug-level alignment log for detailed tracing
+          logger.debug('SDK', `[ALIGNMENT] ${previousId ? 'Updated' : 'Captured'} | contentSessionId=${session.contentSessionId} → memorySessionId=${message.session_id} | Future prompts will resume with this ID`);
        }

-        const responseSize = textContent.length;
+        // Handle assistant messages
+        if (message.type === 'assistant') {
+          const content = message.message.content;
+          const textContent = Array.isArray(content)
+            ? content.filter((c: any) => c.type === 'text').map((c: any) => c.text).join('\n')
+            : typeof content === 'string' ? content : '';

-        // Capture token state BEFORE updating (for delta calculation)
-        const tokensBeforeResponse = session.cumulativeInputTokens + session.cumulativeOutputTokens;
-
-        // Extract and track token usage
-        const usage = message.message.usage;
-        if (usage) {
-          session.cumulativeInputTokens += usage.input_tokens || 0;
-          session.cumulativeOutputTokens += usage.output_tokens || 0;
-
-          // Cache creation counts as discovery, cache read doesn't
-          if (usage.cache_creation_input_tokens) {
-            session.cumulativeInputTokens += usage.cache_creation_input_tokens;
+          // Check for context overflow - prevents infinite retry loops
+          if (textContent.includes('prompt is too long') ||
+              textContent.includes('context window')) {
+            logger.error('SDK', 'Context overflow detected - terminating session');
+            session.abortController.abort();
+            return;
          }

-          logger.debug('SDK', 'Token usage captured', {
-            sessionId: session.sessionDbId,
-            inputTokens: usage.input_tokens,
-            outputTokens: usage.output_tokens,
-            cacheCreation: usage.cache_creation_input_tokens || 0,
-            cacheRead: usage.cache_read_input_tokens || 0,
-            cumulativeInput: session.cumulativeInputTokens,
-            cumulativeOutput: session.cumulativeOutputTokens
-          });
+          const responseSize = textContent.length;
+
+          // Capture token state BEFORE updating (for delta calculation)
+          const tokensBeforeResponse = session.cumulativeInputTokens + session.cumulativeOutputTokens;
+
+          // Extract and track token usage
+          const usage = message.message.usage;
+          if (usage) {
+            session.cumulativeInputTokens += usage.input_tokens || 0;
+            session.cumulativeOutputTokens += usage.output_tokens || 0;
+
+            // Cache creation counts as discovery, cache read doesn't
+            if (usage.cache_creation_input_tokens) {
+              session.cumulativeInputTokens += usage.cache_creation_input_tokens;
+            }
+
+            logger.debug('SDK', 'Token usage captured', {
+              sessionId: session.sessionDbId,
+              inputTokens: usage.input_tokens,
+              outputTokens: usage.output_tokens,
+              cacheCreation: usage.cache_creation_input_tokens || 0,
+              cacheRead: usage.cache_read_input_tokens || 0,
+              cumulativeInput: session.cumulativeInputTokens,
+              cumulativeOutput: session.cumulativeOutputTokens
+            });
+          }
+
+          // Calculate discovery tokens (delta for this response only)
+          const discoveryTokens = (session.cumulativeInputTokens + session.cumulativeOutputTokens) - tokensBeforeResponse;
+
+          // Process response (empty or not) and mark messages as processed
+          // Capture earliest timestamp BEFORE processing (will be cleared after)
+          const originalTimestamp = session.earliestPendingTimestamp;
+
+          if (responseSize > 0) {
+            const truncatedResponse = responseSize > 100
+              ? textContent.substring(0, 100) + '...'
+              : textContent;
+            logger.dataOut('SDK', `Response received (${responseSize} chars)`, {
+              sessionId: session.sessionDbId,
+              promptNumber: session.lastPromptNumber
+            }, truncatedResponse);
+          }
+
+          // Detect fatal context overflow and terminate gracefully (issue #870)
+          if (typeof textContent === 'string' && textContent.includes('Prompt is too long')) {
+            throw new Error('Claude session context overflow: prompt is too long');
+          }
+
+          // Detect invalid API key — SDK returns this as response text, not an error.
+          // Throw so it surfaces in health endpoint and prevents silent failures.
+          if (typeof textContent === 'string' && textContent.includes('Invalid API key')) {
+            throw new Error('Invalid API key: check your API key configuration in ~/.claude-mem/settings.json or ~/.claude-mem/.env');
+          }
+
+          // Parse and process response using shared ResponseProcessor
+          await processAgentResponse(
+            textContent,
+            session,
+            this.dbManager,
+            this.sessionManager,
+            worker,
+            discoveryTokens,
+            originalTimestamp,
+            'SDK',
+            cwdTracker.lastCwd
+          );
        }

-        // Calculate discovery tokens (delta for this response only)
-        const discoveryTokens = (session.cumulativeInputTokens + session.cumulativeOutputTokens) - tokensBeforeResponse;
-
-        // Process response (empty or not) and mark messages as processed
-        // Capture earliest timestamp BEFORE processing (will be cleared after)
-        const originalTimestamp = session.earliestPendingTimestamp;
-
-        if (responseSize > 0) {
-          const truncatedResponse = responseSize > 100
-            ? textContent.substring(0, 100) + '...'
-            : textContent;
-          logger.dataOut('SDK', `Response received (${responseSize} chars)`, {
-            sessionId: session.sessionDbId,
-            promptNumber: session.lastPromptNumber
-          }, truncatedResponse);
+        // Log result messages
+        if (message.type === 'result' && message.subtype === 'success') {
+          // Usage telemetry is captured at SDK level
        }
-
-        // Detect fatal context overflow and terminate gracefully (issue #870)
-        if (typeof textContent === 'string' && textContent.includes('Prompt is too long')) {
-          throw new Error('Claude session context overflow: prompt is too long');
-        }
-
-        // Detect invalid API key — SDK returns this as response text, not an error.
-        // Throw so it surfaces in health endpoint and prevents silent failures.
-        if (typeof textContent === 'string' && textContent.includes('Invalid API key')) {
-          throw new Error('Invalid API key: check your API key configuration in ~/.claude-mem/settings.json or ~/.claude-mem/.env');
-        }
-
-        // Parse and process response using shared ResponseProcessor
-        await processAgentResponse(
-          textContent,
-          session,
-          this.dbManager,
-          this.sessionManager,
-          worker,
-          discoveryTokens,
-          originalTimestamp,
-          'SDK',
-          cwdTracker.lastCwd
-        );
      }
-
-      // Log result messages
-      if (message.type === 'result' && message.subtype === 'success') {
-        // Usage telemetry is captured at SDK level
+    } finally {
+      // Ensure subprocess is terminated after query completes (or on error)
+      const tracked = getProcessBySession(session.sessionDbId);
+      if (tracked && !tracked.process.killed && tracked.process.exitCode === null) {
+        await ensureProcessExit(tracked, 5000);
      }
    }

@@ -95,15 +95,15 @@ export class SettingsDefaultsManager {
    CLAUDE_CODE_PATH: '', // Empty means auto-detect via 'which claude'
    CLAUDE_MEM_MODE: 'code', // Default mode profile
    // Token Economics
-    CLAUDE_MEM_CONTEXT_SHOW_READ_TOKENS: 'true',
-    CLAUDE_MEM_CONTEXT_SHOW_WORK_TOKENS: 'true',
-    CLAUDE_MEM_CONTEXT_SHOW_SAVINGS_AMOUNT: 'true',
+    CLAUDE_MEM_CONTEXT_SHOW_READ_TOKENS: 'false',
+    CLAUDE_MEM_CONTEXT_SHOW_WORK_TOKENS: 'false',
+    CLAUDE_MEM_CONTEXT_SHOW_SAVINGS_AMOUNT: 'false',
    CLAUDE_MEM_CONTEXT_SHOW_SAVINGS_PERCENT: 'true',
    // Observation Filtering
    CLAUDE_MEM_CONTEXT_OBSERVATION_TYPES: DEFAULT_OBSERVATION_TYPES_STRING,
    CLAUDE_MEM_CONTEXT_OBSERVATION_CONCEPTS: DEFAULT_OBSERVATION_CONCEPTS_STRING,
    // Display Configuration
-    CLAUDE_MEM_CONTEXT_FULL_COUNT: '5',
+    CLAUDE_MEM_CONTEXT_FULL_COUNT: '0',
    CLAUDE_MEM_CONTEXT_FULL_FIELD: 'narrative',
    CLAUDE_MEM_CONTEXT_SESSION_COUNT: '10',
    // Feature Toggles
Author	SHA1	Message	Date
Alex Newman	327dd44992	chore: bump version to 10.1.0 Publish to npm / publish (push) Has been cancelled Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 00:17:34 -05:00
Alex Newman	0e11d4812a	Merge pull request #1125 from thedotmack/feat/session-start-system-message feat: SessionStart systemMessage + cleaner defaults	2026-02-16 00:15:52 -05:00
Alex Newman	676a3d175e	fix: make context and colored timeline fetches truly parallel Address PR #1125 review feedback - both fetches now start simultaneously via Promise.all instead of sequential-then-parallel. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 00:11:25 -05:00
Alex Newman	34358ab33d	feat: add systemMessage support for SessionStart hook and tune defaults Add systemMessage field to HookResult so SessionStart can display a colored timeline directly to the user in the CLI. The handler now parallel-fetches both markdown (for Claude context) and ANSI-colored (for user display) timelines, appending a viewer URL link. Also update default settings to hide verbose token columns (read/work tokens, savings amount) and disable full observation expansion, keeping the cleaner index-only view by default. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 00:05:13 -05:00
Alex Newman	5ccaf40ad0	docs: update CHANGELOG.md for v10.0.8 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:33:51 -05:00
Alex Newman	51abe5d1ff	chore: bump version to 10.0.8 Publish to npm / publish (push) Has been cancelled Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:33:20 -05:00
Alex Newman	2dea824cc0	Merge pull request #1122 from thedotmack/claude/friendly-pascal fix: resolve orphaned subprocesses and Chroma HTTP regressions	2026-02-15 23:31:10 -05:00
Alex Newman	055888e181	fix: address PR review feedback for subprocess cleanup and binary resolution Wrap SDK query loop in try/finally so subprocess cleanup runs on error paths. Swap Chroma binary check order to try project-level .bin first (common case). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:24:00 -05:00
Alex Newman	67ba17cc8a	fix: use WASM backend for Chroma embeddings to fix cross-platform issues Chroma requires client-side embeddings — the server is storage only. The previous commit incorrectly removed @chroma-core/default-embed. Uses DefaultEmbeddingFunction({ wasm: true }) which forces the WASM backend instead of native ONNX binaries. Same model (all-MiniLM-L6-v2), same embeddings, but works on all platforms without segfaults or ENOENT errors (#1104, #1105, #1110). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 23:14:21 -05:00
Alex Newman	e1ef14dbcc	fix: resolve orphaned subprocesses and Chroma HTTP regressions - Add subprocess cleanup after SDK query loop completes, using existing ProcessRegistry infrastructure (getProcessBySession + ensureProcessExit) - Replace npx-based Chroma binary spawning with absolute path resolution via require.resolve, falling back to npx with explicit cwd (#1120) - Remove @chroma-core/default-embed client-side dependency; let Chroma HTTP server handle embeddings server-side (#1104, #1105, #1110) Closes #1010, #1089, #1090, #1068, #1120, #1104, #1105, #1110 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 22:04:52 -05:00
Alex Newman	685d54f2cb	ci: add npm publish workflow on tag push Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 17:06:24 -05:00
Alex Newman	490f36099f	docs: update CHANGELOG.md for v10.0.7 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 16:53:19 -05:00