diff --git a/docs/reports/2026-01-04--issue-514-orphaned-sessions-analysis.md b/docs/reports/2026-01-04--issue-514-orphaned-sessions-analysis.md new file mode 100644 index 00000000..f61e0378 --- /dev/null +++ b/docs/reports/2026-01-04--issue-514-orphaned-sessions-analysis.md @@ -0,0 +1,292 @@ +# Issue #514: Orphaned Observer Session Files Analysis + +**Date:** January 4, 2026 +**Status:** PARTIALLY RESOLVED - Root cause understood, fix was made but reverted +**Original Issue:** 13,000+ orphaned .jsonl session files created over 2 days + +--- + +## Executive Summary + +Issue #514 reported that the plugin created 13,000+ orphaned session .jsonl files in `~/.claude/projects//`. Each file contained only an initialization message with no actual observations. The hypothesis was that `startSessionProcessor()` in startup-recovery created new observer sessions in a loop. + +**Current State:** The issue was **fixed in commit 9a7f662** with a deterministic `mem-${contentSessionId}` prefix approach, but this fix was **reverted in commit f9197b5** due to the SDK not accepting custom session IDs. The current code uses a NULL-based initialization pattern that can still create orphaned sessions under certain conditions. + +--- + +## Evidence: Current File Analysis + +Filesystem analysis of `~/.claude/projects/-Users-alexnewman-Scripts-claude-mem/`: + +| Line Count | Number of Files | +|------------|-----------------| +| 0 lines (empty) | 407 | +| 1 line | **12,562** | +| 2 lines | 3,199 | +| 3+ lines | 3,546 | +| **Total** | **~19,714** | + +The 12,562 single-line files are consistent with the issue description - sessions that initialized but never received observations. + +Sample single-line file content: +```json +{"type":"queue-operation","operation":"dequeue","timestamp":"2025-12-28T20:41:25.484Z","sessionId":"00081a3b-9485-48a4-89f0-fd4dfccd3ac9"} +``` + +--- + +## Root Cause Analysis + +### The Problem Chain + +1. **Worker startup calls `processPendingQueues()`** (line 281 in worker-service.ts) +2. For each session with pending messages, it calls `initializeSession()` then `startSessionProcessor()` +3. `startSessionProcessor()` invokes `sdkAgent.startSession()` which calls the Claude Agent SDK `query()` function +4. **If `memorySessionId` is NULL**, no `resume` parameter is passed to `query()` +5. **The SDK creates a NEW .jsonl file** for each query call without a resume parameter +6. **If the query aborts before receiving a response** (timeout, crash, abort signal), the `memorySessionId` is never captured +7. On next startup, the cycle repeats - creating yet another orphaned file + +### Why Sessions Abort Before Capturing memorySessionId + +Looking at `startSessionProcessor()` flow: + +```typescript +// worker-service.ts lines 301-321 +private startSessionProcessor(session, source) { + session.generatorPromise = this.sdkAgent.startSession(session, this) + .catch(error => { /* error handling */ }) + .finally(() => { + session.generatorPromise = null; + this.broadcastProcessingStatus(); + }); +} +``` + +And `processPendingQueues()`: + +```typescript +// worker-service.ts lines 347-371 +for (const sessionDbId of orphanedSessionIds) { + const session = this.sessionManager.initializeSession(sessionDbId); + this.startSessionProcessor(session, 'startup-recovery'); + await new Promise(resolve => setTimeout(resolve, 100)); // 100ms delay between sessions +} +``` + +The problem: Starting 50 sessions rapidly (100ms delay) with pending messages means: +- All 50 SDK queries start nearly simultaneously +- The SDK creates 50 new .jsonl files (since none have memorySessionId yet) +- If any query fails/aborts before the first response, its memorySessionId is never captured +- On next startup, those sessions get new files again + +--- + +## Code Flow: Where .jsonl Files Are Created + +The .jsonl files are created by the **Claude Agent SDK** (`@anthropic-ai/claude-agent-sdk`), not by claude-mem directly. + +When `query()` is called in SDKAgent.ts: + +```typescript +// SDKAgent.ts lines 89-99 +const queryResult = query({ + prompt: messageGenerator, + options: { + model: modelId, + // Resume with captured memorySessionId (null on first prompt, real ID on subsequent) + ...(hasRealMemorySessionId && { resume: session.memorySessionId }), + disallowedTools, + abortController: session.abortController, + pathToClaudeCodeExecutable: claudePath + } +}); +``` + +**Key insight:** If `hasRealMemorySessionId` is false (memorySessionId is null), no `resume` parameter is passed. The SDK then generates a new UUID and creates a new file at: +`~/.claude/projects//.jsonl` + +--- + +## Fix History + +### Commit 9a7f662: The Original Fix (Reverted) + +``` +fix(sdk): always pass deterministic session ID to prevent orphaned files + +Fixes #514 - Excessive observer sessions created during startup-recovery + +Root cause: When memorySessionId was null, no `resume` parameter was passed +to the SDK's query(). This caused the SDK to create a NEW session file on +every call. If queries aborted before capturing the SDK's session_id, the +placeholder remained, leading to cascading creation of 13,000+ orphaned files. + +Fix: +- Generate deterministic ID `mem-${contentSessionId}` upfront +- Always pass it to `resume` parameter +- Persist immediately to database before query starts +- If SDK returns different ID, capture and use that going forward +``` + +**This fix was correct in approach** - always passing a resume parameter prevents new file creation. + +### Commit f9197b5: The Revert + +``` +fix(sdk): restore session continuity via robust capture-and-resume strategy + +Replaces the deterministic 'mem-' ID approach with a capture-based strategy: +1. Passes 'resume' parameter ONLY when a verified memory session ID exists +2. Captures SDK-generated session ID when it differs from current ID +3. Ensures subsequent prompts resume the correctly captured session ID + +This resolves the issue where new sessions were created for every message +due to failure to capture/resume the initial session ID, without introducing +potentially invalid deterministic IDs. +``` + +**The revert explanation suggests the SDK rejected the `mem-` prefix IDs.** + +### Commit 005b0f8: Current NULL-based Pattern + +Changed `memory_session_id` initialization from `contentSessionId` (placeholder) to `NULL`: +- Simpler logic: `!!session.memorySessionId` instead of `memorySessionId !== contentSessionId` +- But still creates new files on first prompt of each session + +--- + +## Relationship with Issue #520 (Stuck Messages) + +**Issue #520 is related but distinct:** + +| Aspect | Issue #514 (Orphaned Files) | Issue #520 (Stuck Messages) | +|--------|-----------------------------|-----------------------------| +| Problem | Too many .jsonl files | Messages never processed | +| Root Cause | SDK creates new file per query without resume | Old claim-process-mark pattern left messages in 'processing' state | +| Status | Partially resolved | **Fully resolved** | +| Fix | Need deterministic resume IDs | Changed to claim-and-delete pattern | + +**Connection:** Both issues relate to startup-recovery. Issue #520's fix (claim-and-delete pattern) doesn't create the loop that #514 describes, but #514 can still occur when: +1. Sessions have pending messages +2. Recovery starts the generator +3. Generator aborts before capturing memorySessionId +4. Next startup repeats the cycle + +--- + +## v8.5.7 Status + +**v8.5.7 did NOT fully address Issue #514.** The major changes were: +- Modular architecture refactor +- NULL-based initialization pattern +- Comprehensive test coverage + +The deterministic `mem-` prefix fix (9a7f662) was reverted before v8.5.7. + +--- + +## Recommended Fix + +### Option 1: Reintroduce Deterministic IDs with SDK Validation + +```typescript +// SDKAgent.ts - In startSession() +async startSession(session: ActiveSession, worker?: WorkerRef): Promise { + // Generate deterministic ID based on database session ID (not UUID-based contentSessionId) + // Format: "mem-" is short and unlikely to conflict + const deterministicMemoryId = session.memorySessionId || `mem-${session.sessionDbId}`; + + // Always pass resume to prevent orphaned sessions + const queryResult = query({ + prompt: messageGenerator, + options: { + model: modelId, + resume: deterministicMemoryId, // ALWAYS pass, even if SDK might reject + disallowedTools, + abortController: session.abortController, + pathToClaudeCodeExecutable: claudePath + } + }); + + // Capture whatever ID the SDK actually uses + for await (const message of queryResult) { + if (message.session_id && message.session_id !== session.memorySessionId) { + session.memorySessionId = message.session_id; + this.dbManager.getSessionStore().updateMemorySessionId( + session.sessionDbId, + message.session_id + ); + } + // ... rest of processing + } +} +``` + +### Option 2: Limit Recovery Scope + +Prevent the recovery loop by limiting how many times a session can be recovered: + +```typescript +// In processPendingQueues() +for (const sessionDbId of orphanedSessionIds) { + // Check if this session was already recovered recently + const dbSession = this.dbManager.getSessionById(sessionDbId); + const recoveryAttempts = dbSession.recovery_attempts || 0; + + if (recoveryAttempts >= 3) { + logger.warn('SYSTEM', 'Session exceeded max recovery attempts, skipping', { + sessionDbId, + recoveryAttempts + }); + continue; + } + + // Increment recovery counter + this.dbManager.getSessionStore().incrementRecoveryAttempts(sessionDbId); + + // ... rest of recovery +} +``` + +### Option 3: Cleanup Old Files (Mitigation, Not Fix) + +Add a cleanup script that removes orphaned .jsonl files: + +```bash +# Find files with only 1 line older than 7 days +find ~/.claude/projects/ -name "*.jsonl" -mtime +7 \ + -exec sh -c '[ $(wc -l < "$1") -le 1 ] && rm "$1"' _ {} \; +``` + +--- + +## Files Involved + +| File | Role | +|------|------| +| `src/services/worker-service.ts` | `startSessionProcessor()`, `processPendingQueues()` | +| `src/services/worker/SDKAgent.ts` | `startSession()`, `query()` call with `resume` parameter | +| `src/services/worker/SessionManager.ts` | `initializeSession()`, session lifecycle | +| `src/services/sqlite/sessions/create.ts` | `createSDKSession()`, NULL-based initialization | +| `src/services/sqlite/PendingMessageStore.ts` | `getSessionsWithPendingMessages()` | + +--- + +## Conclusion + +Issue #514 was correctly diagnosed. The fix in commit 9a7f662 was the right approach but was reverted because the SDK may not accept arbitrary custom IDs. The current NULL-based pattern (005b0f8) is cleaner but doesn't prevent orphaned files when queries abort before capturing the SDK's session ID. + +**Recommendation:** Reintroduce the deterministic ID approach with proper handling of SDK rejections (Option 1). If the SDK rejects the ID and returns a different one, capture and persist that ID immediately. This ensures at most one .jsonl file per database session, even across crashes and restarts. + +--- + +## Appendix: Git Commit References + +| Commit | Description | +|--------|-------------| +| 9a7f662 | Original fix: deterministic `mem-` prefix IDs (REVERTED) | +| f9197b5 | Revert: capture-based strategy without deterministic IDs | +| 005b0f8 | NULL-based initialization pattern (current) | +| d72a81e | Queue refactoring (related to #520) | +| eb1a78b | Claim-and-delete pattern (fixes #520) | diff --git a/docs/reports/2026-01-04--issue-517-windows-powershell-analysis.md b/docs/reports/2026-01-04--issue-517-windows-powershell-analysis.md new file mode 100644 index 00000000..56567691 --- /dev/null +++ b/docs/reports/2026-01-04--issue-517-windows-powershell-analysis.md @@ -0,0 +1,87 @@ +# Issue #517 Analysis: Windows PowerShell Escaping in cleanupOrphanedProcesses() + +**Date:** 2026-01-04 +**Version Analyzed:** 8.5.7 +**Status:** NOT FIXED - Issue still present + +## Summary + +The reported issue involves PowerShell's `$_` variable being interpreted by Bash before PowerShell receives it when running in Git Bash or WSL environments on Windows. This causes `cleanupOrphanedProcesses()` to fail during worker initialization. + +## Current State + +The `cleanupOrphanedProcesses()` function is located in: +- **File:** `/Users/alexnewman/Scripts/claude-mem/src/services/infrastructure/ProcessManager.ts` +- **Lines:** 164-251 + +### Problematic Code (Lines 170-172) + +```typescript +if (isWindows) { + // Windows: Use PowerShell Get-CimInstance to find chroma-mcp processes + const cmd = `powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.Name -like '*python*' -and $_.CommandLine -like '*chroma-mcp*' } | Select-Object -ExpandProperty ProcessId"`; + const { stdout } = await execAsync(cmd, { timeout: 60000 }); +``` + +The `$_.Name` and `$_.CommandLine` contain `$_` which is a special variable in both PowerShell and Bash. When this command string is executed via Node.js `child_process.exec()` in a Git Bash or WSL environment, Bash may interpret `$_` as its own special variable (the last argument of the previous command) before passing it to PowerShell. + +### Additional Occurrence (Lines 91-92) + +A similar issue exists in `getChildProcesses()`: + +```typescript +const cmd = `powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.ParentProcessId -eq ${parentPid} } | Select-Object -ExpandProperty ProcessId"`; +``` + +## Error Handling Analysis + +Both functions have try-catch blocks with non-blocking error handling: +- Line 208-212: `cleanupOrphanedProcesses()` catches errors and logs a warning, then returns +- Line 98-102: `getChildProcesses()` catches errors and logs a warning, returning empty array + +While this prevents worker initialization from crashing, it means orphaned process cleanup silently fails on affected Windows environments. + +## Recommended Fix + +Replace PowerShell commands with WMIC (Windows Management Instrumentation Command-line), which does not use `$_` syntax: + +### For cleanupOrphanedProcesses() (Line 171): + +**Current:** +```typescript +const cmd = `powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.Name -like '*python*' -and $_.CommandLine -like '*chroma-mcp*' } | Select-Object -ExpandProperty ProcessId"`; +``` + +**Recommended:** +```typescript +const cmd = `wmic process where "name like '%python%' and commandline like '%chroma-mcp%'" get processid /format:list`; +``` + +### For getChildProcesses() (Line 91): + +**Current:** +```typescript +const cmd = `powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.ParentProcessId -eq ${parentPid} } | Select-Object -ExpandProperty ProcessId"`; +``` + +**Recommended:** +```typescript +const cmd = `wmic process where "parentprocessid=${parentPid}" get processid /format:list`; +``` + +### Implementation Notes + +1. WMIC output format differs from PowerShell - parse `ProcessId=12345` format +2. WMIC is deprecated in newer Windows versions but still widely available +3. Alternative: Use PowerShell with proper escaping (`$$_` or `\$_` depending on context) +4. Consider using `powershell -NoProfile -NonInteractive` flags for faster execution + +## Impact Assessment + +- **Severity:** Medium - orphaned process cleanup fails silently +- **Scope:** Windows users running in Git Bash, WSL, or mixed shell environments +- **Workaround:** None currently - users must manually kill orphaned chroma-mcp processes + +## Files to Modify + +1. `/src/services/infrastructure/ProcessManager.ts` (lines 91-92, 171-172) diff --git a/docs/reports/2026-01-04--issue-520-stuck-messages-analysis.md b/docs/reports/2026-01-04--issue-520-stuck-messages-analysis.md new file mode 100644 index 00000000..216a2ed4 --- /dev/null +++ b/docs/reports/2026-01-04--issue-520-stuck-messages-analysis.md @@ -0,0 +1,210 @@ +# Issue #520: Stuck Messages Analysis + +**Date:** January 4, 2026 +**Status:** RESOLVED - Issue no longer exists in current codebase +**Original Issue:** Messages in 'processing' status never recovered after worker crash + +--- + +## Executive Summary + +The issue described in GitHub #520 has been **fully resolved** in the current codebase through a fundamental architectural change. The system now uses a **claim-and-delete** pattern instead of the old **claim-process-mark** pattern, which eliminates the stuck 'processing' state problem entirely. + +--- + +## Original Issue Description + +The issue claimed that after a worker crash: + +1. `getSessionsWithPendingMessages()` returns sessions with `status IN ('pending', 'processing')` +2. But `claimNextMessage()` only looks for `status = 'pending'` +3. So 'processing' messages are orphaned + +**Proposed Fix:** Add `resetStuckMessages(0)` at start of `processPendingQueues()` + +--- + +## Current Code Analysis + +### 1. Queue Processing Pattern: Claim-and-Delete + +The current architecture uses `claimAndDelete()` instead of `claimNextMessage()`: + +**File:** `/Users/alexnewman/Scripts/claude-mem/src/services/sqlite/PendingMessageStore.ts` + +```typescript +// Lines 85-104 +claimAndDelete(sessionDbId: number): PersistentPendingMessage | null { + const claimTx = this.db.transaction((sessionId: number) => { + const peekStmt = this.db.prepare(` + SELECT * FROM pending_messages + WHERE session_db_id = ? AND status = 'pending' + ORDER BY id ASC + LIMIT 1 + `); + const msg = peekStmt.get(sessionId) as PersistentPendingMessage | null; + + if (msg) { + // Delete immediately - no "processing" state needed + const deleteStmt = this.db.prepare('DELETE FROM pending_messages WHERE id = ?'); + deleteStmt.run(msg.id); + } + return msg; + }); + + return claimTx(sessionDbId) as PersistentPendingMessage | null; +} +``` + +**Key insight:** Messages are atomically selected and deleted in a single transaction. There is no 'processing' state for messages being actively worked on - they simply don't exist in the database anymore. + +### 2. Iterator Uses claimAndDelete + +**File:** `/Users/alexnewman/Scripts/claude-mem/src/services/queue/SessionQueueProcessor.ts` + +```typescript +// Lines 18-38 +async *createIterator(sessionDbId: number, signal: AbortSignal): AsyncIterableIterator { + while (!signal.aborted) { + try { + // Atomically claim AND DELETE next message from DB + // Message is now in memory only - no "processing" state tracking needed + const persistentMessage = this.store.claimAndDelete(sessionDbId); + + if (persistentMessage) { + // Yield the message for processing (it's already deleted from queue) + yield this.toPendingMessageWithId(persistentMessage); + } else { + // Queue empty - wait for wake-up event + await this.waitForMessage(signal); + } + } catch (error) { + // ... error handling + } + } +} +``` + +### 3. getSessionsWithPendingMessages Still Checks Both States + +**File:** `/Users/alexnewman/Scripts/claude-mem/src/services/sqlite/PendingMessageStore.ts` + +```typescript +// Lines 319-326 +getSessionsWithPendingMessages(): number[] { + const stmt = this.db.prepare(` + SELECT DISTINCT session_db_id FROM pending_messages + WHERE status IN ('pending', 'processing') + `); + const results = stmt.all() as { session_db_id: number }[]; + return results.map(r => r.session_db_id); +} +``` + +**This is technically vestigial code** - with the claim-and-delete pattern, messages should never be in 'processing' state. However, it provides backward compatibility and defense-in-depth. + +### 4. Startup Recovery Still Exists + +**File:** `/Users/alexnewman/Scripts/claude-mem/src/services/worker-service.ts` + +```typescript +// Lines 236-242 +// Recover stuck messages from previous crashes +const { PendingMessageStore } = await import('./sqlite/PendingMessageStore.js'); +const pendingStore = new PendingMessageStore(this.dbManager.getSessionStore().db, 3); +const STUCK_THRESHOLD_MS = 5 * 60 * 1000; +const resetCount = pendingStore.resetStuckMessages(STUCK_THRESHOLD_MS); +if (resetCount > 0) { + logger.info('SYSTEM', `Recovered ${resetCount} stuck messages from previous session`, { thresholdMinutes: 5 }); +} +``` + +This runs BEFORE `processPendingQueues()` is called (line 281), which addresses the original fix request. + +--- + +## Verification of Issue Status + +### Does the Issue Exist? + +**NO** - The issue as described no longer exists because: + +1. **No 'processing' state during normal operation**: With claim-and-delete, messages go directly from 'pending' to 'deleted'. They never enter a 'processing' state. + +2. **Startup recovery handles legacy stuck messages**: Even if 'processing' messages exist (from old code or edge cases), `resetStuckMessages()` is called BEFORE `processPendingQueues()` in `initializeBackground()` (lines 236-241 run before line 281). + +3. **Architecture fundamentally changed**: The old `claimNextMessage()` function that only looked for `status = 'pending'` no longer exists. It was replaced with `claimAndDelete()`. + +### GeminiAgent and OpenRouterAgent Behavior + +Both agents use the same `SessionManager.getMessageIterator()` which calls `SessionQueueProcessor.createIterator()` which uses `claimAndDelete()`. All three agents (SDKAgent, GeminiAgent, OpenRouterAgent) use identical queue processing: + +```typescript +// GeminiAgent.ts:174, OpenRouterAgent.ts:134 +for await (const message of this.sessionManager.getMessageIterator(session.sessionDbId)) { + // ... +} +``` + +They do NOT handle recovery differently - they all rely on the shared infrastructure. + +### What v8.5.7 Changed + +Looking at the git history: + +``` +v8.5.7 (ac03901): +- Minor ESM/CommonJS compatibility fix for isMainModule detection +- No queue-related changes + +v8.5.6 -> v8.5.7: +- f21ea97 refactor: decompose monolith into modular architecture with comprehensive test suite (#538) +``` + +The major refactor happened before v8.5.7. The claim-and-delete pattern was already in place. + +--- + +## Timeline of Resolution + +Based on git history, the issue was likely resolved through these commits: + +1. **b8ce27b** - `feat(queue): Simplify queue processing and enhance reliability` +2. **eb1a78b** - `fix: eliminate duplicate observations by simplifying message queue` +3. **d72a81e** - `Refactor session queue processing and database interactions` + +These commits appear to have introduced the claim-and-delete pattern that eliminates the original bug. + +--- + +## Conclusion + +**Issue #520 should be closed as resolved.** + +The described bug (`claimNextMessage()` only checking `status = 'pending'`) no longer exists because: + +1. `claimNextMessage()` was replaced with `claimAndDelete()` which atomically removes messages +2. `resetStuckMessages()` is already called at startup BEFORE `processPendingQueues()` +3. The 'processing' status is now only used for legacy compatibility and edge cases + +### No Fix Needed + +The proposed fix ("Add `resetStuckMessages(0)` at start of `processPendingQueues()`") is: + +1. **Unnecessary** - The recovery happens in `initializeBackground()` before `processPendingQueues()` is called +2. **Using wrong threshold** - `resetStuckMessages(0)` would reset ALL processing messages immediately, which could cause issues if called during normal operation (not just startup) + +The current implementation with a 5-minute threshold is more robust - it only recovers truly stuck messages, not messages that are actively being processed. + +--- + +## Appendix: File References + +| Component | File | Key Lines | +|-----------|------|-----------| +| claimAndDelete | `src/services/sqlite/PendingMessageStore.ts` | 85-104 | +| Queue Iterator | `src/services/queue/SessionQueueProcessor.ts` | 18-38 | +| Startup Recovery | `src/services/worker-service.ts` | 236-242 | +| processPendingQueues | `src/services/worker-service.ts` | 326-375 | +| getSessionsWithPendingMessages | `src/services/sqlite/PendingMessageStore.ts` | 319-326 | +| resetStuckMessages | `src/services/sqlite/PendingMessageStore.ts` | 279-290 | diff --git a/docs/reports/2026-01-04--issue-527-uv-homebrew-analysis.md b/docs/reports/2026-01-04--issue-527-uv-homebrew-analysis.md new file mode 100644 index 00000000..27fc897f --- /dev/null +++ b/docs/reports/2026-01-04--issue-527-uv-homebrew-analysis.md @@ -0,0 +1,112 @@ +# Issue #527: uv Detection Fails on Apple Silicon Macs with Homebrew Installation + +**Date**: 2026-01-04 +**Issue**: GitHub Issue #527 +**Status**: Confirmed - Fix Required + +## Summary + +The `isUvInstalled()` function fails to detect uv when installed via Homebrew on Apple Silicon Macs because it does not check the `/opt/homebrew/bin/uv` path. + +## Analysis + +### Files Affected + +Two copies of `smart-install.js` exist in the codebase: + +1. **Source file**: `/Users/alexnewman/Scripts/claude-mem/scripts/smart-install.js` +2. **Built/deployed file**: `/Users/alexnewman/Scripts/claude-mem/plugin/scripts/smart-install.js` + +### Current uv Path Detection + +**Source file (`scripts/smart-install.js`)** - Lines 22-24: +```javascript +const UV_COMMON_PATHS = IS_WINDOWS + ? [join(homedir(), '.local', 'bin', 'uv.exe'), join(homedir(), '.cargo', 'bin', 'uv.exe')] + : [join(homedir(), '.local', 'bin', 'uv'), join(homedir(), '.cargo', 'bin', 'uv'), '/usr/local/bin/uv']; +``` + +**Plugin file (`plugin/scripts/smart-install.js`)** - Lines 103-105: +```javascript +const uvPaths = IS_WINDOWS + ? [join(homedir(), '.local', 'bin', 'uv.exe'), join(homedir(), '.cargo', 'bin', 'uv.exe')] + : [join(homedir(), '.local', 'bin', 'uv'), join(homedir(), '.cargo', 'bin', 'uv'), '/usr/local/bin/uv']; +``` + +### Paths Currently Checked (Unix/macOS) + +| Path | Installer | Architecture | +|------|-----------|--------------| +| `~/.local/bin/uv` | Official installer | Any | +| `~/.cargo/bin/uv` | Cargo/Rust install | Any | +| `/usr/local/bin/uv` | Homebrew (Intel) | x86_64 | + +### Missing Path + +| Path | Installer | Architecture | +|------|-----------|--------------| +| `/opt/homebrew/bin/uv` | Homebrew (Apple Silicon) | arm64 | + +## Root Cause + +Homebrew installs to different prefixes depending on architecture: +- **Intel Macs (x86_64)**: `/usr/local/bin/` +- **Apple Silicon Macs (arm64)**: `/opt/homebrew/bin/` + +The current implementation only includes the Intel Homebrew path, causing detection to fail on Apple Silicon when: +1. uv is installed via `brew install uv` +2. The user's shell PATH is not available during script execution (common in non-interactive contexts) + +## Impact + +Users on Apple Silicon Macs who installed uv via Homebrew will: +1. See "uv not found" errors +2. Have uv unnecessarily reinstalled via the official installer +3. End up with duplicate installations + +## Recommended Fix + +Add `/opt/homebrew/bin/uv` to the Unix paths array. + +### Source file (`scripts/smart-install.js`) - Line 24 + +**Before:** +```javascript +: [join(homedir(), '.local', 'bin', 'uv'), join(homedir(), '.cargo', 'bin', 'uv'), '/usr/local/bin/uv']; +``` + +**After:** +```javascript +: [join(homedir(), '.local', 'bin', 'uv'), join(homedir(), '.cargo', 'bin', 'uv'), '/usr/local/bin/uv', '/opt/homebrew/bin/uv']; +``` + +### Plugin file (`plugin/scripts/smart-install.js`) - Lines 103-105 and 222-224 + +The same fix should be applied in both locations where `uvPaths` is defined: +- Line 105 in `isUvInstalled()` +- Line 224 in `installUv()` + +### Note: Bun Has the Same Issue + +The Bun detection has the same gap: + +**Current (`scripts/smart-install.js` line 20):** +```javascript +: [join(homedir(), '.bun', 'bin', 'bun'), '/usr/local/bin/bun']; +``` + +**Should also add:** +```javascript +: [join(homedir(), '.bun', 'bin', 'bun'), '/usr/local/bin/bun', '/opt/homebrew/bin/bun']; +``` + +## Verification + +After the fix, verify by: +1. Installing uv via Homebrew on an Apple Silicon Mac +2. Running the smart-install script +3. Confirming uv is detected without attempting reinstallation + +## Conclusion + +**Fix is required.** The `/opt/homebrew/bin/uv` path is missing from both files. This is a simple one-line addition to the path arrays. The same fix should also be applied to Bun detection paths for consistency. diff --git a/docs/reports/2026-01-04--issue-532-memory-leak-analysis.md b/docs/reports/2026-01-04--issue-532-memory-leak-analysis.md new file mode 100644 index 00000000..23339c03 --- /dev/null +++ b/docs/reports/2026-01-04--issue-532-memory-leak-analysis.md @@ -0,0 +1,324 @@ +# Issue #532: Memory Leak in SessionManager - Analysis Report + +**Date**: 2026-01-04 +**Issue**: Memory leak causing 54GB+ VS Code memory consumption after several days of use +**Reported Root Causes**: +1. Sessions never auto-cleanup after SDK agent completes +2. `conversationHistory` array grows unbounded (never trimmed) + +--- + +## Executive Summary + +This analysis confirms **both issues exist in the current codebase** (v8.5.7). While v8.5.7 included a major modular refactor, it did **not address either memory leak issue**. The `SessionManager` holds sessions indefinitely in memory with no TTL/cleanup mechanism, and `conversationHistory` arrays grow unbounded within each session (with only OpenRouter implementing partial mitigation). + +--- + +## 1. SessionManager Session Storage Analysis + +### Location +`/Users/alexnewman/Scripts/claude-mem/src/services/worker/SessionManager.ts` + +### Current Implementation + +```typescript +export class SessionManager { + private sessions: Map = new Map(); + private sessionQueues: Map = new Map(); + // ... +} +``` + +Sessions are stored in an in-memory `Map` with the session database ID as the key. + +### Session Lifecycle + +| Event | Method | Behavior | +|-------|--------|----------| +| Session created | `initializeSession()` | Added to `this.sessions` Map (line 152) | +| Session deleted | `deleteSession()` | Removed from `this.sessions` Map (line 293) | +| Worker shutdown | `shutdownAll()` | Calls `deleteSession()` on all sessions | + +### The Problem: No Automatic Cleanup + +Looking at `/Users/alexnewman/Scripts/claude-mem/src/services/worker/http/routes/SessionRoutes.ts` (lines 213-216), the session completion handling has this comment: + +```typescript +// NOTE: We do NOT delete the session here anymore. +// The generator waits for events, so if it exited, it's either aborted or crashed. +// Idle sessions stay in memory (ActiveSession is small) to listen for future events. +``` + +**Critical Finding**: Sessions are **intentionally never deleted** after the SDK agent completes. They persist indefinitely "to listen for future events." + +### When Sessions ARE Deleted + +Sessions are only deleted when: +1. Explicit `DELETE /sessions/:sessionDbId` HTTP request (manual cleanup) +2. `POST /sessions/:sessionDbId/complete` HTTP request (cleanup-hook callback) +3. Worker service shutdown (`shutdownAll()`) + +There is **NO automatic cleanup mechanism** based on: +- Session age/TTL +- Session inactivity timeout +- Memory pressure +- Completed/failed status + +--- + +## 2. conversationHistory Analysis + +### Location +`/Users/alexnewman/Scripts/claude-mem/src/services/worker-types.ts` (line 34) + +### Type Definition + +```typescript +export interface ActiveSession { + // ... + conversationHistory: ConversationMessage[]; // Shared conversation history for provider switching + // ... +} +``` + +### Usage Pattern + +The `conversationHistory` array is populated by three agent implementations: + +1. **SDKAgent** (`/Users/alexnewman/Scripts/claude-mem/src/services/worker/SDKAgent.ts`) + - Adds user messages at lines 247, 280, 302 + - Assistant responses added via `ResponseProcessor` + +2. **GeminiAgent** (`/Users/alexnewman/Scripts/claude-mem/src/services/worker/GeminiAgent.ts`) + - Adds user messages at lines 143, 196, 232 + - Adds assistant responses at lines 148, 202, 238 + +3. **OpenRouterAgent** (`/Users/alexnewman/Scripts/claude-mem/src/services/worker/OpenRouterAgent.ts`) + - Adds user messages at lines 103, 155, 191 + - Adds assistant responses at lines 108, 161, 197 + - **Implements truncation**: See `truncateHistory()` at lines 262-301 + +4. **ResponseProcessor** (`/Users/alexnewman/Scripts/claude-mem/src/services/worker/agents/ResponseProcessor.ts`) + - Adds assistant responses at line 57 + +### The Problem: Unbounded Growth + +**For Claude SDK and Gemini agents**, there is **no limit or trimming** of `conversationHistory`. Every message is `push()`ed without checking array size. + +**OpenRouter ONLY** has mitigation via `truncateHistory()`: + +```typescript +private truncateHistory(history: ConversationMessage[]): ConversationMessage[] { + const MAX_CONTEXT_MESSAGES = parseInt(settings.CLAUDE_MEM_OPENROUTER_MAX_CONTEXT_MESSAGES) || 20; + const MAX_ESTIMATED_TOKENS = parseInt(settings.CLAUDE_MEM_OPENROUTER_MAX_TOKENS) || 100000; + + // Sliding window: keep most recent messages within limits + // ... +} +``` + +However, this only truncates the copy sent to OpenRouter API - **it does NOT truncate the actual `session.conversationHistory` array**. The original array still grows unbounded. + +### Memory Impact Calculation + +Each `ConversationMessage` contains: +- `role`: 'user' | 'assistant' (small string) +- `content`: string (can be very large - full prompts/responses) + +A typical session with 100 tool uses could have: +- 1 init prompt (~2KB) +- 100 observation prompts (~5KB each = 500KB) +- 100 responses (~1KB each = 100KB) +- 1 summary prompt + response (~5KB) + +**Per session**: ~600KB in `conversationHistory` alone + +After several days with many sessions, this adds up to gigabytes. + +--- + +## 3. v8.5.7 Refactor Assessment + +The v8.5.7 release (2026-01-04) focused on modular architecture refactoring: + +### What v8.5.7 DID: +- Extracted SQLite repositories into `/src/services/sqlite/` +- Extracted worker agents into `/src/services/worker/agents/` +- Extracted search strategies into `/src/services/worker/search/` +- Extracted context generation into `/src/services/context/` +- Extracted infrastructure into `/src/services/infrastructure/` +- Added 595 tests across 36 test files + +### What v8.5.7 DID NOT address: +- No session TTL or automatic cleanup mechanism +- No `conversationHistory` size limits for Claude SDK or Gemini +- No memory pressure monitoring for sessions +- The "sessions stay in memory" design comment was already present + +**Relevant v8.5.2 Note**: There was a related fix for SDK Agent child process memory leak (orphaned Claude processes), but that addressed process cleanup, not in-memory session state. + +--- + +## 4. Specific Code Locations Requiring Fixes + +### Fix Location 1: SessionManager needs cleanup mechanism +**File**: `/Users/alexnewman/Scripts/claude-mem/src/services/worker/SessionManager.ts` + +Add automatic session cleanup based on: +- Session completion (when generator finishes and no pending work) +- Session age TTL (e.g., 1 hour after last activity) +- Memory pressure (configurable max sessions) + +### Fix Location 2: conversationHistory needs bounds +**Files**: +- `/Users/alexnewman/Scripts/claude-mem/src/services/worker/SDKAgent.ts` +- `/Users/alexnewman/Scripts/claude-mem/src/services/worker/GeminiAgent.ts` +- `/Users/alexnewman/Scripts/claude-mem/src/services/worker/agents/ResponseProcessor.ts` + +Apply sliding window truncation similar to OpenRouterAgent's approach, but mutate the original array. + +### Fix Location 3: Session cleanup on completion +**File**: `/Users/alexnewman/Scripts/claude-mem/src/services/worker/http/routes/SessionRoutes.ts` + +Remove the design decision to keep idle sessions in memory. Add cleanup timer after generator completes. + +--- + +## 5. Recommended Fixes + +### Fix 1: Add Session TTL and Cleanup Timer + +```typescript +// In SessionManager.ts + +private readonly SESSION_TTL_MS = 60 * 60 * 1000; // 1 hour +private cleanupTimers: Map = new Map(); + +/** + * Schedule automatic cleanup for idle sessions + */ +scheduleSessionCleanup(sessionDbId: number): void { + // Clear existing timer if any + const existingTimer = this.cleanupTimers.get(sessionDbId); + if (existingTimer) { + clearTimeout(existingTimer); + } + + // Schedule cleanup after TTL + const timer = setTimeout(() => { + const session = this.sessions.get(sessionDbId); + if (session && !session.generatorPromise) { + // Only delete if no active generator + this.deleteSession(sessionDbId); + logger.info('SESSION', 'Session auto-cleaned due to TTL', { sessionDbId }); + } + }, this.SESSION_TTL_MS); + + this.cleanupTimers.set(sessionDbId, timer); +} + +/** + * Cancel cleanup timer (call when session receives new work) + */ +cancelSessionCleanup(sessionDbId: number): void { + const timer = this.cleanupTimers.get(sessionDbId); + if (timer) { + clearTimeout(timer); + this.cleanupTimers.delete(sessionDbId); + } +} +``` + +### Fix 2: Add conversationHistory Bounds + +```typescript +// In src/services/worker/SessionManager.ts or new utility file + +const MAX_CONVERSATION_HISTORY_LENGTH = 50; // Configurable + +/** + * Trim conversation history to prevent unbounded growth + * Keeps the most recent messages + */ +export function trimConversationHistory(session: ActiveSession): void { + if (session.conversationHistory.length > MAX_CONVERSATION_HISTORY_LENGTH) { + const toRemove = session.conversationHistory.length - MAX_CONVERSATION_HISTORY_LENGTH; + session.conversationHistory.splice(0, toRemove); + logger.debug('SESSION', 'Trimmed conversation history', { + sessionDbId: session.sessionDbId, + removed: toRemove, + remaining: session.conversationHistory.length + }); + } +} +``` + +Then call this after each message is added in SDKAgent, GeminiAgent, and ResponseProcessor. + +### Fix 3: Update SessionRoutes Generator Completion + +```typescript +// In SessionRoutes.ts, update the finally block (around line 164) + +.finally(() => { + const sessionDbId = session.sessionDbId; + const wasAborted = session.abortController.signal.aborted; + + if (wasAborted) { + logger.info('SESSION', `Generator aborted`, { sessionId: sessionDbId }); + } else { + logger.info('SESSION', `Generator completed naturally`, { sessionId: sessionDbId }); + } + + session.generatorPromise = null; + session.currentProvider = null; + this.workerService.broadcastProcessingStatus(); + + // Check for pending work + const pendingStore = this.sessionManager.getPendingMessageStore(); + const pendingCount = pendingStore.getPendingCount(sessionDbId); + + if (pendingCount > 0 && !wasAborted) { + // Restart for pending work + // ... existing restart logic ... + } else { + // No pending work - schedule cleanup instead of keeping forever + this.sessionManager.scheduleSessionCleanup(sessionDbId); + } +}); +``` + +--- + +## 6. Configuration Recommendations + +Add these to `settings.json` defaults: + +```json +{ + "CLAUDE_MEM_SESSION_TTL_MINUTES": 60, + "CLAUDE_MEM_MAX_CONVERSATION_HISTORY": 50, + "CLAUDE_MEM_MAX_ACTIVE_SESSIONS": 100 +} +``` + +--- + +## 7. Testing Recommendations + +Add tests for: +1. Session cleanup after TTL expires +2. `conversationHistory` trimming at various sizes +3. Memory monitoring under sustained load +4. Cleanup timer cancellation on new work + +--- + +## Summary + +| Issue | Status in v8.5.7 | Fix Required | +|-------|------------------|--------------| +| Sessions never auto-cleanup | NOT FIXED | Yes - add TTL/cleanup mechanism | +| conversationHistory unbounded | NOT FIXED (except partial OpenRouter mitigation) | Yes - add trimming to all agents | + +Both memory leaks are confirmed to exist in the current codebase and require the fixes outlined above. diff --git a/plugin/package.json b/plugin/package.json index 1e1acaa4..8ded9012 100644 --- a/plugin/package.json +++ b/plugin/package.json @@ -1,6 +1,6 @@ { "name": "claude-mem-plugin", - "version": "8.5.6", + "version": "8.5.7", "private": true, "description": "Runtime dependencies for claude-mem bundled hooks", "type": "module", diff --git a/plugin/scripts/worker-service.cjs b/plugin/scripts/worker-service.cjs index 0fddc241..4f85c3d3 100755 --- a/plugin/scripts/worker-service.cjs +++ b/plugin/scripts/worker-service.cjs @@ -705,7 +705,7 @@ Set the \`cycles\` parameter to \`"ref"\` to resolve cyclical schemas with defs. `}var m7=ll.default.platform==="win32"?["APPDATA","HOMEDRIVE","HOMEPATH","LOCALAPPDATA","PATH","PROCESSOR_ARCHITECTURE","SYSTEMDRIVE","SYSTEMROOT","TEMP","USERNAME","USERPROFILE","PROGRAMFILES"]:["HOME","LOGNAME","PATH","SHELL","TERM","USER"];function h7(){let t={};for(let e of m7){let r=ll.default.env[e];r!==void 0&&(r.startsWith("()")||(t[e]=r))}return t}var vs=class{constructor(e){this._readBuffer=new Gf,this._stderrStream=null,this._serverParams=e,(e.stderr==="pipe"||e.stderr==="overlapped")&&(this._stderrStream=new dR.PassThrough)}async start(){if(this._process)throw new Error("StdioClientTransport already started! If using Client class, note that connect() calls start() automatically.");return new Promise((e,r)=>{this._process=(0,lR.default)(this._serverParams.command,this._serverParams.args??[],{env:{...h7(),...this._serverParams.env},stdio:["pipe","pipe",this._serverParams.stderr??"inherit"],shell:!1,windowsHide:ll.default.platform==="win32"&&g7(),cwd:this._serverParams.cwd}),this._process.on("error",n=>{r(n),this.onerror?.(n)}),this._process.on("spawn",()=>{e()}),this._process.on("close",n=>{this._process=void 0,this.onclose?.()}),this._process.stdin?.on("error",n=>{this.onerror?.(n)}),this._process.stdout?.on("data",n=>{this._readBuffer.append(n),this.processReadBuffer()}),this._process.stdout?.on("error",n=>{this.onerror?.(n)}),this._stderrStream&&this._process.stderr&&this._process.stderr.pipe(this._stderrStream)})}get stderr(){return this._stderrStream?this._stderrStream:this._process?.stderr??null}get pid(){return this._process?.pid??null}processReadBuffer(){for(;;)try{let e=this._readBuffer.readMessage();if(e===null)break;this.onmessage?.(e)}catch(e){this.onerror?.(e)}}async close(){if(this._process){let e=this._process;this._process=void 0;let r=new Promise(n=>{e.once("close",()=>{n()})});try{e.stdin?.end()}catch{}if(await Promise.race([r,new Promise(n=>setTimeout(n,2e3).unref())]),e.exitCode===null){try{e.kill("SIGTERM")}catch{}await Promise.race([r,new Promise(n=>setTimeout(n,2e3).unref())])}if(e.exitCode===null)try{e.kill("SIGKILL")}catch{}}this._readBuffer.clear()}send(e){return new Promise(r=>{if(!this._process?.stdin)throw new Error("Not connected");let n=uR(e);this._process.stdin.write(n)?r():this._process.stdin.once("drain",r)})}};function g7(){return"type"in ll.default}var Kf=yt(require("path"),1),_R=require("os");Ne();var C0={DEFAULT:3e5,HEALTH_CHECK:3e4,WORKER_STARTUP_WAIT:1e3,WORKER_STARTUP_RETRIES:300,PRE_RESTART_SETTLE_DELAY:2e3,WINDOWS_MULTIPLIER:1.5};function yR(t){return process.platform==="win32"?Math.round(t*C0.WINDOWS_MULTIPLIER):t}on();var Xwe=Kf.default.join((0,_R.homedir)(),".claude","plugins","marketplaces","thedotmack"),Ywe=yR(C0.HEALTH_CHECK),dl=null,pl=null;function Ln(){if(dl!==null)return dl;let t=Kf.default.join(Ke.get("CLAUDE_MEM_DATA_DIR"),"settings.json"),e=Ke.loadFromFile(t);return dl=parseInt(e.CLAUDE_MEM_WORKER_PORT,10),dl}function bR(){if(pl!==null)return pl;let t=Kf.default.join(Ke.get("CLAUDE_MEM_DATA_DIR"),"settings.json");return pl=Ke.loadFromFile(t).CLAUDE_MEM_WORKER_HOST,pl}function xR(){dl=null,pl=null}Ne();var N0=yt(require("path"),1),SR=require("os"),Fn=require("fs"),bs=require("child_process"),wR=require("util");Ne();var Jf=(0,wR.promisify)(bs.exec),$R=N0.default.join((0,SR.homedir)(),".claude-mem"),Ka=N0.default.join($R,"worker.pid");function j0(t){(0,Fn.mkdirSync)($R,{recursive:!0}),(0,Fn.writeFileSync)(Ka,JSON.stringify(t,null,2))}function ER(){if(!(0,Fn.existsSync)(Ka))return null;try{return JSON.parse((0,Fn.readFileSync)(Ka,"utf-8"))}catch(t){return k.warn("SYSTEM","Failed to parse PID file",{path:Ka},t),null}}function Ni(){if((0,Fn.existsSync)(Ka))try{(0,Fn.unlinkSync)(Ka)}catch(t){k.warn("SYSTEM","Failed to remove PID file",{path:Ka},t)}}function Ja(t){return process.platform==="win32"?Math.round(t*2):t}async function kR(t){if(process.platform!=="win32")return[];if(!Number.isInteger(t)||t<=0)return k.warn("SYSTEM","Invalid parent PID for child process enumeration",{parentPid:t}),[];try{let e=`powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.ParentProcessId -eq ${t} } | Select-Object -ExpandProperty ProcessId"`,{stdout:r}=await Jf(e,{timeout:6e4});return r.trim().split(` `).map(n=>parseInt(n.trim(),10)).filter(n=>!isNaN(n)&&Number.isInteger(n)&&n>0)}catch(e){return k.warn("SYSTEM","Failed to enumerate child processes",{parentPid:t},e),[]}}async function TR(t){if(!Number.isInteger(t)||t<=0){k.warn("SYSTEM","Invalid PID for force kill",{pid:t});return}try{process.platform==="win32"?await Jf(`taskkill /PID ${t} /T /F`,{timeout:6e4}):process.kill(t,"SIGKILL"),k.info("SYSTEM","Killed process",{pid:t})}catch(e){k.debug("SYSTEM","Process already exited during force kill",{pid:t},e)}}async function IR(t,e){let r=Date.now();for(;Date.now()-r{try{return process.kill(i,0),!0}catch{return!1}});if(n.length===0){k.info("SYSTEM","All child processes exited");return}k.debug("SYSTEM","Waiting for processes to exit",{stillAlive:n}),await new Promise(i=>setTimeout(i,100))}k.warn("SYSTEM","Timeout waiting for child processes to exit")}async function PR(){let t=process.platform==="win32",e=[];try{if(t){let r=`powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.Name -like '*python*' -and $_.CommandLine -like '*chroma-mcp*' } | Select-Object -ExpandProperty ProcessId"`,{stdout:n}=await Jf(r,{timeout:6e4});if(!n.trim()){k.debug("SYSTEM","No orphaned chroma-mcp processes found (Windows)");return}let i=n.trim().split(` `);for(let a of i){let o=parseInt(a.trim(),10);!isNaN(o)&&Number.isInteger(o)&&o>0&&e.push(o)}}else{let{stdout:r}=await Jf('ps aux | grep "chroma-mcp" | grep -v grep || true');if(!r.trim()){k.debug("SYSTEM","No orphaned chroma-mcp processes found (Unix)");return}let n=r.trim().split(` -`);for(let i of n){let a=i.trim().split(/\s+/);if(a.length>1){let o=parseInt(a[1],10);!isNaN(o)&&Number.isInteger(o)&&o>0&&e.push(o)}}}}catch(r){k.warn("SYSTEM","Failed to enumerate orphaned processes",{},r);return}if(e.length!==0){if(k.info("SYSTEM","Cleaning up orphaned chroma-mcp processes",{platform:t?"Windows":"Unix",count:e.length,pids:e}),t)for(let r of e){if(!Number.isInteger(r)||r<=0){k.warn("SYSTEM","Skipping invalid PID",{pid:r});continue}try{(0,bs.execSync)(`taskkill /PID ${r} /T /F`,{timeout:6e4,stdio:"ignore"})}catch(n){k.debug("SYSTEM","Failed to kill process, may have already exited",{pid:r},n)}}else for(let r of e)try{process.kill(r,"SIGKILL")}catch(n){k.debug("SYSTEM","Process already exited",{pid:r},n)}k.info("SYSTEM","Orphaned processes cleaned up",{count:e.length})}}function A0(t,e,r={}){let n=(0,bs.spawn)(process.execPath,[t,"--daemon"],{detached:!0,stdio:"ignore",windowsHide:!0,env:{...process.env,CLAUDE_MEM_WORKER_PORT:String(e),...r}});if(n.pid!==void 0)return n.unref(),n.pid}function OR(t,e){return async r=>{if(e.value){k.warn("SYSTEM",`Received ${r} but shutdown already in progress`);return}e.value=!0,k.info("SYSTEM",`Received ${r}, shutting down...`);try{await t(),process.exit(0)}catch(n){k.error("SYSTEM","Error during shutdown",{},n),process.exit(1)}}}var M0=yt(require("path"),1),RR=require("os"),CR=require("fs");Ne();async function Xf(t){try{return(await fetch(`http://127.0.0.1:${t}/api/health`)).ok}catch{return!1}}async function fl(t,e=3e4){let r=Date.now();for(;Date.now()-rsetTimeout(n,500))}return!1}async function Yf(t,e=1e4){let r=Date.now();for(;Date.now()-rsetTimeout(n,500))}return!1}async function Qf(t){try{let e=await fetch(`http://127.0.0.1:${t}/api/admin/shutdown`,{method:"POST"});return e.ok?!0:(k.warn("SYSTEM","Shutdown request returned error",{port:t,status:e.status}),!1)}catch(e){return e instanceof Error&&e.message?.includes("ECONNREFUSED")?(k.debug("SYSTEM","Worker already stopped",{port:t},e),!1):(k.warn("SYSTEM","Shutdown request failed unexpectedly",{port:t},e),!1)}}function v7(){let t=M0.default.join((0,RR.homedir)(),".claude","plugins","marketplaces","thedotmack"),e=M0.default.join(t,"package.json");return JSON.parse((0,CR.readFileSync)(e,"utf-8")).version}async function y7(t){try{let e=await fetch(`http://127.0.0.1:${t}/api/version`);return e.ok?(await e.json()).version:null}catch{return k.debug("SYSTEM","Could not fetch worker version",{port:t}),null}}async function NR(t){let e=v7(),r=await y7(t);return r?{matches:e===r,pluginVersion:e,workerVersion:r}:{matches:!0,pluginVersion:e,workerVersion:r}}Ne();async function jR(t){k.info("SYSTEM","Shutdown initiated"),Ni();let e=await kR(process.pid);if(k.info("SYSTEM","Found child processes",{count:e.length,pids:e}),t.server&&(await _7(t.server),k.info("SYSTEM","HTTP server closed")),await t.sessionManager.shutdownAll(),t.mcpClient&&(await t.mcpClient.close(),k.info("SYSTEM","MCP client closed")),t.dbManager&&await t.dbManager.close(),e.length>0){k.info("SYSTEM","Force killing remaining children");for(let r of e)await TR(r);await IR(e,5e3)}k.info("SYSTEM","Worker shutdown complete")}async function _7(t){t.closeAllConnections(),process.platform==="win32"&&await new Promise(e=>setTimeout(e,500)),await new Promise((e,r)=>{t.close(n=>n?r(n):e())}),process.platform==="win32"&&(await new Promise(e=>setTimeout(e,500)),k.info("SYSTEM","Waited for Windows port cleanup"))}var Qz=yt(sh(),1),Hw=yt(require("fs"),1),Bw=yt(require("path"),1);Ne();var Lw=yt(sh(),1),Wz=yt(zz(),1),Kz=yt(require("path"),1);cn();Ne();function Fw(t){let e=[];e.push(Lw.default.json({limit:"50mb"})),e.push((0,Wz.default)()),e.push((i,a,o)=>{let c=[".html",".js",".css",".svg",".png",".jpg",".jpeg",".webp",".woff",".woff2",".ttf",".eot"].some(h=>i.path.endsWith(h)),u=i.path==="/api/logs";if(i.path.startsWith("/health")||i.path==="/"||c||u)return o();let l=Date.now(),d=`${i.method}-${Date.now()}`,p=t(i.method,i.path,i.body);k.info("HTTP",`\u2192 ${i.method} ${i.path}`,{requestId:d},p);let f=a.send.bind(a);a.send=function(h){let _=Date.now()-l;return k.info("HTTP",`\u2190 ${a.statusCode} ${i.path}`,{requestId:d,duration:`${_}ms`}),f(h)},o()});let r=Gr(),n=Kz.default.join(r,"plugin","ui");return e.push(Lw.default.static(n)),e}function ch(t,e,r){let n=t.ip||t.connection.remoteAddress||"";if(!(n==="127.0.0.1"||n==="::1"||n==="::ffff:127.0.0.1"||n==="localhost")){k.warn("SECURITY","Admin endpoint access denied - not localhost",{endpoint:t.path,clientIp:n,method:t.method}),e.status(403).json({error:"Forbidden",message:"Admin endpoints are only accessible from localhost"});return}r()}function Zw(t,e,r){if(!r||Object.keys(r).length===0||e.includes("/init"))return"";if(e.includes("/observations")){let n=r.tool_name||"?",i=r.tool_input;return`tool=${k.formatTool(n,i)}`}return e.includes("/summarize")?"requesting summary":""}Ne();var Gs=class extends Error{constructor(r,n=500,i,a){super(r);this.statusCode=n;this.code=i;this.details=a;this.name="AppError"}};function Jz(t,e,r,n){let i={error:t,message:e};return r&&(i.code=r),n&&(i.details=n),i}var Xz=(t,e,r,n)=>{let i=t instanceof Gs?t.statusCode:500;k.error("HTTP",`Error handling ${e.method} ${e.path}`,{statusCode:i,error:t.message,code:t instanceof Gs?t.code:void 0},t);let a=Jz(t.name||"Error",t.message,t instanceof Gs?t.code:void 0,t instanceof Gs?t.details:void 0);r.status(i).json(a)};function Yz(t,e){e.status(404).json(Jz("NotFound",`Cannot ${t.method} ${t.path}`))}var Nre="8.5.6",uh=class{app;server=null;options;startTime=Date.now();constructor(e){this.options=e,this.app=(0,Qz.default)(),this.setupMiddleware(),this.setupCoreRoutes()}getHttpServer(){return this.server}async listen(e,r){return new Promise((n,i)=>{this.server=this.app.listen(e,r,()=>{k.info("SYSTEM","HTTP server started",{host:r,port:e,pid:process.pid}),n()}),this.server.on("error",i)})}async close(){this.server&&(this.server.closeAllConnections(),process.platform==="win32"&&await new Promise(e=>setTimeout(e,500)),await new Promise((e,r)=>{this.server.close(n=>n?r(n):e())}),process.platform==="win32"&&await new Promise(e=>setTimeout(e,500)),this.server=null,k.info("SYSTEM","HTTP server closed"))}registerRoutes(e){e.setupRoutes(this.app)}finalizeRoutes(){this.app.use(Yz),this.app.use(Xz)}setupMiddleware(){Fw(Zw).forEach(r=>this.app.use(r))}setupCoreRoutes(){let e="TEST-008-wrapper-ipc";this.app.get("/api/health",(r,n)=>{n.status(200).json({status:"ok",build:e,managed:process.env.CLAUDE_MEM_MANAGED==="true",hasIpc:typeof process.send=="function",platform:process.platform,pid:process.pid,initialized:this.options.getInitializationComplete(),mcpReady:this.options.getMcpReady()})}),this.app.get("/api/readiness",(r,n)=>{this.options.getInitializationComplete()?n.status(200).json({status:"ready",mcpReady:this.options.getMcpReady()}):n.status(503).json({status:"initializing",message:"Worker is still initializing, please retry"})}),this.app.get("/api/version",(r,n)=>{n.status(200).json({version:Nre})}),this.app.get("/api/instructions",async(r,n)=>{let i=r.query.topic||"all",a=r.query.operation;try{let o;if(a){let s=Bw.default.join(__dirname,"../skills/mem-search/operations",`${a}.md`);o=await Hw.promises.readFile(s,"utf-8")}else{let s=Bw.default.join(__dirname,"../skills/mem-search/SKILL.md"),c=await Hw.promises.readFile(s,"utf-8");o=this.extractInstructionSection(c,i)}n.json({content:[{type:"text",text:o}]})}catch{n.status(404).json({error:"Instruction not found"})}}),this.app.post("/api/admin/restart",ch,async(r,n)=>{n.json({status:"restarting"}),process.platform==="win32"&&process.env.CLAUDE_MEM_MANAGED==="true"&&process.send?(k.info("SYSTEM","Sending restart request to wrapper"),process.send({type:"restart"})):setTimeout(async()=>{await this.options.onRestart()},100)}),this.app.post("/api/admin/shutdown",ch,async(r,n)=>{n.json({status:"shutting_down"}),process.platform==="win32"&&process.env.CLAUDE_MEM_MANAGED==="true"&&process.send?(k.info("SYSTEM","Sending shutdown request to wrapper"),process.send({type:"shutdown"})):setTimeout(async()=>{await this.options.onShutdown()},100)})}extractInstructionSection(e,r){let n={workflow:this.extractBetween(e,"## The Workflow","## Search Parameters"),search_params:this.extractBetween(e,"## Search Parameters","## Examples"),examples:this.extractBetween(e,"## Examples","## Why This Workflow"),all:e};return n[r]||n.all}extractBetween(e,r,n){let i=e.indexOf(r),a=e.indexOf(n);return i===-1?e:a===-1?e.substring(i):e.substring(i,a).trim()}};var ct=yt(require("path"),1),Yl=require("os"),Mt=require("fs"),r4=require("child_process"),n4=require("util");Ne();var $n=require("fs"),Xl=require("path");Ne();function e4(t){try{return(0,$n.existsSync)(t)?JSON.parse((0,$n.readFileSync)(t,"utf-8")):{}}catch(e){return k.warn("CONFIG","Failed to read Cursor registry, using empty registry",{file:t,error:e instanceof Error?e.message:String(e)}),{}}}function t4(t,e){let r=(0,Xl.join)(t,"..");(0,$n.mkdirSync)(r,{recursive:!0}),(0,$n.writeFileSync)(t,JSON.stringify(e,null,2))}function Vw(t,e){let r=(0,Xl.join)(t,".cursor","rules"),n=(0,Xl.join)(r,"claude-mem-context.mdc"),i=`${n}.tmp`;(0,$n.mkdirSync)(r,{recursive:!0});let a=`--- +`);for(let i of n){let a=i.trim().split(/\s+/);if(a.length>1){let o=parseInt(a[1],10);!isNaN(o)&&Number.isInteger(o)&&o>0&&e.push(o)}}}}catch(r){k.warn("SYSTEM","Failed to enumerate orphaned processes",{},r);return}if(e.length!==0){if(k.info("SYSTEM","Cleaning up orphaned chroma-mcp processes",{platform:t?"Windows":"Unix",count:e.length,pids:e}),t)for(let r of e){if(!Number.isInteger(r)||r<=0){k.warn("SYSTEM","Skipping invalid PID",{pid:r});continue}try{(0,bs.execSync)(`taskkill /PID ${r} /T /F`,{timeout:6e4,stdio:"ignore"})}catch(n){k.debug("SYSTEM","Failed to kill process, may have already exited",{pid:r},n)}}else for(let r of e)try{process.kill(r,"SIGKILL")}catch(n){k.debug("SYSTEM","Process already exited",{pid:r},n)}k.info("SYSTEM","Orphaned processes cleaned up",{count:e.length})}}function A0(t,e,r={}){let n=(0,bs.spawn)(process.execPath,[t,"--daemon"],{detached:!0,stdio:"ignore",windowsHide:!0,env:{...process.env,CLAUDE_MEM_WORKER_PORT:String(e),...r}});if(n.pid!==void 0)return n.unref(),n.pid}function OR(t,e){return async r=>{if(e.value){k.warn("SYSTEM",`Received ${r} but shutdown already in progress`);return}e.value=!0,k.info("SYSTEM",`Received ${r}, shutting down...`);try{await t(),process.exit(0)}catch(n){k.error("SYSTEM","Error during shutdown",{},n),process.exit(1)}}}var M0=yt(require("path"),1),RR=require("os"),CR=require("fs");Ne();async function Xf(t){try{return(await fetch(`http://127.0.0.1:${t}/api/health`)).ok}catch{return!1}}async function fl(t,e=3e4){let r=Date.now();for(;Date.now()-rsetTimeout(n,500))}return!1}async function Yf(t,e=1e4){let r=Date.now();for(;Date.now()-rsetTimeout(n,500))}return!1}async function Qf(t){try{let e=await fetch(`http://127.0.0.1:${t}/api/admin/shutdown`,{method:"POST"});return e.ok?!0:(k.warn("SYSTEM","Shutdown request returned error",{port:t,status:e.status}),!1)}catch(e){return e instanceof Error&&e.message?.includes("ECONNREFUSED")?(k.debug("SYSTEM","Worker already stopped",{port:t},e),!1):(k.warn("SYSTEM","Shutdown request failed unexpectedly",{port:t},e),!1)}}function v7(){let t=M0.default.join((0,RR.homedir)(),".claude","plugins","marketplaces","thedotmack"),e=M0.default.join(t,"package.json");return JSON.parse((0,CR.readFileSync)(e,"utf-8")).version}async function y7(t){try{let e=await fetch(`http://127.0.0.1:${t}/api/version`);return e.ok?(await e.json()).version:null}catch{return k.debug("SYSTEM","Could not fetch worker version",{port:t}),null}}async function NR(t){let e=v7(),r=await y7(t);return r?{matches:e===r,pluginVersion:e,workerVersion:r}:{matches:!0,pluginVersion:e,workerVersion:r}}Ne();async function jR(t){k.info("SYSTEM","Shutdown initiated"),Ni();let e=await kR(process.pid);if(k.info("SYSTEM","Found child processes",{count:e.length,pids:e}),t.server&&(await _7(t.server),k.info("SYSTEM","HTTP server closed")),await t.sessionManager.shutdownAll(),t.mcpClient&&(await t.mcpClient.close(),k.info("SYSTEM","MCP client closed")),t.dbManager&&await t.dbManager.close(),e.length>0){k.info("SYSTEM","Force killing remaining children");for(let r of e)await TR(r);await IR(e,5e3)}k.info("SYSTEM","Worker shutdown complete")}async function _7(t){t.closeAllConnections(),process.platform==="win32"&&await new Promise(e=>setTimeout(e,500)),await new Promise((e,r)=>{t.close(n=>n?r(n):e())}),process.platform==="win32"&&(await new Promise(e=>setTimeout(e,500)),k.info("SYSTEM","Waited for Windows port cleanup"))}var Qz=yt(sh(),1),Hw=yt(require("fs"),1),Bw=yt(require("path"),1);Ne();var Lw=yt(sh(),1),Wz=yt(zz(),1),Kz=yt(require("path"),1);cn();Ne();function Fw(t){let e=[];e.push(Lw.default.json({limit:"50mb"})),e.push((0,Wz.default)()),e.push((i,a,o)=>{let c=[".html",".js",".css",".svg",".png",".jpg",".jpeg",".webp",".woff",".woff2",".ttf",".eot"].some(h=>i.path.endsWith(h)),u=i.path==="/api/logs";if(i.path.startsWith("/health")||i.path==="/"||c||u)return o();let l=Date.now(),d=`${i.method}-${Date.now()}`,p=t(i.method,i.path,i.body);k.info("HTTP",`\u2192 ${i.method} ${i.path}`,{requestId:d},p);let f=a.send.bind(a);a.send=function(h){let _=Date.now()-l;return k.info("HTTP",`\u2190 ${a.statusCode} ${i.path}`,{requestId:d,duration:`${_}ms`}),f(h)},o()});let r=Gr(),n=Kz.default.join(r,"plugin","ui");return e.push(Lw.default.static(n)),e}function ch(t,e,r){let n=t.ip||t.connection.remoteAddress||"";if(!(n==="127.0.0.1"||n==="::1"||n==="::ffff:127.0.0.1"||n==="localhost")){k.warn("SECURITY","Admin endpoint access denied - not localhost",{endpoint:t.path,clientIp:n,method:t.method}),e.status(403).json({error:"Forbidden",message:"Admin endpoints are only accessible from localhost"});return}r()}function Zw(t,e,r){if(!r||Object.keys(r).length===0||e.includes("/init"))return"";if(e.includes("/observations")){let n=r.tool_name||"?",i=r.tool_input;return`tool=${k.formatTool(n,i)}`}return e.includes("/summarize")?"requesting summary":""}Ne();var Gs=class extends Error{constructor(r,n=500,i,a){super(r);this.statusCode=n;this.code=i;this.details=a;this.name="AppError"}};function Jz(t,e,r,n){let i={error:t,message:e};return r&&(i.code=r),n&&(i.details=n),i}var Xz=(t,e,r,n)=>{let i=t instanceof Gs?t.statusCode:500;k.error("HTTP",`Error handling ${e.method} ${e.path}`,{statusCode:i,error:t.message,code:t instanceof Gs?t.code:void 0},t);let a=Jz(t.name||"Error",t.message,t instanceof Gs?t.code:void 0,t instanceof Gs?t.details:void 0);r.status(i).json(a)};function Yz(t,e){e.status(404).json(Jz("NotFound",`Cannot ${t.method} ${t.path}`))}var Nre="8.5.7",uh=class{app;server=null;options;startTime=Date.now();constructor(e){this.options=e,this.app=(0,Qz.default)(),this.setupMiddleware(),this.setupCoreRoutes()}getHttpServer(){return this.server}async listen(e,r){return new Promise((n,i)=>{this.server=this.app.listen(e,r,()=>{k.info("SYSTEM","HTTP server started",{host:r,port:e,pid:process.pid}),n()}),this.server.on("error",i)})}async close(){this.server&&(this.server.closeAllConnections(),process.platform==="win32"&&await new Promise(e=>setTimeout(e,500)),await new Promise((e,r)=>{this.server.close(n=>n?r(n):e())}),process.platform==="win32"&&await new Promise(e=>setTimeout(e,500)),this.server=null,k.info("SYSTEM","HTTP server closed"))}registerRoutes(e){e.setupRoutes(this.app)}finalizeRoutes(){this.app.use(Yz),this.app.use(Xz)}setupMiddleware(){Fw(Zw).forEach(r=>this.app.use(r))}setupCoreRoutes(){let e="TEST-008-wrapper-ipc";this.app.get("/api/health",(r,n)=>{n.status(200).json({status:"ok",build:e,managed:process.env.CLAUDE_MEM_MANAGED==="true",hasIpc:typeof process.send=="function",platform:process.platform,pid:process.pid,initialized:this.options.getInitializationComplete(),mcpReady:this.options.getMcpReady()})}),this.app.get("/api/readiness",(r,n)=>{this.options.getInitializationComplete()?n.status(200).json({status:"ready",mcpReady:this.options.getMcpReady()}):n.status(503).json({status:"initializing",message:"Worker is still initializing, please retry"})}),this.app.get("/api/version",(r,n)=>{n.status(200).json({version:Nre})}),this.app.get("/api/instructions",async(r,n)=>{let i=r.query.topic||"all",a=r.query.operation;try{let o;if(a){let s=Bw.default.join(__dirname,"../skills/mem-search/operations",`${a}.md`);o=await Hw.promises.readFile(s,"utf-8")}else{let s=Bw.default.join(__dirname,"../skills/mem-search/SKILL.md"),c=await Hw.promises.readFile(s,"utf-8");o=this.extractInstructionSection(c,i)}n.json({content:[{type:"text",text:o}]})}catch{n.status(404).json({error:"Instruction not found"})}}),this.app.post("/api/admin/restart",ch,async(r,n)=>{n.json({status:"restarting"}),process.platform==="win32"&&process.env.CLAUDE_MEM_MANAGED==="true"&&process.send?(k.info("SYSTEM","Sending restart request to wrapper"),process.send({type:"restart"})):setTimeout(async()=>{await this.options.onRestart()},100)}),this.app.post("/api/admin/shutdown",ch,async(r,n)=>{n.json({status:"shutting_down"}),process.platform==="win32"&&process.env.CLAUDE_MEM_MANAGED==="true"&&process.send?(k.info("SYSTEM","Sending shutdown request to wrapper"),process.send({type:"shutdown"})):setTimeout(async()=>{await this.options.onShutdown()},100)})}extractInstructionSection(e,r){let n={workflow:this.extractBetween(e,"## The Workflow","## Search Parameters"),search_params:this.extractBetween(e,"## Search Parameters","## Examples"),examples:this.extractBetween(e,"## Examples","## Why This Workflow"),all:e};return n[r]||n.all}extractBetween(e,r,n){let i=e.indexOf(r),a=e.indexOf(n);return i===-1?e:a===-1?e.substring(i):e.substring(i,a).trim()}};var ct=yt(require("path"),1),Yl=require("os"),Mt=require("fs"),r4=require("child_process"),n4=require("util");Ne();var $n=require("fs"),Xl=require("path");Ne();function e4(t){try{return(0,$n.existsSync)(t)?JSON.parse((0,$n.readFileSync)(t,"utf-8")):{}}catch(e){return k.warn("CONFIG","Failed to read Cursor registry, using empty registry",{file:t,error:e instanceof Error?e.message:String(e)}),{}}}function t4(t,e){let r=(0,Xl.join)(t,"..");(0,$n.mkdirSync)(r,{recursive:!0}),(0,$n.writeFileSync)(t,JSON.stringify(e,null,2))}function Vw(t,e){let r=(0,Xl.join)(t,".cursor","rules"),n=(0,Xl.join)(r,"claude-mem-context.mdc"),i=`${n}.tmp`;(0,$n.mkdirSync)(r,{recursive:!0});let a=`--- alwaysApply: true description: "Claude-mem context from past sessions (auto-updated)" ---