claude-mem

Author	SHA1	Message	Date
Rod Boev	a3f9e7f638	fix: prevent chroma-mcp spawn storm with 5-layer defense (641 processes → max 2) During SIGHUP testing with 6+ active sessions, ChromaSync.ensureConnection() had no mutex — concurrent fire-and-forget syncObservation() calls each spawned a chroma-mcp subprocess via StdioClientTransport, creating 641 orphans in ~5min. Error-driven reconnection formed a positive feedback loop amplifying the storm. Defense layers: - Layer 0: Connection mutex via promise memoization (prevents concurrent spawns) - Layer 1: Pre-spawn process count guard using execFileSync('ps') (kills excess) - Layer 2: Hardened close() with try-finally + Unix pkill in GracefulShutdown - Layer 3: Count-based orphan reaper in ProcessManager (not age-based) - Layer 4: Circuit breaker stops retries after 3 consecutive failures for 60s Closes #1063, closes #695 Relates to #1010, #707	2026-02-11 07:19:28 -05:00
Rod Boev	4e67393d27	fix: prevent daemon silent death from SIGHUP + unhandled errors Root cause: registerSignalHandlers() handled SIGTERM/SIGINT but not SIGHUP. When the parent hook process exits, the kernel sends SIGHUP to the daemon, causing immediate termination (default signal action). Belt-and-suspenders fix: 1. SIGHUP handler: ignore in daemon mode, graceful shutdown otherwise 2. setsid: spawn daemon in new session on Linux (prevents SIGHUP delivery) 3. Global unhandledRejection/uncaughtException guards in daemon mode	2026-02-11 00:35:53 -05:00
Alex Newman	af95461a70	Merge branch 'main' into fix/hook-resilience-worker-lifecycle # Conflicts: # plugin/scripts/mcp-server.cjs # plugin/scripts/worker-service.cjs	2026-02-10 23:37:33 -05:00
Rod Boev	418e38ee46	fix: hook resilience and worker lifecycle improvements (#957 , #923 , #984 , #987 , #1042 ) Reduce timeouts to eliminate 10-30s startup delay when worker is dead (common on WSL2 after hibernate). Add stale PID detection, graceful error handling across all handlers, and error classification that distinguishes worker unavailability from handler bugs. - HEALTH_CHECK 30s→3s, new POST_SPAWN_WAIT (5s), PORT_IN_USE_WAIT (3s) - isProcessAlive() with EPERM handling, cleanStalePidFile() - getPluginVersion() try-catch for shutdown race (#1042) - isWorkerUnavailableError: transport+5xx+429→exit 0, 4xx→exit 2 - No-op handler for unknown event types (#984) - Wrap all handler fetch calls in try-catch for graceful degradation - CLAUDE_MEM_HEALTH_TIMEOUT_MS env var override with validation	2026-02-10 15:34:35 -05:00
xingyu	e4e1d3fb92	fix: Windows platform improvements — re-enable Chroma, fix DB race, simplify env isolation 1. ProcessManager: Migrate spawnDaemon() from WMIC to PowerShell Start-Process - WMIC deprecated in Windows 11, PowerShell inherits env vars properly - Use -WindowStyle Hidden to prevent console popups - Fix redundant backslash escaping in PowerShell $_ variables 2. ChromaSync: Re-enable vector search on Windows - Remove overly defensive platform check that disabled all semantic search - Worker daemon starts with -WindowStyle Hidden; child processes inherit - MCP SDK's StdioClientTransport uses shell:false, no new console created 3. worker-service: Unified DB-ready gate middleware - Replace single-endpoint /api/sessions/init wait with global middleware - Hold all DB-dependent requests until database is initialized (30s timeout) - Whitelist static assets, /health, and viewer page for immediate response - Separate dbReadyPromise (DB only) from initializationComplete (full init) - Fixes "Database not initialized" errors on /stream, /summarize, /init 4. EnvManager: Switch from allowlist to blocklist for subprocess env - Only strip ANTHROPIC_API_KEY to prevent Issue #733 billing hijack - Pass through all other vars (ANTHROPIC_AUTH_TOKEN, ANTHROPIC_BASE_URL, etc.) - Simpler, less fragile than maintaining an exhaustive system vars allowlist	2026-02-07 18:30:57 +08:00
Alex Newman	5969d670d0	chore: bump version to 9.1.1 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 02:18:44 -05:00
Alex Newman	8dfcb5e612	chore: bump version to 9.1.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 01:05:38 -05:00
Alex Newman	ff503d08a7	MAESTRO: Merge PR #657 - Add generate/clean CLI commands for CLAUDE.md management Cherry-picked source changes from PR #657 (224 commits behind main). Adds `claude-mem generate` and `claude-mem clean` CLI commands: - New src/cli/claude-md-commands.ts with generateClaudeMd() and cleanClaudeMd() - Worker service generate/clean case handlers with --dry-run support - CLAUDE_MD logger component type - Uses shared isDirectChild from path-utils.ts (DRY improvement over PR original) Skipped from PR: 91 CLAUDE.md file deletions (stale), build artifacts, .claude/plans/ dev artifact, smart-install.js shell alias auto-injection (aggressive profile modification without consent). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 05:52:54 -05:00
Alex Newman	98920bd860	MAESTRO: Merge PR #662 - Add save_memory MCP tool for manual memory storage Adds save_memory MCP tool allowing users to manually save observations for semantic search. Source changes cherry-picked from PR #662 by @darconada (build artifact conflicts resolved by direct application). Closes #645. Co-Authored-By: darconadalabarga <darconada@arsys.es> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 04:13:44 -05:00
Alex Newman	5dffb1ebb0	MAESTRO: fix(hooks): add session-complete handler to enable orphan reaper cleanup Cherry-picked from PR #844 by @thusdigital. Sessions stayed in active sessions map forever after summarize, causing the orphan reaper to think all processes were still active. Adds session-complete as Stop phase 2 hook that calls POST /api/sessions/complete to remove sessions from the active map, allowing the reaper to correctly identify and clean up orphaned worker processes. Fixes #842. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 03:23:13 -05:00
Alex Newman	da1d2cd36a	MAESTRO: fix(db): prevent FK constraint failures on worker restart Cherry-picked source changes from PR #889 by @Et9797. Fixes #846. Key changes: - Add ensureMemorySessionIdRegistered() guard in SessionStore.ts - Add ON UPDATE CASCADE migration (schema v21) for observations and session_summaries FK constraints - Change message queue from claim-and-delete to claim-confirm pattern (PendingMessageStore.ts) - Add spawn deduplication and unrecoverable error detection in SessionRoutes.ts and worker-service.ts - Add forceInit flag to SDKAgent for stale session recovery Build artifacts skipped (pre-existing dompurify dep issue). Path fixes (HealthMonitor.ts, worker-utils.ts) already merged via PR #634. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 03:16:17 -05:00
Alex Newman	8030c44af4	MAESTRO: fix(cursor): use bun runtime and fix hooks directory detection Cherry-picked source changes from PR #721. Fixes two Cursor standalone setup bugs: 1. findCursorHooksDir() now checks for hooks.json (unified CLI mode) in addition to legacy common.sh/common.ps1 scripts 2. installCursorHooks() now uses bun instead of node for hook commands since worker-service.cjs depends on bun:sqlite 3. Added findBunPath() to detect bun executable across platforms Build artifacts skipped (pre-existing dompurify viewer dep issue). Source-only cherry-pick, TypeScript compilation clean for modified file. Co-Authored-By: polux0 <aleksaprosperitylabs@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 03:09:26 -05:00
Alex Newman	75a0f2e981	fix: respect CLAUDE_CONFIG_DIR for plugin paths (#626 ) Add MARKETPLACE_ROOT constant to paths.ts and update 5 source files to use centralized path constants instead of hardcoded ~/.claude paths. Preserves backwards compatibility when CLAUDE_CONFIG_DIR is not set. Based on PR #634 by @Kuroakira, cherry-picked onto main due to build artifact merge conflicts (source changes applied cleanly). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 03:00:08 -05:00
Alex Newman	91e1d5baad	fix: correct Gemini model name from gemini-3-flash to gemini-3-flash-preview The Gemini API requires the -preview suffix for the Gemini 3 Flash model. gemini-3-flash does not exist - only gemini-3-flash-preview is available. This was causing 404 errors when users selected this model option. Closes #831 Co-Authored-By: Glucksberg <markuscontasul@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 02:55:30 -05:00
Alex Newman	3ad53733e8	MAESTRO: Merge PR #884 adding Zscaler SSL certificate support for ChromaDB vector search Adds automatic detection and handling of Zscaler enterprise security certificates on macOS. Combines standard certifi CA certificates with Zscaler certificates into a single bundle, passed via SSL_CERT_FILE/REQUESTS_CA_BUNDLE/CURL_CA_BUNDLE env vars to the chroma-mcp subprocess. Certificate bundle is cached for 24 hours. Falls back gracefully when Zscaler is not present, with no impact on non-Zscaler environments. Co-Authored-By: RClark4958 <rickdclark48@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 02:15:19 -05:00
Jenha Poyarkov	9f2a237aaf	fix: close transport on connection error to prevent chroma-mcp zombie processes Fixes #761 Root cause: When connection errors occur (MCP error -32000, Connection closed), the code was resetting \`connected\` and \`client\` but NOT calling \`transport.close()\`, leaving the chroma-mcp subprocess alive. Each reconnection attempt spawned a NEW process while old ones accumulated. Changes: - Close transport before resetting state in ensureCollection() error handler - Close transport before resetting state in queryChroma() error handler - Set transport = null after closing to match close() method behavior - Add regression tests for Issue #761 with source code verification Tested on macOS - no more zombie processes after the fix.	2026-02-06 02:10:18 -05:00
Abdelkarim Mateos Sanchez	9bd56c993c	fix: align IDs with metadatas in ChromaSearchStrategy ChromaSync.queryChroma() returns deduplicated sqlite_ids but the metadatas array contains multiple entries per observation (narrative + facts). The filterByRecency() method was iterating over metadatas and using the index to access ids, causing array out-of-bounds access. The fix builds a Map from sqlite_id to metadata, then iterates over the deduplicated ids array to ensure proper alignment. Symptoms before fix: - Semantic search returning incorrect/empty results - Search only working with near-exact queries - Recent items (same day) not being found Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-06 02:07:03 -05:00
Alex Newman	711f5455df	fix: generate synthetic memorySessionId for stateless providers (PR #615 ) Gemini and OpenRouter are stateless APIs that never return session IDs. Without synthetic IDs, PR #693's defensive memorySessionId checks throw errors on every observation processing call for these providers. Generates provider-prefixed IDs (gemini-/openrouter-{contentSessionId}- {timestamp}) before the first API call, persisted to the database via updateMemorySessionId(). Applied from PR #615 (closed due to staleness). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 02:04:49 -05:00
Alex Newman	2d40afe7ef	fix: provider-aware recovery and stale session cleanup (PR #741 ) Applies PR #741 by @licutis onto main, resolving conflicts with recently merged PRs #693, #937, and #627. Adds getActiveAgent() to WorkerService so startup-recovery uses the correct provider instead of hardcoding SDKAgent. Also cleans up sessions stuck 'active' for 6+ hours and their pending messages before processing orphaned queues. Co-Authored-By: licutis <43884712+licutis@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 01:58:00 -05:00
TranslateMe	ea38601564	fix: Reset AbortController before starting generator to prevent infinite abort loop When a generator exits with wasAborted=true, the AbortController remains in aborted state but generatorPromise is set to null. When a new observation arrives, ensureGeneratorRunning() sees generatorPromise=null and tries to start a new generator, but the new generator immediately sees signal.aborted=true and exits, causing an infinite "Generator aborted" loop. This fix resets the AbortController if it's already aborted before starting a new generator, allowing the session to recover from the stuck state. Bug reproduction: 1. Session receives observations 2. Something causes the generator to be aborted 3. generatorPromise = null, but abortController.signal.aborted = true 4. New observation arrives → starts generator → immediately aborted → loop Fix: Check if abortController.signal.aborted before starting generator, and create a new AbortController if needed.	2026-02-06 01:53:17 -05:00
jayvenn21	f24bba21e9	fix(worker): gracefully process orphaned pending messages after session termination	2026-02-06 01:47:43 -05:00
Michael Lipscombe	af308ea5c8	fix: Backfill project field on SDK session creation to prevent race condition PostToolUse hook can create the session before UserPromptSubmit's session-init sets the project, leaving it empty. Add an UPDATE after INSERT OR IGNORE to backfill the project when a later hook provides it. Closes #939 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-06 01:44:42 -05:00
Alex Newman	6382d6f9c7	MAESTRO: Merge PR #693 - prevent infinite restart loop that causes runaway API costs Add restart limit (max 3 consecutive restarts) with exponential backoff to prevent infinite generator restart loops. Also add defensive memorySessionId checks in GeminiAgent and OpenRouterAgent before expensive LLM calls to fail fast when session ID hasn't been captured. Based on PR #693 by @ajbmachon (applied to current main). Co-Authored-By: Andre Machon <ajbmachon2@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 01:43:12 -05:00
jayvenn21	93bad99d79	fix: terminate session on prompt-too-long instead of retrying indefinitely	2026-02-06 01:39:55 -05:00
Michel Tomas	b07130acc6	fix: handle both boolean and string types for settings JSON.parse preserves native types, so boolean true/false stay as booleans rather than strings. The previous check only handled string 'true', meaning users who set `"ENABLED": true` (boolean) wouldn't have the feature work. Now both `getBool()` helper and ResponseProcessor check handle: - String 'true' → enabled - Boolean true → enabled - Any other value → disabled Tested: Confirmed feature stays disabled with string "false" and no new CLAUDE.md files are created when accessing directories.	2026-02-06 01:36:45 -05:00
Michel Tomas	bb96092d74	fix: add FOLDER_CLAUDEMD_ENABLED to settingKeys for API/UI access	2026-02-06 01:36:45 -05:00
Michel Tomas	9907df1db8	fix: respect CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED setting The CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED setting was documented but never actually checked in code. The folder CLAUDE.md generation ran unconditionally whenever files were touched. Changes: - Add CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED to SettingsDefaults interface - Add default value 'false' to DEFAULTS object - Check setting in ResponseProcessor before calling updateFolderClaudeMdFiles Fixes the issue identified in issue-600-documentation-audit-features-not-implemented.md: "The setting CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED is never read."	2026-02-06 01:36:45 -05:00
jayvenn21	5d1ee20076	fix: prevent duplicate generator spawns in handleSessionInit Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-05 23:48:56 -05:00
Rajiv Sinclair	27aa98c269	fix: wait for database initialization before processing session-init requests Fixes a race condition where the UserPromptSubmit hook could call /api/sessions/init before the database is fully initialized, resulting in a 500 error with "Database not initialized". Root cause: - Worker starts HTTP server immediately for fast startup - Health endpoint (/api/health) returns OK when server is listening - session-init hook waits for health check, then calls /api/sessions/init - Database initialization happens in background, creating a race window Solution: - Add early handler for /api/sessions/init that waits for initialization - Uses same pattern as existing /api/context/inject handler - Returns 503 with retry message if initialization times out The fix ensures hooks receive proper responses even during the brief startup window when the server is listening but DB isn't ready yet. Co-Authored-By: Claude Code <noreply@anthropic.com>	2026-02-05 22:23:52 -05:00
Alex Newman	d333c7dc08	MAESTRO: Expand startup orphan cleanup to target mcp-server and worker-service processes The startup cleanupOrphanedProcesses() only targeted chroma-mcp, leaving orphaned mcp-server.cjs and worker-service.cjs processes undetected after daemon crashes. Expanded to target all claude-mem process types with 30-minute age filtering and current PID exclusion. Closes PR #687 (which had a spawnDaemon regression removing Windows WMIC support). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-05 22:06:46 -05:00
david718	cdb0e823aa	fix: add daemon children cleanup to orphan reaper Add killIdleDaemonChildren() to clean up SDK-spawned claude processes that don't terminate after completing their work. Problem: - Worker-service daemon spawns Claude SDK processes - These processes remain alive after work completes - They accumulate over time, consuming significant memory - Existing killSystemOrphans() only handles PPID=1 orphans Solution: - Add killIdleDaemonChildren() that finds claude processes where: - Parent PID = daemon's PID (children of worker-service) - CPU = 0% (idle, not actively working) - Running > 2 minutes (completed their work) - Call it from reapOrphanedProcesses() (runs every 5 minutes) Testing: - Verified locally: 15+ zombie processes cleaned up - Memory saved: ~2GB - Normal processes (MCP server, Chroma) unaffected Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-05 19:50:46 -05:00
Alex Newman	0ecb387f58	MAESTRO: Implement Windows spawn guard to prevent repeated worker popups (closes #921 ) Adds file-based cooldown lock in ensureWorkerStarted() so failed worker spawn attempts on Windows don't produce repeated bun.exe terminal popups. Based on the approach from PR #931, integrated into the refactored code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-05 18:59:53 -05:00
Rod Boev	b8b88d998c	fix: fail open on /api/context/inject during initialization The /api/context/inject endpoint previously blocked for up to 5 minutes (300s timeout) waiting for initializationComplete, which includes MCP connection setup. On Windows, the MCP connection can hang indefinitely, causing the context hook to never return and blocking Claude Code startup. This change makes the endpoint fail open: if the worker hasn't finished initializing, return empty context immediately instead of blocking. The hook completes fast, and context becomes available on subsequent prompts once initialization finishes in the background. Closes #958	2026-02-05 18:22:08 -05:00
OpenCode User	86b1d7fad9	fix: restrict CORS to localhost origins only Prevents cross-origin attacks from malicious websites by restricting CORS to only allow: - Requests without Origin header (hooks, curl, CLI tools) - Requests from localhost / 127.0.0.1 origins Previously, CORS was completely open (cors() without configuration), allowing any website to access the local API and read session data.	2026-02-05 18:10:50 -05:00
bigphoot	74f6b75db2	fix: use /api/health instead of /api/readiness for hook health checks Fixes the "Worker did not become ready within 15 seconds" timeout issue. Root cause: isWorkerHealthy() and waitForHealth() were checking /api/readiness which returns 503 until full initialization completes (including MCP connection which can take 5+ minutes). Hooks only have 15 seconds timeout. Solution: Use /api/health (liveness check) which returns 200 as soon as the HTTP server is listening. This is sufficient for hook communication since the worker can accept requests while background initialization continues. Changes: - src/shared/worker-utils.ts: Change /api/readiness to /api/health in isWorkerHealthy() - src/services/infrastructure/HealthMonitor.ts: Change /api/readiness to /api/health in waitForHealth() - tests/infrastructure/health-monitor.test.ts: Update test to expect /api/health Fixes #811, #772, #729 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 20:28:38 -05:00
bigphoot	006ff40175	fix: use centralized credentials from ~/.claude-mem/.env to prevent API key hijacking (#733 ) This fixes Issue #733 where claude-mem would incorrectly use ANTHROPIC_API_KEY from random project .env files instead of the user's configured Claude Code CLI subscription. Root cause: The SDK's `query()` function inherits from `process.env` when no `env` option is passed. When users work in projects with their own .env files containing API keys, the SDK would discover and use those keys, billing the wrong account. Solution: Centralized credential management via ~/.claude-mem/.env Changes: - Add EnvManager.ts: Centralized credential storage and isolated env builder - SDKAgent: Pass isolated env to SDK query() that only includes credentials from ~/.claude-mem/.env, not random keys from process.env inheritance - GeminiAgent/OpenRouterAgent: Use getCredential() instead of process.env fallback - SettingsDefaultsManager: Add CLAUDE_MEM_CLAUDE_AUTH_METHOD setting ('cli' \| 'api') How it works: 1. buildIsolatedEnv() creates a clean environment with only essential system vars (PATH, HOME, etc.) and credentials explicitly configured in ~/.claude-mem/.env 2. SDK subprocess runs with this isolated env, never seeing random API keys 3. If no ANTHROPIC_API_KEY is in ~/.claude-mem/.env, Claude Code CLI billing is used 4. Same pattern applied to Gemini/OpenRouter agents for consistency This ensures claude-mem always uses the user's intended billing method, regardless of what .env files exist in their working directory. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 20:09:41 -05:00
Alex Newman	4df9f61347	refactor: implement in-process worker architecture for hooks (#722 ) * fix: stop generating empty CLAUDE.md files - Return empty string instead of "No recent activity" when no observations exist - Skip writing CLAUDE.md files when formatted content is empty - Remove redundant "auto-generated by claude-mem" HTML comment - Clean up 98 existing empty CLAUDE.md files across the codebase - Update tests to expect empty string for empty input Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * build assets * refactor: implement in-process worker architecture for hooks Replaces spawn-based worker startup with in-process architecture: - Hook processes now become the worker when port 37777 is free - Eliminates Windows spawn issues (NO SPAWN rule) - SessionStart chains: smart-install && stop && context Key changes: - worker-service.ts: hook case starts WorkerService in-process - hook-command.ts: skipExit option prevents process.exit() when hosting worker - hooks.json: single chained command replaces separate start/hook commands - worker-utils.ts: ensureWorkerRunning() returns boolean, doesn't block - handlers: graceful fallback when worker unavailable All 761 tests pass. Manual verification confirms hook stays alive as worker. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * context * a * MAESTRO: Mark PR #722 test verification task complete All 797 tests passed (3 skipped, 0 failed) after merge conflict resolution. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * MAESTRO: Mark PR #722 build verification task complete * MAESTRO: Mark PR #722 code review task complete Code review verified: - worker-service.ts hook case starts WorkerService in-process - hook-command.ts has skipExit option - hooks.json uses single chained command - worker-utils.ts ensureWorkerRunning() returns boolean Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * MAESTRO: Mark PR #722 conflict resolution push task complete Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 19:49:15 -05:00
Alex Newman	57a60c1309	chore: bump version to 9.0.13 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 19:41:07 -05:00
Alex Newman	7566b8c650	fix: add idle timeout to prevent zombie observer processes (#856 ) * fix: add idle timeout to prevent zombie observer processes Root cause fix for zombie observer accumulation. The SessionQueueProcessor iterator now exits gracefully after 3 minutes of inactivity instead of waiting forever for messages. Changes: - Add IDLE_TIMEOUT_MS constant (3 minutes) - waitForMessage() now returns boolean and accepts timeout parameter - createIterator() tracks lastActivityTime and exits on idle timeout - Graceful exit via return (not throw) allows SDK to complete cleanly This addresses the root cause that PR #848 worked around with pattern matching. Observer processes now self-terminate, preventing accumulation when session-complete hooks don't fire. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: trigger abort on idle timeout to actually kill subprocess The previous implementation only returned from the iterator on idle timeout, but this doesn't terminate the Claude subprocess - it just stops yielding messages. The subprocess stays alive as a zombie because: 1. Returning from createIterator() ends the generator 2. The SDK closes stdin via transport.endInput() 3. But the subprocess may not exit on stdin EOF 4. No abort signal is sent to kill it Fix: Add onIdleTimeout callback that SessionManager uses to call session.abortController.abort(). This sends SIGTERM to the subprocess via the SDK's ProcessTransport abort handler. Verified by Codex analysis of the SDK internals: - abort() triggers ProcessTransport abort handler → SIGTERM - transport.close() sends SIGTERM → escalates to SIGKILL after 5s - Just closing stdin is NOT sufficient to guarantee subprocess exit Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: add idle timeout to prevent zombie observer processes Also cleaned up hooks.json to remove redundant start commands. The hook command handler now auto-starts the worker if not running, which is how it should have been since we changed to auto-start. This maintenance change was done manually. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: resolve race condition in session queue idle timeout detection - Reset timer on spurious wakeup when queue is empty but duration check fails - Use optional chaining for onIdleTimeout callback - Include threshold value in idle timeout log message for better diagnostics - Add comprehensive unit tests for SessionQueueProcessor Fixes PR #856 review feedback. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat: migrate installer to Setup hook - Add plugin/scripts/setup.sh for one-time dependency setup - Add Setup hook to hooks.json (triggers via claude --init) - Remove smart-install.js from SessionStart hook - Keep smart-install.js as manual fallback for Windows/auto-install Setup hook handles: - Bun detection with fallback locations - uv detection (optional, for Chroma) - Version marker to skip redundant installs - Clear error messages with install instructions * feat: add np for one-command npm releases - Add np as dev dependency - Add release, release:patch, release:minor, release:major scripts - Add prepublishOnly hook to run build before publish - Configure np (no yarn, include all contents, run tests) * fix: reduce PostToolUse hook timeout to 30s PostToolUse runs on every tool call, 120s was excessive and could cause hangs. Reduced to 30s for responsive behavior. * docs: add PR shipping report Analyzed 6 PRs for shipping readiness: - #856: Ready to merge (idle timeout fix) - #700, #722, #657: Have conflicts, need rebase - #464: Contributor PR, too large (15K+ lines) - #863: Needs manual review Includes shipping strategy and conflict resolution order. * MAESTRO: Verify PR #856 test suite passes All 797 tests pass (3 skipped, 0 failures). The 11 SessionQueueProcessor idle timeout tests all pass with 20 expect() assertions verified. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * MAESTRO: Verify PR #856 build passes - Ran npm run build successfully with no TypeScript errors - All artifacts generated (worker-service, mcp-server, context-generator, viewer UI) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * MAESTRO: Code review PR #856 implementation verified Verified all requirements in SessionQueueProcessor.ts: - IDLE_TIMEOUT_MS = 180000ms (3 minutes) - waitForMessage() accepts timeout parameter - lastActivityTime reset on spurious wakeup (race condition fix) - Graceful exit logs include thresholdMs parameter - 11 comprehensive test cases in SessionQueueProcessor.test.ts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: bigph00t <166455923+bigph00t@users.noreply.github.com> Co-authored-by: root <root@srv1317155.hstgr.cloud>	2026-02-04 19:31:24 -05:00
Alex Newman	abffce6424	fix: use cwd instead of CLAUDE_CONFIG_DIR for observer session isolation (#845 ) The previous approach (PR #837) set CLAUDE_CONFIG_DIR to isolate observer sessions from `claude --resume`. However, this broke authentication because Claude Code stores credentials in the config directory. This fix uses the SDK's `cwd` option instead: - Observer sessions run with cwd=~/.claude-mem/observer-sessions/ - Project name = path.basename(cwd) = "observer-sessions" - Sessions won't appear when running `claude --resume` from actual projects - Authentication works because ~/.claude/ config is preserved Changes: - ProcessRegistry.ts: Remove CLAUDE_CONFIG_DIR override from spawn - SDKAgent.ts: Add cwd option to query() pointing to observer dir - paths.ts: Rename OBSERVER_CONFIG_DIR to OBSERVER_SESSIONS_DIR Fixes regression from #837 Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-28 16:18:15 -05:00
Glucksberg	6791069bca	fix: isolate observer sessions to prevent polluting claude --resume list (#837 ) Observer sessions created by claude-mem were appearing in the main Claude Code session picker (`claude --resume`), cluttering the list with internal plugin sessions that users never intend to resume. In one user's case: 74 observer sessions out of ~220 total (34% noise). ## Solution Set `CLAUDE_CONFIG_DIR` to `~/.claude-mem/observer-config/` when spawning observer Claude processes. This stores observer session files in a separate location, isolating them from user sessions. ## Changes 1. Added `OBSERVER_CONFIG_DIR` to paths.ts 2. Modified `createPidCapturingSpawn()` in ProcessRegistry.ts to inject `CLAUDE_CONFIG_DIR` environment variable Observer sessions now write their `.jsonl` files to: `~/.claude-mem/observer-config/projects//` Instead of the user's: `~/.claude/projects//` Fixes #832 Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-28 13:47:29 -05:00
Alexander Knigge	3e6add90de	fix: prevent stale memory_session_id resume crash after worker restart (Issue #817 ) (#839 ) When the worker restarts, the SDK context is lost but the database still contains memory_session_id values from the previous worker instance. The existing guard (lastPromptNumber > 1) doesn't protect against this because lastPromptNumber is also loaded from the database. This fix: - Clears memory_session_id when initializing a session from DB (not from cache) - Adds warning log when discarding stale session IDs - Lets SDK agent capture fresh memory_session_id on first response The key insight: if a session is not in memory, we're in a new worker instance, and any database memory_session_id is definitely stale. Fixes #817 Related to #825 Co-authored-by: bigphoot <bigphoot@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-28 02:40:19 -05:00
bigphoot	2a83e530e9	feat: Add multi-tenancy support for claude-mem pro Wire tenant, database, and API key settings into ChromaSync for remote/pro mode. In remote mode: - Passes tenant and database to ChromaClient for data isolation - Adds Authorization header when API key is configured - Logs tenant isolation connection details Local mode unchanged - uses default_tenant without explicit params. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 13:02:04 -08:00
bigphoot	e5d763860c	fix: Remove duplicate else block from merge Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 13:02:04 -08:00
Alexander Knigge	9e4b401f9b	Update src/services/sync/ChromaServerManager.ts Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-01-26 13:02:04 -08:00
bigphoot	2c304eafad	feat: Add DefaultEmbeddingFunction for local vector embeddings - Added @chroma-core/default-embed dependency for local embeddings - Updated ChromaSync to use DefaultEmbeddingFunction with collections - Added isServerReachable() async method for reliable server detection - Fixed start() to detect and reuse existing Chroma servers - Updated build script to externalize native ONNX binaries - Added runtime dependency to plugin/package.json The embedding function uses all-MiniLM-L6-v2 model locally via ONNX, eliminating need for external embedding API calls. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 13:02:04 -08:00
bigphoot	70d6ac9daf	fix: Use chromadb v3.2.2 with v2 API heartbeat endpoint - Updated chromadb from ^1.9.2 to ^3.2.2 (includes CLI binary) - Changed heartbeat endpoint from /api/v1 to /api/v2 The 1.9.x version did not include the CLI, causing `npx chroma run` to fail. Version 3.2.2 includes the chroma CLI and uses the v2 API. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 13:02:04 -08:00
bigphoot	5b3804ac08	feat: Switch to persistent Chroma HTTP server Replace MCP subprocess approach with persistent Chroma HTTP server for improved performance and reliability. This re-enables Chroma on Windows by eliminating the subprocess spawning that caused console popups. Changes: - NEW: ChromaServerManager.ts - Manages local Chroma server lifecycle via `npx chroma run` - REFACTOR: ChromaSync.ts - Uses chromadb npm package's ChromaClient instead of MCP subprocess (removes Windows disabling) - UPDATE: worker-service.ts - Starts Chroma server on initialization - UPDATE: GracefulShutdown.ts - Stops Chroma server on shutdown - UPDATE: SettingsDefaultsManager.ts - New Chroma configuration options - UPDATE: build-hooks.js - Mark optional chromadb deps as external Benefits: - Eliminates subprocess spawn latency on first query - Single server process instead of per-operation subprocesses - No Python/uvx dependency for local mode - Re-enables Chroma vector search on Windows - Future-ready for cloud-hosted Chroma (claude-mem pro) - Cross-platform: Linux, macOS, Windows Configuration: CLAUDE_MEM_CHROMA_MODE=local\|remote CLAUDE_MEM_CHROMA_HOST=127.0.0.1 CLAUDE_MEM_CHROMA_PORT=8000 CLAUDE_MEM_CHROMA_SSL=false Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 13:02:04 -08:00
Alexander Knigge	182097ef1c	fix: resolve path format mismatch in folder CLAUDE.md generation (#794 ) (#813 ) The isDirectChild() function failed to match files when the API used absolute paths (/Users/x/project/app/api) but the database stored relative paths (app/api/router.py). This caused all folder CLAUDE.md files to incorrectly show "No recent activity". Changes: - Create shared path-utils module with proper path normalization - Implement suffix matching strategy for mixed path formats - Update SessionSearch.ts to use shared utilities - Update regenerate-claude-md.ts to use shared utilities (was using outdated broken logic) - Prevent spurious directory creation from malformed paths - Add comprehensive test coverage for path matching edge cases This is the proper fix for #794, replacing PR #809 which only masked the bug by skipping file creation when "no activity" was shown. Co-authored-by: bigphoot <bigphoot@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 15:48:31 -05:00
Alexander Knigge	c1b5b2a783	fix: prevent zombie process accumulation via PID registry and signal propagation (Issue #737 ) (#806 ) * Fix zombie process accumulation (Issue #737) Problem: Claude haiku subprocesses spawned by the SDK weren't terminating properly, causing zombie process accumulation (user reported 155 processes consuming 51GB RAM). Root causes: 1. SDK's SpawnedProcess interface hides subprocess PIDs 2. deleteSession() didn't verify subprocess exit 3. abort() was fire-and-forget with no confirmation 4. No mechanism to track or clean up orphaned processes Solution: - Add ProcessRegistry module to track spawned Claude subprocesses - Use SDK's spawnClaudeCodeProcess option to capture PIDs via custom spawn - Pass signal parameter to enable AbortController integration - Wait for subprocess exit in deleteSession() with 5s timeout - Escalate to SIGKILL if graceful exit fails - Add orphan reaper running every 5 minutes as safety net Files changed: - src/services/worker/ProcessRegistry.ts (new): PID registry and reaper - src/services/worker/SDKAgent.ts: Use custom spawn to capture PIDs - src/services/worker/SessionManager.ts: Verify subprocess exit on delete - src/services/worker-service.ts: Start/stop orphan reaper Fixes #737 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: address code review feedback - Replace busy-wait polling with event-based proc.once('exit') - Detect and warn about multiple processes per session (race condition) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: bigphoot <bigphoot@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-25 20:10:11 -05:00

1 2 3 4 5 ...

352 Commits