The parser correctly strips observation types from concepts arrays when the
LLM ignores the prompt instruction. This is routine data normalization, not
an error — downgrade to debug to reduce log noise.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SessionEnd has a 1.5s hardcoded cap from Claude Code (CLAUDE_CODE_SESSIONEND_HOOKS_TIMEOUT_MS),
making it unsuitable for waiting on async work. Previously, the Stop hook would fire-and-forget
the summarize request, then SessionEnd would immediately call deleteSession — aborting the SDK
agent mid-summary.
Now the Stop hook (120s timeout, no cap) owns the full lifecycle:
1. Queue summarize request
2. Poll new GET /api/sessions/status endpoint until queue drains
3. Call /api/sessions/complete after summary finishes
SessionEnd is now a true fire-and-forget fallback (process.exit(0) immediately).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve merge conflicts in adapter index, gemini-cli adapter, and rebuilt CJS artifacts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes CursorHooksInstaller ESM compatibility, updates install command
with improved path resolution, and refreshes built plugin artifacts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
System reminders (CLAUDE.md contents, deferred tool lists) were being
stored in memory observations. Add system-reminder to the tag stripping
pipeline alongside <private> and <system_instruction>, and extract the
duplicated regex into a shared SYSTEM_REMINDER_REGEX constant.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of using shell:true with spawn(), use cmd.exe as the command
with /c flag to properly execute bun.cmd on Windows.
Without this, spawn() with shell:true fails because cmd.exe doesn't
know how to handle the bun shell script directly.
Fixes: Stop hook "Failed to start Bun: spawn bun ENOENT"
Windows npm installs both bun (shell script) and bun.cmd (batch file).
When spawning bun, cmd.exe cannot execute the shell script directly.
This change makes findBun() return the full path to bun.cmd on Windows.
Fixes: Stop hook "spawn bun ENOENT" on Windows
Windows npm installs bun as a shell script (C:\Users\...\AppData\Roaming\npm\bun),
not a native executable. Without shell:true, spawn() fails with ENOENT
when trying to execute it.
Fixes Stop hook failure: "Failed to start Bun: spawn bun ENOENT"
- Update POST_SPAWN_WAIT test assertion from 5000 to 15000 to match
the constant change in hook-constants.ts
- Remove redundant readPidFile() from aggressiveStartupCleanup() —
start() writes the new PID before this runs, so it always returns
process.pid (already protected)
- Add waitForReadiness() to the reused-worker path in
ensureWorkerStarted() to prevent concurrent hooks from racing
past a cold-starting worker's initialization guard
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three independent fixes for worker daemon instability:
1. Remove version mismatch auto-restart from ensureWorkerStarted() (#1435).
The marketplace bundle ships with __DEFAULT_PACKAGE_VERSION__ unbaked,
causing BUILT_IN_VERSION to fall back to "development". This creates a
100% reproducible mismatch on every hook call, killing a healthy worker
and often failing to restart. Same pattern across #566, #665, #667,
#669, #689, #1124, #1145 (8+ releases).
2. Add process.ppid and PID-file PID to aggressiveStartupCleanup()
exclusions (#1426). Without this, a newly spawned daemon SIGKILLs
the hook process that spawned it and any already-running worker
the PID file points to.
3. Increase POST_SPAWN_WAIT from 5s to 15s (#1423). The 5s timeout was
sized for Linux (<1s startup) but macOS ARM64 cold starts take 6-8s
with Chroma enabled.
The MCP server (#!/usr/bin/env node) and context generator run under
Node.js, where import.meta.url throws SyntaxError in CJS mode. Only
the worker-service needs the banner since it runs under Bun.
CJS files under Node.js already have __dirname/__filename natively.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Fix esbuild inlining build-machine __dirname as string literal — use
CJS-compatible runtime banner with require("node:url").fileURLToPath
across worker-service, mcp-server, and context-generator builds.
2. Fix isMainModule check missing .cjs extension and Windows backslash
path normalization.
3. Wrap extractLastMessage in try-catch to prevent infinite Stop hook
feedback loop on malformed transcripts (exit 0 instead of exit 2).
4. Replace heavy SessionEnd hook (Node→Bun→1.7MB CJS→HTTP) with
lightweight inline node -e one-liner (~200ms vs >1s).
5. Add 7 Gemini/OpenRouter error patterns to unrecoverablePatterns
circuit breaker to prevent 77K+ retry loops on expired API keys.
6. Preserve CLAUDE_CODE_OAUTH_TOKEN and CLAUDE_CODE_GIT_BASH_PATH in
sanitizeEnv instead of stripping them with the CLAUDE_CODE_ prefix.
7. Use PowerShell -EncodedCommand for spawnDaemon to fix path quoting
when Windows usernames contain spaces.
Closes#1515, #1495, #1475, #1465, #1500, #1513, #1512, #1450, #1460,
#1486, #1449, #1481, #1451, #1480, #1453, #1445
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Defense-in-depth for #1505. When the 'start' subcommand forks a daemon,
the parent bun process may be killed by signal (exit > 128). If the
close handler fires, treat this as success since the daemon started fine.
Note: the primary fix is in hooks.json since the SIGKILL often kills
the entire process group before this handler fires.
Tighten platform source persistence so legacy callers cannot silently relabel existing sessions, repair migration 24 when schema_versions drifts from the real schema, and polish the follow-up UI/error-handler review nits.
- only backfill platform_source when it is blank and raise on explicit source conflicts for an existing session
- make migration 24 verify both the sdk_sessions column and its index before treating it as applied
- expose platform_source from the functional session getters and add regression tests for source preservation and schema drift recovery
- add the required APPROVED OVERRIDE annotation for centralized HTTP error translation
- keep mobile source pills on a single horizontal row
Persist platform_source across session creation, transcript ingestion, API query paths, and viewer state so Claude and Codex data can coexist without bleeding into each other.
- add platform-source normalization helpers and persist platform_source in sdk_sessions via migration 24 with backfill and indexing
- thread platformSource through CLI hooks, transcript processing, context generation, pagination, search routes, SSE payloads, and session management
- expose source-aware project catalogs, viewer tabs, context preview selectors, and source badges for observations, prompts, and summaries
- start the transcript watcher from the worker for transcript-based clients and preserve platform source during Codex ingestion
- auto-start the worker from the MCP server for MCP-only clients and tighten stdio-driven cleanup during shutdown
- keep createSDKSession backward compatible with existing custom-title callers while allowing explicit platform source forwarding
GeminiAgent sends the full conversation history with every API call,
causing quadratic token growth per session. A 100-observation session
sends ~30M cumulative input tokens. This ports the proven truncateHistory()
sliding window from OpenRouterAgent to GeminiAgent.
- Add CLAUDE_MEM_GEMINI_MAX_CONTEXT_MESSAGES (default: 20) and
CLAUDE_MEM_GEMINI_MAX_TOKENS (default: 100000) settings
- Add truncateHistory() to GeminiAgent using shared estimateTokens()
- Always preserve at least the newest message to avoid empty API requests
- Add settings validation in SettingsRoutes (1-100 messages, 1K-1M tokens)
- Add regression tests for truncation and oversized single-prompt edge case
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: stop spinner from spinning forever due to orphaned DB messages
The activity spinner never stopped because isAnySessionProcessing() queried
ALL pending/processing messages in the database, including orphaned messages
from dead sessions that no generator would ever process.
Root cause: isAnySessionProcessing() used hasAnyPendingWork() which is a
global DB scan. Changed it to use getTotalQueueDepth() which only checks
sessions in the active in-memory Map.
Additional fixes:
- Add terminateSession() to enforce restart-or-terminate invariant
- Fix 3 zombie paths in .finally() handler that left sessions alive
- Clean up idle sessions from memory on successful completion
- Remove redundant bare isProcessing:true broadcast
- Replace inline require() with proper accessor
- Add 8 regression tests for session termination invariant
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address review findings — idle-timeout race, double broadcast, query amplification
- Move pendingCount check before idle-timeout termination to prevent
abandoning fresh messages that arrive between idle abort and .finally()
- Move broadcastProcessingStatus() inside restart branch only — the else
branch already broadcasts via removeSessionImmediate callback
- Compute queueDepth once in broadcastProcessingStatus() and derive
isProcessing from it, eliminating redundant double iteration
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: strip <system_instruction> tags before database storage
Extends the existing tag-stripping mechanism (used for <private> and
<claude-mem-context>) to also filter Conductor-injected system instructions,
preventing them from being persisted in the observation database.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: also strip <system-instruction> (hyphen variant) before DB storage
Conductor uses both <system_instruction> and <system-instruction> tag
formats. This adds the hyphen variant to the same stripping mechanism.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add embedded Process Supervisor for unified process lifecycle management
Consolidates scattered process management (ProcessManager, GracefulShutdown,
HealthMonitor, ProcessRegistry) into a unified src/supervisor/ module.
New: ProcessRegistry with JSON persistence, env sanitizer (strips CLAUDECODE_*
vars), graceful shutdown cascade (SIGTERM → 5s wait → SIGKILL with tree-kill
on Windows), PID file liveness validation, and singleton Supervisor API.
Fixes#1352 (worker inherits CLAUDECODE env causing nested sessions)
Fixes#1356 (zombie TCP socket after Windows reboot)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add session-scoped process reaping to supervisor
Adds reapSession(sessionId) to ProcessRegistry for killing session-tagged
processes on session end. SessionManager.deleteSession() now triggers reaping.
Tightens orphan reaper interval from 60s to 30s.
Fixes#1351 (MCP server processes leak on session end)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add Unix domain socket support for worker communication
Introduces socket-manager.ts for UDS-based worker communication, eliminating
port 37777 collisions between concurrent sessions. Worker listens on
~/.claude-mem/sockets/worker.sock by default with TCP fallback.
All hook handlers, MCP server, health checks, and admin commands updated to
use socket-aware workerHttpRequest(). Backwards compatible — settings can
force TCP mode via CLAUDE_MEM_WORKER_TRANSPORT=tcp.
Fixes#1346 (port 37777 collision across concurrent sessions)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: remove in-process worker fallback from hook command
Removes the fallback path where hook scripts started WorkerService in-process,
making the worker a grandchild of Claude Code (killed by sandbox). Hooks now
always delegate to ensureWorkerStarted() which spawns a fully detached daemon.
Fixes#1249 (grandchild process killed by sandbox)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add health checker and /api/admin/doctor endpoint
Adds 30-second periodic health sweep that prunes dead processes from the
supervisor registry and cleans stale socket files. Adds /api/admin/doctor
endpoint exposing supervisor state, process liveness, and environment health.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add comprehensive supervisor test suite
64 tests covering all supervisor modules: process registry (18 tests),
env sanitizer (8), shutdown cascade (10), socket manager (15), health
checker (5), and supervisor API (6). Includes persistence, isolation,
edge cases, and cross-module integration scenarios.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: revert Unix domain socket transport, restore TCP on port 37777
The socket-manager introduced UDS as default transport, but this broke
the HTTP server's TCP accessibility (viewer UI, curl, external monitoring).
Since there's only ever one worker process handling all sessions, the
port collision rationale for UDS doesn't apply. Reverts to TCP-only,
removing ~900 lines of unnecessary complexity.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: remove dead code found in pre-landing review
Remove unused `acceptingSpawns` field from Supervisor class (written but
never read — assertCanSpawn uses stopPromise instead) and unused
`buildWorkerUrl` import from context handler.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* updated gitignore
* fix: address PR review feedback - downgrade HTTP logging, clean up gitignore, harden supervisor
- Downgrade request/response HTTP logging from info to debug to reduce noise
- Remove unused getWorkerPort imports, use buildWorkerUrl helper
- Export ENV_PREFIXES/ENV_EXACT_MATCHES from env-sanitizer, reuse in Server.ts
- Fix isPidAlive(0) returning true (should be false)
- Add shutdownInitiated flag to prevent signal handler race condition
- Make validateWorkerPidFile testable with pidFilePath option
- Remove unused dataDir from ShutdownCascadeOptions
- Upgrade reapSession log from debug to warn
- Rename zombiePidFiles to deadProcessPids (returns actual PIDs)
- Clean up gitignore: remove duplicate datasets/, stale ~*/ and http*/ patterns
- Fix tests to use temp directories instead of relying on real PID file
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: always pass --ssl flag to chroma-mcp in remote mode
The chroma-mcp CLI defaults to SSL when using --client-type http.
When CLAUDE_MEM_CHROMA_SSL is false (the common case for local
ChromaDB servers), buildCommandArgs() omitted --ssl entirely,
causing chroma-mcp to attempt an SSL connection to a plain HTTP
server and fail with "Could not connect to a Chroma server".
Always pass --ssl with an explicit true/false value so the user's
CLAUDE_MEM_CHROMA_SSL setting is faithfully forwarded.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test: add regression tests for ChromaMcpManager SSL flag fix
Adds 4 focused test cases verifying buildCommandArgs() produces correct
--ssl args, covering SSL=false, SSL=true, unset (defaults to false), and
local mode (no --ssl flag). Requested by @xkonjin in PR #1286 review.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: rebuild checked-in bundles to include SSL flag fix
Rebuild all bundles against upstream/main so the --ssl <true|false>
fix is present in the runtime artifacts that hooks and the marketplace
plugin actually execute.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
smart-install.js used stdio: 'inherit' for execSync calls, leaking
plain-text install output (bun/npm progress) to stdout. Claude Code
expects hook output to start with '{' (valid JSON), so this caused
a confusing "SessionStart:startup hook error" on every session start.
Changes:
- Pipe stdout on all execSync calls to prevent non-JSON stdout leak
- Output {"continue":true,"suppressOutput":true} on both success and
error paths so Claude Code always receives valid JSON
- Add tests verifying no stdio:'inherit' remains and JSON is output
Fixes#1253
Vibe-coded by Ousama Ben Younes
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: unify mode type/concept loading to always use mode definition
Code mode previously read observation types/concepts from settings.json
while non-code modes read from their mode JSON definition. This caused
stale filters to persist when switching between modes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: remove dead observation type/concept settings constants
CLAUDE_MEM_CONTEXT_OBSERVATION_TYPES and OBSERVATION_CONCEPTS are no
longer read by ContextConfigLoader since all modes now use their mode
definition. Removes the constants, defaults, UI controls, and the
now-empty observation-metadata.ts file.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add smart-file-read module for token-optimized semantic code search
- Created package.json for the smart-file-read module with dependencies and scripts.
- Implemented parser.ts for code structure parsing using tree-sitter, supporting multiple languages.
- Developed search.ts for searching code files and symbols with grep-style and structural matching.
- Added test-run.mjs for testing search and outline functionalities.
- Configured TypeScript with tsconfig.json for strict type checking and module resolution.
* fix: update .gitignore to include _tree-sitter and remove unused subproject
* feat: add preliminary results and skill recommendation for smart-explore module
* chore: remove outdated plan.md file detailing session start hook issues
* feat: update Smart File Read integration plan and skill documentation for smart-explore
* feat: migrate Smart File Read to web-tree-sitter WASM for cross-platform compatibility
* refactor: switch to tree-sitter CLI for parsing and enhance search functionality
- Updated `parser.ts` to utilize the tree-sitter CLI for AST extraction instead of native bindings, improving compatibility and performance.
- Removed grammar loading logic and replaced it with a path resolution for grammar packages.
- Implemented batch parsing in `parseFilesBatch` to handle multiple files in a single CLI call, enhancing search speed.
- Refactored `searchCodebase` to collect files and parse them in batches, streamlining the search process.
- Adjusted symbol extraction logic to accommodate the new parsing method and ensure accurate symbol matching.
* feat: update Smart File Read integration plan to utilize tree-sitter CLI for improved performance and cross-platform compatibility
* feat: add smart-file-read parser and search to src/services
Copy validated tree-sitter CLI-based parser and search modules from
smart-file-read prototype into the claude-mem source tree for MCP
tool integration.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: register smart_search, smart_unfold, smart_outline MCP tools
Add 3 tree-sitter AST-based code exploration tools to the MCP server.
Direct execution (no HTTP delegation) — they call parser/search
functions directly for sub-second response times.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add tree-sitter CLI deps to build system and plugin runtime
Externalize tree-sitter packages in esbuild MCP server build. Add
10 grammar packages + CLI to plugin package.json for runtime install.
Remove unused @chroma-core/default-embed from plugin deps.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: create smart-explore skill with 3-layer workflow docs
Progressive disclosure workflow: search -> outline -> unfold.
Documents all 3 MCP tools with parameters and token economics.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add comprehensive documentation for the smart-explore feature
- Introduced a detailed technical reference covering the architecture, parser, search engine, and tool registration for the smart-explore feature in claude-mem.
- Documented the three-layer workflow: search, outline, and unfold, along with their respective MCP tools.
- Explained the parsing process using tree-sitter, including language support, query patterns, and symbol extraction.
- Outlined the search module's functionality, including file discovery, batch parsing, and relevance scoring.
- Provided insights into build system integration and token economics for efficient code exploration.
* chore: remove experiment artifacts, prototypes, and plan files
Remove A/B test docs, prototype smart-file-read directory, and
implementation plans. Keep only production code.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: simplify hooks configuration and remove setup script
* fix: use execFileSync to prevent command injection in tree-sitter parser
Replaces execSync shell string with execFileSync + argument array,
eliminating shell interpretation of file paths. Also corrects
file_pattern description from "Glob pattern" to "Substring filter".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve PostToolUse hook crashes and 5s latency (#1220)
Three compounding bugs caused hook failures:
1. Missing break statements in worker-service.ts switch — if async
code threw before process.exit(), execution fell through to
subsequent cases. Added break to all 7 cases missing them.
2. Unhandled promise rejection on main() — added .catch() that logs
the error and exits 0 (per project exit code strategy: don't block
Claude Code or leave Windows Terminal tabs open).
3. Redundant start commands in hooks.json — PostToolUse,
UserPromptSubmit, and Stop groups each had a standalone start
command that was redundant (the hook case already calls
ensureWorkerStarted internally). The redundant start also caused
5s latency via bun-runner.js collectStdin() timeout since Claude
Code never closes stdin.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add CLAUDE_PLUGIN_ROOT fallback for Stop hooks (#1215)
Upstream Claude Code bug (anthropics/claude-code#24529) leaves
CLAUDE_PLUGIN_ROOT unset for Stop hooks on macOS and ALL hooks
on Linux. Two-layer defense:
1. Shell-level: hooks.json commands now use inline fallback
_R="${CLAUDE_PLUGIN_ROOT}"; [ -z "$_R" ] && _R="$HOME/...";
falling back to the known marketplace install path.
2. Script-level: bun-runner.js self-resolves plugin root from
its own filesystem location via import.meta.url, and fixes
broken /scripts/... paths that result from empty expansion.
Added test to verify all hook commands include the fallback path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>