Compare commits

..

19 Commits

Author SHA1 Message Date
Alex Newman 5ccd81b8a3 chore: bump version to 10.5.6
Publish to npm / publish (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 14:54:32 -07:00
Alex Newman 678ae1e7d3 chore: rebuild worker-service.cjs with latest source changes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 14:53:03 -07:00
Alex Newman 80a8c90a1a feat: add embedded Process Supervisor for unified process lifecycle (#1370)
* feat: add embedded Process Supervisor for unified process lifecycle management

Consolidates scattered process management (ProcessManager, GracefulShutdown,
HealthMonitor, ProcessRegistry) into a unified src/supervisor/ module.

New: ProcessRegistry with JSON persistence, env sanitizer (strips CLAUDECODE_*
vars), graceful shutdown cascade (SIGTERM → 5s wait → SIGKILL with tree-kill
on Windows), PID file liveness validation, and singleton Supervisor API.

Fixes #1352 (worker inherits CLAUDECODE env causing nested sessions)
Fixes #1356 (zombie TCP socket after Windows reboot)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add session-scoped process reaping to supervisor

Adds reapSession(sessionId) to ProcessRegistry for killing session-tagged
processes on session end. SessionManager.deleteSession() now triggers reaping.
Tightens orphan reaper interval from 60s to 30s.

Fixes #1351 (MCP server processes leak on session end)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Unix domain socket support for worker communication

Introduces socket-manager.ts for UDS-based worker communication, eliminating
port 37777 collisions between concurrent sessions. Worker listens on
~/.claude-mem/sockets/worker.sock by default with TCP fallback.

All hook handlers, MCP server, health checks, and admin commands updated to
use socket-aware workerHttpRequest(). Backwards compatible — settings can
force TCP mode via CLAUDE_MEM_WORKER_TRANSPORT=tcp.

Fixes #1346 (port 37777 collision across concurrent sessions)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove in-process worker fallback from hook command

Removes the fallback path where hook scripts started WorkerService in-process,
making the worker a grandchild of Claude Code (killed by sandbox). Hooks now
always delegate to ensureWorkerStarted() which spawns a fully detached daemon.

Fixes #1249 (grandchild process killed by sandbox)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add health checker and /api/admin/doctor endpoint

Adds 30-second periodic health sweep that prunes dead processes from the
supervisor registry and cleans stale socket files. Adds /api/admin/doctor
endpoint exposing supervisor state, process liveness, and environment health.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add comprehensive supervisor test suite

64 tests covering all supervisor modules: process registry (18 tests),
env sanitizer (8), shutdown cascade (10), socket manager (15), health
checker (5), and supervisor API (6). Includes persistence, isolation,
edge cases, and cross-module integration scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: revert Unix domain socket transport, restore TCP on port 37777

The socket-manager introduced UDS as default transport, but this broke
the HTTP server's TCP accessibility (viewer UI, curl, external monitoring).
Since there's only ever one worker process handling all sessions, the
port collision rationale for UDS doesn't apply. Reverts to TCP-only,
removing ~900 lines of unnecessary complexity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: remove dead code found in pre-landing review

Remove unused `acceptingSpawns` field from Supervisor class (written but
never read — assertCanSpawn uses stopPromise instead) and unused
`buildWorkerUrl` import from context handler.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* updated gitignore

* fix: address PR review feedback - downgrade HTTP logging, clean up gitignore, harden supervisor

- Downgrade request/response HTTP logging from info to debug to reduce noise
- Remove unused getWorkerPort imports, use buildWorkerUrl helper
- Export ENV_PREFIXES/ENV_EXACT_MATCHES from env-sanitizer, reuse in Server.ts
- Fix isPidAlive(0) returning true (should be false)
- Add shutdownInitiated flag to prevent signal handler race condition
- Make validateWorkerPidFile testable with pidFilePath option
- Remove unused dataDir from ShutdownCascadeOptions
- Upgrade reapSession log from debug to warn
- Rename zombiePidFiles to deadProcessPids (returns actual PIDs)
- Clean up gitignore: remove duplicate datasets/, stale ~*/ and http*/ patterns
- Fix tests to use temp directories instead of relying on real PID file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 14:49:23 -07:00
Vincent Leraitre 237a4c37f8 fix: always pass --ssl flag to chroma-mcp in remote mode (#1286)
* fix: always pass --ssl flag to chroma-mcp in remote mode

The chroma-mcp CLI defaults to SSL when using --client-type http.
When CLAUDE_MEM_CHROMA_SSL is false (the common case for local
ChromaDB servers), buildCommandArgs() omitted --ssl entirely,
causing chroma-mcp to attempt an SSL connection to a plain HTTP
server and fail with "Could not connect to a Chroma server".

Always pass --ssl with an explicit true/false value so the user's
CLAUDE_MEM_CHROMA_SSL setting is faithfully forwarded.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add regression tests for ChromaMcpManager SSL flag fix

Adds 4 focused test cases verifying buildCommandArgs() produces correct
--ssl args, covering SSL=false, SSL=true, unset (defaults to false), and
local mode (no --ssl flag). Requested by @xkonjin in PR #1286 review.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: rebuild checked-in bundles to include SSL flag fix

Rebuild all bundles against upstream/main so the --ssl <true|false>
fix is present in the runtime artifacts that hooks and the marketplace
plugin actually execute.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:03:58 -07:00
laihenyi 626654f816 fix: prevent infinite restart loop on FOREIGN KEY constraint errors (#1334)
The pending-work-restart logic had no retry limit, causing infinite loops
when sessions encountered FOREIGN KEY constraint failures. This led to
2000+ error log entries per minute and eventual worker crash via SIGTERM.

Two fixes:
1. Add 'FOREIGN KEY constraint failed' to unrecoverable error patterns
   so it short-circuits immediately instead of falling through to restart
2. Add MAX_PENDING_RESTARTS (3) limit to pending-work-restart path as a
   safety net for any future unhandled persistent errors

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:03:48 -07:00
secyunshu ed5189ebe9 fix: merge SessionStart hooks to run sequentially (#1341)
The SessionStart hook was incorrectly split into two separate matchers
with the same pattern "startup|clear|compact", causing them to run
in parallel per Claude Code's hook execution model. This resulted in
a race condition where both hooks tried to bind to port 37777
simultaneously, causing "port in use" errors on first startup.

This fix consolidates all SessionStart commands into a single
matcher, ensuring they execute sequentially.

Fixes regression introduced in commit d93bde0.

Co-authored-by: yunshu <yunshu@moresec.cn>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:03:44 -07:00
enzoricciulli e7ba9acaa7 fix: add content-hash dedup to batch observation store methods (#1302)
storeObservations() and storeObservationsAndMarkComplete() were missing
the content-hash deduplication that storeObservation() (singular) already
had via computeObservationContentHash() and findDuplicateObservation().

This caused the Gemini provider (and potentially others that return
multiple observations per response) to insert 2-10x duplicate rows per
tool use, since the batch methods inserted unconditionally without
checking content_hash.

The fix adds the same dedup pattern from storeObservation() to both
batch methods:
1. Compute content hash via computeObservationContentHash()
2. Check for existing observation within 30s window via findDuplicateObservation()
3. Skip insert and reuse existing ID if duplicate found
4. Include content_hash column in INSERT statement

Fixes #1158 (duplicate observations with Gemini provider)

Co-authored-by: Enzo Ricciulli <e.ricciulli@systhema.ai>
2026-03-12 20:01:53 -07:00
antmid ad902bedd9 fix: auto-repair malformed database schema from cross-version sync (#1308)
When a claude-mem DB is synced between machines running different versions,
orphaned indexes can reference non-existent columns (e.g. idx_observations_content_hash
referencing content_hash). This causes SQLite to throw "malformed database schema"
on ALL queries, including PRAGMAs, creating a silent 503 failure loop.

The fix detects this on startup, uses Python's sqlite3 module to drop the
orphaned schema objects (bun:sqlite doesn't support writable_schema modifications),
resets migration versions, and lets the idempotent migration system recreate
everything properly.

Fixes #1307

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:01:51 -07:00
GigiTiti-Kai b88566dcdd fix(ui): include SSE live data when project filter is active (#1315)
When a project filter was selected in the Web UI, all SSE live data
(observations, summaries, prompts) was completely discarded. Only
paginated API data was shown, meaning new real-time events were
invisible until the user refreshed the page.

Fix: filter SSE data by project before merging with paginated data,
instead of discarding it entirely.

Fixes #1313

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:01:48 -07:00
Rajiv Sinclair 1fac57535e fix: gracefully handle missing transcript files in worktree sessions (#1326)
When Claude Code runs in a worktree (via Agent tool with isolation: "worktree"),
the transcript path points to the worktree's project directory. After the
worktree is cleaned up, the Stop hook fires but the transcript file no longer
exists, causing extractLastMessage() to throw. This error triggers Claude to
respond, which fires another Stop hook, creating an infinite error loop.

Changed throws to warn-and-return-empty so the summarize hook exits cleanly
with exit 0 instead of cascading errors.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:59:47 -07:00
AlexWorland 10e980cd69 fix: remove unrecognized fields from Claude Code Stop hook output (#1291)
* fix: remove unrecognized fields from Claude Code Stop hook output

Claude Code validates Stop hook JSON output against its hook contract
schema which only accepts {decision?, reason?, systemMessage?}. The
formatOutput() function was returning {continue, suppressOutput} which
are not part of the Claude Code hook API, causing "JSON validation
failed" errors on every session stop.

Return an empty object {} for the default case (no hookSpecificOutput),
preserving only systemMessage when present. This is valid for all hook
event types and eliminates the schema validation error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add unhappy-path tests for formatOutput per PR review

Add edge case coverage for malformed input (undefined/null), falsy
systemMessage values, non-contract field stripping, and contract key
allowlist. Also add defensive null guard to formatOutput matching
normalizeInput pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Alex Worland <alexworland@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:59:45 -07:00
Nir Alfasi 38d9ac7adb fix: prevent zombie subprocess accumulation by only trusting exitCode (#1226) (#1325)
proc.killed only means Node sent a signal — the process can still be alive.
This caused premature pool slot release, allowing unbounded process spawning.

- ensureProcessExit: remove proc.killed from early-exit checks, only trust exitCode
- Fix 3 call-site guards that skipped cleanup for signaled-but-alive processes
- Add TOTAL_PROCESS_HARD_CAP=10 safety net in waitForSlot()
- After SIGKILL, wait up to 1s via exit event instead of blind 200ms sleep
- Reduce reaper interval from 5min to 1min, idle threshold from 2min to 1min

Closes #1226

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:59:42 -07:00
GigiTiti-Kai 23058d4b0c fix: move session-complete from Stop to SessionEnd hook (#1330)
The `session-complete` command in the Stop hook runs every turn,
killing the SDK agent via SIGTERM. This causes the next summarize
call to receive an empty response because the agent needs to restart.

By moving `session-complete` to SessionEnd (which only fires on
actual session termination), the SDK agent stays alive between turns
and summarize works reliably.

Fixes #1314

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:59:39 -07:00
Ben Younes 503bda4868 fix: add null guards for getChromaSync() when Chroma is disabled (#1336)
When CLAUDE_MEM_CHROMA_ENABLED=false, getChromaSync() returns null.
Two call sites were missing null guards, causing "null is not an object"
errors on every UserPromptSubmit / session init.

Fixes #1294

Vibe-coded by Ousama Ben Younes

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:58:03 -07:00
Ben Younes 4616f7ab1c fix: output valid JSON from smart-install.js hook to prevent SessionStart error (#1337)
smart-install.js used stdio: 'inherit' for execSync calls, leaking
plain-text install output (bun/npm progress) to stdout. Claude Code
expects hook output to start with '{' (valid JSON), so this caused
a confusing "SessionStart:startup hook error" on every session start.

Changes:
- Pipe stdout on all execSync calls to prevent non-JSON stdout leak
- Output {"continue":true,"suppressOutput":true} on both success and
  error paths so Claude Code always receives valid JSON
- Add tests verifying no stdio:'inherit' remains and JSON is output

Fixes #1253

Vibe-coded by Ousama Ben Younes

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:58:00 -07:00
Umut Polat 73113321a1 fix: respect dateStart/dateEnd filters in Chroma search path (#1343)
When a search query includes dateStart/dateEnd parameters, the Chroma
semantic search path (PATH 2) ignored them entirely and only applied a
hardcoded 90-day recency window. This meant date-filtered searches
returned results from outside the requested range.

Now the Chroma path checks for a user-provided dateRange first. If
present, it filters results by the requested start/end bounds. The
90-day window is only used as the default when no date filter is
specified, preserving backward compatibility.

Fixes #1324

Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com>
2026-03-12 19:57:58 -07:00
Umut Polat 88be01910b fix: respect env vars and settings.json for DATA_DIR resolution (#1344)
SettingsDefaultsManager.get() returned only the hardcoded default,
ignoring both environment variables and settings.json overrides. This
meant CLAUDE_MEM_DATA_DIR set via env or settings file had no effect
on paths.ts (and other callers using get()).

Two changes:
- get() now checks process.env before falling back to the default
- paths.ts resolves DATA_DIR with full priority: env var > settings.json
  at the default location > hardcoded default

The settings file read uses a synchronous bootstrap pattern to avoid
circular dependencies (DATA_DIR is needed to locate the settings file,
so we check the default path only).

Fixes #1303

Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com>
2026-03-12 19:57:55 -07:00
Umut Polat 9dbf63f5d4 fix: prevent LLM from using <observation> tags in summary responses (#1345)
The SDK agent's conversation history is heavily biased toward
<observation> output. By the time a summarize prompt arrives, the
in-context conditioning can cause the LLM to respond with <observation>
tags instead of the expected <summary> tags. parseSummary() then returns
null and the summary is silently lost.

Two changes:
- Add explicit mode-switch instructions at the top of the summary prompt
  telling the LLM not to use <observation> tags and that only <summary>
  output will be accepted
- Add a warning log in parseSummary() when <observation> tags are found
  in a response that has no <summary> block, making the issue visible
  in logs instead of silently discarding

Fixes #1312

Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com>
2026-03-12 19:57:52 -07:00
Alex Newman 3651a34e96 docs: update CHANGELOG.md for v10.5.5
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 03:02:35 -07:00
66 changed files with 3399 additions and 724 deletions
+1 -1
View File
@@ -10,7 +10,7 @@
"plugins": [
{
"name": "claude-mem",
"version": "10.5.5",
"version": "10.5.6",
"source": "./plugin",
"description": "Persistent memory system for Claude Code - context compression across sessions"
}
+5 -8
View File
@@ -20,7 +20,6 @@ plugin/data.backup/
package-lock.json
bun.lock
private/
datasets/
Auto Run Docs/
# Generated UI files (built from viewer-template.html)
@@ -30,12 +29,10 @@ src/ui/viewer.html
.mcp.json
.cursor/
# Prevent literal tilde directories (path validation bug artifacts)
~*/
# Prevent other malformed path directories
http*/
https*/
# Ignore WebStorm project files (for dinosaur IDE users)
.idea/
.claude-octopus/
.claude/session-intent.md
.claude/session-plan.md
.octo/
+12 -19
View File
@@ -2,6 +2,18 @@
All notable changes to claude-mem.
## [v10.5.5] - 2026-03-09
### Bug Fix
- **Fixed empty context queries after mode switching**: Switching from a non-code mode (e.g., law-study) back to code mode left stale observation type/concept filters in `settings.json`, causing all context queries to return empty results. All modes now read types/concepts from their mode JSON definition uniformly.
### Cleanup
- Removed dead `CLAUDE_MEM_CONTEXT_OBSERVATION_TYPES` and `CLAUDE_MEM_CONTEXT_OBSERVATION_CONCEPTS` settings constants
- Deleted `src/constants/observation-metadata.ts` (no longer needed)
- Removed observation type/concept filter UI controls from the viewer's Context Settings modal
## [v10.5.4] - 2026-03-09
## Bug Fixes
@@ -1168,22 +1180,3 @@ Version 9.0.0 introduces the **Live Context System** - a major new capability th
🤖 Generated with [Claude Code](https://claude.com/claude-code)
## [v8.5.10] - 2026-01-06
## Bug Fixes
- **#545**: Fixed `formatTool` crash when parsing non-JSON tool inputs (e.g., raw Bash commands)
- **#544**: Fixed terminology in context hints - changed "mem-search skill" to "MCP tools"
- **#557**: Settings file now auto-creates with defaults on first run (no more "module loader" errors)
- **#543**: Fixed hook execution by switching runtime from `node` to `bun` (resolves `bun:sqlite` issues)
## Code Quality
- Fixed circular dependency between Logger and SettingsDefaultsManager
- Added 72 integration tests for critical coverage gaps
- Cleaned up mock-heavy tests causing module cache pollution
## Full Changelog
See PR #558 for complete details and diagnostic reports.
+4 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "10.5.5",
"version": "10.5.6",
"description": "Memory compression system for Claude Code - persist context across sessions",
"keywords": [
"claude",
@@ -129,5 +129,8 @@
"tree-sitter-typescript": "^0.23.2",
"tsx": "^4.20.6",
"typescript": "^5.3.0"
},
"optionalDependencies": {
"tree-kill": "^1.2.2"
}
}
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "10.5.5",
"version": "10.5.6",
"description": "Persistent memory system for Claude Code - seamlessly preserve context across sessions",
"author": {
"name": "Alex Newman"
+8 -7
View File
@@ -21,12 +21,7 @@
"type": "command",
"command": "_R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/smart-install.js\"",
"timeout": 300
}
]
},
{
"matcher": "startup|clear|compact",
"hooks": [
},
{
"type": "command",
"command": "_R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" start",
@@ -70,7 +65,13 @@
"type": "command",
"command": "_R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code summarize",
"timeout": 120
},
}
]
}
],
"SessionEnd": [
{
"hooks": [
{
"type": "command",
"command": "_R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-complete",
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem-plugin",
"version": "10.5.5",
"version": "10.5.6",
"private": true,
"description": "Runtime dependencies for claude-mem bundled hooks",
"type": "module",
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
+18 -9
View File
@@ -220,14 +220,14 @@ function installBun() {
// Windows: Use PowerShell installer
console.error(' Installing via PowerShell...');
execSync('powershell -c "irm bun.sh/install.ps1 | iex"', {
stdio: 'inherit',
stdio: ['pipe', 'pipe', 'inherit'],
shell: true
});
} else {
// Unix/macOS: Use curl installer
console.error(' Installing via curl...');
execSync('curl -fsSL https://bun.sh/install | bash', {
stdio: 'inherit',
stdio: ['pipe', 'pipe', 'inherit'],
shell: true
});
}
@@ -285,14 +285,14 @@ function installUv() {
// Windows: Use PowerShell installer
console.error(' Installing via PowerShell...');
execSync('powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"', {
stdio: 'inherit',
stdio: ['pipe', 'pipe', 'inherit'],
shell: true
});
} else {
// Unix/macOS: Use curl installer
console.error(' Installing via curl...');
execSync('curl -LsSf https://astral.sh/uv/install.sh | sh', {
stdio: 'inherit',
stdio: ['pipe', 'pipe', 'inherit'],
shell: true
});
}
@@ -426,14 +426,18 @@ function installDeps() {
// Quote path for Windows paths with spaces
const bunCmd = IS_WINDOWS && bunPath.includes(' ') ? `"${bunPath}"` : bunPath;
// Use pipe for stdout to prevent non-JSON output leaking to Claude Code hooks.
// stderr is inherited so progress/errors are still visible to the user.
const installStdio = ['pipe', 'pipe', 'inherit'];
let bunSucceeded = false;
try {
execSync(`${bunCmd} install`, { cwd: ROOT, stdio: 'inherit', shell: IS_WINDOWS });
execSync(`${bunCmd} install`, { cwd: ROOT, stdio: installStdio, shell: IS_WINDOWS });
bunSucceeded = true;
} catch {
// First attempt failed, try with force flag
try {
execSync(`${bunCmd} install --force`, { cwd: ROOT, stdio: 'inherit', shell: IS_WINDOWS });
execSync(`${bunCmd} install --force`, { cwd: ROOT, stdio: installStdio, shell: IS_WINDOWS });
bunSucceeded = true;
} catch {
// Bun failed completely, will try npm fallback
@@ -445,7 +449,7 @@ function installDeps() {
console.error('⚠️ Bun install failed, falling back to npm...');
console.error(' (This can happen with npm alias packages like *-cjs)');
try {
execSync('npm install', { cwd: ROOT, stdio: 'inherit', shell: IS_WINDOWS });
execSync('npm install', { cwd: ROOT, stdio: installStdio, shell: IS_WINDOWS });
} catch (npmError) {
throw new Error('Both bun and npm install failed: ' + npmError.message);
}
@@ -506,7 +510,7 @@ try {
console.error(`⚠️ Bun ${currentVersion} is outdated. Minimum required: ${MIN_BUN_VERSION}`);
console.error(' Upgrading bun...');
try {
execSync('bun upgrade', { stdio: 'inherit', shell: IS_WINDOWS });
execSync('bun upgrade', { stdio: ['pipe', 'pipe', 'inherit'], shell: IS_WINDOWS });
if (!isBunVersionSufficient()) {
console.error(`❌ Bun upgrade failed. Please manually upgrade: bun upgrade`);
process.exit(1);
@@ -542,7 +546,7 @@ try {
if (!verifyCriticalModules()) {
console.error('⚠️ Retrying install with npm...');
try {
execSync('npm install --production', { cwd: ROOT, stdio: 'inherit', shell: IS_WINDOWS });
execSync('npm install --production', { cwd: ROOT, stdio: ['pipe', 'pipe', 'inherit'], shell: IS_WINDOWS });
} catch {
// npm also failed
}
@@ -577,7 +581,12 @@ try {
// Step 4: Install CLI to PATH
installCLI();
// Output valid JSON for Claude Code hook contract
console.log(JSON.stringify({ continue: true, suppressOutput: true }));
} catch (e) {
console.error('❌ Installation failed:', e.message);
// Still output valid JSON so Claude Code doesn't show a confusing error
console.log(JSON.stringify({ continue: true, suppressOutput: true }));
process.exit(1);
}
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
+11 -4
View File
@@ -16,13 +16,20 @@ export const claudeCodeAdapter: PlatformAdapter = {
};
},
formatOutput(result) {
if (result.hookSpecificOutput) {
const r = result ?? ({} as HookResult);
if (r.hookSpecificOutput) {
const output: Record<string, unknown> = { hookSpecificOutput: result.hookSpecificOutput };
if (result.systemMessage) {
output.systemMessage = result.systemMessage;
if (r.systemMessage) {
output.systemMessage = r.systemMessage;
}
return output;
}
return { continue: result.continue ?? true, suppressOutput: result.suppressOutput ?? true };
// Only emit fields in the Claude Code hook contract — unrecognized fields
// cause "JSON validation failed" in Stop hooks.
const output: Record<string, unknown> = {};
if (r.systemMessage) {
output.systemMessage = r.systemMessage;
}
return output;
}
};
+5 -5
View File
@@ -6,7 +6,7 @@
*/
import type { EventHandler, NormalizedHookInput, HookResult } from '../types.js';
import { ensureWorkerRunning, getWorkerPort } from '../../shared/worker-utils.js';
import { ensureWorkerRunning, getWorkerPort, workerHttpRequest } from '../../shared/worker-utils.js';
import { getProjectContext } from '../../utils/project-name.js';
import { HOOK_EXIT_CODES } from '../../shared/hook-constants.js';
import { logger } from '../../utils/logger.js';
@@ -38,16 +38,16 @@ export const contextHandler: EventHandler = {
// Pass all projects (parent + worktree if applicable) for unified timeline
const projectsParam = context.allProjects.join(',');
const url = `http://127.0.0.1:${port}/api/context/inject?projects=${encodeURIComponent(projectsParam)}`;
const apiPath = `/api/context/inject?projects=${encodeURIComponent(projectsParam)}`;
const colorApiPath = `${apiPath}&colors=true`;
// Note: Removed AbortSignal.timeout due to Windows Bun cleanup issue (libuv assertion)
// Worker service has its own timeouts, so client-side timeout is redundant
try {
// Fetch markdown (for Claude context) and optionally colored (for user display)
const colorUrl = `${url}&colors=true`;
const [response, colorResponse] = await Promise.all([
fetch(url),
showTerminalOutput ? fetch(colorUrl).catch(() => null) : Promise.resolve(null)
workerHttpRequest(apiPath),
showTerminalOutput ? workerHttpRequest(colorApiPath).catch(() => null) : Promise.resolve(null)
]);
if (!response.ok) {
+2 -6
View File
@@ -6,7 +6,7 @@
*/
import type { EventHandler, NormalizedHookInput, HookResult } from '../types.js';
import { ensureWorkerRunning, getWorkerPort } from '../../shared/worker-utils.js';
import { ensureWorkerRunning, workerHttpRequest } from '../../shared/worker-utils.js';
import { logger } from '../../utils/logger.js';
import { HOOK_EXIT_CODES } from '../../shared/hook-constants.js';
@@ -25,10 +25,7 @@ export const fileEditHandler: EventHandler = {
throw new Error('fileEditHandler requires filePath');
}
const port = getWorkerPort();
logger.dataIn('HOOK', `FileEdit: ${filePath}`, {
workerPort: port,
editCount: edits?.length ?? 0
});
@@ -40,7 +37,7 @@ export const fileEditHandler: EventHandler = {
// Send to worker as an observation with file edit metadata
// The observation handler on the worker will process this appropriately
try {
const response = await fetch(`http://127.0.0.1:${port}/api/sessions/observations`, {
const response = await workerHttpRequest('/api/sessions/observations', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
@@ -50,7 +47,6 @@ export const fileEditHandler: EventHandler = {
tool_response: { success: true },
cwd
})
// Note: Removed signal to avoid Windows Bun cleanup issue (libuv assertion)
});
if (!response.ok) {
+3 -8
View File
@@ -5,7 +5,7 @@
*/
import type { EventHandler, NormalizedHookInput, HookResult } from '../types.js';
import { ensureWorkerRunning, getWorkerPort } from '../../shared/worker-utils.js';
import { ensureWorkerRunning, workerHttpRequest } from '../../shared/worker-utils.js';
import { logger } from '../../utils/logger.js';
import { HOOK_EXIT_CODES } from '../../shared/hook-constants.js';
import { isProjectExcluded } from '../../utils/project-filter.js';
@@ -28,13 +28,9 @@ export const observationHandler: EventHandler = {
return { continue: true, suppressOutput: true, exitCode: HOOK_EXIT_CODES.SUCCESS };
}
const port = getWorkerPort();
const toolStr = logger.formatTool(toolName, toolInput);
logger.dataIn('HOOK', `PostToolUse: ${toolStr}`, {
workerPort: port
});
logger.dataIn('HOOK', `PostToolUse: ${toolStr}`, {});
// Validate required fields before sending to worker
if (!cwd) {
@@ -50,7 +46,7 @@ export const observationHandler: EventHandler = {
// Send to worker - worker handles privacy check and database operations
try {
const response = await fetch(`http://127.0.0.1:${port}/api/sessions/observations`, {
const response = await workerHttpRequest('/api/sessions/observations', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
@@ -60,7 +56,6 @@ export const observationHandler: EventHandler = {
tool_response: toolResponse,
cwd
})
// Note: Removed signal to avoid Windows Bun cleanup issue (libuv assertion)
});
if (!response.ok) {
+2 -4
View File
@@ -10,7 +10,7 @@
*/
import type { EventHandler, NormalizedHookInput, HookResult } from '../types.js';
import { ensureWorkerRunning, getWorkerPort } from '../../shared/worker-utils.js';
import { ensureWorkerRunning, workerHttpRequest } from '../../shared/worker-utils.js';
import { logger } from '../../utils/logger.js';
export const sessionCompleteHandler: EventHandler = {
@@ -23,7 +23,6 @@ export const sessionCompleteHandler: EventHandler = {
}
const { sessionId } = input;
const port = getWorkerPort();
if (!sessionId) {
logger.warn('HOOK', 'session-complete: Missing sessionId, skipping');
@@ -31,13 +30,12 @@ export const sessionCompleteHandler: EventHandler = {
}
logger.info('HOOK', '→ session-complete: Removing session from active map', {
workerPort: port,
contentSessionId: sessionId
});
try {
// Call the session complete endpoint by contentSessionId
const response = await fetch(`http://127.0.0.1:${port}/api/sessions/complete`, {
const response = await workerHttpRequest('/api/sessions/complete', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
+3 -6
View File
@@ -5,7 +5,7 @@
*/
import type { EventHandler, NormalizedHookInput, HookResult } from '../types.js';
import { ensureWorkerRunning, getWorkerPort } from '../../shared/worker-utils.js';
import { ensureWorkerRunning, workerHttpRequest } from '../../shared/worker-utils.js';
import { getProjectName } from '../../utils/project-name.js';
import { logger } from '../../utils/logger.js';
import { HOOK_EXIT_CODES } from '../../shared/hook-constants.js';
@@ -42,12 +42,11 @@ export const sessionInitHandler: EventHandler = {
const prompt = (!rawPrompt || !rawPrompt.trim()) ? '[media prompt]' : rawPrompt;
const project = getProjectName(cwd);
const port = getWorkerPort();
logger.debug('HOOK', 'session-init: Calling /api/sessions/init', { contentSessionId: sessionId, project });
// Initialize session via HTTP - handles DB operations and privacy checks
const initResponse = await fetch(`http://127.0.0.1:${port}/api/sessions/init`, {
const initResponse = await workerHttpRequest('/api/sessions/init', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
@@ -55,7 +54,6 @@ export const sessionInitHandler: EventHandler = {
project,
prompt
})
// Note: Removed signal to avoid Windows Bun cleanup issue (libuv assertion)
});
if (!initResponse.ok) {
@@ -107,11 +105,10 @@ export const sessionInitHandler: EventHandler = {
logger.debug('HOOK', 'session-init: Calling /sessions/{sessionDbId}/init', { sessionDbId, promptNumber });
// Initialize SDK agent session via HTTP (starts the agent!)
const response = await fetch(`http://127.0.0.1:${port}/sessions/${sessionDbId}/init`, {
const response = await workerHttpRequest(`/sessions/${sessionDbId}/init`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ userPrompt: cleanedPrompt, promptNumber })
// Note: Removed signal to avoid Windows Bun cleanup issue (libuv assertion)
});
if (!response.ok) {
+10 -16
View File
@@ -7,7 +7,7 @@
*/
import type { EventHandler, NormalizedHookInput, HookResult } from '../types.js';
import { ensureWorkerRunning, getWorkerPort, fetchWithTimeout } from '../../shared/worker-utils.js';
import { ensureWorkerRunning, workerHttpRequest } from '../../shared/worker-utils.js';
import { logger } from '../../utils/logger.js';
import { extractLastMessage } from '../../shared/transcript-parser.js';
import { HOOK_EXIT_CODES, HOOK_TIMEOUTS, getTimeout } from '../../shared/hook-constants.js';
@@ -25,8 +25,6 @@ export const summarizeHandler: EventHandler = {
const { sessionId, transcriptPath } = input;
const port = getWorkerPort();
// Validate required fields before processing
if (!transcriptPath) {
// No transcript available - skip summary gracefully (not an error)
@@ -40,23 +38,19 @@ export const summarizeHandler: EventHandler = {
const lastAssistantMessage = extractLastMessage(transcriptPath, 'assistant', true);
logger.dataIn('HOOK', 'Stop: Requesting summary', {
workerPort: port,
hasLastAssistantMessage: !!lastAssistantMessage
});
// Send to worker - worker handles privacy check and database operations
const response = await fetchWithTimeout(
`http://127.0.0.1:${port}/api/sessions/summarize`,
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
contentSessionId: sessionId,
last_assistant_message: lastAssistantMessage
}),
},
SUMMARIZE_TIMEOUT_MS
);
const response = await workerHttpRequest('/api/sessions/summarize', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
contentSessionId: sessionId,
last_assistant_message: lastAssistantMessage
}),
timeoutMs: SUMMARIZE_TIMEOUT_MS
});
if (!response.ok) {
// Return standard response even on failure (matches original behavior)
+3 -5
View File
@@ -7,7 +7,7 @@
import { basename } from 'path';
import type { EventHandler, NormalizedHookInput, HookResult } from '../types.js';
import { ensureWorkerRunning, getWorkerPort } from '../../shared/worker-utils.js';
import { ensureWorkerRunning, getWorkerPort, workerHttpRequest } from '../../shared/worker-utils.js';
import { HOOK_EXIT_CODES } from '../../shared/hook-constants.js';
export const userMessageHandler: EventHandler = {
@@ -23,11 +23,9 @@ export const userMessageHandler: EventHandler = {
const project = basename(input.cwd ?? process.cwd());
// Fetch formatted context directly from worker API
// Note: Removed AbortSignal.timeout to avoid Windows Bun cleanup issue (libuv assertion)
try {
const response = await fetch(
`http://127.0.0.1:${port}/api/context/inject?project=${encodeURIComponent(project)}&colors=true`,
{ method: 'GET' }
const response = await workerHttpRequest(
`/api/context/inject?project=${encodeURIComponent(project)}&colors=true`
);
if (!response.ok) {
+5
View File
@@ -120,6 +120,11 @@ export function parseSummary(text: string, sessionId?: number): ParsedSummary |
const summaryMatch = summaryRegex.exec(text);
if (!summaryMatch) {
// Log when the response contains <observation> instead of <summary>
// to help diagnose prompt conditioning issues (see #1312)
if (/<observation>/.test(text)) {
logger.warn('PARSER', 'Summary response contained <observation> tags instead of <summary> — prompt conditioning may need strengthening', { sessionId });
}
return null;
}
+5 -1
View File
@@ -130,7 +130,11 @@ export function buildSummaryPrompt(session: SDKSession, mode: ModeConfig): strin
return '';
})();
return `${mode.prompts.header_summary_checkpoint}
return `--- MODE SWITCH: PROGRESS SUMMARY ---
Do NOT output <observation> tags. This is a summary request, not an observation request.
Your response MUST use <summary> tags ONLY. Any <observation> output will be discarded.
${mode.prompts.header_summary_checkpoint}
${mode.prompts.summary_instruction}
${mode.prompts.summary_context_label}
+9 -19
View File
@@ -27,19 +27,12 @@ import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import { getWorkerPort, getWorkerHost } from '../shared/worker-utils.js';
import { workerHttpRequest } from '../shared/worker-utils.js';
import { searchCodebase, formatSearchResults } from '../services/smart-file-read/search.js';
import { parseFile, formatFoldedView, unfoldSymbol } from '../services/smart-file-read/parser.js';
import { readFile } from 'node:fs/promises';
import { resolve } from 'node:path';
/**
* Worker HTTP API configuration
*/
const WORKER_PORT = getWorkerPort();
const WORKER_HOST = getWorkerHost();
const WORKER_BASE_URL = `http://${WORKER_HOST}:${WORKER_PORT}`;
/**
* Map tool names to Worker HTTP endpoints
*/
@@ -49,7 +42,7 @@ const TOOL_ENDPOINT_MAP: Record<string, string> = {
};
/**
* Call Worker HTTP API endpoint
* Call Worker HTTP API endpoint (uses socket or TCP automatically)
*/
async function callWorkerAPI(
endpoint: string,
@@ -67,8 +60,8 @@ async function callWorkerAPI(
}
}
const url = `${WORKER_BASE_URL}${endpoint}?${searchParams}`;
const response = await fetch(url);
const apiPath = `${endpoint}?${searchParams}`;
const response = await workerHttpRequest(apiPath);
if (!response.ok) {
const errorText = await response.text();
@@ -103,12 +96,9 @@ async function callWorkerAPIPost(
logger.debug('HTTP', 'Worker API request (POST)', undefined, { endpoint });
try {
const url = `${WORKER_BASE_URL}${endpoint}`;
const response = await fetch(url, {
const response = await workerHttpRequest(endpoint, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body)
});
@@ -145,7 +135,7 @@ async function callWorkerAPIPost(
*/
async function verifyWorkerConnection(): Promise<boolean> {
try {
const response = await fetch(`${WORKER_BASE_URL}/api/health`);
const response = await workerHttpRequest('/api/health');
return response.ok;
} catch (error) {
// Expected during worker startup or if worker is down
@@ -448,11 +438,11 @@ async function main() {
setTimeout(async () => {
const workerAvailable = await verifyWorkerConnection();
if (!workerAvailable) {
logger.error('SYSTEM', 'Worker not available', undefined, { workerUrl: WORKER_BASE_URL });
logger.error('SYSTEM', 'Worker not available', undefined, {});
logger.error('SYSTEM', 'Tools will fail until Worker is started');
logger.error('SYSTEM', 'Start Worker with: npm run worker:restart');
} else {
logger.info('SYSTEM', 'Worker available', undefined, { workerUrl: WORKER_BASE_URL });
logger.info('SYSTEM', 'Worker available', undefined, {});
}
}, 0);
}
@@ -10,12 +10,7 @@
import http from 'http';
import { logger } from '../../utils/logger.js';
import {
getChildProcesses,
forceKillProcess,
waitForProcessesExit,
removePidFile
} from './ProcessManager.js';
import { stopSupervisor } from '../../supervisor/index.js';
export interface ShutdownableService {
shutdownAll(): Promise<void>;
@@ -57,49 +52,35 @@ export interface GracefulShutdownConfig {
export async function performGracefulShutdown(config: GracefulShutdownConfig): Promise<void> {
logger.info('SYSTEM', 'Shutdown initiated');
// Clean up PID file on shutdown
removePidFile();
// STEP 1: Enumerate all child processes BEFORE we start closing things
const childPids = await getChildProcesses(process.pid);
logger.info('SYSTEM', 'Found child processes', { count: childPids.length, pids: childPids });
// STEP 2: Close HTTP server first
// STEP 1: Close HTTP server first
if (config.server) {
await closeHttpServer(config.server);
logger.info('SYSTEM', 'HTTP server closed');
}
// STEP 3: Shutdown active sessions
// STEP 2: Shutdown active sessions
await config.sessionManager.shutdownAll();
// STEP 4: Close MCP client connection (signals child to exit gracefully)
// STEP 3: Close MCP client connection (signals child to exit gracefully)
if (config.mcpClient) {
await config.mcpClient.close();
logger.info('SYSTEM', 'MCP client closed');
}
// STEP 5: Stop Chroma MCP connection
// STEP 4: Stop Chroma MCP connection
if (config.chromaMcpManager) {
logger.info('SHUTDOWN', 'Stopping Chroma MCP connection...');
await config.chromaMcpManager.stop();
logger.info('SHUTDOWN', 'Chroma MCP connection stopped');
}
// STEP 6: Close database connection (includes ChromaSync cleanup)
// STEP 5: Close database connection (includes ChromaSync cleanup)
if (config.dbManager) {
await config.dbManager.close();
}
// STEP 7: Force kill any remaining child processes (Windows zombie port fix)
if (childPids.length > 0) {
logger.info('SYSTEM', 'Force killing remaining children');
for (const pid of childPids) {
await forceKillProcess(pid);
}
// Wait for children to fully exit
await waitForProcessesExit(childPids, 5000);
}
// STEP 6: Supervisor handles tracked child termination, PID cleanup, and stale sockets.
await stopSupervisor();
logger.info('SYSTEM', 'Worker shutdown complete');
}
+34 -19
View File
@@ -14,6 +14,26 @@ import { readFileSync } from 'fs';
import { logger } from '../../utils/logger.js';
import { MARKETPLACE_ROOT } from '../../shared/paths.js';
/**
* Make an HTTP request to the worker via TCP.
* Returns { ok, statusCode, body } or throws on transport error.
*/
async function httpRequestToWorker(
port: number,
endpointPath: string,
method: string = 'GET'
): Promise<{ ok: boolean; statusCode: number; body: string }> {
const response = await fetch(`http://127.0.0.1:${port}${endpointPath}`, { method });
// Gracefully handle cases where response body isn't available (e.g., test mocks)
let body = '';
try {
body = await response.text();
} catch {
// Body unavailable — health/readiness checks only need .ok
}
return { ok: response.ok, statusCode: response.status, body };
}
/**
* Check if a port is in use by querying the health endpoint
*/
@@ -29,7 +49,7 @@ export async function isPortInUse(port: number): Promise<boolean> {
}
/**
* Poll a localhost endpoint until it returns 200 OK or timeout.
* Poll a worker endpoint until it returns 200 OK or timeout.
* Shared implementation for liveness and readiness checks.
*/
async function pollEndpointUntilOk(
@@ -41,12 +61,11 @@ async function pollEndpointUntilOk(
const start = Date.now();
while (Date.now() - start < timeoutMs) {
try {
// Note: Removed AbortSignal.timeout to avoid Windows Bun cleanup issue (libuv assertion)
const response = await fetch(`http://127.0.0.1:${port}${endpointPath}`);
if (response.ok) return true;
const result = await httpRequestToWorker(port, endpointPath);
if (result.ok) return true;
} catch (error) {
// [ANTI-PATTERN IGNORED]: Retry loop - expected failures during startup, will retry
logger.debug('SYSTEM', retryLogMessage, { port }, error as Error);
logger.debug('SYSTEM', retryLogMessage, {}, error as Error);
}
await new Promise(r => setTimeout(r, 500));
}
@@ -87,28 +106,24 @@ export async function waitForPortFree(port: number, timeoutMs: number = 10000):
/**
* Send HTTP shutdown request to a running worker
* @param port Worker port
* @returns true if shutdown request was acknowledged, false otherwise
*/
export async function httpShutdown(port: number): Promise<boolean> {
try {
// Note: Removed AbortSignal.timeout to avoid Windows Bun cleanup issue (libuv assertion)
const response = await fetch(`http://127.0.0.1:${port}/api/admin/shutdown`, {
method: 'POST'
});
if (!response.ok) {
logger.warn('SYSTEM', 'Shutdown request returned error', { port, status: response.status });
const result = await httpRequestToWorker(port, '/api/admin/shutdown', 'POST');
if (!result.ok) {
logger.warn('SYSTEM', 'Shutdown request returned error', { status: result.statusCode });
return false;
}
return true;
} catch (error) {
// Connection refused is expected if worker already stopped
if (error instanceof Error && error.message?.includes('ECONNREFUSED')) {
logger.debug('SYSTEM', 'Worker already stopped', { port }, error);
logger.debug('SYSTEM', 'Worker already stopped', {}, error);
return false;
}
// Unexpected error - log full details
logger.error('SYSTEM', 'Shutdown request failed unexpectedly', { port }, error as Error);
logger.error('SYSTEM', 'Shutdown request failed unexpectedly', {}, error as Error);
return false;
}
}
@@ -135,17 +150,17 @@ export function getInstalledPluginVersion(): string {
/**
* Get the running worker's version via API
* This is the "actual" version currently running
* This is the "actual" version currently running.
*/
export async function getRunningWorkerVersion(port: number): Promise<string | null> {
try {
const response = await fetch(`http://127.0.0.1:${port}/api/version`);
if (!response.ok) return null;
const data = await response.json() as { version: string };
const result = await httpRequestToWorker(port, '/api/version');
if (!result.ok) return null;
const data = JSON.parse(result.body) as { version: string };
return data.version;
} catch {
// Expected: worker not running or version endpoint unavailable
logger.debug('SYSTEM', 'Could not fetch worker version', { port });
logger.debug('SYSTEM', 'Could not fetch worker version', {});
return null;
}
}
+8 -14
View File
@@ -15,6 +15,8 @@ import { exec, execSync, spawn } from 'child_process';
import { promisify } from 'util';
import { logger } from '../../utils/logger.js';
import { HOOK_TIMEOUTS } from '../../shared/hook-constants.js';
import { sanitizeEnv } from '../../supervisor/env-sanitizer.js';
import { getSupervisor, validateWorkerPidFile, type ValidateWorkerPidStatus } from '../../supervisor/index.js';
const execAsync = promisify(exec);
@@ -625,11 +627,13 @@ export function spawnDaemon(
extraEnv: Record<string, string> = {}
): number | undefined {
const isWindows = process.platform === 'win32';
const env = {
getSupervisor().assertCanSpawn('worker daemon');
const env = sanitizeEnv({
...process.env,
CLAUDE_MEM_WORKER_PORT: String(port),
...extraEnv
};
});
if (isWindows) {
// Use PowerShell Start-Process to spawn a hidden, independent process
@@ -764,18 +768,8 @@ export function touchPidFile(): void {
* Called at the top of ensureWorkerStarted() to clean up after WSL2
* hibernate, OOM kills, or other ungraceful worker deaths.
*/
export function cleanStalePidFile(): void {
const pidInfo = readPidFile();
if (!pidInfo) return;
if (!isProcessAlive(pidInfo.pid)) {
logger.info('SYSTEM', 'Removing stale PID file (worker process is dead)', {
pid: pidInfo.pid,
port: pidInfo.port,
startedAt: pidInfo.startedAt
});
removePidFile();
}
export function cleanStalePidFile(): ValidateWorkerPidStatus {
return validateWorkerPidFile({ logAlive: false });
}
/**
@@ -15,7 +15,7 @@ import { existsSync, readFileSync, writeFileSync, unlinkSync, mkdirSync } from '
import { exec } from 'child_process';
import { promisify } from 'util';
import { logger } from '../../utils/logger.js';
import { getWorkerPort } from '../../shared/worker-utils.js';
import { getWorkerPort, workerHttpRequest } from '../../shared/worker-utils.js';
import { DATA_DIR, MARKETPLACE_ROOT, CLAUDE_CONFIG_DIR } from '../../shared/paths.js';
import {
readCursorRegistry as readCursorRegistryFromFile,
@@ -95,16 +95,16 @@ export function unregisterCursorProject(projectName: string): void {
* Update Cursor context files for all registered projects matching this project name.
* Called by SDK agents after saving a summary.
*/
export async function updateCursorContextForProject(projectName: string, port: number): Promise<void> {
export async function updateCursorContextForProject(projectName: string, _port: number): Promise<void> {
const registry = readCursorRegistry();
const entry = registry[projectName];
if (!entry) return; // Project doesn't have Cursor hooks installed
try {
// Fetch fresh context from worker
const response = await fetch(
`http://127.0.0.1:${port}/api/context/inject?project=${encodeURIComponent(projectName)}`
// Fetch fresh context from worker (uses socket or TCP automatically)
const response = await workerHttpRequest(
`/api/context/inject?project=${encodeURIComponent(projectName)}`
);
if (!response.ok) return;
@@ -398,19 +398,18 @@ async function setupProjectContext(targetDir: string, workspaceRoot: string): Pr
const rulesDir = path.join(targetDir, 'rules');
mkdirSync(rulesDir, { recursive: true });
const port = getWorkerPort();
const projectName = path.basename(workspaceRoot);
let contextGenerated = false;
console.log(` Generating initial context...`);
try {
// Check if worker is running
const healthResponse = await fetch(`http://127.0.0.1:${port}/api/readiness`);
// Check if worker is running (uses socket or TCP automatically)
const healthResponse = await workerHttpRequest('/api/readiness');
if (healthResponse.ok) {
// Fetch context
const contextResponse = await fetch(
`http://127.0.0.1:${port}/api/context/inject?project=${encodeURIComponent(projectName)}`
const contextResponse = await workerHttpRequest(
`/api/context/inject?project=${encodeURIComponent(projectName)}`
);
if (contextResponse.ok) {
const context = await contextResponse.text();
+47
View File
@@ -17,6 +17,9 @@ import { ALLOWED_OPERATIONS, ALLOWED_TOPICS } from './allowed-constants.js';
import { logger } from '../../utils/logger.js';
import { createMiddleware, summarizeRequestBody, requireLocalhost } from './Middleware.js';
import { errorHandler, notFoundHandler } from './ErrorHandler.js';
import { getSupervisor } from '../../supervisor/index.js';
import { isPidAlive } from '../../supervisor/process-registry.js';
import { ENV_PREFIXES, ENV_EXACT_MATCHES } from '../../supervisor/env-sanitizer.js';
// Build-time injected version constant (set by esbuild define)
declare const __DEFAULT_PACKAGE_VERSION__: string;
@@ -285,6 +288,50 @@ export class Server {
}, 100);
}
});
// Doctor endpoint - diagnostic view of supervisor, processes, and health
this.app.get('/api/admin/doctor', requireLocalhost, (_req: Request, res: Response) => {
const supervisor = getSupervisor();
const registry = supervisor.getRegistry();
const allRecords = registry.getAll();
// Check each process liveness
const processes = allRecords.map(record => ({
id: record.id,
pid: record.pid,
type: record.type,
status: isPidAlive(record.pid) ? 'alive' as const : 'dead' as const,
startedAt: record.startedAt,
}));
// Check for dead processes still in registry
const deadProcessPids = processes.filter(p => p.status === 'dead').map(p => p.pid);
// Check if CLAUDECODE_* env vars are leaking into this process
const envClean = !Object.keys(process.env).some(key =>
ENV_EXACT_MATCHES.has(key) || ENV_PREFIXES.some(prefix => key.startsWith(prefix))
);
// Format uptime
const uptimeMs = Date.now() - this.startTime;
const uptimeSeconds = Math.floor(uptimeMs / 1000);
const hours = Math.floor(uptimeSeconds / 3600);
const minutes = Math.floor((uptimeSeconds % 3600) / 60);
const formattedUptime = hours > 0 ? `${hours}h ${minutes}m` : `${minutes}m`;
res.json({
supervisor: {
running: true,
pid: process.pid,
uptime: formattedUptime,
},
processes,
health: {
deadProcessPids,
envClean,
},
});
});
}
/**
+125
View File
@@ -1,4 +1,8 @@
import { Database } from 'bun:sqlite';
import { execFileSync } from 'child_process';
import { existsSync, unlinkSync, writeFileSync } from 'fs';
import { tmpdir } from 'os';
import { join } from 'path';
import { DATA_DIR, DB_PATH, ensureDir } from '../../shared/paths.js';
import { logger } from '../../utils/logger.js';
import { MigrationRunner } from './migrations/runner.js';
@@ -15,6 +19,118 @@ export interface Migration {
let dbInstance: Database | null = null;
/**
* Repair malformed database schema before migrations run.
*
* This handles the case where a database is synced between machines running
* different claude-mem versions. A newer version may have added columns and
* indexes that an older version (or even the same version on a fresh install)
* cannot process. SQLite throws "malformed database schema" when it encounters
* an index referencing a non-existent column, which prevents ALL queries
* including the migrations that would fix the schema.
*
* The fix: use Python's sqlite3 module (which supports writable_schema) to
* drop the orphaned schema objects, then let the migration system recreate
* them properly. bun:sqlite doesn't allow DELETE FROM sqlite_master even
* with writable_schema = ON.
*/
function repairMalformedSchema(db: Database): void {
try {
// Quick test: if we can query sqlite_master, the schema is fine
db.query('SELECT name FROM sqlite_master WHERE type = "table" LIMIT 1').all();
return;
} catch (error: unknown) {
const message = error instanceof Error ? error.message : String(error);
if (!message.includes('malformed database schema')) {
throw error;
}
logger.warn('DB', 'Detected malformed database schema, attempting repair', { error: message });
// Extract the problematic object name from the error message
// Format: "malformed database schema (object_name) - details"
const match = message.match(/malformed database schema \(([^)]+)\)/);
if (!match) {
logger.error('DB', 'Could not parse malformed schema error, cannot auto-repair', { error: message });
throw error;
}
const objectName = match[1];
logger.info('DB', `Dropping malformed schema object: ${objectName}`);
// Get the DB file path. For file-based DBs, we can use Python to repair.
// For in-memory DBs, we can't shell out — just re-throw.
const dbPath = db.filename;
if (!dbPath || dbPath === ':memory:' || dbPath === '') {
logger.error('DB', 'Cannot auto-repair in-memory database');
throw error;
}
// Close the connection so Python can safely modify the file
db.close();
// Use Python's sqlite3 module to drop the orphaned object and reset
// related migration versions so they re-run and recreate things properly.
// bun:sqlite doesn't support DELETE FROM sqlite_master even with writable_schema.
//
// We write a temp script rather than using -c to avoid shell escaping issues
// with paths containing spaces or special characters. execFileSync passes
// args directly without a shell, so dbPath and objectName are safe.
const scriptPath = join(tmpdir(), `claude-mem-repair-${Date.now()}.py`);
try {
writeFileSync(scriptPath, `
import sqlite3, sys
db_path = sys.argv[1]
obj_name = sys.argv[2]
c = sqlite3.connect(db_path)
c.execute('PRAGMA writable_schema = ON')
c.execute('DELETE FROM sqlite_master WHERE name = ?', (obj_name,))
c.execute('PRAGMA writable_schema = OFF')
# Reset migration versions so affected migrations re-run.
# Guard with existence check: schema_versions may not exist on a very fresh DB.
has_sv = c.execute(
"SELECT count(*) FROM sqlite_master WHERE type='table' AND name='schema_versions'"
).fetchone()[0]
if has_sv:
c.execute('DELETE FROM schema_versions')
c.commit()
c.close()
`);
execFileSync('python3', [scriptPath, dbPath, objectName], { timeout: 10000 });
logger.info('DB', `Dropped orphaned schema object "${objectName}" and reset migration versions via Python sqlite3. All migrations will re-run (they are idempotent).`);
} catch (pyError: unknown) {
const pyMessage = pyError instanceof Error ? pyError.message : String(pyError);
logger.error('DB', 'Python sqlite3 repair failed', { error: pyMessage });
throw new Error(`Schema repair failed: ${message}. Python repair error: ${pyMessage}`);
} finally {
if (existsSync(scriptPath)) unlinkSync(scriptPath);
}
}
}
/**
* Wrapper that handles the close/reopen cycle needed for schema repair.
* Returns a (possibly new) Database connection.
*/
function repairMalformedSchemaWithReopen(dbPath: string, db: Database): Database {
try {
db.query('SELECT name FROM sqlite_master WHERE type = "table" LIMIT 1').all();
return db;
} catch (error: unknown) {
const message = error instanceof Error ? error.message : String(error);
if (!message.includes('malformed database schema')) {
throw error;
}
// repairMalformedSchema closes the DB internally for Python access
repairMalformedSchema(db);
// Reopen and check for additional malformed objects
const newDb = new Database(dbPath, { create: true, readwrite: true });
return repairMalformedSchemaWithReopen(dbPath, newDb);
}
}
/**
* ClaudeMemDatabase - New entry point for the sqlite module
*
@@ -38,6 +154,11 @@ export class ClaudeMemDatabase {
// Create database connection
this.db = new Database(dbPath, { create: true, readwrite: true });
// Repair any malformed schema before applying settings or running migrations.
// Must happen first — even PRAGMA calls can fail on a corrupted schema.
// This may close and reopen the connection if repair is needed.
this.db = repairMalformedSchemaWithReopen(dbPath, this.db);
// Apply optimized SQLite settings
this.db.run('PRAGMA journal_mode = WAL');
this.db.run('PRAGMA synchronous = NORMAL');
@@ -97,6 +218,10 @@ export class DatabaseManager {
this.db = new Database(DB_PATH, { create: true, readwrite: true });
// Repair any malformed schema before applying settings or running migrations.
// Must happen first — even PRAGMA calls can fail on a corrupted schema.
this.db = repairMalformedSchemaWithReopen(DB_PATH, this.db);
// Apply optimized SQLite settings
this.db.run('PRAGMA journal_mode = WAL');
this.db.run('PRAGMA synchronous = NORMAL');
+34 -14
View File
@@ -839,19 +839,21 @@ export class SessionStore {
* Add content_hash column to observations for deduplication (migration 22)
*/
private addObservationContentHashColumn(): void {
const applied = this.db.prepare('SELECT version FROM schema_versions WHERE version = ?').get(22) as SchemaVersion | undefined;
if (applied) return;
// Check actual schema first — cross-machine DB sync can leave schema_versions
// claiming this migration ran while the column is actually missing.
const tableInfo = this.db.query('PRAGMA table_info(observations)').all() as TableColumnInfo[];
const hasColumn = tableInfo.some(col => col.name === 'content_hash');
if (!hasColumn) {
this.db.run('ALTER TABLE observations ADD COLUMN content_hash TEXT');
this.db.run("UPDATE observations SET content_hash = substr(hex(randomblob(8)), 1, 16) WHERE content_hash IS NULL");
this.db.run('CREATE INDEX IF NOT EXISTS idx_observations_content_hash ON observations(content_hash, created_at_epoch)');
logger.debug('DB', 'Added content_hash column to observations table with backfill and index');
if (hasColumn) {
this.db.prepare('INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)').run(22, new Date().toISOString());
return;
}
this.db.run('ALTER TABLE observations ADD COLUMN content_hash TEXT');
this.db.run("UPDATE observations SET content_hash = substr(hex(randomblob(8)), 1, 16) WHERE content_hash IS NULL");
this.db.run('CREATE INDEX IF NOT EXISTS idx_observations_content_hash ON observations(content_hash, created_at_epoch)');
logger.debug('DB', 'Added content_hash column to observations table with backfill and index');
this.db.prepare('INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)').run(22, new Date().toISOString());
}
@@ -1659,15 +1661,23 @@ export class SessionStore {
const storeTx = this.db.transaction(() => {
const observationIds: number[] = [];
// 1. Store all observations
// 1. Store all observations (with content-hash deduplication)
const obsStmt = this.db.prepare(`
INSERT INTO observations
(memory_session_id, project, type, title, subtitle, facts, narrative, concepts,
files_read, files_modified, prompt_number, discovery_tokens, created_at, created_at_epoch)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
files_read, files_modified, prompt_number, discovery_tokens, content_hash, created_at, created_at_epoch)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
for (const observation of observations) {
// Content-hash deduplication (same logic as storeObservation singular)
const contentHash = computeObservationContentHash(memorySessionId, observation.title, observation.narrative);
const existing = findDuplicateObservation(this.db, contentHash, timestampEpoch);
if (existing) {
observationIds.push(existing.id);
continue;
}
const result = obsStmt.run(
memorySessionId,
project,
@@ -1681,6 +1691,7 @@ export class SessionStore {
JSON.stringify(observation.files_modified),
promptNumber || null,
discoveryTokens,
contentHash,
timestampIso,
timestampEpoch
);
@@ -1779,15 +1790,23 @@ export class SessionStore {
const storeAndMarkTx = this.db.transaction(() => {
const observationIds: number[] = [];
// 1. Store all observations
// 1. Store all observations (with content-hash deduplication)
const obsStmt = this.db.prepare(`
INSERT INTO observations
(memory_session_id, project, type, title, subtitle, facts, narrative, concepts,
files_read, files_modified, prompt_number, discovery_tokens, created_at, created_at_epoch)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
files_read, files_modified, prompt_number, discovery_tokens, content_hash, created_at, created_at_epoch)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
for (const observation of observations) {
// Content-hash deduplication (same logic as storeObservation singular)
const contentHash = computeObservationContentHash(memorySessionId, observation.title, observation.narrative);
const existing = findDuplicateObservation(this.db, contentHash, timestampEpoch);
if (existing) {
observationIds.push(existing.id);
continue;
}
const result = obsStmt.run(
memorySessionId,
project,
@@ -1801,6 +1820,7 @@ export class SessionStore {
JSON.stringify(observation.files_modified),
promptNumber || null,
discoveryTokens,
contentHash,
timestampIso,
timestampEpoch
);
+14 -10
View File
@@ -823,21 +823,25 @@ export class MigrationRunner {
* Backfills existing rows with unique random hashes so they don't block new inserts.
*/
private addObservationContentHashColumn(): void {
const applied = this.db.prepare('SELECT version FROM schema_versions WHERE version = ?').get(22) as SchemaVersion | undefined;
if (applied) return;
// Check actual schema first — cross-machine DB sync can leave schema_versions
// claiming this migration ran while the column is actually missing (e.g. migration 21
// recreated the table without content_hash on the synced machine).
const tableInfo = this.db.query('PRAGMA table_info(observations)').all() as TableColumnInfo[];
const hasColumn = tableInfo.some(col => col.name === 'content_hash');
if (!hasColumn) {
this.db.run('ALTER TABLE observations ADD COLUMN content_hash TEXT');
// Backfill existing rows with unique random hashes
this.db.run("UPDATE observations SET content_hash = substr(hex(randomblob(8)), 1, 16) WHERE content_hash IS NULL");
// Index for fast dedup lookups
this.db.run('CREATE INDEX IF NOT EXISTS idx_observations_content_hash ON observations(content_hash, created_at_epoch)');
logger.debug('DB', 'Added content_hash column to observations table with backfill and index');
if (hasColumn) {
// Column exists — just ensure version record is present
this.db.prepare('INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)').run(22, new Date().toISOString());
return;
}
this.db.run('ALTER TABLE observations ADD COLUMN content_hash TEXT');
// Backfill existing rows with unique random hashes
this.db.run("UPDATE observations SET content_hash = substr(hex(randomblob(8)), 1, 16) WHERE content_hash IS NULL");
// Index for fast dedup lookups
this.db.run('CREATE INDEX IF NOT EXISTS idx_observations_content_hash ON observations(content_hash, created_at_epoch)');
logger.debug('DB', 'Added content_hash column to observations table with backfill and index');
this.db.prepare('INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)').run(22, new Date().toISOString());
}
+26 -4
View File
@@ -21,12 +21,15 @@ import fs from 'fs';
import { logger } from '../../utils/logger.js';
import { SettingsDefaultsManager } from '../../shared/SettingsDefaultsManager.js';
import { USER_SETTINGS_PATH } from '../../shared/paths.js';
import { sanitizeEnv } from '../../supervisor/env-sanitizer.js';
import { getSupervisor } from '../../supervisor/index.js';
const CHROMA_MCP_CLIENT_NAME = 'claude-mem-chroma';
const CHROMA_MCP_CLIENT_VERSION = '1.0.0';
const MCP_CONNECTION_TIMEOUT_MS = 30_000;
const RECONNECT_BACKOFF_MS = 10_000; // Don't retry connections faster than this after failure
const DEFAULT_CHROMA_DATA_DIR = path.join(os.homedir(), '.claude-mem', 'chroma');
const CHROMA_SUPERVISOR_ID = 'chroma-mcp';
export class ChromaMcpManager {
private static instance: ChromaMcpManager | null = null;
@@ -101,6 +104,7 @@ export class ChromaMcpManager {
const commandArgs = this.buildCommandArgs();
const spawnEnvironment = this.getSpawnEnv();
getSupervisor().assertCanSpawn('chroma mcp');
// On Windows, .cmd files require shell resolution. Since MCP SDK's
// StdioClientTransport doesn't support `shell: true`, route through
@@ -155,6 +159,7 @@ export class ChromaMcpManager {
clearTimeout(timeoutId!);
this.connected = true;
this.registerManagedProcess();
logger.info('CHROMA_MCP', 'Connected to chroma-mcp successfully');
@@ -169,6 +174,7 @@ export class ChromaMcpManager {
}
logger.warn('CHROMA_MCP', 'chroma-mcp subprocess closed unexpectedly, applying reconnect backoff');
this.connected = false;
getSupervisor().unregisterProcess(CHROMA_SUPERVISOR_ID);
this.client = null;
this.transport = null;
this.lastConnectionFailureTimestamp = Date.now();
@@ -201,9 +207,7 @@ export class ChromaMcpManager {
'--port', chromaPort
];
if (chromaSsl) {
args.push('--ssl');
}
args.push('--ssl', chromaSsl ? 'true' : 'false');
if (chromaTenant !== 'default_tenant') {
args.push('--tenant', chromaTenant);
@@ -335,6 +339,7 @@ export class ChromaMcpManager {
logger.debug('CHROMA_MCP', 'Error during client close (subprocess may already be dead)', {}, error as Error);
}
getSupervisor().unregisterProcess(CHROMA_SUPERVISOR_ID);
this.client = null;
this.transport = null;
this.connected = false;
@@ -430,7 +435,7 @@ export class ChromaMcpManager {
*/
private getSpawnEnv(): Record<string, string> {
const baseEnv: Record<string, string> = {};
for (const [key, value] of Object.entries(process.env)) {
for (const [key, value] of Object.entries(sanitizeEnv(process.env))) {
if (value !== undefined) {
baseEnv[key] = value;
}
@@ -453,4 +458,21 @@ export class ChromaMcpManager {
NODE_EXTRA_CA_CERTS: combinedCertPath
};
}
private registerManagedProcess(): void {
const chromaProcess = (this.transport as unknown as { _process?: import('child_process').ChildProcess })._process;
if (!chromaProcess?.pid) {
return;
}
getSupervisor().registerProcess(CHROMA_SUPERVISOR_ID, {
pid: chromaProcess.pid,
type: 'chroma',
startedAt: new Date().toISOString()
}, chromaProcess);
chromaProcess.once('exit', () => {
getSupervisor().unregisterProcess(CHROMA_SUPERVISOR_ID);
});
}
}
+4 -6
View File
@@ -2,7 +2,7 @@ import { sessionInitHandler } from '../../cli/handlers/session-init.js';
import { observationHandler } from '../../cli/handlers/observation.js';
import { fileEditHandler } from '../../cli/handlers/file-edit.js';
import { sessionCompleteHandler } from '../../cli/handlers/session-complete.js';
import { ensureWorkerRunning, getWorkerPort } from '../../shared/worker-utils.js';
import { ensureWorkerRunning, workerHttpRequest } from '../../shared/worker-utils.js';
import { logger } from '../../utils/logger.js';
import { getProjectContext, getProjectName } from '../../utils/project-name.js';
import { writeAgentsMd } from '../../utils/agents-md-utils.js';
@@ -317,11 +317,10 @@ export class TranscriptEventProcessor {
const workerReady = await ensureWorkerRunning();
if (!workerReady) return;
const port = getWorkerPort();
const lastAssistantMessage = session.lastAssistantMessage ?? '';
try {
await fetch(`http://127.0.0.1:${port}/api/sessions/summarize`, {
await workerHttpRequest('/api/sessions/summarize', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
@@ -348,11 +347,10 @@ export class TranscriptEventProcessor {
const context = getProjectContext(cwd);
const projectsParam = context.allProjects.join(',');
const port = getWorkerPort();
try {
const response = await fetch(
`http://127.0.0.1:${port}/api/context/inject?projects=${encodeURIComponent(projectsParam)}`
const response = await workerHttpRequest(
`/api/context/inject?projects=${encodeURIComponent(projectsParam)}`
);
if (!response.ok) return;
+101 -74
View File
@@ -20,6 +20,8 @@ import { getAuthMethodDescription } from '../shared/EnvManager.js';
import { logger } from '../utils/logger.js';
import { ChromaMcpManager } from './sync/ChromaMcpManager.js';
import { ChromaSync } from './sync/ChromaSync.js';
import { configureSupervisorSignalHandlers, getSupervisor, startSupervisor } from '../supervisor/index.js';
import { sanitizeEnv } from '../supervisor/env-sanitizer.js';
// Windows: avoid repeated spawn popups when startup fails (issue #921)
const WINDOWS_SPAWN_COOLDOWN_MS = 2 * 60 * 1000;
@@ -78,7 +80,6 @@ import {
cleanStalePidFile,
isProcessAlive,
spawnDaemon,
createSignalHandler,
isPidFileRecent,
touchPidFile
} from './infrastructure/ProcessManager.js';
@@ -263,33 +264,10 @@ export class WorkerService {
* Register signal handlers for graceful shutdown
*/
private registerSignalHandlers(): void {
const shutdownRef = { value: this.isShuttingDown };
const handler = createSignalHandler(() => this.shutdown(), shutdownRef);
process.on('SIGTERM', () => {
this.isShuttingDown = shutdownRef.value;
handler('SIGTERM');
configureSupervisorSignalHandlers(async () => {
this.isShuttingDown = true;
await this.shutdown();
});
process.on('SIGINT', () => {
this.isShuttingDown = shutdownRef.value;
handler('SIGINT');
});
// SIGHUP: sent by kernel when controlling terminal closes.
// Daemon mode: ignore it (survive parent shell exit).
// Interactive mode: treat like SIGTERM (graceful shutdown).
if (process.platform !== 'win32') {
if (process.argv.includes('--daemon')) {
process.on('SIGHUP', () => {
logger.debug('SYSTEM', 'Ignoring SIGHUP in daemon mode');
});
} else {
process.on('SIGHUP', () => {
this.isShuttingDown = shutdownRef.value;
handler('SIGHUP');
});
}
}
}
/**
@@ -351,7 +329,9 @@ export class WorkerService {
const port = getWorkerPort();
const host = getWorkerHost();
// Start HTTP server FIRST - make port available immediately
await startSupervisor();
// Start HTTP server FIRST - make it available immediately
await this.server.listen(port, host);
// Worker writes its own PID - reliable on all platforms
@@ -363,6 +343,12 @@ export class WorkerService {
startedAt: new Date().toISOString()
});
getSupervisor().registerProcess('worker', {
pid: process.pid,
type: 'worker',
startedAt: new Date().toISOString()
});
logger.info('SYSTEM', 'Worker started', { host, port, pid: process.pid });
// Do slow initialization in background (non-blocking)
@@ -446,19 +432,50 @@ export class WorkerService {
// Connect to MCP server
const mcpServerPath = path.join(__dirname, 'mcp-server.cjs');
getSupervisor().assertCanSpawn('mcp server');
const transport = new StdioClientTransport({
command: 'node',
args: [mcpServerPath],
env: process.env
env: sanitizeEnv(process.env)
});
const MCP_INIT_TIMEOUT_MS = 300000;
const mcpConnectionPromise = this.mcpClient.connect(transport);
const timeoutPromise = new Promise<never>((_, reject) =>
setTimeout(() => reject(new Error('MCP connection timeout after 5 minutes')), MCP_INIT_TIMEOUT_MS)
);
let timeoutId: ReturnType<typeof setTimeout>;
const timeoutPromise = new Promise<never>((_, reject) => {
timeoutId = setTimeout(
() => reject(new Error('MCP connection timeout after 5 minutes')),
MCP_INIT_TIMEOUT_MS
);
});
await Promise.race([mcpConnectionPromise, timeoutPromise]);
try {
await Promise.race([mcpConnectionPromise, timeoutPromise]);
} catch (connectionError) {
clearTimeout(timeoutId!);
logger.warn('WORKER', 'MCP server connection failed, cleaning up subprocess', {
error: connectionError instanceof Error ? connectionError.message : String(connectionError)
});
try {
await transport.close();
} catch {
// Best effort: the supervisor handles later process cleanup for survivors.
}
throw connectionError;
}
clearTimeout(timeoutId!);
const mcpProcess = (transport as unknown as { _process?: import('child_process').ChildProcess })._process;
if (mcpProcess?.pid) {
getSupervisor().registerProcess('mcp-server', {
pid: mcpProcess.pid,
type: 'mcp',
startedAt: new Date().toISOString()
}, mcpProcess);
mcpProcess.once('exit', () => {
getSupervisor().unregisterProcess('mcp-server');
});
}
this.mcpReady = true;
logger.success('WORKER', 'MCP server connected');
@@ -470,7 +487,7 @@ export class WorkerService {
}
return activeIds;
});
logger.info('SYSTEM', 'Started orphan reaper (runs every 5 minutes)');
logger.info('SYSTEM', 'Started orphan reaper (runs every 30 seconds)');
// Reap stale sessions to unblock orphan process cleanup (Issue #1168)
this.staleSessionReaperInterval = setInterval(async () => {
@@ -561,6 +578,7 @@ export class WorkerService {
'ENOENT',
'spawn',
'Invalid API key',
'FOREIGN KEY constraint failed',
];
if (unrecoverablePatterns.some(pattern => errorMessage.includes(pattern))) {
hadUnrecoverableError = true;
@@ -618,7 +636,7 @@ export class WorkerService {
.finally(async () => {
// CRITICAL: Verify subprocess exit to prevent zombie accumulation (Issue #1168)
const trackedProcess = getProcessBySession(session.sessionDbId);
if (trackedProcess && !trackedProcess.process.killed && trackedProcess.process.exitCode === null) {
if (trackedProcess && trackedProcess.process.exitCode === null) {
await ensureProcessExit(trackedProcess, 5000);
}
@@ -659,16 +677,35 @@ export class WorkerService {
// Check if there's pending work that needs processing with a fresh AbortController
const pendingCount = pendingStore.getPendingCount(session.sessionDbId);
const MAX_PENDING_RESTARTS = 3;
if (pendingCount > 0) {
// Track consecutive pending-work restarts to prevent infinite loops (e.g. FK errors)
session.consecutiveRestarts = (session.consecutiveRestarts || 0) + 1;
if (session.consecutiveRestarts > MAX_PENDING_RESTARTS) {
logger.error('SYSTEM', 'Exceeded max pending-work restarts, stopping to prevent infinite loop', {
sessionId: session.sessionDbId,
pendingCount,
consecutiveRestarts: session.consecutiveRestarts
});
session.consecutiveRestarts = 0;
this.broadcastProcessingStatus();
return;
}
logger.info('SYSTEM', 'Pending work remains after generator exit, restarting with fresh AbortController', {
sessionId: session.sessionDbId,
pendingCount
pendingCount,
attempt: session.consecutiveRestarts
});
// Reset AbortController for restart
session.abortController = new AbortController();
// Restart processor
this.startSessionProcessor(session, 'pending-work-restart');
} else {
// Successful completion with no pending work — reset counter
session.consecutiveRestarts = 0;
}
this.broadcastProcessingStatus();
@@ -896,12 +933,22 @@ export class WorkerService {
* Ensures the worker is started and healthy.
* This function can be called by both 'start' and 'hook' commands.
*
* @param port - The port the worker should run on
* @param port - The TCP port (used for port-in-use checks and daemon spawn)
* @returns true if worker is healthy (existing or newly started), false on failure
*/
async function ensureWorkerStarted(port: number): Promise<boolean> {
// Clean stale PID file first (cheap: 1 fs read + 1 signal-0 check)
cleanStalePidFile();
const pidFileStatus = cleanStalePidFile();
if (pidFileStatus === 'alive') {
logger.info('SYSTEM', 'Worker PID file points to a live process, skipping duplicate spawn');
const healthy = await waitForHealth(port, getPlatformTimeout(HOOK_TIMEOUTS.PORT_IN_USE_WAIT));
if (healthy) {
logger.info('SYSTEM', 'Worker became healthy while waiting on live PID');
return true;
}
logger.warn('SYSTEM', 'Live PID detected but worker did not become healthy before timeout');
return false;
}
// Check if worker is already running and healthy
if (await waitForHealth(port, 1000)) {
@@ -1045,11 +1092,9 @@ async function main() {
case 'restart': {
logger.info('SYSTEM', 'Restarting worker');
await httpShutdown(port);
const freed = await waitForPortFree(port, getPlatformTimeout(15000));
if (!freed) {
const restartFreed = await waitForPortFree(port, getPlatformTimeout(15000));
if (!restartFreed) {
logger.error('SYSTEM', 'Port did not free up after shutdown, aborting restart', { port });
// Exit gracefully: Windows Terminal won't keep tab open on exit 0
// The wrapper/plugin will handle restart logic if needed
process.exit(0);
}
removePidFile();
@@ -1080,9 +1125,9 @@ async function main() {
}
case 'status': {
const running = await isPortInUse(port);
const portInUse = await isPortInUse(port);
const pidInfo = readPidFile();
if (running && pidInfo) {
if (portInUse && pidInfo) {
console.log('Worker is running');
console.log(` PID: ${pidInfo.pid}`);
console.log(` Port: ${pidInfo.port}`);
@@ -1102,13 +1147,7 @@ async function main() {
}
case 'hook': {
// Auto-start worker if not running
const workerReady = await ensureWorkerStarted(port);
if (!workerReady) {
logger.warn('SYSTEM', 'Worker failed to start before hook, handler will retry');
}
// Existing logic unchanged
// Validate CLI args first (before any I/O)
const platform = process.argv[3];
const event = process.argv[4];
if (!platform || !event) {
@@ -1118,32 +1157,20 @@ async function main() {
process.exit(1);
}
// Check if worker is already running on port
const portInUse = await isPortInUse(port);
let startedWorkerInProcess = false;
if (!portInUse) {
// Port free - start worker IN THIS PROCESS (no spawn!)
// This process becomes the worker and stays alive
try {
logger.info('SYSTEM', 'Starting worker in-process for hook', { event });
const worker = new WorkerService();
await worker.start();
startedWorkerInProcess = true;
// Worker is now running in this process on the port
} catch (error) {
logger.failure('SYSTEM', 'Worker failed to start in hook', {}, error as Error);
removePidFile();
process.exit(0);
}
// Ensure worker is running as a detached daemon (#1249).
//
// IMPORTANT: The hook process MUST NOT become the worker. Starting the
// worker in-process makes it a grandchild of Claude Code, which the
// sandbox kills. Instead, ensureWorkerStarted() spawns a fully detached
// daemon (detached: true, stdio: 'ignore', child.unref()) that survives
// the hook process's exit and is invisible to Claude Code's sandbox.
const workerReady = await ensureWorkerStarted(port);
if (!workerReady) {
logger.warn('SYSTEM', 'Worker failed to start before hook, handler will proceed gracefully');
}
// If port in use, we'll use HTTP to the existing worker
const { hookCommand } = await import('../cli/hook-command.js');
// If we started the worker in this process, skip process.exit() so we stay alive as the worker
await hookCommand(platform, event, { skipExit: startedWorkerInProcess });
// Note: if we started worker in-process, this process stays alive as the worker
// The break allows the event loop to continue serving requests
await hookCommand(platform, event);
break;
}
+82 -30
View File
@@ -19,6 +19,8 @@
import { spawn, exec, ChildProcess } from 'child_process';
import { promisify } from 'util';
import { logger } from '../../utils/logger.js';
import { sanitizeEnv } from '../../supervisor/env-sanitizer.js';
import { getSupervisor } from '../../supervisor/index.js';
const execAsync = promisify(exec);
@@ -29,14 +31,36 @@ interface TrackedProcess {
process: ChildProcess;
}
// PID Registry - tracks spawned Claude subprocesses
const processRegistry = new Map<number, TrackedProcess>();
function getTrackedProcesses(): TrackedProcess[] {
return getSupervisor().getRegistry()
.getAll()
.filter(record => record.type === 'sdk')
.map((record) => {
const processRef = getSupervisor().getRegistry().getRuntimeProcess(record.id);
if (!processRef) {
return null;
}
return {
pid: record.pid,
sessionDbId: Number(record.sessionId),
spawnedAt: Date.parse(record.startedAt),
process: processRef
};
})
.filter((value): value is TrackedProcess => value !== null);
}
/**
* Register a spawned process in the registry
*/
export function registerProcess(pid: number, sessionDbId: number, process: ChildProcess): void {
processRegistry.set(pid, { pid, sessionDbId, spawnedAt: Date.now(), process });
getSupervisor().registerProcess(`sdk:${sessionDbId}:${pid}`, {
pid,
type: 'sdk',
sessionId: sessionDbId,
startedAt: new Date().toISOString()
}, process);
logger.info('PROCESS', `Registered PID ${pid} for session ${sessionDbId}`, { pid, sessionDbId });
}
@@ -44,7 +68,11 @@ export function registerProcess(pid: number, sessionDbId: number, process: Child
* Unregister a process from the registry and notify pool waiters
*/
export function unregisterProcess(pid: number): void {
processRegistry.delete(pid);
for (const record of getSupervisor().getRegistry().getByPid(pid)) {
if (record.type === 'sdk') {
getSupervisor().unregisterProcess(record.id);
}
}
logger.debug('PROCESS', `Unregistered PID ${pid}`, { pid });
// Notify waiters that a pool slot may be available
notifySlotAvailable();
@@ -55,10 +83,7 @@ export function unregisterProcess(pid: number): void {
* Warns if multiple processes found (indicates race condition)
*/
export function getProcessBySession(sessionDbId: number): TrackedProcess | undefined {
const matches: TrackedProcess[] = [];
for (const [, info] of processRegistry) {
if (info.sessionDbId === sessionDbId) matches.push(info);
}
const matches = getTrackedProcesses().filter(info => info.sessionDbId === sessionDbId);
if (matches.length > 1) {
logger.warn('PROCESS', `Multiple processes found for session ${sessionDbId}`, {
count: matches.length,
@@ -72,7 +97,7 @@ export function getProcessBySession(sessionDbId: number): TrackedProcess | undef
* Get count of active processes in the registry
*/
export function getActiveCount(): number {
return processRegistry.size;
return getSupervisor().getRegistry().getAll().filter(record => record.type === 'sdk').length;
}
// Waiters for pool slots - resolved when a process exits and frees a slot
@@ -91,10 +116,18 @@ function notifySlotAvailable(): void {
* @param maxConcurrent Max number of concurrent agents
* @param timeoutMs Max time to wait before giving up
*/
export async function waitForSlot(maxConcurrent: number, timeoutMs: number = 60_000): Promise<void> {
if (processRegistry.size < maxConcurrent) return;
const TOTAL_PROCESS_HARD_CAP = 10;
logger.info('PROCESS', `Pool limit reached (${processRegistry.size}/${maxConcurrent}), waiting for slot...`);
export async function waitForSlot(maxConcurrent: number, timeoutMs: number = 60_000): Promise<void> {
// Hard cap: refuse to spawn if too many processes exist regardless of pool accounting
const activeCount = getActiveCount();
if (activeCount >= TOTAL_PROCESS_HARD_CAP) {
throw new Error(`Hard cap exceeded: ${activeCount} processes in registry (cap=${TOTAL_PROCESS_HARD_CAP}). Refusing to spawn more.`);
}
if (activeCount < maxConcurrent) return;
logger.info('PROCESS', `Pool limit reached (${activeCount}/${maxConcurrent}), waiting for slot...`);
return new Promise<void>((resolve, reject) => {
const timeout = setTimeout(() => {
@@ -105,7 +138,7 @@ export async function waitForSlot(maxConcurrent: number, timeoutMs: number = 60_
const onSlot = () => {
clearTimeout(timeout);
if (processRegistry.size < maxConcurrent) {
if (getActiveCount() < maxConcurrent) {
resolve();
} else {
// Still full, re-queue
@@ -122,7 +155,7 @@ export async function waitForSlot(maxConcurrent: number, timeoutMs: number = 60_
*/
export function getActiveProcesses(): Array<{ pid: number; sessionDbId: number; ageMs: number }> {
const now = Date.now();
return Array.from(processRegistry.values()).map(info => ({
return getTrackedProcesses().map(info => ({
pid: info.pid,
sessionDbId: info.sessionDbId,
ageMs: now - info.spawnedAt
@@ -136,8 +169,9 @@ export function getActiveProcesses(): Array<{ pid: number; sessionDbId: number;
export async function ensureProcessExit(tracked: TrackedProcess, timeoutMs: number = 5000): Promise<void> {
const { pid, process: proc } = tracked;
// Already exited?
if (proc.killed || proc.exitCode !== null) {
// Already exited? Only trust exitCode, NOT proc.killed
// proc.killed only means Node sent a signal — the process can still be alive
if (proc.exitCode !== null) {
unregisterProcess(pid);
return;
}
@@ -153,8 +187,8 @@ export async function ensureProcessExit(tracked: TrackedProcess, timeoutMs: numb
await Promise.race([exitPromise, timeoutPromise]);
// Check if exited gracefully
if (proc.killed || proc.exitCode !== null) {
// Check if exited gracefully — only trust exitCode
if (proc.exitCode !== null) {
unregisterProcess(pid);
return;
}
@@ -167,8 +201,14 @@ export async function ensureProcessExit(tracked: TrackedProcess, timeoutMs: numb
// Already dead
}
// Brief wait for SIGKILL to take effect
await new Promise(resolve => setTimeout(resolve, 200));
// Wait for SIGKILL to take effect — use exit event with 1s timeout instead of blind sleep
const sigkillExitPromise = new Promise<void>((resolve) => {
proc.once('exit', () => resolve());
});
const sigkillTimeout = new Promise<void>((resolve) => {
setTimeout(resolve, 1000);
});
await Promise.race([sigkillExitPromise, sigkillTimeout]);
unregisterProcess(pid);
}
@@ -234,8 +274,8 @@ async function killIdleDaemonChildren(): Promise<number> {
minutes = parseInt(minMatch[1], 10);
}
// Kill if idle for more than 2 minutes
if (minutes >= 2) {
// Kill if idle for more than 1 minute
if (minutes >= 1) {
logger.info('PROCESS', `Killing idle daemon child PID ${pid} (idle ${minutes}m)`, { pid, minutes });
try {
process.kill(pid, 'SIGKILL');
@@ -294,17 +334,26 @@ export async function reapOrphanedProcesses(activeSessionIds: Set<number>): Prom
let killed = 0;
// Registry-based: kill processes for dead sessions
for (const [pid, info] of processRegistry) {
if (activeSessionIds.has(info.sessionDbId)) continue; // Active = safe
for (const record of getSupervisor().getRegistry().getAll().filter(entry => entry.type === 'sdk')) {
const pid = record.pid;
const sessionDbId = Number(record.sessionId);
const processRef = getSupervisor().getRegistry().getRuntimeProcess(record.id);
logger.warn('PROCESS', `Killing orphan PID ${pid} (session ${info.sessionDbId} gone)`, { pid, sessionDbId: info.sessionDbId });
if (activeSessionIds.has(sessionDbId)) continue; // Active = safe
logger.warn('PROCESS', `Killing orphan PID ${pid} (session ${sessionDbId} gone)`, { pid, sessionDbId });
try {
info.process.kill('SIGKILL');
if (processRef) {
processRef.kill('SIGKILL');
} else {
process.kill(pid, 'SIGKILL');
}
killed++;
} catch {
// Already dead
}
unregisterProcess(pid);
getSupervisor().unregisterProcess(record.id);
notifySlotAvailable();
}
// System-level: find ppid=1 orphans
@@ -333,20 +382,23 @@ export function createPidCapturingSpawn(sessionDbId: number) {
env?: NodeJS.ProcessEnv;
signal?: AbortSignal;
}) => {
getSupervisor().assertCanSpawn('claude sdk');
// On Windows, use cmd.exe wrapper for .cmd files to properly handle paths with spaces
const useCmdWrapper = process.platform === 'win32' && spawnOptions.command.endsWith('.cmd');
const env = sanitizeEnv(spawnOptions.env ?? process.env);
const child = useCmdWrapper
? spawn('cmd.exe', ['/d', '/c', spawnOptions.command, ...spawnOptions.args], {
cwd: spawnOptions.cwd,
env: spawnOptions.env,
env,
stdio: ['pipe', 'pipe', 'pipe'],
signal: spawnOptions.signal,
windowsHide: true
})
: spawn(spawnOptions.command, spawnOptions.args, {
cwd: spawnOptions.cwd,
env: spawnOptions.env,
env,
stdio: ['pipe', 'pipe', 'pipe'],
signal: spawnOptions.signal, // CRITICAL: Pass signal for AbortController integration
windowsHide: true
@@ -393,7 +445,7 @@ export function createPidCapturingSpawn(sessionDbId: number) {
* Start the orphan reaper interval
* Returns cleanup function to stop the interval
*/
export function startOrphanReaper(getActiveSessionIds: () => Set<number>, intervalMs: number = 5 * 60 * 1000): () => void {
export function startOrphanReaper(getActiveSessionIds: () => Set<number>, intervalMs: number = 30 * 1000): () => void {
const interval = setInterval(async () => {
try {
const activeIds = getActiveSessionIds();
+3 -2
View File
@@ -22,6 +22,7 @@ import type { ActiveSession, SDKUserMessage } from '../worker-types.js';
import { ModeManager } from '../domain/ModeManager.js';
import { processAgentResponse, type WorkerRef } from './agents/index.js';
import { createPidCapturingSpawn, getProcessBySession, ensureProcessExit, waitForSlot } from './ProcessRegistry.js';
import { sanitizeEnv } from '../../supervisor/env-sanitizer.js';
// Import Agent SDK (assumes it's installed)
// @ts-ignore - Agent SDK types may not be available
@@ -96,7 +97,7 @@ export class SDKAgent {
// Build isolated environment from ~/.claude-mem/.env
// This prevents Issue #733: random ANTHROPIC_API_KEY from project .env files
// being used instead of the configured auth method (CLI subscription or explicit API key)
const isolatedEnv = buildIsolatedEnv();
const isolatedEnv = sanitizeEnv(buildIsolatedEnv());
const authMethod = getAuthMethodDescription();
logger.info('SDK', 'Starting SDK query', {
@@ -281,7 +282,7 @@ export class SDKAgent {
} finally {
// Ensure subprocess is terminated after query completes (or on error)
const tracked = getProcessBySession(session.sessionDbId);
if (tracked && !tracked.process.killed && tracked.process.exitCode === null) {
if (tracked && tracked.process.exitCode === null) {
await ensureProcessExit(tracked, 5000);
}
}
+29 -4
View File
@@ -61,6 +61,9 @@ export class SearchManager {
limit: number,
whereFilter?: Record<string, any>
): Promise<{ ids: number[]; distances: number[]; metadatas: any[] }> {
if (!this.chromaSync) {
return { ids: [], distances: [], metadatas: [] };
}
return await this.chromaSync.queryChroma(query, limit, whereFilter);
}
@@ -180,15 +183,37 @@ export class SearchManager {
logger.debug('SEARCH', 'ChromaDB returned semantic matches', { matchCount: chromaResults.ids.length });
if (chromaResults.ids.length > 0) {
// Step 2: Filter by recency (90 days)
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
// Step 2: Filter by date range
// Use user-provided dateRange if available, otherwise fall back to 90-day recency window
const { dateRange } = options;
let startEpoch: number | undefined;
let endEpoch: number | undefined;
if (dateRange) {
if (dateRange.start) {
startEpoch = typeof dateRange.start === 'number'
? dateRange.start
: new Date(dateRange.start).getTime();
}
if (dateRange.end) {
endEpoch = typeof dateRange.end === 'number'
? dateRange.end
: new Date(dateRange.end).getTime();
}
} else {
// Default: 90-day recency window
startEpoch = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
}
const recentMetadata = chromaResults.metadatas.map((meta, idx) => ({
id: chromaResults.ids[idx],
meta,
isRecent: meta && meta.created_at_epoch > ninetyDaysAgo
isRecent: meta && meta.created_at_epoch != null
&& (!startEpoch || meta.created_at_epoch >= startEpoch)
&& (!endEpoch || meta.created_at_epoch <= endEpoch)
})).filter(item => item.isRecent);
logger.debug('SEARCH', 'Results within 90-day window', { count: recentMetadata.length });
logger.debug('SEARCH', dateRange ? 'Results within user date range' : 'Results within 90-day window', { count: recentMetadata.length });
// Step 3: Categorize IDs by document type
const obsIds: number[] = [];
+13 -1
View File
@@ -15,6 +15,7 @@ import type { ActiveSession, PendingMessage, PendingMessageWithId, ObservationDa
import { PendingMessageStore } from '../sqlite/PendingMessageStore.js';
import { SessionQueueProcessor } from '../queue/SessionQueueProcessor.js';
import { getProcessBySession, ensureProcessExit } from './ProcessRegistry.js';
import { getSupervisor } from '../../supervisor/index.js';
export class SessionManager {
private dbManager: DatabaseManager;
@@ -302,7 +303,7 @@ export class SessionManager {
// 3. Verify subprocess exit with 5s timeout (Issue #737 fix)
const tracked = getProcessBySession(sessionDbId);
if (tracked && !tracked.process.killed && tracked.process.exitCode === null) {
if (tracked && tracked.process.exitCode === null) {
logger.debug('SESSION', `Waiting for subprocess PID ${tracked.pid} to exit`, {
sessionId: sessionDbId,
pid: tracked.pid
@@ -310,6 +311,17 @@ export class SessionManager {
await ensureProcessExit(tracked, 5000);
}
// 3b. Reap all supervisor-tracked processes for this session (#1351)
// This catches MCP servers and other child processes not tracked by the
// in-memory ProcessRegistry (e.g. processes registered only in supervisor.json).
try {
await getSupervisor().getRegistry().reapSession(sessionDbId);
} catch (error) {
logger.warn('SESSION', 'Supervisor reapSession failed (non-blocking)', {
sessionId: sessionDbId
}, error as Error);
}
// 4. Cleanup
this.sessions.delete(sessionDbId);
this.sessionQueues.delete(sessionDbId);
+2 -2
View File
@@ -57,13 +57,13 @@ export function createMiddleware(
// Log incoming request with body summary
const bodySummary = summarizeRequestBody(req.method, req.path, req.body);
logger.info('HTTP', `${req.method} ${req.path}`, { requestId }, bodySummary);
logger.debug('HTTP', `${req.method} ${req.path}`, { requestId }, bodySummary);
// Capture response
const originalSend = res.send.bind(res);
res.send = function(body: any) {
const duration = Date.now() - start;
logger.info('HTTP', `${res.statusCode} ${req.path}`, { requestId, duration: `${duration}ms` });
logger.debug('HTTP', `${res.statusCode} ${req.path}`, { requestId, duration: `${duration}ms` });
return originalSend(body);
};
@@ -356,7 +356,7 @@ export class SessionRoutes extends BaseRouteHandler {
// Sync user prompt to Chroma
const chromaStart = Date.now();
const promptText = latestPrompt.prompt_text;
this.dbManager.getChromaSync().syncUserPrompt(
this.dbManager.getChromaSync()?.syncUserPrompt(
latestPrompt.id,
latestPrompt.memory_session_id,
latestPrompt.project,
+7 -2
View File
@@ -133,10 +133,15 @@ export class SettingsDefaultsManager {
}
/**
* Get a default value from defaults (no environment variable override)
* Get a setting value with environment variable override.
* Priority: process.env > hardcoded default
*
* For full priority (env > settings file > default), use loadFromFile().
* This method is safe to call at module-load time (no file I/O) and still
* respects environment variable overrides that were previously ignored.
*/
static get(key: keyof SettingsDefaults): string {
return this.DEFAULTS[key];
return process.env[key] ?? this.DEFAULTS[key];
}
/**
+31 -1
View File
@@ -24,7 +24,37 @@ const _dirname = getDirname();
*/
// Base directories
export const DATA_DIR = SettingsDefaultsManager.get('CLAUDE_MEM_DATA_DIR');
// Resolve DATA_DIR with full priority: env var > settings.json > default.
// SettingsDefaultsManager.get() handles env > default. For settings file
// support, we do a one-time synchronous read of the default settings path
// to check if the user configured a custom DATA_DIR there.
function resolveDataDir(): string {
// 1. Environment variable (highest priority) — already handled by get()
if (process.env.CLAUDE_MEM_DATA_DIR) {
return process.env.CLAUDE_MEM_DATA_DIR;
}
// 2. Settings file at the default location
const defaultDataDir = join(homedir(), '.claude-mem');
const settingsPath = join(defaultDataDir, 'settings.json');
try {
if (existsSync(settingsPath)) {
const { readFileSync } = require('fs');
const raw = JSON.parse(readFileSync(settingsPath, 'utf-8'));
const settings = raw.env ?? raw; // handle legacy nested schema
if (settings.CLAUDE_MEM_DATA_DIR) {
return settings.CLAUDE_MEM_DATA_DIR;
}
}
} catch {
// settings file missing or corrupt — fall through to default
}
// 3. Hardcoded default
return defaultDataDir;
}
export const DATA_DIR = resolveDataDir();
// Note: CLAUDE_CONFIG_DIR is a Claude Code setting, not claude-mem, so leave as env var
export const CLAUDE_CONFIG_DIR = process.env.CLAUDE_CONFIG_DIR || join(homedir(), '.claude');
+4 -2
View File
@@ -13,12 +13,14 @@ export function extractLastMessage(
stripSystemReminders: boolean = false
): string {
if (!transcriptPath || !existsSync(transcriptPath)) {
throw new Error(`Transcript path missing or file does not exist: ${transcriptPath}`);
logger.warn('PARSER', `Transcript path missing or file does not exist: ${transcriptPath}`);
return '';
}
const content = readFileSync(transcriptPath, 'utf-8').trim();
if (!content) {
throw new Error(`Transcript file exists but is empty: ${transcriptPath}`);
logger.warn('PARSER', `Transcript file exists but is empty: ${transcriptPath}`);
return '';
}
const lines = content.split('\n');
+44 -11
View File
@@ -78,8 +78,8 @@ export function getWorkerHost(): string {
}
/**
* Clear the cached port and host values
* Call this when settings are updated to force re-reading from file
* Clear the cached port and host values.
* Call this when settings are updated to force re-reading from file.
*/
export function clearPortCache(): void {
cachedPort = null;
@@ -87,7 +87,46 @@ export function clearPortCache(): void {
}
/**
* Check if worker HTTP server is responsive
* Build a full URL for a given API path.
*/
export function buildWorkerUrl(apiPath: string): string {
return `http://${getWorkerHost()}:${getWorkerPort()}${apiPath}`;
}
/**
* Make an HTTP request to the worker over TCP.
*
* This is the preferred way for hooks to communicate with the worker.
*/
export function workerHttpRequest(
apiPath: string,
options: {
method?: string;
headers?: Record<string, string>;
body?: string;
timeoutMs?: number;
} = {}
): Promise<Response> {
const method = options.method ?? 'GET';
const timeoutMs = options.timeoutMs ?? HEALTH_CHECK_TIMEOUT_MS;
const url = buildWorkerUrl(apiPath);
const init: RequestInit = { method };
if (options.headers) {
init.headers = options.headers;
}
if (options.body) {
init.body = options.body;
}
if (timeoutMs > 0) {
return fetchWithTimeout(url, init, timeoutMs);
}
return fetch(url, init);
}
/**
* Check if worker HTTP server is responsive.
* Uses /api/health (liveness) instead of /api/readiness because:
* - Hooks have 15-second timeout, but full initialization can take 5+ minutes (MCP connection)
* - /api/health returns 200 as soon as HTTP server is up (sufficient for hook communication)
@@ -95,10 +134,7 @@ export function clearPortCache(): void {
* See: https://github.com/thedotmack/claude-mem/issues/811
*/
async function isWorkerHealthy(): Promise<boolean> {
const port = getWorkerPort();
const response = await fetchWithTimeout(
`http://127.0.0.1:${port}/api/health`, {}, HEALTH_CHECK_TIMEOUT_MS
);
const response = await workerHttpRequest('/api/health', { timeoutMs: HEALTH_CHECK_TIMEOUT_MS });
return response.ok;
}
@@ -125,10 +161,7 @@ function getPluginVersion(): string {
* Get the running worker's version from the API
*/
async function getWorkerVersion(): Promise<string> {
const port = getWorkerPort();
const response = await fetchWithTimeout(
`http://127.0.0.1:${port}/api/version`, {}, HEALTH_CHECK_TIMEOUT_MS
);
const response = await workerHttpRequest('/api/version', { timeoutMs: HEALTH_CHECK_TIMEOUT_MS });
if (!response.ok) {
throw new Error(`Failed to get worker version: ${response.status}`);
}
+20
View File
@@ -0,0 +1,20 @@
export const ENV_PREFIXES = ['CLAUDECODE_', 'CLAUDE_CODE_'];
export const ENV_EXACT_MATCHES = new Set([
'CLAUDECODE',
'CLAUDE_CODE_SESSION',
'CLAUDE_CODE_ENTRYPOINT',
'MCP_SESSION_ID',
]);
export function sanitizeEnv(env: NodeJS.ProcessEnv = process.env): NodeJS.ProcessEnv {
const sanitized: NodeJS.ProcessEnv = {};
for (const [key, value] of Object.entries(env)) {
if (value === undefined) continue;
if (ENV_EXACT_MATCHES.has(key)) continue;
if (ENV_PREFIXES.some(prefix => key.startsWith(prefix))) continue;
sanitized[key] = value;
}
return sanitized;
}
+40
View File
@@ -0,0 +1,40 @@
/**
* Health Checker - Periodic background cleanup of dead processes
*
* Runs every 30 seconds to prune dead processes from the supervisor registry.
* The interval is unref'd so it does not keep the process alive.
*/
import { logger } from '../utils/logger.js';
import { getProcessRegistry } from './process-registry.js';
const HEALTH_CHECK_INTERVAL_MS = 30_000;
let healthCheckInterval: ReturnType<typeof setInterval> | null = null;
function runHealthCheck(): void {
const registry = getProcessRegistry();
const removedProcessCount = registry.pruneDeadEntries();
if (removedProcessCount > 0) {
logger.info('SYSTEM', `Health check: pruned ${removedProcessCount} dead process(es) from registry`);
}
}
export function startHealthChecker(): void {
if (healthCheckInterval !== null) return;
healthCheckInterval = setInterval(runHealthCheck, HEALTH_CHECK_INTERVAL_MS);
healthCheckInterval.unref();
logger.debug('SYSTEM', 'Health checker started', { intervalMs: HEALTH_CHECK_INTERVAL_MS });
}
export function stopHealthChecker(): void {
if (healthCheckInterval === null) return;
clearInterval(healthCheckInterval);
healthCheckInterval = null;
logger.debug('SYSTEM', 'Health checker stopped');
}
+188
View File
@@ -0,0 +1,188 @@
import { existsSync, readFileSync, rmSync } from 'fs';
import { homedir } from 'os';
import path from 'path';
import { logger } from '../utils/logger.js';
import { getProcessRegistry, isPidAlive, type ManagedProcessInfo, type ProcessRegistry } from './process-registry.js';
import { runShutdownCascade } from './shutdown.js';
import { startHealthChecker, stopHealthChecker } from './health-checker.js';
const DATA_DIR = path.join(homedir(), '.claude-mem');
const PID_FILE = path.join(DATA_DIR, 'worker.pid');
interface PidInfo {
pid: number;
port: number;
startedAt: string;
}
interface ValidateWorkerPidOptions {
logAlive?: boolean;
pidFilePath?: string;
}
export type ValidateWorkerPidStatus = 'missing' | 'alive' | 'stale' | 'invalid';
class Supervisor {
private readonly registry: ProcessRegistry;
private started = false;
private stopPromise: Promise<void> | null = null;
private signalHandlersRegistered = false;
private shutdownInitiated = false;
private shutdownHandler: (() => Promise<void>) | null = null;
constructor(registry: ProcessRegistry) {
this.registry = registry;
}
async start(): Promise<void> {
if (this.started) return;
this.registry.initialize();
const pidStatus = validateWorkerPidFile({ logAlive: false });
if (pidStatus === 'alive') {
throw new Error('Worker already running');
}
this.started = true;
startHealthChecker();
}
configureSignalHandlers(shutdownHandler: () => Promise<void>): void {
this.shutdownHandler = shutdownHandler;
if (this.signalHandlersRegistered) return;
this.signalHandlersRegistered = true;
const handleSignal = async (signal: string): Promise<void> => {
if (this.shutdownInitiated) {
logger.warn('SYSTEM', `Received ${signal} but shutdown already in progress`);
return;
}
this.shutdownInitiated = true;
logger.info('SYSTEM', `Received ${signal}, shutting down...`);
try {
if (this.shutdownHandler) {
await this.shutdownHandler();
} else {
await this.stop();
}
} catch (error) {
logger.error('SYSTEM', 'Error during shutdown', {}, error as Error);
try {
await this.stop();
} catch (stopError) {
logger.debug('SYSTEM', 'Supervisor shutdown fallback failed', {}, stopError as Error);
}
}
process.exit(0);
};
process.on('SIGTERM', () => void handleSignal('SIGTERM'));
process.on('SIGINT', () => void handleSignal('SIGINT'));
if (process.platform !== 'win32') {
if (process.argv.includes('--daemon')) {
process.on('SIGHUP', () => {
logger.debug('SYSTEM', 'Ignoring SIGHUP in daemon mode');
});
} else {
process.on('SIGHUP', () => void handleSignal('SIGHUP'));
}
}
}
async stop(): Promise<void> {
if (this.stopPromise) {
await this.stopPromise;
return;
}
stopHealthChecker();
this.stopPromise = runShutdownCascade({
registry: this.registry,
currentPid: process.pid
}).finally(() => {
this.started = false;
this.stopPromise = null;
});
await this.stopPromise;
}
assertCanSpawn(type: string): void {
if (this.stopPromise !== null) {
throw new Error(`Supervisor is shutting down, refusing to spawn ${type}`);
}
}
registerProcess(id: string, processInfo: ManagedProcessInfo, processRef?: Parameters<ProcessRegistry['register']>[2]): void {
this.registry.register(id, processInfo, processRef);
}
unregisterProcess(id: string): void {
this.registry.unregister(id);
}
getRegistry(): ProcessRegistry {
return this.registry;
}
}
const supervisorSingleton = new Supervisor(getProcessRegistry());
export async function startSupervisor(): Promise<void> {
await supervisorSingleton.start();
}
export async function stopSupervisor(): Promise<void> {
await supervisorSingleton.stop();
}
export function getSupervisor(): Supervisor {
return supervisorSingleton;
}
export function configureSupervisorSignalHandlers(shutdownHandler: () => Promise<void>): void {
supervisorSingleton.configureSignalHandlers(shutdownHandler);
}
export function validateWorkerPidFile(options: ValidateWorkerPidOptions = {}): ValidateWorkerPidStatus {
const pidFilePath = options.pidFilePath ?? PID_FILE;
if (!existsSync(pidFilePath)) {
return 'missing';
}
let pidInfo: PidInfo | null = null;
try {
pidInfo = JSON.parse(readFileSync(pidFilePath, 'utf-8')) as PidInfo;
} catch (error) {
logger.warn('SYSTEM', 'Failed to parse worker PID file, removing it', { path: pidFilePath }, error as Error);
rmSync(pidFilePath, { force: true });
return 'invalid';
}
if (isPidAlive(pidInfo.pid)) {
if (options.logAlive ?? true) {
logger.info('SYSTEM', 'Worker already running (PID alive)', {
existingPid: pidInfo.pid,
existingPort: pidInfo.port,
startedAt: pidInfo.startedAt
});
}
return 'alive';
}
logger.info('SYSTEM', 'Removing stale PID file (worker process is dead)', {
pid: pidInfo.pid,
port: pidInfo.port,
startedAt: pidInfo.startedAt
});
rmSync(pidFilePath, { force: true });
return 'stale';
}
+253
View File
@@ -0,0 +1,253 @@
import { ChildProcess } from 'child_process';
import { existsSync, mkdirSync, readFileSync, writeFileSync } from 'fs';
import { homedir } from 'os';
import path from 'path';
import { logger } from '../utils/logger.js';
const REAP_SESSION_SIGTERM_TIMEOUT_MS = 5_000;
const REAP_SESSION_SIGKILL_TIMEOUT_MS = 1_000;
const DATA_DIR = path.join(homedir(), '.claude-mem');
const DEFAULT_REGISTRY_PATH = path.join(DATA_DIR, 'supervisor.json');
export interface ManagedProcessInfo {
pid: number;
type: string;
sessionId?: string | number;
startedAt: string;
}
export interface ManagedProcessRecord extends ManagedProcessInfo {
id: string;
}
interface PersistedRegistry {
processes: Record<string, ManagedProcessInfo>;
}
export function isPidAlive(pid: number): boolean {
if (!Number.isInteger(pid) || pid < 0) return false;
if (pid === 0) return false;
try {
process.kill(pid, 0);
return true;
} catch (error: unknown) {
const code = (error as NodeJS.ErrnoException).code;
return code === 'EPERM';
}
}
export class ProcessRegistry {
private readonly registryPath: string;
private readonly entries = new Map<string, ManagedProcessInfo>();
private readonly runtimeProcesses = new Map<string, ChildProcess>();
private initialized = false;
constructor(registryPath: string = DEFAULT_REGISTRY_PATH) {
this.registryPath = registryPath;
}
initialize(): void {
if (this.initialized) return;
this.initialized = true;
mkdirSync(path.dirname(this.registryPath), { recursive: true });
if (!existsSync(this.registryPath)) {
this.persist();
return;
}
try {
const raw = JSON.parse(readFileSync(this.registryPath, 'utf-8')) as PersistedRegistry;
const processes = raw.processes ?? {};
for (const [id, info] of Object.entries(processes)) {
this.entries.set(id, info);
}
} catch (error) {
logger.warn('SYSTEM', 'Failed to parse supervisor registry, rebuilding', {
path: this.registryPath
}, error as Error);
this.entries.clear();
}
const removed = this.pruneDeadEntries();
if (removed > 0) {
logger.info('SYSTEM', 'Removed dead processes from supervisor registry', { removed });
}
this.persist();
}
register(id: string, processInfo: ManagedProcessInfo, processRef?: ChildProcess): void {
this.initialize();
this.entries.set(id, processInfo);
if (processRef) {
this.runtimeProcesses.set(id, processRef);
}
this.persist();
}
unregister(id: string): void {
this.initialize();
this.entries.delete(id);
this.runtimeProcesses.delete(id);
this.persist();
}
clear(): void {
this.entries.clear();
this.runtimeProcesses.clear();
this.persist();
}
getAll(): ManagedProcessRecord[] {
this.initialize();
return Array.from(this.entries.entries())
.map(([id, info]) => ({ id, ...info }))
.sort((a, b) => {
const left = Date.parse(a.startedAt);
const right = Date.parse(b.startedAt);
return (Number.isNaN(left) ? 0 : left) - (Number.isNaN(right) ? 0 : right);
});
}
getBySession(sessionId: string | number): ManagedProcessRecord[] {
const normalized = String(sessionId);
return this.getAll().filter(record => record.sessionId !== undefined && String(record.sessionId) === normalized);
}
getRuntimeProcess(id: string): ChildProcess | undefined {
return this.runtimeProcesses.get(id);
}
getByPid(pid: number): ManagedProcessRecord[] {
return this.getAll().filter(record => record.pid === pid);
}
pruneDeadEntries(): number {
this.initialize();
let removed = 0;
for (const [id, info] of this.entries) {
if (isPidAlive(info.pid)) continue;
this.entries.delete(id);
this.runtimeProcesses.delete(id);
removed += 1;
}
if (removed > 0) {
this.persist();
}
return removed;
}
/**
* Kill and unregister all processes tagged with the given sessionId.
* Sends SIGTERM first, waits up to 5s, then SIGKILL for survivors.
* Called when a session is deleted to prevent leaked child processes (#1351).
*/
async reapSession(sessionId: string | number): Promise<number> {
this.initialize();
const sessionRecords = this.getBySession(sessionId);
if (sessionRecords.length === 0) {
return 0;
}
const sessionIdNum = typeof sessionId === 'number' ? sessionId : Number(sessionId) || undefined;
logger.info('SYSTEM', `Reaping ${sessionRecords.length} process(es) for session ${sessionId}`, {
sessionId: sessionIdNum,
pids: sessionRecords.map(r => r.pid)
});
// Phase 1: SIGTERM all alive processes
const aliveRecords = sessionRecords.filter(r => isPidAlive(r.pid));
for (const record of aliveRecords) {
try {
process.kill(record.pid, 'SIGTERM');
} catch (error: unknown) {
const code = (error as NodeJS.ErrnoException).code;
if (code !== 'ESRCH') {
logger.debug('SYSTEM', `Failed to SIGTERM session process PID ${record.pid}`, {
pid: record.pid
}, error as Error);
}
}
}
// Phase 2: Wait for processes to exit
const deadline = Date.now() + REAP_SESSION_SIGTERM_TIMEOUT_MS;
while (Date.now() < deadline) {
const survivors = aliveRecords.filter(r => isPidAlive(r.pid));
if (survivors.length === 0) break;
await new Promise(resolve => setTimeout(resolve, 100));
}
// Phase 3: SIGKILL any survivors
const survivors = aliveRecords.filter(r => isPidAlive(r.pid));
for (const record of survivors) {
logger.warn('SYSTEM', `Session process PID ${record.pid} did not exit after SIGTERM, sending SIGKILL`, {
pid: record.pid,
sessionId: sessionIdNum
});
try {
process.kill(record.pid, 'SIGKILL');
} catch (error: unknown) {
const code = (error as NodeJS.ErrnoException).code;
if (code !== 'ESRCH') {
logger.debug('SYSTEM', `Failed to SIGKILL session process PID ${record.pid}`, {
pid: record.pid
}, error as Error);
}
}
}
// Brief wait for SIGKILL to take effect
if (survivors.length > 0) {
const sigkillDeadline = Date.now() + REAP_SESSION_SIGKILL_TIMEOUT_MS;
while (Date.now() < sigkillDeadline) {
const remaining = survivors.filter(r => isPidAlive(r.pid));
if (remaining.length === 0) break;
await new Promise(resolve => setTimeout(resolve, 100));
}
}
// Phase 4: Unregister all session records
for (const record of sessionRecords) {
this.entries.delete(record.id);
this.runtimeProcesses.delete(record.id);
}
this.persist();
logger.info('SYSTEM', `Reaped ${sessionRecords.length} process(es) for session ${sessionId}`, {
sessionId: sessionIdNum,
reaped: sessionRecords.length
});
return sessionRecords.length;
}
private persist(): void {
const payload: PersistedRegistry = {
processes: Object.fromEntries(this.entries.entries())
};
mkdirSync(path.dirname(this.registryPath), { recursive: true });
writeFileSync(this.registryPath, JSON.stringify(payload, null, 2));
}
}
let registrySingleton: ProcessRegistry | null = null;
export function getProcessRegistry(): ProcessRegistry {
if (!registrySingleton) {
registrySingleton = new ProcessRegistry();
}
return registrySingleton;
}
export function createProcessRegistry(registryPath: string): ProcessRegistry {
return new ProcessRegistry(registryPath);
}
+157
View File
@@ -0,0 +1,157 @@
import { execFile } from 'child_process';
import { rmSync } from 'fs';
import { homedir } from 'os';
import path from 'path';
import { promisify } from 'util';
import { logger } from '../utils/logger.js';
import { HOOK_TIMEOUTS } from '../shared/hook-constants.js';
import { isPidAlive, type ManagedProcessRecord, type ProcessRegistry } from './process-registry.js';
const execFileAsync = promisify(execFile);
const DATA_DIR = path.join(homedir(), '.claude-mem');
const PID_FILE = path.join(DATA_DIR, 'worker.pid');
type TreeKillFn = (pid: number, signal?: string, callback?: (error?: Error | null) => void) => void;
export interface ShutdownCascadeOptions {
registry: ProcessRegistry;
currentPid?: number;
pidFilePath?: string;
}
export async function runShutdownCascade(options: ShutdownCascadeOptions): Promise<void> {
const currentPid = options.currentPid ?? process.pid;
const pidFilePath = options.pidFilePath ?? PID_FILE;
const allRecords = options.registry.getAll();
const childRecords = [...allRecords]
.filter(record => record.pid !== currentPid)
.sort((a, b) => Date.parse(b.startedAt) - Date.parse(a.startedAt));
for (const record of childRecords) {
if (!isPidAlive(record.pid)) {
options.registry.unregister(record.id);
continue;
}
try {
await signalProcess(record.pid, 'SIGTERM');
} catch (error) {
logger.debug('SYSTEM', 'Failed to send SIGTERM to child process', {
pid: record.pid,
type: record.type
}, error as Error);
}
}
await waitForExit(childRecords, 5000);
const survivors = childRecords.filter(record => isPidAlive(record.pid));
for (const record of survivors) {
try {
await signalProcess(record.pid, 'SIGKILL');
} catch (error) {
logger.debug('SYSTEM', 'Failed to force kill child process', {
pid: record.pid,
type: record.type
}, error as Error);
}
}
await waitForExit(survivors, 1000);
for (const record of childRecords) {
options.registry.unregister(record.id);
}
for (const record of allRecords.filter(record => record.pid === currentPid)) {
options.registry.unregister(record.id);
}
try {
rmSync(pidFilePath, { force: true });
} catch (error) {
logger.debug('SYSTEM', 'Failed to remove PID file during shutdown', { pidFilePath }, error as Error);
}
options.registry.pruneDeadEntries();
}
async function waitForExit(records: ManagedProcessRecord[], timeoutMs: number): Promise<void> {
const deadline = Date.now() + timeoutMs;
while (Date.now() < deadline) {
const survivors = records.filter(record => isPidAlive(record.pid));
if (survivors.length === 0) {
return;
}
await new Promise(resolve => setTimeout(resolve, 100));
}
}
async function signalProcess(pid: number, signal: 'SIGTERM' | 'SIGKILL'): Promise<void> {
if (signal === 'SIGTERM') {
try {
process.kill(pid, signal);
} catch (error) {
const errno = (error as NodeJS.ErrnoException).code;
if (errno === 'ESRCH') {
return;
}
throw error;
}
return;
}
if (process.platform === 'win32') {
const treeKill = await loadTreeKill();
if (treeKill) {
await new Promise<void>((resolve, reject) => {
treeKill(pid, signal, (error) => {
if (!error) {
resolve();
return;
}
const errno = (error as NodeJS.ErrnoException).code;
if (errno === 'ESRCH') {
resolve();
return;
}
reject(error);
});
});
return;
}
const args = ['/PID', String(pid), '/T'];
if (signal === 'SIGKILL') {
args.push('/F');
}
await execFileAsync('taskkill', args, {
timeout: HOOK_TIMEOUTS.POWERSHELL_COMMAND,
windowsHide: true
});
return;
}
try {
process.kill(pid, signal);
} catch (error) {
const errno = (error as NodeJS.ErrnoException).code;
if (errno === 'ESRCH') {
return;
}
throw error;
}
}
async function loadTreeKill(): Promise<TreeKillFn | null> {
const moduleName = 'tree-kill';
try {
const treeKillModule = await import(moduleName);
return (treeKillModule.default ?? treeKillModule) as TreeKillFn;
} catch {
return null;
}
}
+7
View File
@@ -0,0 +1,7 @@
declare module 'tree-kill' {
export default function treeKill(
pid: number,
signal?: string,
callback?: (error?: Error | null) => void
): void;
}
+13 -16
View File
@@ -25,29 +25,26 @@ export function App() {
const { preference, resolvedTheme, setThemePreference } = useTheme();
const pagination = usePagination(currentFilter);
// When filtering by project: ONLY use paginated data (API-filtered)
// When showing all projects: merge SSE live data with paginated data
// Merge SSE live data with paginated data, filtering by project when active
const allObservations = useMemo(() => {
if (currentFilter) {
// Project filter active: API handles filtering, ignore SSE items
return paginatedObservations;
}
// No filter: merge SSE + paginated, deduplicate by ID
return mergeAndDeduplicateByProject(observations, paginatedObservations);
const live = currentFilter
? observations.filter(o => o.project === currentFilter)
: observations;
return mergeAndDeduplicateByProject(live, paginatedObservations);
}, [observations, paginatedObservations, currentFilter]);
const allSummaries = useMemo(() => {
if (currentFilter) {
return paginatedSummaries;
}
return mergeAndDeduplicateByProject(summaries, paginatedSummaries);
const live = currentFilter
? summaries.filter(s => s.project === currentFilter)
: summaries;
return mergeAndDeduplicateByProject(live, paginatedSummaries);
}, [summaries, paginatedSummaries, currentFilter]);
const allPrompts = useMemo(() => {
if (currentFilter) {
return paginatedPrompts;
}
return mergeAndDeduplicateByProject(prompts, paginatedPrompts);
const live = currentFilter
? prompts.filter(p => p.project === currentFilter)
: prompts;
return mergeAndDeduplicateByProject(live, paginatedPrompts);
}, [prompts, paginatedPrompts, currentFilter]);
// Toggle context preview modal
+2 -3
View File
@@ -5,10 +5,9 @@
/**
* Merge real-time SSE items with paginated items, removing duplicates by ID
* NOTE: This should ONLY be used when no project filter is active.
* When filtering, use ONLY paginated data (API-filtered).
* Callers should pre-filter liveItems by project when a filter is active.
*
* @param liveItems - Items from SSE stream (unfiltered)
* @param liveItems - Items from SSE stream (pre-filtered if needed)
* @param paginatedItems - Items from pagination API
* @returns Merged and deduplicated array
*/
+6 -7
View File
@@ -12,7 +12,7 @@ import os from 'os';
import { logger } from './logger.js';
import { formatDate, groupByDate } from '../shared/timeline-formatting.js';
import { SettingsDefaultsManager } from '../shared/SettingsDefaultsManager.js';
import { getWorkerHost } from '../shared/worker-utils.js';
import { workerHttpRequest } from '../shared/worker-utils.js';
const SETTINGS_PATH = path.join(os.homedir(), '.claude-mem', 'settings.json');
@@ -321,12 +321,12 @@ function isExcludedFolder(folderPath: string, excludePaths: string[]): boolean {
*
* @param filePaths - Array of absolute file paths (modified or read)
* @param project - Project identifier for API query
* @param port - Worker API port
* @param _port - Worker API port (legacy, now resolved automatically via socket/TCP)
*/
export async function updateFolderClaudeMdFiles(
filePaths: string[],
project: string,
port: number,
_port: number,
projectRoot?: string
): Promise<void> {
// Load settings to get configurable observation limit and exclude list
@@ -417,10 +417,9 @@ export async function updateFolderClaudeMdFiles(
// Process each folder
for (const folderPath of folderPaths) {
try {
// Fetch timeline via existing API
const host = getWorkerHost();
const response = await fetch(
`http://${host}:${port}/api/search/by-file?filePath=${encodeURIComponent(folderPath)}&limit=${limit}&project=${encodeURIComponent(project)}&isFolder=true`
// Fetch timeline via existing API (uses socket or TCP automatically)
const response = await workerHttpRequest(
`/api/search/by-file?filePath=${encodeURIComponent(folderPath)}&limit=${limit}&project=${encodeURIComponent(project)}&isFolder=true`
);
if (!response.ok) {
+58 -25
View File
@@ -256,41 +256,74 @@ describe('Cursor IDE Compatibility (#838, #1049)', () => {
// --- Platform Adapter Tests ---
describe('Hook Lifecycle - Claude Code Adapter', () => {
it('should default suppressOutput to true when not explicitly set', async () => {
const fmt = async (input: any) => {
const { claudeCodeAdapter } = await import('../src/cli/adapters/claude-code.js');
return claudeCodeAdapter.formatOutput(input);
};
// Result with no suppressOutput field
const output = claudeCodeAdapter.formatOutput({ continue: true });
expect(output).toEqual({ continue: true, suppressOutput: true });
// --- Happy paths ---
it('should return empty object for empty result', async () => {
expect(await fmt({})).toEqual({});
});
it('should default both continue and suppressOutput to true for empty result', async () => {
const { claudeCodeAdapter } = await import('../src/cli/adapters/claude-code.js');
const output = claudeCodeAdapter.formatOutput({});
expect(output).toEqual({ continue: true, suppressOutput: true });
it('should include systemMessage when present', async () => {
expect(await fmt({ systemMessage: 'test message' })).toEqual({ systemMessage: 'test message' });
});
it('should respect explicit suppressOutput: false', async () => {
const { claudeCodeAdapter } = await import('../src/cli/adapters/claude-code.js');
const output = claudeCodeAdapter.formatOutput({ continue: true, suppressOutput: false });
expect(output).toEqual({ continue: true, suppressOutput: false });
});
it('should use hookSpecificOutput format for context injection', async () => {
const { claudeCodeAdapter } = await import('../src/cli/adapters/claude-code.js');
const result = {
it('should use hookSpecificOutput format with systemMessage', async () => {
const output = await fmt({
hookSpecificOutput: { hookEventName: 'SessionStart', additionalContext: 'test context' },
systemMessage: 'test message'
};
const output = claudeCodeAdapter.formatOutput(result) as Record<string, unknown>;
}) as Record<string, unknown>;
expect(output.hookSpecificOutput).toEqual({ hookEventName: 'SessionStart', additionalContext: 'test context' });
expect(output.systemMessage).toBe('test message');
// Should NOT have continue/suppressOutput when using hookSpecificOutput
expect(output.continue).toBeUndefined();
expect(output.suppressOutput).toBeUndefined();
});
it('should return hookSpecificOutput without systemMessage when absent', async () => {
expect(await fmt({
hookSpecificOutput: { hookEventName: 'SessionStart', additionalContext: 'ctx' },
})).toEqual({
hookSpecificOutput: { hookEventName: 'SessionStart', additionalContext: 'ctx' },
});
});
// --- Edge cases / unhappy paths (addresses PR #1291 review) ---
it('should return empty object for malformed input (undefined/null)', async () => {
expect(await fmt(undefined)).toEqual({});
expect(await fmt(null)).toEqual({});
});
it('should exclude falsy systemMessage values', async () => {
expect(await fmt({ systemMessage: '' })).toEqual({});
expect(await fmt({ systemMessage: null })).toEqual({});
expect(await fmt({ systemMessage: 0 })).toEqual({});
});
it('should strip all non-contract fields', async () => {
expect(await fmt({
continue: false,
suppressOutput: false,
systemMessage: 'msg',
exitCode: 2,
hookSpecificOutput: undefined,
})).toEqual({ systemMessage: 'msg' });
});
it('should only emit keys from the Claude Code hook contract', async () => {
const allowedKeys = new Set(['hookSpecificOutput', 'systemMessage', 'decision', 'reason']);
const cases = [
{},
{ systemMessage: 'x' },
{ continue: true, suppressOutput: true, systemMessage: 'x', exitCode: 1 },
{ hookSpecificOutput: { hookEventName: 'E', additionalContext: 'C' }, systemMessage: 'x' },
];
for (const input of cases) {
for (const key of Object.keys(await fmt(input) as object)) {
expect(allowedKeys.has(key)).toBe(true);
}
}
});
});
@@ -27,6 +27,15 @@ mock.module('../../src/shared/SettingsDefaultsManager.js', () => ({
mock.module('../../src/shared/worker-utils.js', () => ({
ensureWorkerRunning: () => Promise.resolve(true),
getWorkerPort: () => 37777,
workerHttpRequest: (apiPath: string, options?: any) => {
// Delegate to global fetch so tests can mock fetch behavior
const url = `http://127.0.0.1:37777${apiPath}`;
return globalThis.fetch(url, {
method: options?.method ?? 'GET',
headers: options?.headers,
body: options?.body,
});
},
}));
mock.module('../../src/utils/project-name.js', () => ({
+26 -8
View File
@@ -59,7 +59,11 @@ describe('HealthMonitor', () => {
describe('waitForHealth', () => {
it('should succeed immediately when server responds', async () => {
global.fetch = mock(() => Promise.resolve({ ok: true } as Response));
global.fetch = mock(() => Promise.resolve({
ok: true,
status: 200,
text: () => Promise.resolve('')
} as unknown as Response));
const start = Date.now();
const result = await waitForHealth(37777, 5000);
@@ -91,7 +95,11 @@ describe('HealthMonitor', () => {
if (callCount < 3) {
return Promise.reject(new Error('ECONNREFUSED'));
}
return Promise.resolve({ ok: true } as Response);
return Promise.resolve({
ok: true,
status: 200,
text: () => Promise.resolve('')
} as unknown as Response);
});
const result = await waitForHealth(37777, 5000);
@@ -101,7 +109,11 @@ describe('HealthMonitor', () => {
});
it('should check health endpoint for liveness', async () => {
const fetchMock = mock(() => Promise.resolve({ ok: true } as Response));
const fetchMock = mock(() => Promise.resolve({
ok: true,
status: 200,
text: () => Promise.resolve('')
} as unknown as Response));
global.fetch = fetchMock;
await waitForHealth(37777, 1000);
@@ -115,7 +127,11 @@ describe('HealthMonitor', () => {
});
it('should use default timeout when not specified', async () => {
global.fetch = mock(() => Promise.resolve({ ok: true } as Response));
global.fetch = mock(() => Promise.resolve({
ok: true,
status: 200,
text: () => Promise.resolve('')
} as unknown as Response));
// Just verify it doesn't throw and returns quickly
const result = await waitForHealth(37777);
@@ -154,8 +170,9 @@ describe('HealthMonitor', () => {
it('should detect version mismatch', async () => {
global.fetch = mock(() => Promise.resolve({
ok: true,
json: () => Promise.resolve({ version: '0.0.0-definitely-wrong' })
} as Response));
status: 200,
text: () => Promise.resolve(JSON.stringify({ version: '0.0.0-definitely-wrong' }))
} as unknown as Response));
const result = await checkVersionMatch(37777);
@@ -172,8 +189,9 @@ describe('HealthMonitor', () => {
global.fetch = mock(() => Promise.resolve({
ok: true,
json: () => Promise.resolve({ version: pluginVersion })
} as Response));
status: 200,
text: () => Promise.resolve(JSON.stringify({ version: pluginVersion }))
} as unknown as Response));
const result = await checkVersionMatch(37777);
+253
View File
@@ -0,0 +1,253 @@
/**
* Tests for malformed schema repair in Database.ts
*
* Mock Justification: NONE (0% mock code)
* - Uses real SQLite with temp file tests actual schema repair logic
* - Uses Python sqlite3 to simulate cross-version schema corruption
* (bun:sqlite doesn't allow writable_schema modifications)
* - Covers the cross-machine sync scenario from issue #1307
*
* Value: Prevents the silent 503 failure loop when a DB is synced between
* machines running different claude-mem versions
*/
import { describe, it, expect } from 'bun:test';
import { Database } from 'bun:sqlite';
import { ClaudeMemDatabase } from '../../../src/services/sqlite/Database.js';
import { MigrationRunner } from '../../../src/services/sqlite/migrations/runner.js';
import { existsSync, unlinkSync, writeFileSync } from 'fs';
import { join } from 'path';
import { tmpdir } from 'os';
import { execFileSync, execSync } from 'child_process';
function tempDbPath(): string {
return join(tmpdir(), `claude-mem-test-${Date.now()}-${Math.random().toString(36).slice(2)}.db`);
}
function cleanup(path: string): void {
for (const suffix of ['', '-wal', '-shm']) {
const p = path + suffix;
if (existsSync(p)) unlinkSync(p);
}
}
function hasPython(): boolean {
try {
execSync('python3 --version', { stdio: 'pipe' });
return true;
} catch {
return false;
}
}
/**
* Use Python's sqlite3 to corrupt a DB by removing the content_hash column
* from the observations table definition while leaving the index intact.
* This simulates what happens when a DB from a newer version is synced.
*/
function corruptDbViaPython(dbPath: string): void {
const script = join(tmpdir(), `corrupt-${Date.now()}.py`);
writeFileSync(script, `
import sqlite3, re, sys
c = sqlite3.connect(sys.argv[1])
c.execute("PRAGMA writable_schema = ON")
row = c.execute("SELECT sql FROM sqlite_master WHERE type='table' AND name='observations'").fetchone()
if row:
new_sql = re.sub(r',\\s*content_hash\\s+TEXT', '', row[0])
c.execute("UPDATE sqlite_master SET sql = ? WHERE type='table' AND name='observations'", (new_sql,))
c.execute("PRAGMA writable_schema = OFF")
c.commit()
c.close()
`);
try {
execSync(`python3 "${script}" "${dbPath}"`, { timeout: 10000 });
} finally {
if (existsSync(script)) unlinkSync(script);
}
}
describe('Schema repair on malformed database', () => {
it('should repair a database with an orphaned index referencing a non-existent column', () => {
if (!hasPython()) {
console.log('Python3 not available, skipping test');
return;
}
const dbPath = tempDbPath();
try {
// Step 1: Create a valid database with all migrations
const db = new Database(dbPath, { create: true, readwrite: true });
db.run('PRAGMA journal_mode = WAL');
db.run('PRAGMA foreign_keys = ON');
const runner = new MigrationRunner(db);
runner.runAllMigrations();
// Verify content_hash column and index exist
const hasContentHash = db.prepare('PRAGMA table_info(observations)').all()
.some((col: any) => col.name === 'content_hash');
expect(hasContentHash).toBe(true);
// Checkpoint WAL so all data is in the main file
db.run('PRAGMA wal_checkpoint(TRUNCATE)');
db.close();
// Step 2: Corrupt the DB
corruptDbViaPython(dbPath);
// Step 3: Verify the DB is actually corrupted
const corruptDb = new Database(dbPath, { readwrite: true });
let threw = false;
try {
corruptDb.query('SELECT name FROM sqlite_master WHERE type = "table" LIMIT 1').all();
} catch (e: any) {
threw = true;
expect(e.message).toContain('malformed database schema');
expect(e.message).toContain('idx_observations_content_hash');
}
corruptDb.close();
expect(threw).toBe(true);
// Step 4: Open via ClaudeMemDatabase — it should auto-repair
const repaired = new ClaudeMemDatabase(dbPath);
// Verify the DB is functional
const tables = repaired.db.prepare("SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%' ORDER BY name")
.all() as { name: string }[];
const tableNames = tables.map(t => t.name);
expect(tableNames).toContain('observations');
expect(tableNames).toContain('sdk_sessions');
// Verify the index was recreated by the migration runner
const indexes = repaired.db.prepare("SELECT name FROM sqlite_master WHERE type='index' AND name='idx_observations_content_hash'")
.all() as { name: string }[];
expect(indexes.length).toBe(1);
// Verify the content_hash column was re-added by the migration
const columns = repaired.db.prepare('PRAGMA table_info(observations)').all() as { name: string }[];
expect(columns.some(c => c.name === 'content_hash')).toBe(true);
repaired.close();
} finally {
cleanup(dbPath);
}
});
it('should handle a fresh database without triggering repair', () => {
const dbPath = tempDbPath();
try {
const db = new ClaudeMemDatabase(dbPath);
const tables = db.db.prepare("SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'")
.all() as { name: string }[];
expect(tables.length).toBeGreaterThan(0);
db.close();
} finally {
cleanup(dbPath);
}
});
it('should repair a corrupted DB that has no schema_versions table', () => {
if (!hasPython()) {
console.log('Python3 not available, skipping test');
return;
}
const dbPath = tempDbPath();
const scriptPath = join(tmpdir(), `corrupt-nosv-${Date.now()}.py`);
try {
// Build a minimal DB with only a malformed observations table and orphaned index
// — no schema_versions table. This simulates a partially-initialized DB that was
// synced before migrations ever ran.
writeFileSync(scriptPath, `
import sqlite3, sys
c = sqlite3.connect(sys.argv[1])
c.execute('PRAGMA writable_schema = ON')
# Inject an orphaned index into sqlite_master without any backing table.
# This simulates a partially-synced DB where index metadata arrived but
# the table schema is incomplete or missing columns.
idx_sql = 'CREATE INDEX idx_observations_content_hash ON observations(content_hash, created_at_epoch)'
c.execute(
"INSERT INTO sqlite_master (type, name, tbl_name, rootpage, sql) VALUES ('index', 'idx_observations_content_hash', 'observations', 0, ?)",
(idx_sql,)
)
c.execute('PRAGMA writable_schema = OFF')
c.commit()
c.close()
`);
execFileSync('python3', [scriptPath, dbPath], { timeout: 10000 });
// Verify it's corrupted
const corruptDb = new Database(dbPath, { readwrite: true });
let threw = false;
try {
corruptDb.query('SELECT name FROM sqlite_master WHERE type = "table" LIMIT 1').all();
} catch (e: any) {
threw = true;
expect(e.message).toContain('malformed database schema');
}
corruptDb.close();
expect(threw).toBe(true);
// ClaudeMemDatabase must repair and fully initialize despite missing schema_versions
const repaired = new ClaudeMemDatabase(dbPath);
const tables = repaired.db.prepare("SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%' ORDER BY name")
.all() as { name: string }[];
const tableNames = tables.map(t => t.name);
expect(tableNames).toContain('schema_versions');
expect(tableNames).toContain('observations');
expect(tableNames).toContain('sdk_sessions');
repaired.close();
} finally {
cleanup(dbPath);
if (existsSync(scriptPath)) unlinkSync(scriptPath);
}
});
it('should preserve existing data through repair and re-migration', () => {
if (!hasPython()) {
console.log('Python3 not available, skipping test');
return;
}
const dbPath = tempDbPath();
try {
// Step 1: Create a fully migrated DB and insert a session + observation
const db = new Database(dbPath, { create: true, readwrite: true });
db.run('PRAGMA journal_mode = WAL');
db.run('PRAGMA foreign_keys = ON');
const runner = new MigrationRunner(db);
runner.runAllMigrations();
const now = new Date().toISOString();
const epoch = Date.now();
db.prepare(`
INSERT INTO sdk_sessions (content_session_id, memory_session_id, project, started_at, started_at_epoch, status)
VALUES (?, ?, ?, ?, ?, ?)
`).run('test-content-1', 'test-memory-1', 'test-project', now, epoch, 'active');
db.prepare(`
INSERT INTO observations (memory_session_id, project, type, created_at, created_at_epoch)
VALUES (?, ?, ?, ?, ?)
`).run('test-memory-1', 'test-project', 'discovery', now, epoch);
db.run('PRAGMA wal_checkpoint(TRUNCATE)');
db.close();
// Step 2: Corrupt the DB
corruptDbViaPython(dbPath);
// Step 3: Repair via ClaudeMemDatabase
const repaired = new ClaudeMemDatabase(dbPath);
// Data must survive the repair + re-migration
const sessions = repaired.db.prepare('SELECT COUNT(*) as count FROM sdk_sessions').get() as { count: number };
const observations = repaired.db.prepare('SELECT COUNT(*) as count FROM observations').get() as { count: number };
expect(sessions.count).toBe(1);
expect(observations.count).toBe(1);
repaired.close();
} finally {
cleanup(dbPath);
}
});
});
@@ -0,0 +1,115 @@
/**
* Regression tests for ChromaMcpManager SSL flag handling (PR #1286)
*
* Validates that buildCommandArgs() always emits the correct `--ssl` flag
* based on CLAUDE_MEM_CHROMA_SSL, and omits it entirely in local mode.
*
* Strategy: mock StdioClientTransport to capture the spawned args without
* actually launching a subprocess, then inspect the captured args array.
*/
import { describe, it, expect, beforeEach, mock } from 'bun:test';
// ── Mutable settings closure (updated per test) ────────────────────────
let currentSettings: Record<string, string> = {};
// ── Mock modules BEFORE importing the module under test ────────────────
// Capture the args passed to StdioClientTransport constructor
let capturedTransportOpts: { command: string; args: string[] } | null = null;
mock.module('@modelcontextprotocol/sdk/client/stdio.js', () => ({
StdioClientTransport: class FakeTransport {
// Required: ChromaMcpManager assigns transport.onclose after connect()
onclose: (() => void) | null = null;
constructor(opts: { command: string; args: string[] }) {
capturedTransportOpts = { command: opts.command, args: opts.args };
}
async close() {}
},
}));
mock.module('@modelcontextprotocol/sdk/client/index.js', () => ({
Client: class FakeClient {
constructor() {}
async connect() {}
async callTool() {
return { content: [{ type: 'text', text: '{}' }] };
}
async close() {}
},
}));
mock.module('../../../src/shared/SettingsDefaultsManager.js', () => ({
SettingsDefaultsManager: {
get: (key: string) => currentSettings[key] ?? '',
getInt: () => 0,
loadFromFile: () => currentSettings,
},
}));
mock.module('../../../src/shared/paths.js', () => ({
USER_SETTINGS_PATH: '/tmp/fake-settings.json',
}));
mock.module('../../../src/utils/logger.js', () => ({
logger: {
info: () => {},
debug: () => {},
warn: () => {},
error: () => {},
failure: () => {},
},
}));
// ── Now import the module under test ───────────────────────────────────
import { ChromaMcpManager } from '../../../src/services/sync/ChromaMcpManager.js';
// ── Helpers ────────────────────────────────────────────────────────────
async function assertSslFlag(sslSetting: string | undefined, expectedValue: string) {
currentSettings = { CLAUDE_MEM_CHROMA_MODE: 'remote' };
if (sslSetting !== undefined) currentSettings.CLAUDE_MEM_CHROMA_SSL = sslSetting;
await mgr.callTool('chroma_list_collections', {});
expect(capturedTransportOpts).not.toBeNull();
const sslIdx = capturedTransportOpts!.args.indexOf('--ssl');
expect(sslIdx).not.toBe(-1);
expect(capturedTransportOpts!.args[sslIdx + 1]).toBe(expectedValue);
}
let mgr: ChromaMcpManager;
// ── Test suite ─────────────────────────────────────────────────────────
describe('ChromaMcpManager SSL flag regression (#1286)', () => {
beforeEach(async () => {
await ChromaMcpManager.reset();
capturedTransportOpts = null;
currentSettings = {};
mgr = ChromaMcpManager.getInstance();
});
it('emits --ssl false when CLAUDE_MEM_CHROMA_SSL=false', async () => {
await assertSslFlag('false', 'false');
});
it('emits --ssl true when CLAUDE_MEM_CHROMA_SSL=true', async () => {
await assertSslFlag('true', 'true');
});
it('defaults --ssl false when CLAUDE_MEM_CHROMA_SSL is not set', async () => {
await assertSslFlag(undefined, 'false');
});
it('omits --ssl entirely in local mode', async () => {
currentSettings = {
CLAUDE_MEM_CHROMA_MODE: 'local',
};
await mgr.callTool('chroma_list_collections', {});
expect(capturedTransportOpts).not.toBeNull();
const args = capturedTransportOpts!.args;
expect(args).not.toContain('--ssl');
expect(args).toContain('--client-type');
expect(args[args.indexOf('--client-type') + 1]).toBe('persistent');
});
});
+74
View File
@@ -2,6 +2,7 @@ import { describe, it, expect, beforeEach, afterEach } from 'bun:test';
import { existsSync, mkdirSync, writeFileSync, rmSync, readFileSync } from 'fs';
import { join } from 'path';
import { tmpdir } from 'os';
import { spawnSync } from 'child_process';
/**
* Smart Install Script Tests
@@ -163,3 +164,76 @@ describe('smart-install verifyCriticalModules logic', () => {
expect(missing).toEqual(['@chroma-core/other-pkg']);
});
});
describe('smart-install stdout JSON output (#1253)', () => {
const SCRIPT_PATH = join(__dirname, '..', 'plugin', 'scripts', 'smart-install.js');
it('should not have any execSync with stdio: inherit (prevents stdout leak)', () => {
const content = readFileSync(SCRIPT_PATH, 'utf-8');
// stdio: 'inherit' would leak non-JSON output to stdout, breaking Claude Code hooks
expect(content).not.toContain("stdio: 'inherit'");
expect(content).not.toContain('stdio: "inherit"');
});
it('should output valid JSON to stdout on success path', () => {
const content = readFileSync(SCRIPT_PATH, 'utf-8');
// The script must print JSON to stdout for the Claude Code hook contract
expect(content).toContain('console.log(JSON.stringify(');
expect(content).toContain('continue');
expect(content).toContain('suppressOutput');
});
it('should output valid JSON to stdout even in error catch block', () => {
const content = readFileSync(SCRIPT_PATH, 'utf-8');
// Find the catch block and verify it also outputs JSON
const catchIndex = content.lastIndexOf('catch (e)');
expect(catchIndex).toBeGreaterThan(0);
const catchBlock = content.slice(catchIndex, catchIndex + 300);
expect(catchBlock).toContain('console.log(JSON.stringify(');
});
it('should use piped stdout for all execSync calls', () => {
const content = readFileSync(SCRIPT_PATH, 'utf-8');
// All execSync calls should pipe stdout to prevent leaking to the hook output.
// Match execSync calls that have a stdio option — they should all use array form.
// All execSync calls should either use 'ignore', array form, or the installStdio variable
// — never bare 'inherit' which leaks non-JSON output to stdout
expect(content).not.toContain("stdio: 'inherit'");
expect(content).not.toContain('stdio: "inherit"');
// Verify the installStdio variable is defined with the correct pipe config
expect(content).toContain("const installStdio = ['pipe', 'pipe', 'inherit']");
});
it('should produce valid JSON when run with plugin disabled', () => {
// Run the actual script with the plugin forcefully disabled via settings
// This exercises the early exit path
const settingsDir = join(tmpdir(), `claude-mem-test-settings-${process.pid}`);
const settingsFile = join(settingsDir, 'settings.json');
mkdirSync(settingsDir, { recursive: true });
writeFileSync(settingsFile, JSON.stringify({
enabledPlugins: { 'claude-mem@thedotmack': false }
}));
try {
const result = spawnSync('node', [SCRIPT_PATH], {
encoding: 'utf-8',
env: {
...process.env,
CLAUDE_CONFIG_DIR: settingsDir,
},
timeout: 10000,
});
// When plugin is disabled, script exits with 0 and produces no stdout
// (the early exit at line 31-33 calls process.exit(0) before any output)
expect(result.status).toBe(0);
// stdout should be empty or valid JSON (not plain text install messages)
const stdout = (result.stdout || '').trim();
if (stdout.length > 0) {
expect(() => JSON.parse(stdout)).not.toThrow();
}
} finally {
rmSync(settingsDir, { recursive: true, force: true });
}
});
});
+123
View File
@@ -0,0 +1,123 @@
import { describe, expect, it } from 'bun:test';
import { sanitizeEnv } from '../../src/supervisor/env-sanitizer.js';
describe('sanitizeEnv', () => {
it('strips variables with CLAUDECODE_ prefix', () => {
const result = sanitizeEnv({
CLAUDECODE_FOO: 'bar',
CLAUDECODE_SOMETHING: 'value',
PATH: '/usr/bin'
});
expect(result.CLAUDECODE_FOO).toBeUndefined();
expect(result.CLAUDECODE_SOMETHING).toBeUndefined();
expect(result.PATH).toBe('/usr/bin');
});
it('strips variables with CLAUDE_CODE_ prefix', () => {
const result = sanitizeEnv({
CLAUDE_CODE_BAR: 'baz',
CLAUDE_CODE_OAUTH_TOKEN: 'token',
HOME: '/home/user'
});
expect(result.CLAUDE_CODE_BAR).toBeUndefined();
expect(result.CLAUDE_CODE_OAUTH_TOKEN).toBeUndefined();
expect(result.HOME).toBe('/home/user');
});
it('strips exact-match variables (CLAUDECODE, CLAUDE_CODE_SESSION, CLAUDE_CODE_ENTRYPOINT, MCP_SESSION_ID)', () => {
const result = sanitizeEnv({
CLAUDECODE: '1',
CLAUDE_CODE_SESSION: 'session-123',
CLAUDE_CODE_ENTRYPOINT: 'hook',
MCP_SESSION_ID: 'mcp-abc',
NODE_PATH: '/usr/local/lib'
});
expect(result.CLAUDECODE).toBeUndefined();
expect(result.CLAUDE_CODE_SESSION).toBeUndefined();
expect(result.CLAUDE_CODE_ENTRYPOINT).toBeUndefined();
expect(result.MCP_SESSION_ID).toBeUndefined();
expect(result.NODE_PATH).toBe('/usr/local/lib');
});
it('preserves allowed variables like PATH, HOME, NODE_PATH', () => {
const result = sanitizeEnv({
PATH: '/usr/bin:/usr/local/bin',
HOME: '/home/user',
NODE_PATH: '/usr/local/lib/node_modules',
SHELL: '/bin/zsh',
USER: 'developer',
LANG: 'en_US.UTF-8'
});
expect(result.PATH).toBe('/usr/bin:/usr/local/bin');
expect(result.HOME).toBe('/home/user');
expect(result.NODE_PATH).toBe('/usr/local/lib/node_modules');
expect(result.SHELL).toBe('/bin/zsh');
expect(result.USER).toBe('developer');
expect(result.LANG).toBe('en_US.UTF-8');
});
it('returns a new object and does not mutate the original', () => {
const original: NodeJS.ProcessEnv = {
PATH: '/usr/bin',
CLAUDECODE_FOO: 'bar',
KEEP: 'yes'
};
const originalCopy = { ...original };
const result = sanitizeEnv(original);
// Result should be a different object
expect(result).not.toBe(original);
// Original should be unchanged
expect(original).toEqual(originalCopy);
// Result should not contain stripped vars
expect(result.CLAUDECODE_FOO).toBeUndefined();
expect(result.PATH).toBe('/usr/bin');
});
it('handles empty env gracefully', () => {
const result = sanitizeEnv({});
expect(result).toEqual({});
});
it('skips entries with undefined values', () => {
const env: NodeJS.ProcessEnv = {
DEFINED: 'value',
UNDEFINED_KEY: undefined
};
const result = sanitizeEnv(env);
expect(result.DEFINED).toBe('value');
expect('UNDEFINED_KEY' in result).toBe(false);
});
it('combines prefix and exact match removal in a single pass', () => {
const result = sanitizeEnv({
PATH: '/usr/bin',
CLAUDECODE: '1',
CLAUDECODE_FOO: 'bar',
CLAUDE_CODE_BAR: 'baz',
CLAUDE_CODE_OAUTH_TOKEN: 'oauth-token',
CLAUDE_CODE_SESSION: 'session',
CLAUDE_CODE_ENTRYPOINT: 'entry',
MCP_SESSION_ID: 'mcp',
KEEP_ME: 'yes'
});
expect(result.PATH).toBe('/usr/bin');
expect(result.KEEP_ME).toBe('yes');
expect(result.CLAUDECODE).toBeUndefined();
expect(result.CLAUDECODE_FOO).toBeUndefined();
expect(result.CLAUDE_CODE_BAR).toBeUndefined();
expect(result.CLAUDE_CODE_OAUTH_TOKEN).toBeUndefined();
expect(result.CLAUDE_CODE_SESSION).toBeUndefined();
expect(result.CLAUDE_CODE_ENTRYPOINT).toBeUndefined();
expect(result.MCP_SESSION_ID).toBeUndefined();
});
});
+73
View File
@@ -0,0 +1,73 @@
import { afterEach, describe, expect, it, mock } from 'bun:test';
import { startHealthChecker, stopHealthChecker } from '../../src/supervisor/health-checker.js';
describe('health-checker', () => {
afterEach(() => {
// Always stop the checker to avoid leaking intervals between tests
stopHealthChecker();
});
it('startHealthChecker sets up an interval without throwing', () => {
expect(() => startHealthChecker()).not.toThrow();
});
it('stopHealthChecker clears the interval without throwing', () => {
startHealthChecker();
expect(() => stopHealthChecker()).not.toThrow();
});
it('stopHealthChecker is safe to call when no checker is running', () => {
expect(() => stopHealthChecker()).not.toThrow();
});
it('multiple startHealthChecker calls do not create multiple intervals', () => {
// Track setInterval calls
const originalSetInterval = globalThis.setInterval;
let setIntervalCallCount = 0;
globalThis.setInterval = ((...args: Parameters<typeof setInterval>) => {
setIntervalCallCount++;
return originalSetInterval(...args);
}) as typeof setInterval;
try {
// Stop any existing checker first to ensure clean state
stopHealthChecker();
setIntervalCallCount = 0;
startHealthChecker();
startHealthChecker();
startHealthChecker();
// Only one interval should have been created due to the guard
expect(setIntervalCallCount).toBe(1);
} finally {
globalThis.setInterval = originalSetInterval;
}
});
it('stopHealthChecker after start allows restarting', () => {
const originalSetInterval = globalThis.setInterval;
let setIntervalCallCount = 0;
globalThis.setInterval = ((...args: Parameters<typeof setInterval>) => {
setIntervalCallCount++;
return originalSetInterval(...args);
}) as typeof setInterval;
try {
stopHealthChecker();
setIntervalCallCount = 0;
startHealthChecker();
expect(setIntervalCallCount).toBe(1);
stopHealthChecker();
startHealthChecker();
expect(setIntervalCallCount).toBe(2);
} finally {
globalThis.setInterval = originalSetInterval;
}
});
});
+111
View File
@@ -0,0 +1,111 @@
import { afterEach, describe, expect, it } from 'bun:test';
import { mkdirSync, rmSync, writeFileSync } from 'fs';
import { tmpdir } from 'os';
import path from 'path';
import { validateWorkerPidFile, type ValidateWorkerPidStatus } from '../../src/supervisor/index.js';
function makeTempDir(): string {
const dir = path.join(tmpdir(), `claude-mem-index-${Date.now()}-${Math.random().toString(36).slice(2)}`);
mkdirSync(dir, { recursive: true });
return dir;
}
const tempDirs: string[] = [];
describe('validateWorkerPidFile', () => {
afterEach(() => {
while (tempDirs.length > 0) {
const dir = tempDirs.pop();
if (dir) {
rmSync(dir, { recursive: true, force: true });
}
}
});
it('returns "missing" when PID file does not exist', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const pidFilePath = path.join(tempDir, 'worker.pid');
const status = validateWorkerPidFile({ logAlive: false, pidFilePath });
expect(status).toBe('missing');
});
it('returns "invalid" when PID file contains bad JSON', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const pidFilePath = path.join(tempDir, 'worker.pid');
writeFileSync(pidFilePath, 'not-json!!!');
const status = validateWorkerPidFile({ logAlive: false, pidFilePath });
expect(status).toBe('invalid');
});
it('returns "stale" when PID file references a dead process', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const pidFilePath = path.join(tempDir, 'worker.pid');
writeFileSync(pidFilePath, JSON.stringify({
pid: 2147483647,
port: 37777,
startedAt: new Date().toISOString()
}));
const status = validateWorkerPidFile({ logAlive: false, pidFilePath });
expect(status).toBe('stale');
});
it('returns "alive" when PID file references the current process', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const pidFilePath = path.join(tempDir, 'worker.pid');
writeFileSync(pidFilePath, JSON.stringify({
pid: process.pid,
port: 37777,
startedAt: new Date().toISOString()
}));
const status = validateWorkerPidFile({ logAlive: false, pidFilePath });
expect(status).toBe('alive');
});
});
describe('Supervisor assertCanSpawn behavior', () => {
it('assertCanSpawn throws when stopPromise is active (shutdown in progress)', () => {
const { getSupervisor } = require('../../src/supervisor/index.js');
const supervisor = getSupervisor();
// When not shutting down, assertCanSpawn should not throw
expect(() => supervisor.assertCanSpawn('test')).not.toThrow();
});
it('registerProcess and unregisterProcess delegate to the registry', () => {
const { getSupervisor } = require('../../src/supervisor/index.js');
const supervisor = getSupervisor();
const registry = supervisor.getRegistry();
const testId = `test-${Date.now()}`;
supervisor.registerProcess(testId, {
pid: process.pid,
type: 'test',
startedAt: new Date().toISOString()
});
const found = registry.getAll().find((r: { id: string }) => r.id === testId);
expect(found).toBeDefined();
expect(found?.type).toBe('test');
supervisor.unregisterProcess(testId);
const afterUnregister = registry.getAll().find((r: { id: string }) => r.id === testId);
expect(afterUnregister).toBeUndefined();
});
});
describe('Supervisor start idempotency', () => {
it('getSupervisor returns the same instance', () => {
const { getSupervisor } = require('../../src/supervisor/index.js');
const s1 = getSupervisor();
const s2 = getSupervisor();
expect(s1).toBe(s2);
});
});
+423
View File
@@ -0,0 +1,423 @@
import { afterEach, describe, expect, it } from 'bun:test';
import { existsSync, mkdirSync, readFileSync, rmSync, writeFileSync } from 'fs';
import { tmpdir } from 'os';
import path from 'path';
import { createProcessRegistry, isPidAlive } from '../../src/supervisor/process-registry.js';
function makeTempDir(): string {
return path.join(tmpdir(), `claude-mem-supervisor-${Date.now()}-${Math.random().toString(36).slice(2)}`);
}
const tempDirs: string[] = [];
describe('supervisor ProcessRegistry', () => {
afterEach(() => {
while (tempDirs.length > 0) {
const dir = tempDirs.pop();
if (dir) {
rmSync(dir, { recursive: true, force: true });
}
}
});
describe('isPidAlive', () => {
it('treats current process as alive', () => {
expect(isPidAlive(process.pid)).toBe(true);
});
it('treats an impossibly high PID as dead', () => {
expect(isPidAlive(2147483647)).toBe(false);
});
it('treats negative PID as dead', () => {
expect(isPidAlive(-1)).toBe(false);
});
it('treats non-integer PID as dead', () => {
expect(isPidAlive(3.14)).toBe(false);
});
});
describe('persistence', () => {
it('persists entries to disk and reloads them on initialize', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
mkdirSync(tempDir, { recursive: true });
const registryPath = path.join(tempDir, 'supervisor.json');
// Create a registry, register an entry, and let it persist
const registry1 = createProcessRegistry(registryPath);
registry1.register('worker:1', {
pid: process.pid,
type: 'worker',
startedAt: '2026-03-15T00:00:00.000Z'
});
// Verify file exists on disk
expect(existsSync(registryPath)).toBe(true);
const diskData = JSON.parse(readFileSync(registryPath, 'utf-8'));
expect(diskData.processes['worker:1']).toBeDefined();
// Create a second registry from the same path — it should load the persisted entry
const registry2 = createProcessRegistry(registryPath);
registry2.initialize();
const records = registry2.getAll();
expect(records).toHaveLength(1);
expect(records[0]?.id).toBe('worker:1');
expect(records[0]?.pid).toBe(process.pid);
});
it('prunes dead processes on initialize', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
mkdirSync(tempDir, { recursive: true });
const registryPath = path.join(tempDir, 'supervisor.json');
writeFileSync(registryPath, JSON.stringify({
processes: {
alive: {
pid: process.pid,
type: 'worker',
startedAt: '2026-03-15T00:00:00.000Z'
},
dead: {
pid: 2147483647,
type: 'mcp',
startedAt: '2026-03-15T00:00:01.000Z'
}
}
}));
const registry = createProcessRegistry(registryPath);
registry.initialize();
const records = registry.getAll();
expect(records).toHaveLength(1);
expect(records[0]?.id).toBe('alive');
expect(existsSync(registryPath)).toBe(true);
});
it('handles corrupted registry file gracefully', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
mkdirSync(tempDir, { recursive: true });
const registryPath = path.join(tempDir, 'supervisor.json');
writeFileSync(registryPath, '{ not valid json!!!');
const registry = createProcessRegistry(registryPath);
registry.initialize();
// Should recover with an empty registry
expect(registry.getAll()).toHaveLength(0);
});
});
describe('register and unregister', () => {
it('register adds an entry retrievable by getAll', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
expect(registry.getAll()).toHaveLength(0);
registry.register('sdk:1', {
pid: process.pid,
type: 'sdk',
startedAt: '2026-03-15T00:00:00.000Z'
});
const records = registry.getAll();
expect(records).toHaveLength(1);
expect(records[0]?.id).toBe('sdk:1');
expect(records[0]?.type).toBe('sdk');
});
it('unregister removes an entry', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
registry.register('sdk:1', {
pid: process.pid,
type: 'sdk',
startedAt: '2026-03-15T00:00:00.000Z'
});
expect(registry.getAll()).toHaveLength(1);
registry.unregister('sdk:1');
expect(registry.getAll()).toHaveLength(0);
});
it('unregister is a no-op for unknown IDs', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
registry.register('sdk:1', {
pid: process.pid,
type: 'sdk',
startedAt: '2026-03-15T00:00:00.000Z'
});
registry.unregister('nonexistent');
expect(registry.getAll()).toHaveLength(1);
});
});
describe('getAll', () => {
it('returns records sorted by startedAt ascending', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
registry.register('newest', {
pid: process.pid,
type: 'sdk',
startedAt: '2026-03-15T00:00:02.000Z'
});
registry.register('oldest', {
pid: process.pid,
type: 'worker',
startedAt: '2026-03-15T00:00:00.000Z'
});
registry.register('middle', {
pid: process.pid,
type: 'mcp',
startedAt: '2026-03-15T00:00:01.000Z'
});
const records = registry.getAll();
expect(records).toHaveLength(3);
expect(records[0]?.id).toBe('oldest');
expect(records[1]?.id).toBe('middle');
expect(records[2]?.id).toBe('newest');
});
it('returns empty array when no entries exist', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
expect(registry.getAll()).toEqual([]);
});
});
describe('getBySession', () => {
it('filters records by session id', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
registry.register('sdk:1', {
pid: process.pid,
type: 'sdk',
sessionId: 42,
startedAt: '2026-03-15T00:00:00.000Z'
});
registry.register('sdk:2', {
pid: process.pid,
type: 'sdk',
sessionId: 'other',
startedAt: '2026-03-15T00:00:01.000Z'
});
const records = registry.getBySession(42);
expect(records).toHaveLength(1);
expect(records[0]?.id).toBe('sdk:1');
});
it('returns empty array when no processes match the session', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
registry.register('sdk:1', {
pid: process.pid,
type: 'sdk',
sessionId: 42,
startedAt: '2026-03-15T00:00:00.000Z'
});
expect(registry.getBySession(999)).toHaveLength(0);
});
it('matches string and numeric session IDs by string comparison', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
registry.register('sdk:1', {
pid: process.pid,
type: 'sdk',
sessionId: '42',
startedAt: '2026-03-15T00:00:00.000Z'
});
// Querying with number should find string "42"
expect(registry.getBySession(42)).toHaveLength(1);
});
});
describe('pruneDeadEntries', () => {
it('removes entries with dead PIDs and preserves live ones', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registryPath = path.join(tempDir, 'supervisor.json');
const registry = createProcessRegistry(registryPath);
registry.register('alive', {
pid: process.pid,
type: 'worker',
startedAt: '2026-03-15T00:00:00.000Z'
});
registry.register('dead', {
pid: 2147483647,
type: 'mcp',
startedAt: '2026-03-15T00:00:01.000Z'
});
const removed = registry.pruneDeadEntries();
expect(removed).toBe(1);
expect(registry.getAll()).toHaveLength(1);
expect(registry.getAll()[0]?.id).toBe('alive');
});
it('returns 0 when all entries are alive', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
registry.register('alive', {
pid: process.pid,
type: 'worker',
startedAt: '2026-03-15T00:00:00.000Z'
});
const removed = registry.pruneDeadEntries();
expect(removed).toBe(0);
expect(registry.getAll()).toHaveLength(1);
});
it('persists changes to disk after pruning', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registryPath = path.join(tempDir, 'supervisor.json');
const registry = createProcessRegistry(registryPath);
registry.register('dead', {
pid: 2147483647,
type: 'mcp',
startedAt: '2026-03-15T00:00:01.000Z'
});
registry.pruneDeadEntries();
const diskData = JSON.parse(readFileSync(registryPath, 'utf-8'));
expect(Object.keys(diskData.processes)).toHaveLength(0);
});
});
describe('clear', () => {
it('removes all entries', () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registryPath = path.join(tempDir, 'supervisor.json');
const registry = createProcessRegistry(registryPath);
registry.register('sdk:1', {
pid: process.pid,
type: 'sdk',
startedAt: '2026-03-15T00:00:00.000Z'
});
registry.register('sdk:2', {
pid: process.pid,
type: 'sdk',
startedAt: '2026-03-15T00:00:01.000Z'
});
expect(registry.getAll()).toHaveLength(2);
registry.clear();
expect(registry.getAll()).toHaveLength(0);
// Verify persisted to disk
const diskData = JSON.parse(readFileSync(registryPath, 'utf-8'));
expect(Object.keys(diskData.processes)).toHaveLength(0);
});
});
describe('createProcessRegistry', () => {
it('creates an isolated instance with a custom path', () => {
const tempDir1 = makeTempDir();
const tempDir2 = makeTempDir();
tempDirs.push(tempDir1, tempDir2);
const registry1 = createProcessRegistry(path.join(tempDir1, 'supervisor.json'));
const registry2 = createProcessRegistry(path.join(tempDir2, 'supervisor.json'));
registry1.register('sdk:1', {
pid: process.pid,
type: 'sdk',
startedAt: '2026-03-15T00:00:00.000Z'
});
// registry2 should be independent
expect(registry1.getAll()).toHaveLength(1);
expect(registry2.getAll()).toHaveLength(0);
});
});
describe('reapSession', () => {
it('unregisters dead processes for the given session', async () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
registry.register('sdk:99:50001', {
pid: 2147483640,
type: 'sdk',
sessionId: 99,
startedAt: '2026-03-15T00:00:00.000Z'
});
registry.register('mcp:99:50002', {
pid: 2147483641,
type: 'mcp',
sessionId: 99,
startedAt: '2026-03-15T00:00:01.000Z'
});
// Register a process for a different session (should survive)
registry.register('sdk:100:50003', {
pid: process.pid,
type: 'sdk',
sessionId: 100,
startedAt: '2026-03-15T00:00:02.000Z'
});
const reaped = await registry.reapSession(99);
expect(reaped).toBe(2);
expect(registry.getBySession(99)).toHaveLength(0);
expect(registry.getBySession(100)).toHaveLength(1);
});
it('returns 0 when no processes match the session', async () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
registry.register('sdk:1', {
pid: process.pid,
type: 'sdk',
sessionId: 42,
startedAt: '2026-03-15T00:00:00.000Z'
});
const reaped = await registry.reapSession(999);
expect(reaped).toBe(0);
expect(registry.getAll()).toHaveLength(1);
});
});
});
+186
View File
@@ -0,0 +1,186 @@
import { afterEach, describe, expect, it } from 'bun:test';
import { mkdirSync, readFileSync, rmSync, writeFileSync } from 'fs';
import { tmpdir } from 'os';
import path from 'path';
import { createProcessRegistry } from '../../src/supervisor/process-registry.js';
import { runShutdownCascade } from '../../src/supervisor/shutdown.js';
function makeTempDir(): string {
return path.join(tmpdir(), `claude-mem-shutdown-${Date.now()}-${Math.random().toString(36).slice(2)}`);
}
const tempDirs: string[] = [];
describe('supervisor shutdown cascade', () => {
afterEach(() => {
while (tempDirs.length > 0) {
const dir = tempDirs.pop();
if (dir) {
rmSync(dir, { recursive: true, force: true });
}
}
});
it('removes child records and pid file', async () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
mkdirSync(tempDir, { recursive: true });
const registryPath = path.join(tempDir, 'supervisor.json');
const pidFilePath = path.join(tempDir, 'worker.pid');
writeFileSync(pidFilePath, JSON.stringify({
pid: process.pid,
port: 37777,
startedAt: new Date().toISOString()
}));
const registry = createProcessRegistry(registryPath);
registry.register('worker', {
pid: process.pid,
type: 'worker',
startedAt: '2026-03-15T00:00:00.000Z'
});
registry.register('dead-child', {
pid: 2147483647,
type: 'mcp',
startedAt: '2026-03-15T00:00:01.000Z'
});
await runShutdownCascade({
registry,
currentPid: process.pid,
pidFilePath
});
const persisted = JSON.parse(readFileSync(registryPath, 'utf-8'));
expect(Object.keys(persisted.processes)).toHaveLength(0);
expect(() => readFileSync(pidFilePath, 'utf-8')).toThrow();
});
it('terminates tracked children in reverse spawn order', async () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
mkdirSync(tempDir, { recursive: true });
const registry = createProcessRegistry(path.join(tempDir, 'supervisor.json'));
registry.register('oldest', {
pid: 41001,
type: 'sdk',
startedAt: '2026-03-15T00:00:00.000Z'
});
registry.register('middle', {
pid: 41002,
type: 'mcp',
startedAt: '2026-03-15T00:00:01.000Z'
});
registry.register('newest', {
pid: 41003,
type: 'chroma',
startedAt: '2026-03-15T00:00:02.000Z'
});
const originalKill = process.kill;
const alive = new Set([41001, 41002, 41003]);
const calls: Array<{ pid: number; signal: NodeJS.Signals | number }> = [];
process.kill = ((pid: number, signal?: NodeJS.Signals | number) => {
const normalizedSignal = signal ?? 'SIGTERM';
if (normalizedSignal === 0) {
if (!alive.has(pid)) {
const error = new Error(`kill ESRCH ${pid}`) as NodeJS.ErrnoException;
error.code = 'ESRCH';
throw error;
}
return true;
}
calls.push({ pid, signal: normalizedSignal });
alive.delete(pid);
return true;
}) as typeof process.kill;
try {
await runShutdownCascade({
registry,
currentPid: process.pid,
pidFilePath: path.join(tempDir, 'worker.pid')
});
} finally {
process.kill = originalKill;
}
expect(calls).toEqual([
{ pid: 41003, signal: 'SIGTERM' },
{ pid: 41002, signal: 'SIGTERM' },
{ pid: 41001, signal: 'SIGTERM' }
]);
});
it('handles already-dead processes gracefully without throwing', async () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
mkdirSync(tempDir, { recursive: true });
const registryPath = path.join(tempDir, 'supervisor.json');
const registry = createProcessRegistry(registryPath);
// Register processes with PIDs that are definitely dead
registry.register('dead:1', {
pid: 2147483640,
type: 'sdk',
startedAt: '2026-03-15T00:00:00.000Z'
});
registry.register('dead:2', {
pid: 2147483641,
type: 'mcp',
startedAt: '2026-03-15T00:00:01.000Z'
});
// Should not throw
await runShutdownCascade({
registry,
currentPid: process.pid,
pidFilePath: path.join(tempDir, 'worker.pid')
});
// All entries should be unregistered
const persisted = JSON.parse(readFileSync(registryPath, 'utf-8'));
expect(Object.keys(persisted.processes)).toHaveLength(0);
});
it('unregisters all children from registry after cascade', async () => {
const tempDir = makeTempDir();
tempDirs.push(tempDir);
mkdirSync(tempDir, { recursive: true });
const registryPath = path.join(tempDir, 'supervisor.json');
const registry = createProcessRegistry(registryPath);
registry.register('worker', {
pid: process.pid,
type: 'worker',
startedAt: '2026-03-15T00:00:00.000Z'
});
registry.register('child:1', {
pid: 2147483640,
type: 'sdk',
startedAt: '2026-03-15T00:00:01.000Z'
});
registry.register('child:2', {
pid: 2147483641,
type: 'mcp',
startedAt: '2026-03-15T00:00:02.000Z'
});
await runShutdownCascade({
registry,
currentPid: process.pid,
pidFilePath: path.join(tempDir, 'worker.pid')
});
// All records (including the current process one) should be removed
expect(registry.getAll()).toHaveLength(0);
});
});
+18
View File
@@ -14,6 +14,24 @@ mock.module('../../src/utils/logger.js', () => ({
},
}));
// Mock worker-utils to delegate workerHttpRequest to global.fetch
mock.module('../../src/shared/worker-utils.js', () => ({
getWorkerPort: () => 37777,
getWorkerHost: () => '127.0.0.1',
workerHttpRequest: (apiPath: string, options?: any) => {
const url = `http://127.0.0.1:37777${apiPath}`;
return globalThis.fetch(url, {
method: options?.method ?? 'GET',
headers: options?.headers,
body: options?.body,
});
},
clearPortCache: () => {},
ensureWorkerRunning: () => Promise.resolve(true),
fetchWithTimeout: (url: string, init: any, timeoutMs: number) => globalThis.fetch(url, init),
buildWorkerUrl: (apiPath: string) => `http://127.0.0.1:37777${apiPath}`,
}));
// Import after mocks
import {
replaceTaggedContent,
+204
View File
@@ -0,0 +1,204 @@
import { describe, it, expect, beforeEach, afterEach } from 'bun:test';
import { EventEmitter } from 'events';
import {
registerProcess,
unregisterProcess,
getProcessBySession,
getActiveCount,
getActiveProcesses,
waitForSlot,
ensureProcessExit,
} from '../../src/services/worker/ProcessRegistry.js';
/**
* Create a mock ChildProcess that behaves like a real one for testing.
* Supports exitCode, killed, kill(), and event emission.
*/
function createMockProcess(overrides: { exitCode?: number | null; killed?: boolean } = {}) {
const emitter = new EventEmitter();
const mock = Object.assign(emitter, {
pid: Math.floor(Math.random() * 100000) + 1000,
exitCode: overrides.exitCode ?? null,
killed: overrides.killed ?? false,
kill(signal?: string) {
mock.killed = true;
// Simulate async exit after kill
setTimeout(() => {
mock.exitCode = signal === 'SIGKILL' ? null : 0;
mock.emit('exit', mock.exitCode, signal || 'SIGTERM');
}, 10);
return true;
},
stdin: null,
stdout: null,
stderr: null,
});
return mock;
}
// Helper to clear registry between tests by unregistering all
function clearRegistry() {
for (const p of getActiveProcesses()) {
unregisterProcess(p.pid);
}
}
describe('ProcessRegistry', () => {
beforeEach(() => {
clearRegistry();
});
afterEach(() => {
clearRegistry();
});
describe('registerProcess / unregisterProcess', () => {
it('should register and track a process', () => {
const proc = createMockProcess();
registerProcess(proc.pid, 1, proc as any);
expect(getActiveCount()).toBe(1);
expect(getProcessBySession(1)).toBeDefined();
});
it('should unregister a process and free the slot', () => {
const proc = createMockProcess();
registerProcess(proc.pid, 1, proc as any);
unregisterProcess(proc.pid);
expect(getActiveCount()).toBe(0);
expect(getProcessBySession(1)).toBeUndefined();
});
});
describe('getProcessBySession', () => {
it('should return undefined for unknown session', () => {
expect(getProcessBySession(999)).toBeUndefined();
});
it('should find process by session ID', () => {
const proc = createMockProcess();
registerProcess(proc.pid, 42, proc as any);
const found = getProcessBySession(42);
expect(found).toBeDefined();
expect(found!.pid).toBe(proc.pid);
});
});
describe('waitForSlot', () => {
it('should resolve immediately when under limit', async () => {
await waitForSlot(2); // 0 processes, limit 2
});
it('should wait until a slot opens', async () => {
const proc1 = createMockProcess();
const proc2 = createMockProcess();
registerProcess(proc1.pid, 1, proc1 as any);
registerProcess(proc2.pid, 2, proc2 as any);
// Start waiting for slot (limit=2, both slots full)
const waitPromise = waitForSlot(2, 5000);
// Free a slot after 50ms
setTimeout(() => unregisterProcess(proc1.pid), 50);
await waitPromise; // Should resolve once slot freed
expect(getActiveCount()).toBe(1);
});
it('should throw on timeout when no slot opens', async () => {
const proc1 = createMockProcess();
const proc2 = createMockProcess();
registerProcess(proc1.pid, 1, proc1 as any);
registerProcess(proc2.pid, 2, proc2 as any);
await expect(waitForSlot(2, 100)).rejects.toThrow('Timed out waiting for agent pool slot');
});
it('should throw when hard cap (10) is exceeded', async () => {
// Register 10 processes to hit the hard cap
const procs = [];
for (let i = 0; i < 10; i++) {
const proc = createMockProcess();
registerProcess(proc.pid, i + 100, proc as any);
procs.push(proc);
}
await expect(waitForSlot(20)).rejects.toThrow('Hard cap exceeded');
});
});
describe('ensureProcessExit', () => {
it('should unregister immediately if exitCode is set', async () => {
const proc = createMockProcess({ exitCode: 0 });
registerProcess(proc.pid, 1, proc as any);
await ensureProcessExit({ pid: proc.pid, sessionDbId: 1, spawnedAt: Date.now(), process: proc as any });
expect(getActiveCount()).toBe(0);
});
it('should NOT treat proc.killed as exited — must wait for actual exit', async () => {
// This is the core bug fix: proc.killed=true but exitCode=null means NOT dead
const proc = createMockProcess({ killed: true, exitCode: null });
registerProcess(proc.pid, 1, proc as any);
// Override kill to simulate SIGKILL + delayed exit
proc.kill = (signal?: string) => {
proc.killed = true;
setTimeout(() => {
proc.exitCode = 0;
proc.emit('exit', 0, signal);
}, 20);
return true;
};
// ensureProcessExit should NOT short-circuit on proc.killed
// It should wait for exit event or timeout, then escalate to SIGKILL
const start = Date.now();
await ensureProcessExit({ pid: proc.pid, sessionDbId: 1, spawnedAt: Date.now(), process: proc as any }, 100);
expect(getActiveCount()).toBe(0);
});
it('should escalate to SIGKILL after timeout', async () => {
const proc = createMockProcess();
registerProcess(proc.pid, 1, proc as any);
// Override kill: only respond to SIGKILL
let sigkillSent = false;
proc.kill = (signal?: string) => {
proc.killed = true;
if (signal === 'SIGKILL') {
sigkillSent = true;
setTimeout(() => {
proc.exitCode = -1;
proc.emit('exit', -1, 'SIGKILL');
}, 10);
}
// Don't emit exit for non-SIGKILL signals (simulates stuck process)
return true;
};
await ensureProcessExit({ pid: proc.pid, sessionDbId: 1, spawnedAt: Date.now(), process: proc as any }, 100);
expect(sigkillSent).toBe(true);
expect(getActiveCount()).toBe(0);
});
it('should unregister even if process ignores SIGKILL (after 1s timeout)', async () => {
const proc = createMockProcess();
registerProcess(proc.pid, 1, proc as any);
// Override kill to never emit exit (completely stuck process)
proc.kill = () => {
proc.killed = true;
return true;
};
const start = Date.now();
await ensureProcessExit({ pid: proc.pid, sessionDbId: 1, spawnedAt: Date.now(), process: proc as any }, 100);
const elapsed = Date.now() - start;
// Should have waited ~100ms for graceful + ~1000ms for SIGKILL timeout
expect(elapsed).toBeGreaterThan(90);
// Process is unregistered regardless (safety net)
expect(getActiveCount()).toBe(0);
});
});
});