fix: sequoia-territory bug-fix bundle (chroma, env, build, MCP, worker) (#2394)
* fix(mcp): drop ${_R%/} parameter-expansion trim that trips Claude Code MCP validator
The POSIX substring trim ${_R%/} is misread by Claude Code's MCP-config
validator as a required env var named "_R%/", causing /doctor to flag
mcp-search as invalid on every install. POSIX collapses // in paths, so
the trim was cosmetic — drop it and the validator passes.
Fixes #2350, #2354, #2356.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(env): block ANTHROPIC_BASE_URL leak + three-branch OAuth-skip predicate
Issue #2375: parent-shell ANTHROPIC_BASE_URL leaked through to subprocess
isolatedEnv, while ANTHROPIC_AUTH_TOKEN was blocked. The OAuth-skip
predicate fired on bare BASE_URL, but no auth credential reached the
subprocess -> "Not logged in". Add ANTHROPIC_BASE_URL to BLOCKED_ENV_VARS
so it can only enter isolatedEnv via ~/.claude-mem/.env.
Replace the OAuth-skip predicate with three branches to prevent a
second-order security regression: a user with a tokenless gateway
configured in .env (BASE_URL only, no token) would otherwise have their
Anthropic OAuth token fetched and sent to their gateway. Token leak to
third party. Three-branch predicate:
1. BASE_URL set -> return without OAuth (custom gateway, never leak token)
2. API_KEY or AUTH_TOKEN set -> return without OAuth (explicit credentials)
3. Otherwise -> OAuth lookup for api.anthropic.com
Adds tests/env-isolation.test.ts.
Fixes #2375.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(worker): classify Claude SDK HTTP 400 as unrecoverable
ClaudeProvider previously had no explicit HTTP 400 handling — the
default branch classified all errors as `transient`, so a permanent
400 (e.g., model rejecting an `effort` parameter forwarded from a
leaked CLAUDE_CODE_EFFORT_LEVEL) would be retried indefinitely
(#1874+ retries observed in one session per #2357).
Mirror GeminiProvider/OpenRouterProvider's pattern: classify 400 as
`unrecoverable`, 401/403 as `auth_invalid`, 429 as `rate_limit`,
default to `transient`. When the 400 body matches the
"effort parameter" signature, emit a one-time SDK warn log pointing
at the env-leak fix in ~/.claude-mem/.env.
Adds tests/claude-provider-error-classifier.test.ts.
Fixes #2357.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(chroma): pin onnxruntime>=1.20 + protobuf<7 to fix INVALID_PROTOBUF on macOS arm64
The shipped all-MiniLM-L6-v2 model has pytorch-2.0 IR. chroma-mcp 0.2.6
transitively depends on `chromadb>=1.0.16` which only requires
`onnxruntime>=1.14.1` — uv can therefore resolve to an onnxruntime old
enough to fail every embedding add with `[ONNXRuntimeError] : 7 :
INVALID_PROTOBUF` on macOS arm64 / Python 3.13. Semantic search silently
degraded to FTS-only and smart backfill broke (#2371).
Path B (override) was required because chroma-mcp 0.2.6 is the latest
PyPI release — no upstream bump exists.
Inject `--with onnxruntime>=1.20 --with protobuf<7` into the uvx spawn
args (both persistent and remote modes). The protobuf cap is essential:
forcing only `onnxruntime>=1.20` causes uv to re-resolve and land on
protobuf 7.x, which trips opentelemetry's `_pb2` stubs with `TypeError:
Descriptors cannot be created directly` because they were generated
with protoc <3.19. Capping below 7 lands on protobuf 6.x which
opentelemetry tolerates.
Verified end-to-end: ONNX model loads, embeddings produce a 384-dim
vector, PersistentClient init / add / query roundtrip succeeds:
uvx --python 3.13 --with "onnxruntime>=1.20" --with "protobuf<7" \
chroma-mcp==0.2.6 --help # clean
# programmatic test: onnxruntime 1.26.0, protobuf 6.33.6,
# embedding ok 384, query ok ids=[['1']]
Fixes #2371.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(chroma): enforce single chroma-mcp subprocess per worker (#2313)
Root cause: every reconnect path in ChromaMcpManager — connectInternal's
re-entry, the connect-timeout catch, callTool's transport-error retry, and
the transport.onclose handler — used to abandon `this.transport`/`this.client`
by calling at most `transport.close()` and nulling the handles. The MCP SDK's
StdioClientTransport.close() only signals the direct child (uvx); on Linux the
grandchildren (uv -> python -> chroma-mcp) re-parent to init and survive
because the SDK does not put the subprocess in its own process group. Each
reconnect therefore leaked a full chroma-mcp tree, accumulating 20+ instances
per session.
Fix: introduce a private disposeCurrentSubprocess() helper that always tree-
kills via the existing killProcessTree primitive before nulling the transport
reference, and route every "abandon current transport" path (reconnect,
connect-timeout, transport error, onclose, stop) through it. The existing
`connecting: Promise<void> | null` lock continues to serialize concurrent
ensureConnected() callers into a single spawn.
Adds tests/services/sync/chroma-mcp-manager-singleton.test.ts covering:
- 5 parallel ensureConnected() calls produce exactly one spawn
- a transport-error reconnect tree-kills the prior subprocess pid before
spawning a replacement
- stop() disposes state including any pending connecting promise
Manual verification needed on Linux: after a long session with multiple
tool uses, `ps aux | grep chroma-mcp | wc -l` should return 1, not 20+.
Fixes #2313.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(build): polyfill import.meta.url to __filename in CJS worker bundle
The worker bundles ESM dependencies (notably @anthropic-ai/claude-agent-sdk's
*.mjs files) into CJS output. Those modules call createRequire(import.meta.url)
at module-load time. esbuild's CJS output left this as createRequire(ute.url)
— where `ute` is its `import.meta` polyfill `{}` — so `ute.url` was undefined
and module-load crashed with:
TypeError: The argument 'filename' must be a file URL object, file URL
string, or absolute path string. Received undefined
code: ERR_INVALID_ARG_VALUE
Every Stop hook and every worker subprocess invocation hit this. Fix is the
esbuild `define` option mapping `import.meta.url` to `__filename` (provided as
a real absolute path by the existing CJS prelude in the banner).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: daily dep bump per CLAUDE.md maintenance policy
Root: @anthropic-ai/claude-agent-sdk, @clack/prompts, @types/node,
dompurify, postcss, react, react-dom, yaml, zod.
plugin/: tree-sitter-cli, zod.
openclaw/: @types/node.
All patch/minor bumps; no major version changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* build: regenerate plugin artifacts after env/chroma/mcp fixes
Built artifacts are committed so the marketplace-installable plugin
ships with the runtime bundles. Picks up:
- d7b145e9 .mcp.json shell-prelude trim drop
- a8cbd651 EnvManager BASE_URL block + 3-branch predicate
- 8cb73b8c ClaudeProvider HTTP 400 unrecoverable classifier
- ecd5b802 ChromaMcpManager onnxruntime/protobuf overrides
- c79324ea ChromaMcpManager singleton enforcement
- e8376f46 esbuild import.meta.url -> __filename polyfill
- a7541d71 daily dep bump
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* build: regenerate plugin artifacts after main merge
Bundles now include both v13.0.0 server-beta runtime (server-beta-service.cjs
+ updated mcp-server.cjs / worker-service.cjs) and this branch's chroma /
env / build / Claude SDK fixes.
Verified: bun test tests/env-isolation.test.ts \\
tests/claude-provider-error-classifier.test.ts \\
tests/services/sync/chroma-mcp-manager-singleton.test.ts
→ 13/13 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(review): address CodeRabbit findings on PR #2394
1. scripts/build-hooks.js — `import.meta.url` now maps to a file:// URL
(via pathToFileURL(__filename).href in the CJS banner) instead of the
raw __filename path. Preserves URL semantics for any bundled ESM dep
that does `new URL(rel, import.meta.url)`. createRequire still works.
2. src/shared/EnvManager.ts — added envFilePath() that resolves
CLAUDE_MEM_ENV_FILE lazily (falling back to paths.envFile()), and
switched internal load/save call sites to use it. ENV_FILE_PATH is
kept as a deprecated snapshot for back-compat. Lets tests target a
temp file without depending on module-load order.
3. tests/env-isolation.test.ts — redirects to a temp dir via
CLAUDE_MEM_ENV_FILE in beforeAll, removes all mutation of the real
~/.claude-mem/.env, and wraps the OAuth-spy assertion in try/finally
so the spy is always restored even if the test fails.
Verified:
bun test tests/env-isolation.test.ts \
tests/claude-provider-error-classifier.test.ts \
tests/services/sync/chroma-mcp-manager-singleton.test.ts
→ 13/13 pass
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -5,7 +5,7 @@
|
||||
"command": "sh",
|
||||
"args": [
|
||||
"-c",
|
||||
"_C=\"${CLAUDE_CONFIG_DIR:-$HOME/.claude}\"; _E=\"${CLAUDE_PLUGIN_ROOT:-${PLUGIN_ROOT:-}}\"; _P=$({ [ -n \"$_E\" ] && printf '%s\\n' \"$_E\"; printf '%s\\n' \"$PWD/plugin\" \"$PWD\"; ls -dt \"$HOME/.codex/plugins/cache/claude-mem-local/claude-mem\"/[0-9]*/ \"$HOME/.codex/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ \"$_C/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null; printf '%s\\n' \"$_C/plugins/marketplaces/thedotmack/plugin\"; } | while IFS= read -r _R; do _R=\"${_R%/}\"; [ -d \"$_R/plugin/scripts\" ] && _Q=\"$_R/plugin\" || _Q=\"$_R\"; [ -f \"$_Q/scripts/mcp-server.cjs\" ] && { printf '%s\\n' \"$_Q\"; break; }; done); [ -n \"$_P\" ] || { echo \"claude-mem: mcp server not found\" >&2; exit 1; }; exec node \"$_P/scripts/mcp-server.cjs\""
|
||||
"_C=\"${CLAUDE_CONFIG_DIR:-$HOME/.claude}\"; _E=\"${CLAUDE_PLUGIN_ROOT:-${PLUGIN_ROOT:-}}\"; _P=$({ [ -n \"$_E\" ] && printf '%s\\n' \"$_E\"; printf '%s\\n' \"$PWD/plugin\" \"$PWD\"; ls -dt \"$HOME/.codex/plugins/cache/claude-mem-local/claude-mem\"/[0-9]*/ \"$HOME/.codex/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ \"$_C/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null; printf '%s\\n' \"$_C/plugins/marketplaces/thedotmack/plugin\"; } | while IFS= read -r _R; do [ -d \"$_R/plugin/scripts\" ] && _Q=\"$_R/plugin\" || _Q=\"$_R\"; [ -f \"$_Q/scripts/mcp-server.cjs\" ] && { printf '%s\\n' \"$_Q\"; break; }; done); [ -n \"$_P\" ] || { echo \"claude-mem: mcp server not found\" >&2; exit 1; }; exec node \"$_P/scripts/mcp-server.cjs\""
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
@@ -10,7 +10,7 @@
|
||||
"test": "tsc && node --test dist/index.test.js"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/node": "^25.6.0",
|
||||
"@types/node": "^25.6.2",
|
||||
"typescript": "^6.0.3"
|
||||
},
|
||||
"openclaw": {
|
||||
|
||||
+9
-9
@@ -123,26 +123,26 @@
|
||||
"2fa": false
|
||||
},
|
||||
"dependencies": {
|
||||
"@anthropic-ai/claude-agent-sdk": "^0.2.119",
|
||||
"@anthropic-ai/claude-agent-sdk": "^0.2.138",
|
||||
"@better-auth/api-key": "^1.6.9",
|
||||
"@clack/prompts": "^1.2.0",
|
||||
"@clack/prompts": "^1.3.0",
|
||||
"@modelcontextprotocol/sdk": "^1.29.0",
|
||||
"ansi-to-html": "^0.7.2",
|
||||
"better-auth": "^1.6.9",
|
||||
"bullmq": "^5.76.6",
|
||||
"cors": "^2.8.6",
|
||||
"dompurify": "^3.4.1",
|
||||
"dompurify": "^3.4.2",
|
||||
"express": "^5.2.1",
|
||||
"glob": "^13.0.6",
|
||||
"handlebars": "^4.7.9",
|
||||
"ioredis": "^5.10.1",
|
||||
"pg": "^8.20.0",
|
||||
"picocolors": "^1.1.1",
|
||||
"react": "^19.2.5",
|
||||
"react-dom": "^19.2.5",
|
||||
"react": "^19.2.6",
|
||||
"react-dom": "^19.2.6",
|
||||
"shell-quote": "^1.8.3",
|
||||
"yaml": "^2.8.3",
|
||||
"zod": "^4.3.6",
|
||||
"yaml": "^2.8.4",
|
||||
"zod": "^4.4.3",
|
||||
"zod-to-json-schema": "^3.25.2"
|
||||
},
|
||||
"devDependencies": {
|
||||
@@ -156,7 +156,7 @@
|
||||
"@types/cors": "^2.8.19",
|
||||
"@types/dompurify": "^3.2.0",
|
||||
"@types/express": "^5.0.6",
|
||||
"@types/node": "^25.6.0",
|
||||
"@types/node": "^25.6.2",
|
||||
"@types/pg": "^8.20.0",
|
||||
"@types/react": "^19.2.14",
|
||||
"@types/react-dom": "^19.2.3",
|
||||
@@ -164,7 +164,7 @@
|
||||
"jimp": "^1.6.1",
|
||||
"np": "^11.2.0",
|
||||
"parse5": "^8.0.1",
|
||||
"postcss": "^8.5.13",
|
||||
"postcss": "^8.5.14",
|
||||
"remark-mdx": "^3.1.1",
|
||||
"remark-parse": "^11.0.0",
|
||||
"tree-sitter-bash": "^0.25.1",
|
||||
|
||||
+1
-1
@@ -5,7 +5,7 @@
|
||||
"command": "sh",
|
||||
"args": [
|
||||
"-c",
|
||||
"_C=\"${CLAUDE_CONFIG_DIR:-$HOME/.claude}\"; _E=\"${CLAUDE_PLUGIN_ROOT:-${PLUGIN_ROOT:-}}\"; _P=$({ [ -n \"$_E\" ] && printf '%s\\n' \"$_E\"; printf '%s\\n' \"$PWD/plugin\" \"$PWD\"; ls -dt \"$HOME/.codex/plugins/cache/claude-mem-local/claude-mem\"/[0-9]*/ \"$HOME/.codex/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ \"$_C/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null; printf '%s\\n' \"$_C/plugins/marketplaces/thedotmack/plugin\"; } | while IFS= read -r _R; do _R=\"${_R%/}\"; [ -d \"$_R/plugin/scripts\" ] && _Q=\"$_R/plugin\" || _Q=\"$_R\"; [ -f \"$_Q/scripts/mcp-server.cjs\" ] && { printf '%s\\n' \"$_Q\"; break; }; done); [ -n \"$_P\" ] || { echo \"claude-mem: mcp server not found\" >&2; exit 1; }; exec node \"$_P/scripts/mcp-server.cjs\""
|
||||
"_C=\"${CLAUDE_CONFIG_DIR:-$HOME/.claude}\"; _E=\"${CLAUDE_PLUGIN_ROOT:-${PLUGIN_ROOT:-}}\"; _P=$({ [ -n \"$_E\" ] && printf '%s\\n' \"$_E\"; printf '%s\\n' \"$PWD/plugin\" \"$PWD\"; ls -dt \"$HOME/.codex/plugins/cache/claude-mem-local/claude-mem\"/[0-9]*/ \"$HOME/.codex/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ \"$_C/plugins/cache/thedotmack/claude-mem\"/[0-9]*/ 2>/dev/null; printf '%s\\n' \"$_C/plugins/marketplaces/thedotmack/plugin\"; } | while IFS= read -r _R; do [ -d \"$_R/plugin/scripts\" ] && _Q=\"$_R/plugin\" || _Q=\"$_R\"; [ -f \"$_Q/scripts/mcp-server.cjs\" ] && { printf '%s\\n' \"$_Q\"; break; }; done); [ -n \"$_P\" ] || { echo \"claude-mem: mcp server not found\" >&2; exit 1; }; exec node \"$_P/scripts/mcp-server.cjs\""
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
@@ -212,7 +212,7 @@ ${f}`}let a=s.lineStart;for(let u=s.lineStart-1;u>=0;u--){let l=i[u].trim();if(l
|
||||
${c}`}var u_=new Set([".js",".jsx",".ts",".tsx",".mjs",".cjs",".py",".pyw",".go",".rs",".rb",".java",".cs",".cpp",".cc",".cxx",".c",".h",".hpp",".hh",".swift",".kt",".kts",".php",".vue",".svelte",".ex",".exs",".lua",".scala",".sc",".sh",".bash",".zsh",".hs",".zig",".css",".scss",".toml",".yml",".yaml",".sql",".md",".mdx"]),Gx=new Set(["node_modules",".git","dist","build",".next","__pycache__",".venv","venv","env",".env","target","vendor",".cache",".turbo","coverage",".nyc_output",".claude",".smart-file-read"]),Kx=512*1024;async function*l_(t,e,r=20,n){if(r<=0)return;let o;try{o=await(0,Ar.readdir)(t,{withFileTypes:!0})}catch(s){S.debug("WORKER",`walkDir: failed to read directory ${t}`,void 0,s instanceof Error?s:void 0);return}for(let s of o){if(s.name.startsWith(".")&&s.name!=="."||Gx.has(s.name))continue;let i=(0,Vn.join)(t,s.name);if(s.isDirectory())yield*l_(i,e,r-1,n);else if(s.isFile()){let a=s.name.slice(s.name.lastIndexOf("."));(u_.has(a)||n&&n.has(a))&&(yield i)}}}async function Jx(t){try{let e=await(0,Ar.stat)(t);if(e.size>Kx||e.size===0)return null;let r=await(0,Ar.readFile)(t,"utf-8");return r.slice(0,1e3).includes("\0")?null:r}catch(e){return S.debug("WORKER",`safeReadFile: failed to read ${t}`,void 0,e instanceof Error?e:void 0),null}}async function d_(t,e,r={}){let n=r.maxResults||20,o=e.toLowerCase(),s=o.split(/[\s_\-./]+/).filter(w=>w.length>0),i=r.projectRoot||t,a=Wn(i),c=new Set;for(let w of Object.values(a.grammars))for(let v of w.extensions)u_.has(v)||c.add(v);let u=[];for await(let w of l_(t,t,20,c.size>0?c:void 0)){if(r.filePattern&&!(0,Vn.relative)(t,w).toLowerCase().includes(r.filePattern.toLowerCase()))continue;let v=await Jx(w);v&&u.push({absolutePath:w,relativePath:(0,Vn.relative)(t,w),content:v})}let l=i_(u,i),d=[],p=[],f=0;for(let[w,v]of l){f+=Bx(v);let k=Ps(w.toLowerCase(),s)>0,_e=[],Ee=(Dt,Qt)=>{for(let ae of Dt){let St=0,Ve="",Cr=Ps(ae.name.toLowerCase(),s);Cr>0&&(St+=Cr*3,Ve="name match"),ae.signature.toLowerCase().includes(o)&&(St+=2,Ve=Ve?`${Ve} + signature`:"signature match"),ae.jsdoc&&ae.jsdoc.toLowerCase().includes(o)&&(St+=1,Ve=Ve?`${Ve} + jsdoc`:"jsdoc match"),St>0&&(k=!0,_e.push({filePath:w,symbolName:Qt?`${Qt}.${ae.name}`:ae.name,kind:ae.kind,signature:ae.signature,jsdoc:ae.jsdoc,lineStart:ae.lineStart,lineEnd:ae.lineEnd,matchReason:Ve})),ae.children&&Ee(ae.children,ae.name)}};Ee(v.symbols),k&&(d.push(v),p.push(..._e))}p.sort((w,v)=>{let x=Ps(w.symbolName.toLowerCase(),s);return Ps(v.symbolName.toLowerCase(),s)-x});let m=p.slice(0,n),_=new Set(m.map(w=>w.filePath)),y=d.filter(w=>_.has(w.filePath)).slice(0,n),b=y.reduce((w,v)=>w+v.foldedTokenEstimate,0);return{foldedFiles:y,matchingSymbols:m,totalFilesScanned:u.length,totalSymbolsFound:f,tokenEstimate:b}}function Ps(t,e){let r=0;for(let n of e)if(t===n)r+=10;else if(t.includes(n))r+=5;else{let o=0,s=0;for(let i of n){let a=t.indexOf(i,o);a!==-1&&(s++,o=a+1)}s===n.length&&(r+=1)}return r}function Bx(t){let e=t.symbols.length;for(let r of t.symbols)r.children&&(e+=r.children.length);return e}function p_(t,e){let r=[];if(r.push(`\u{1F50D} Smart Search: "${e}"`),r.push(` Scanned ${t.totalFilesScanned} files, found ${t.totalSymbolsFound} symbols`),r.push(` ${t.matchingSymbols.length} matches across ${t.foldedFiles.length} files (~${t.tokenEstimate} tokens for folded view)`),r.push(""),t.matchingSymbols.length===0)return r.push(" No matching symbols found."),r.join(`
|
||||
`);r.push("\u2500\u2500 Matching Symbols \u2500\u2500"),r.push("");for(let n of t.matchingSymbols){if(r.push(` ${n.kind} ${n.symbolName} (${n.filePath}:${n.lineStart+1})`),r.push(` ${n.signature}`),n.jsdoc){let o=n.jsdoc.split(`
|
||||
`).find(s=>s.replace(/^[\s*/]+/,"").trim().length>0);o&&r.push(` \u{1F4AC} ${o.replace(/^[\s*/]+/,"").trim()}`)}r.push("")}r.push("\u2500\u2500 Folded File Views \u2500\u2500"),r.push("");for(let n of t.foldedFiles)r.push(Ir(n)),r.push("");return r.push("\u2500\u2500 Actions \u2500\u2500"),r.push(" To see full implementation: use smart_unfold with file path and symbol name"),r.join(`
|
||||
`)}var iu=require("node:fs/promises"),zs=require("node:fs"),Qe=require("node:path"),h_=require("node:os"),g_=require("node:url"),cP={},Yx="12.7.5";console.log=(...t)=>{S.error("CONSOLE","Intercepted console output (MCP protocol protection)",void 0,{args:t})};var __=!1,y_=(()=>{if(typeof __dirname<"u")return __dirname;try{return(0,Qe.dirname)((0,g_.fileURLToPath)(cP.url))}catch{return __=!0,process.cwd()}})(),au=(0,Qe.resolve)(y_,"worker-service.cjs");function Xx(){__&&((0,zs.existsSync)(au)||S.error("SYSTEM","mcp-server: dirname resolution failed (both __dirname and import.meta.url are unavailable). Fell back to process.cwd() and the resolved WORKER_SCRIPT_PATH does not exist. This is the actual problem \u2014 the worker bundle is fine, but mcp-server cannot locate it. Worker auto-start will fail until the dirname-resolution path is fixed.",{workerScriptPath:au,mcpServerDir:y_}))}var f_={search:"/api/search",timeline:"/api/timeline"};async function su(t,e){S.debug("SYSTEM","\u2192 Worker API",void 0,{endpoint:t,params:e});let r=new URLSearchParams;for(let[o,s]of Object.entries(e))s!=null&&r.append(o,String(s));let n=`${t}?${r}`;try{let o=await $s(n);if(!o.ok){let i=await o.text();throw new Error(`Worker API error (${o.status}): ${i}`)}let s=await o.json();return S.debug("SYSTEM","\u2190 Worker API success",void 0,{endpoint:t}),s}catch(o){return S.error("SYSTEM","\u2190 Worker API error",{endpoint:t},o instanceof Error?o:new Error(String(o))),{content:[{type:"text",text:`Error calling Worker API: ${o instanceof Error?o.message:String(o)}`}],isError:!0}}}async function Qx(t,e){let r=await $s(t,{method:"POST",headers:{"Content-Type":"application/json"},body:JSON.stringify(e)});if(!r.ok){let o=await r.text();throw new Error(`Worker API error (${r.status}): ${o}`)}let n=await r.json();return S.debug("HTTP","Worker API success (POST)",void 0,{endpoint:t}),{content:[{type:"text",text:JSON.stringify(n,null,2)}]}}async function Mr(t,e){S.debug("HTTP","Worker API request (POST)",void 0,{endpoint:t});try{return await Qx(t,e)}catch(r){return S.error("HTTP","Worker API error (POST)",{endpoint:t},r instanceof Error?r:new Error(String(r))),{content:[{type:"text",text:`Error calling Worker API: ${r instanceof Error?r.message:String(r)}`}],isError:!0}}}async function eP(){try{return(await $s("/api/health")).ok}catch(t){return S.debug("SYSTEM","Worker health check failed",{},t instanceof Error?t:new Error(String(t))),!1}}async function tP(){if(await eP())return!0;S.warn("SYSTEM","Worker not available, attempting auto-start for MCP client"),Xx();try{let t=Jc(),e=await Kg(t,au);return e==="dead"&&S.error("SYSTEM","Worker auto-start failed \u2014 MCP tools that require the worker (search, timeline, get_observations) will fail until the worker is running. Check earlier log lines for the specific failure reason (Bun not found, missing worker bundle, port conflict, etc.)."),e!=="dead"}catch(t){return S.error("SYSTEM","Worker auto-start threw \u2014 MCP tools that require the worker (search, timeline, get_observations) will fail until the worker is running.",void 0,t instanceof Error?t:new Error(String(t))),!1}}var S_=[{name:"__IMPORTANT",description:`3-LAYER WORKFLOW (ALWAYS FOLLOW):
|
||||
`)}var iu=require("node:fs/promises"),zs=require("node:fs"),Qe=require("node:path"),h_=require("node:os"),g_=require("node:url"),cP={},Yx="13.0.0";console.log=(...t)=>{S.error("CONSOLE","Intercepted console output (MCP protocol protection)",void 0,{args:t})};var __=!1,y_=(()=>{if(typeof __dirname<"u")return __dirname;try{return(0,Qe.dirname)((0,g_.fileURLToPath)(cP.url))}catch{return __=!0,process.cwd()}})(),au=(0,Qe.resolve)(y_,"worker-service.cjs");function Xx(){__&&((0,zs.existsSync)(au)||S.error("SYSTEM","mcp-server: dirname resolution failed (both __dirname and import.meta.url are unavailable). Fell back to process.cwd() and the resolved WORKER_SCRIPT_PATH does not exist. This is the actual problem \u2014 the worker bundle is fine, but mcp-server cannot locate it. Worker auto-start will fail until the dirname-resolution path is fixed.",{workerScriptPath:au,mcpServerDir:y_}))}var f_={search:"/api/search",timeline:"/api/timeline"};async function su(t,e){S.debug("SYSTEM","\u2192 Worker API",void 0,{endpoint:t,params:e});let r=new URLSearchParams;for(let[o,s]of Object.entries(e))s!=null&&r.append(o,String(s));let n=`${t}?${r}`;try{let o=await $s(n);if(!o.ok){let i=await o.text();throw new Error(`Worker API error (${o.status}): ${i}`)}let s=await o.json();return S.debug("SYSTEM","\u2190 Worker API success",void 0,{endpoint:t}),s}catch(o){return S.error("SYSTEM","\u2190 Worker API error",{endpoint:t},o instanceof Error?o:new Error(String(o))),{content:[{type:"text",text:`Error calling Worker API: ${o instanceof Error?o.message:String(o)}`}],isError:!0}}}async function Qx(t,e){let r=await $s(t,{method:"POST",headers:{"Content-Type":"application/json"},body:JSON.stringify(e)});if(!r.ok){let o=await r.text();throw new Error(`Worker API error (${r.status}): ${o}`)}let n=await r.json();return S.debug("HTTP","Worker API success (POST)",void 0,{endpoint:t}),{content:[{type:"text",text:JSON.stringify(n,null,2)}]}}async function Mr(t,e){S.debug("HTTP","Worker API request (POST)",void 0,{endpoint:t});try{return await Qx(t,e)}catch(r){return S.error("HTTP","Worker API error (POST)",{endpoint:t},r instanceof Error?r:new Error(String(r))),{content:[{type:"text",text:`Error calling Worker API: ${r instanceof Error?r.message:String(r)}`}],isError:!0}}}async function eP(){try{return(await $s("/api/health")).ok}catch(t){return S.debug("SYSTEM","Worker health check failed",{},t instanceof Error?t:new Error(String(t))),!1}}async function tP(){if(await eP())return!0;S.warn("SYSTEM","Worker not available, attempting auto-start for MCP client"),Xx();try{let t=Jc(),e=await Kg(t,au);return e==="dead"&&S.error("SYSTEM","Worker auto-start failed \u2014 MCP tools that require the worker (search, timeline, get_observations) will fail until the worker is running. Check earlier log lines for the specific failure reason (Bun not found, missing worker bundle, port conflict, etc.)."),e!=="dead"}catch(t){return S.error("SYSTEM","Worker auto-start threw \u2014 MCP tools that require the worker (search, timeline, get_observations) will fail until the worker is running.",void 0,t instanceof Error?t:new Error(String(t))),!1}}var S_=[{name:"__IMPORTANT",description:`3-LAYER WORKFLOW (ALWAYS FOLLOW):
|
||||
1. search(query) \u2192 Get index with IDs (~50-100 tokens/result)
|
||||
2. timeline(anchor=ID) \u2192 Get context around interesting results
|
||||
3. get_observations([IDs]) \u2192 Fetch full details ONLY for filtered IDs
|
||||
|
||||
File diff suppressed because one or more lines are too long
+349
-341
File diff suppressed because one or more lines are too long
@@ -151,13 +151,19 @@ async function buildHooks() {
|
||||
'onnxruntime-node'
|
||||
],
|
||||
define: {
|
||||
'__DEFAULT_PACKAGE_VERSION__': `"${version}"`
|
||||
'__DEFAULT_PACKAGE_VERSION__': `"${version}"`,
|
||||
// Polyfill import.meta.url for ESM deps bundled into CJS output.
|
||||
// @anthropic-ai/claude-agent-sdk's *.mjs files use createRequire(import.meta.url)
|
||||
// and `new URL(rel, import.meta.url)`. We map import.meta.url to a file:// URL
|
||||
// (not the raw __filename path) so URL construction preserves its semantics.
|
||||
'import.meta.url': '__IMPORT_META_URL__'
|
||||
},
|
||||
banner: {
|
||||
js: [
|
||||
'#!/usr/bin/env bun',
|
||||
'var __filename = __filename || require("node:path").resolve(process.argv[1] || "");',
|
||||
'var __dirname = __dirname || require("node:path").dirname(__filename);'
|
||||
'var __dirname = __dirname || require("node:path").dirname(__filename);',
|
||||
'var __IMPORT_META_URL__ = require("node:url").pathToFileURL(__filename).href;'
|
||||
].join('\n')
|
||||
}
|
||||
});
|
||||
|
||||
@@ -23,6 +23,26 @@ const CHROMA_SUPERVISOR_ID = 'chroma-mcp';
|
||||
|
||||
const CHROMA_MCP_PINNED_VERSION = '0.2.6';
|
||||
|
||||
// Override transitive dep resolutions for chroma-mcp 0.2.6 (issue #2371).
|
||||
//
|
||||
// Why onnxruntime>=1.20: the shipped all-MiniLM-L6-v2 model has pytorch-2.0
|
||||
// IR. Older onnxruntime versions can't parse it and fail every embedding
|
||||
// add with `[ONNXRuntimeError] : 7 : INVALID_PROTOBUF`. uv may otherwise
|
||||
// resolve to a too-old onnxruntime on macOS arm64 / Python 3.13 depending
|
||||
// on cache state, so we force a floor.
|
||||
//
|
||||
// Why protobuf<7: protobuf 7.x's stricter generated-file check rejects
|
||||
// opentelemetry's _pb2 stubs (generated with protoc <3.19), throwing
|
||||
// `TypeError: Descriptors cannot be created directly` at chromadb import.
|
||||
// Capping below 7 lands on protobuf 6.x which opentelemetry tolerates.
|
||||
//
|
||||
// These pins are runtime-only (uvx --with) so we don't have to fork
|
||||
// chroma-mcp upstream — they apply only to claude-mem's spawned subprocess.
|
||||
const CHROMA_MCP_DEP_OVERRIDES: ReadonlyArray<string> = [
|
||||
'onnxruntime>=1.20',
|
||||
'protobuf<7',
|
||||
];
|
||||
|
||||
export class ChromaMcpManager {
|
||||
private static instance: ChromaMcpManager | null = null;
|
||||
private client: Client | null = null;
|
||||
@@ -72,15 +92,14 @@ export class ChromaMcpManager {
|
||||
}
|
||||
|
||||
private async connectInternal(): Promise<void> {
|
||||
if (this.transport) {
|
||||
try { await this.transport.close(); } catch { /* already dead */ }
|
||||
}
|
||||
if (this.client) {
|
||||
try { await this.client.close(); } catch { /* already dead */ }
|
||||
}
|
||||
this.client = null;
|
||||
this.transport = null;
|
||||
this.connected = false;
|
||||
// Singleton invariant (#2313): kill any pre-existing chroma-mcp subprocess
|
||||
// tree before spawning a new one. The MCP SDK's transport.close() only
|
||||
// signals the direct child (uvx); on Linux the grandchildren (uv, python,
|
||||
// chroma-mcp) get re-parented to init and survive, accumulating 20+
|
||||
// instances per session if reconnects fire repeatedly. Reuse the same
|
||||
// tree-kill primitive used by stop() so reconnect can never leave
|
||||
// orphans behind.
|
||||
await this.disposeCurrentSubprocess();
|
||||
|
||||
const commandArgs = this.buildCommandArgs();
|
||||
const spawnEnvironment = this.getSpawnEnv();
|
||||
@@ -121,14 +140,12 @@ export class ChromaMcpManager {
|
||||
await Promise.race([mcpConnectionPromise, timeoutPromise]);
|
||||
} catch (connectionError) {
|
||||
clearTimeout(timeoutId!);
|
||||
logger.warn('CHROMA_MCP', 'Connection failed, killing subprocess to prevent zombie', {
|
||||
logger.warn('CHROMA_MCP', 'Connection failed, killing subprocess tree to prevent zombie', {
|
||||
error: connectionError instanceof Error ? connectionError.message : String(connectionError)
|
||||
});
|
||||
try { await this.transport.close(); } catch { /* best effort */ }
|
||||
try { await this.client.close(); } catch { /* best effort */ }
|
||||
this.client = null;
|
||||
this.transport = null;
|
||||
this.connected = false;
|
||||
// Tree-kill (not just transport.close) so failed-connect descendants
|
||||
// can't survive on Linux (#2313).
|
||||
await this.disposeCurrentSubprocess();
|
||||
throw connectionError;
|
||||
}
|
||||
clearTimeout(timeoutId!);
|
||||
@@ -139,6 +156,7 @@ export class ChromaMcpManager {
|
||||
logger.info('CHROMA_MCP', 'Connected to chroma-mcp successfully');
|
||||
|
||||
const currentTransport = this.transport;
|
||||
const currentTrackedPid = (this.transport as unknown as { _process?: ChildProcess })._process?.pid;
|
||||
this.transport.onclose = () => {
|
||||
if (this.transport !== currentTransport) {
|
||||
logger.debug('CHROMA_MCP', 'Ignoring stale onclose from previous transport');
|
||||
@@ -150,6 +168,20 @@ export class ChromaMcpManager {
|
||||
this.client = null;
|
||||
this.transport = null;
|
||||
this.lastConnectionFailureTimestamp = Date.now();
|
||||
|
||||
// Direct child (uvx) emitted close, but on Linux the grandchildren
|
||||
// (uv/python/chroma-mcp) often outlive their parent because MCP SDK
|
||||
// does not use process groups. Sweep the descendant tree using the
|
||||
// captured PID — best-effort; pgrep returns nothing if everything
|
||||
// already exited (#2313).
|
||||
if (currentTrackedPid) {
|
||||
ChromaMcpManager.killProcessTree(currentTrackedPid).catch((error) => {
|
||||
logger.debug('CHROMA_MCP', 'Background tree-kill after onclose finished (best-effort)', {
|
||||
pid: currentTrackedPid,
|
||||
error: error instanceof Error ? error.message : String(error)
|
||||
});
|
||||
});
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
@@ -158,6 +190,8 @@ export class ChromaMcpManager {
|
||||
const chromaMode = settings.CLAUDE_MEM_CHROMA_MODE || 'local';
|
||||
const pythonVersion = process.env.CLAUDE_MEM_PYTHON_VERSION || settings.CLAUDE_MEM_PYTHON_VERSION || '3.13';
|
||||
|
||||
const depOverrideFlags = CHROMA_MCP_DEP_OVERRIDES.flatMap(spec => ['--with', spec]);
|
||||
|
||||
if (chromaMode === 'remote') {
|
||||
const chromaHost = settings.CLAUDE_MEM_CHROMA_HOST || '127.0.0.1';
|
||||
const chromaPort = settings.CLAUDE_MEM_CHROMA_PORT || '8000';
|
||||
@@ -168,6 +202,7 @@ export class ChromaMcpManager {
|
||||
|
||||
const args = [
|
||||
'--python', pythonVersion,
|
||||
...depOverrideFlags,
|
||||
`chroma-mcp==${CHROMA_MCP_PINNED_VERSION}`,
|
||||
'--client-type', 'http',
|
||||
'--host', chromaHost,
|
||||
@@ -193,6 +228,7 @@ export class ChromaMcpManager {
|
||||
|
||||
return [
|
||||
'--python', pythonVersion,
|
||||
...depOverrideFlags,
|
||||
`chroma-mcp==${CHROMA_MCP_PINNED_VERSION}`,
|
||||
'--client-type', 'persistent',
|
||||
'--data-dir', DEFAULT_CHROMA_DATA_DIR.replace(/\\/g, '/')
|
||||
@@ -213,14 +249,15 @@ export class ChromaMcpManager {
|
||||
arguments: toolArguments
|
||||
});
|
||||
} catch (transportError) {
|
||||
this.connected = false;
|
||||
this.client = null;
|
||||
this.transport = null;
|
||||
|
||||
logger.warn('CHROMA_MCP', `Transport error during "${toolName}", reconnecting and retrying once`, {
|
||||
error: transportError instanceof Error ? transportError.message : String(transportError)
|
||||
});
|
||||
|
||||
// Tree-kill the dying subprocess before reconnect. Previously this path
|
||||
// just nulled the handle, which on Linux leaks the uv/python/chroma-mcp
|
||||
// descendants every time a transport error happens (#2313).
|
||||
await this.disposeCurrentSubprocess();
|
||||
|
||||
try {
|
||||
await this.ensureConnected();
|
||||
result = await this.client!.callTool({
|
||||
@@ -328,6 +365,53 @@ export class ChromaMcpManager {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Singleton enforcement helper (#2313): tree-kill the currently tracked
|
||||
* chroma-mcp subprocess and reset all state so the next spawn starts clean.
|
||||
*
|
||||
* Why this is the singleton invariant: every code path that intends to
|
||||
* abandon `this.transport` / `this.client` (reconnect, transport error,
|
||||
* connect-timeout, onclose, stop()) MUST funnel through here. The MCP
|
||||
* SDK's transport.close() only signals the direct child (uvx); on Linux
|
||||
* the grandchildren (uv, python, chroma-mcp) re-parent to init and
|
||||
* accumulate. Calling killProcessTree() against the captured PID before
|
||||
* we drop the reference is the only way to guarantee at most one
|
||||
* chroma-mcp subprocess tree exists per worker process.
|
||||
*
|
||||
* Idempotent and best-effort — safe to call when there is no active
|
||||
* subprocess (no-op in that case).
|
||||
*/
|
||||
private async disposeCurrentSubprocess(): Promise<void> {
|
||||
const chromaProcess = (this.transport as unknown as { _process?: ChildProcess })?._process;
|
||||
const trackedPid = chromaProcess?.pid;
|
||||
|
||||
if (trackedPid) {
|
||||
try {
|
||||
await ChromaMcpManager.killProcessTree(trackedPid);
|
||||
} catch (error) {
|
||||
logger.warn('CHROMA_MCP', 'failed to kill prior chroma-mcp tree (best-effort)', {
|
||||
pid: trackedPid,
|
||||
error: error instanceof Error ? error.message : String(error)
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
if (this.transport) {
|
||||
try { await this.transport.close(); } catch { /* already dead */ }
|
||||
}
|
||||
if (this.client) {
|
||||
try { await this.client.close(); } catch { /* already dead */ }
|
||||
}
|
||||
|
||||
if (trackedPid) {
|
||||
getSupervisor().unregisterProcess(CHROMA_SUPERVISOR_ID);
|
||||
}
|
||||
|
||||
this.client = null;
|
||||
this.transport = null;
|
||||
this.connected = false;
|
||||
}
|
||||
|
||||
/**
|
||||
* Gracefully stop the MCP connection and kill the chroma-mcp subprocess tree.
|
||||
*
|
||||
@@ -341,34 +425,15 @@ export class ChromaMcpManager {
|
||||
* pattern from shutdown.ts (Principle 5: OS-supervised teardown).
|
||||
*/
|
||||
async stop(): Promise<void> {
|
||||
if (!this.client) {
|
||||
if (!this.client && !this.transport) {
|
||||
logger.debug('CHROMA_MCP', 'No active MCP connection to stop');
|
||||
this.connecting = null;
|
||||
return;
|
||||
}
|
||||
|
||||
logger.info('CHROMA_MCP', 'Stopping chroma-mcp MCP connection');
|
||||
|
||||
// Kill the entire process tree before closing the MCP client so
|
||||
// descendants (uv, python, chroma-mcp) don't become orphans.
|
||||
const chromaProcess = (this.transport as unknown as { _process?: ChildProcess })?._process;
|
||||
if (chromaProcess?.pid) {
|
||||
await ChromaMcpManager.killProcessTree(chromaProcess.pid);
|
||||
}
|
||||
|
||||
try {
|
||||
await this.client.close();
|
||||
} catch (error) {
|
||||
if (error instanceof Error) {
|
||||
logger.debug('CHROMA_MCP', 'Error during client close (subprocess may already be dead)', {}, error);
|
||||
} else {
|
||||
logger.debug('CHROMA_MCP', 'Error during client close (subprocess may already be dead)', { error: String(error) });
|
||||
}
|
||||
}
|
||||
|
||||
getSupervisor().unregisterProcess(CHROMA_SUPERVISOR_ID);
|
||||
this.client = null;
|
||||
this.transport = null;
|
||||
this.connected = false;
|
||||
await this.disposeCurrentSubprocess();
|
||||
this.connecting = null;
|
||||
|
||||
logger.info('CHROMA_MCP', 'chroma-mcp MCP connection stopped');
|
||||
|
||||
@@ -27,6 +27,19 @@ import {
|
||||
import { query } from '@anthropic-ai/claude-agent-sdk';
|
||||
import { ClassifiedProviderError } from './provider-errors.js';
|
||||
|
||||
/**
|
||||
* Module-scoped guard so the "effort parameter" hint only fires once per
|
||||
* worker process. The underlying cause (a leaked CLAUDE_CODE_EFFORT_LEVEL in
|
||||
* ~/.claude-mem/.env, see #2357) is environmental — re-logging it on every
|
||||
* SDK call would spam the logs without adding signal.
|
||||
*
|
||||
* Exported solely for tests to reset the latch between cases.
|
||||
*/
|
||||
let effortHintLogged = false;
|
||||
export function __resetEffortHintLatchForTesting(): void {
|
||||
effortHintLogged = false;
|
||||
}
|
||||
|
||||
/**
|
||||
* Classify a ClaudeProvider error (executable spawn failures, SDK errors,
|
||||
* Anthropic API errors). Provider-specific because it relies on:
|
||||
@@ -36,7 +49,7 @@ import { ClassifiedProviderError } from './provider-errors.js';
|
||||
*/
|
||||
export function classifyClaudeError(err: unknown): ClassifiedProviderError {
|
||||
const message = err instanceof Error ? err.message : String(err);
|
||||
const errAny = err as { name?: string; status?: number; error?: { type?: string } };
|
||||
const errAny = err as { name?: string; status?: number; error?: { type?: string }; body?: unknown };
|
||||
|
||||
// Executable / spawn issues — unrecoverable, no point retrying.
|
||||
if (
|
||||
@@ -88,6 +101,39 @@ export function classifyClaudeError(err: unknown): ClassifiedProviderError {
|
||||
return new ClassifiedProviderError(message, { kind: 'unrecoverable', cause: err });
|
||||
}
|
||||
|
||||
// HTTP 400 from the Anthropic SDK — bad request, never recoverable. Mirrors
|
||||
// the pattern in GeminiProvider.classifyGeminiError / classifyOpenRouterError
|
||||
// (see #2357: the SDK forwards `effort` to the Messages API when
|
||||
// CLAUDE_CODE_EFFORT_LEVEL leaks into the subprocess env, and models like
|
||||
// Haiku/Sonnet 4.5 reject with 400 — without this branch the default
|
||||
// `transient` classification retried indefinitely).
|
||||
if (errAny.status === 400) {
|
||||
// Inspect both the message and any structured body for the effort marker.
|
||||
const bodyText = (() => {
|
||||
const body = errAny.body;
|
||||
if (typeof body === 'string') return body;
|
||||
if (body && typeof body === 'object') {
|
||||
try { return JSON.stringify(body); } catch { return ''; }
|
||||
}
|
||||
return '';
|
||||
})();
|
||||
const haystack = `${message}\n${bodyText}`;
|
||||
if (/effort parameter/i.test(haystack) && !effortHintLogged) {
|
||||
effortHintLogged = true;
|
||||
logger.warn(
|
||||
'SDK',
|
||||
'Anthropic API rejected request with HTTP 400: this model does not support the `effort` parameter. ' +
|
||||
'CLAUDE_CODE_EFFORT_LEVEL is likely leaking into the SDK subprocess env via ~/.claude-mem/.env — ' +
|
||||
'remove it or scope it to models that support effort. See https://github.com/thedotmack/claude-mem/issues/2357.',
|
||||
{ status: 400 }
|
||||
);
|
||||
}
|
||||
return new ClassifiedProviderError(
|
||||
message || 'Anthropic bad request (status 400)',
|
||||
{ kind: 'unrecoverable', cause: err },
|
||||
);
|
||||
}
|
||||
|
||||
// Server errors → transient.
|
||||
if (typeof errAny.status === 'number' && errAny.status >= 500 && errAny.status < 600) {
|
||||
return new ClassifiedProviderError(message, { kind: 'transient', cause: err });
|
||||
|
||||
+35
-18
@@ -9,7 +9,16 @@ import {
|
||||
type OAuthTokenResult,
|
||||
} from './oauth-token.js';
|
||||
|
||||
export const ENV_FILE_PATH = paths.envFile();
|
||||
// Resolved lazily so tests (and any rare runtime path-overrides) can target a
|
||||
// temp file via CLAUDE_MEM_ENV_FILE without depending on module-load order.
|
||||
// Production callers see the canonical ~/.claude-mem/.env path through
|
||||
// paths.envFile() unchanged.
|
||||
export function envFilePath(): string {
|
||||
return process.env.CLAUDE_MEM_ENV_FILE ?? paths.envFile();
|
||||
}
|
||||
|
||||
/** @deprecated Prefer envFilePath(); kept as a snapshot for back-compat. */
|
||||
export const ENV_FILE_PATH = envFilePath();
|
||||
|
||||
const BLOCKED_ENV_VARS = [
|
||||
'ANTHROPIC_API_KEY', // Issue #733: Prevent auto-discovery from project .env files
|
||||
@@ -17,6 +26,10 @@ const BLOCKED_ENV_VARS = [
|
||||
// shell would otherwise short-circuit OAuth lookup at spawn time.
|
||||
// The fresh token from ~/.claude-mem/.env is re-injected below
|
||||
// when explicit gateway credentials are configured.
|
||||
'ANTHROPIC_BASE_URL', // Issue #2375: same leak class as AUTH_TOKEN. A leaked BASE_URL
|
||||
// alone (no token) was enough to trigger the OAuth-skip path,
|
||||
// sending the subprocess to a proxy with no credentials.
|
||||
// Re-injected from ~/.claude-mem/.env when configured.
|
||||
'CLAUDECODE', // Prevent "cannot be launched inside another Claude Code session" error
|
||||
'CLAUDE_CODE_OAUTH_TOKEN', // Issue #2215: prevent stale parent-process token from leaking into
|
||||
// isolated env. The fresh token is read from the keychain at spawn
|
||||
@@ -77,12 +90,13 @@ function serializeEnvFile(env: Record<string, string>): string {
|
||||
}
|
||||
|
||||
export function loadClaudeMemEnv(): ClaudeMemEnv {
|
||||
if (!existsSync(ENV_FILE_PATH)) {
|
||||
const envFile = envFilePath();
|
||||
if (!existsSync(envFile)) {
|
||||
return {};
|
||||
}
|
||||
|
||||
try {
|
||||
const content = readFileSync(ENV_FILE_PATH, 'utf-8');
|
||||
const content = readFileSync(envFile, 'utf-8');
|
||||
const parsed = parseEnvFile(content);
|
||||
|
||||
const result: ClaudeMemEnv = {};
|
||||
@@ -94,12 +108,13 @@ export function loadClaudeMemEnv(): ClaudeMemEnv {
|
||||
|
||||
return result;
|
||||
} catch (error: unknown) {
|
||||
logger.warn('ENV', 'Failed to load .env file', { path: ENV_FILE_PATH }, error instanceof Error ? error : new Error(String(error)));
|
||||
logger.warn('ENV', 'Failed to load .env file', { path: envFile }, error instanceof Error ? error : new Error(String(error)));
|
||||
return {};
|
||||
}
|
||||
}
|
||||
|
||||
export function saveClaudeMemEnv(env: ClaudeMemEnv): void {
|
||||
const envFile = envFilePath();
|
||||
let existing: Record<string, string> = {};
|
||||
try {
|
||||
if (!existsSync(paths.dataDir())) {
|
||||
@@ -107,8 +122,8 @@ export function saveClaudeMemEnv(env: ClaudeMemEnv): void {
|
||||
}
|
||||
chmodSync(paths.dataDir(), 0o700);
|
||||
|
||||
existing = existsSync(ENV_FILE_PATH)
|
||||
? parseEnvFile(readFileSync(ENV_FILE_PATH, 'utf-8'))
|
||||
existing = existsSync(envFile)
|
||||
? parseEnvFile(readFileSync(envFile, 'utf-8'))
|
||||
: {};
|
||||
} catch (error) {
|
||||
const normalizedError = error instanceof Error ? error : new Error(String(error));
|
||||
@@ -155,10 +170,10 @@ export function saveClaudeMemEnv(env: ClaudeMemEnv): void {
|
||||
}
|
||||
|
||||
try {
|
||||
writeFileSync(ENV_FILE_PATH, serializeEnvFile(updated), { encoding: 'utf-8', mode: 0o600 });
|
||||
chmodSync(ENV_FILE_PATH, 0o600);
|
||||
writeFileSync(envFile, serializeEnvFile(updated), { encoding: 'utf-8', mode: 0o600 });
|
||||
chmodSync(envFile, 0o600);
|
||||
} catch (error: unknown) {
|
||||
logger.error('ENV', 'Failed to save .env file', { path: ENV_FILE_PATH }, error instanceof Error ? error : new Error(String(error)));
|
||||
logger.error('ENV', 'Failed to save .env file', { path: envFile }, error instanceof Error ? error : new Error(String(error)));
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
@@ -230,15 +245,17 @@ export async function buildIsolatedEnvWithFreshOAuth(
|
||||
|
||||
if (!includeCredentials) return isolatedEnv;
|
||||
|
||||
// If the user already configured explicit Anthropic/gateway credentials in
|
||||
// ~/.claude-mem/.env, honor those and skip OAuth lookup entirely. A bare
|
||||
// ANTHROPIC_BASE_URL counts because gateways may be tokenless, and falling
|
||||
// back to OAuth would silently route requests to api.anthropic.com.
|
||||
if (
|
||||
isolatedEnv.ANTHROPIC_API_KEY ||
|
||||
isolatedEnv.ANTHROPIC_BASE_URL ||
|
||||
isolatedEnv.ANTHROPIC_AUTH_TOKEN
|
||||
) {
|
||||
// Custom gateway: never inject OAuth (would leak the user's Anthropic OAuth
|
||||
// token to a third-party gateway). The user must explicitly configure a
|
||||
// gateway-appropriate token in ~/.claude-mem/.env if their gateway requires
|
||||
// one. A bare BASE_URL with no token = tokenless gateway (e.g. mTLS at the
|
||||
// network boundary).
|
||||
if (isolatedEnv.ANTHROPIC_BASE_URL) {
|
||||
clearStaleMarker();
|
||||
return isolatedEnv;
|
||||
}
|
||||
// Direct API with explicit credentials: skip OAuth lookup.
|
||||
if (isolatedEnv.ANTHROPIC_API_KEY || isolatedEnv.ANTHROPIC_AUTH_TOKEN) {
|
||||
clearStaleMarker();
|
||||
return isolatedEnv;
|
||||
}
|
||||
|
||||
@@ -0,0 +1,122 @@
|
||||
import { describe, it, expect, beforeEach, afterEach, spyOn } from 'bun:test';
|
||||
|
||||
import {
|
||||
classifyClaudeError,
|
||||
__resetEffortHintLatchForTesting,
|
||||
} from '../src/services/worker/ClaudeProvider.js';
|
||||
import { isClassified } from '../src/services/worker/provider-errors.js';
|
||||
import { logger } from '../src/utils/logger.js';
|
||||
|
||||
/**
|
||||
* Tests for HTTP 400 classification in ClaudeProvider's classifyClaudeError.
|
||||
*
|
||||
* Regression coverage for #2357: ClaudeProvider previously had no explicit
|
||||
* HTTP 400 handling, so the default branch classified all 400s as `transient`
|
||||
* and the retry loop would hammer a permanent error indefinitely (e.g. when
|
||||
* CLAUDE_CODE_EFFORT_LEVEL leaks into the SDK subprocess and the model
|
||||
* rejects the `effort` parameter).
|
||||
*/
|
||||
describe('classifyClaudeError — HTTP 400 handling (#2357)', () => {
|
||||
let warnSpy: ReturnType<typeof spyOn>;
|
||||
|
||||
beforeEach(() => {
|
||||
__resetEffortHintLatchForTesting();
|
||||
warnSpy = spyOn(logger, 'warn').mockImplementation(() => {});
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
warnSpy.mockRestore();
|
||||
__resetEffortHintLatchForTesting();
|
||||
});
|
||||
|
||||
it('classifies 400 with "effort parameter" body as unrecoverable AND logs an SDK warn once', () => {
|
||||
const sdkErr = Object.assign(
|
||||
new Error('This model does not support the effort parameter.'),
|
||||
{ status: 400 },
|
||||
);
|
||||
|
||||
const classified = classifyClaudeError(sdkErr);
|
||||
|
||||
expect(isClassified(classified)).toBe(true);
|
||||
expect(classified.kind).toBe('unrecoverable');
|
||||
expect(warnSpy).toHaveBeenCalledTimes(1);
|
||||
// First positional arg of logger.warn is the component category.
|
||||
const [component, hintMessage] = warnSpy.mock.calls[0] as [string, string, ...unknown[]];
|
||||
expect(component).toBe('SDK');
|
||||
expect(hintMessage).toMatch(/effort/i);
|
||||
expect(hintMessage).toMatch(/2357/);
|
||||
});
|
||||
|
||||
it('classifies 400 with effort marker in a structured body field', () => {
|
||||
const sdkErr = Object.assign(
|
||||
new Error('Bad request'),
|
||||
{
|
||||
status: 400,
|
||||
body: { error: { message: 'This model does not support the effort parameter.' } },
|
||||
},
|
||||
);
|
||||
|
||||
const classified = classifyClaudeError(sdkErr);
|
||||
|
||||
expect(classified.kind).toBe('unrecoverable');
|
||||
expect(warnSpy).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it('classifies 400 without effort body as unrecoverable WITHOUT firing the effort hint', () => {
|
||||
const sdkErr = Object.assign(
|
||||
new Error('some other 400 error'),
|
||||
{ status: 400 },
|
||||
);
|
||||
|
||||
const classified = classifyClaudeError(sdkErr);
|
||||
|
||||
expect(classified.kind).toBe('unrecoverable');
|
||||
expect(warnSpy).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('throttles the effort hint to one log per process even on repeated 400s', () => {
|
||||
const sdkErr = Object.assign(
|
||||
new Error('This model does not support the effort parameter.'),
|
||||
{ status: 400 },
|
||||
);
|
||||
|
||||
for (let i = 0; i < 5; i++) {
|
||||
const classified = classifyClaudeError(sdkErr);
|
||||
expect(classified.kind).toBe('unrecoverable');
|
||||
}
|
||||
|
||||
expect(warnSpy).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
});
|
||||
|
||||
describe('classifyClaudeError — sibling status codes (regression sanity)', () => {
|
||||
let warnSpy: ReturnType<typeof spyOn>;
|
||||
|
||||
beforeEach(() => {
|
||||
__resetEffortHintLatchForTesting();
|
||||
warnSpy = spyOn(logger, 'warn').mockImplementation(() => {});
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
warnSpy.mockRestore();
|
||||
__resetEffortHintLatchForTesting();
|
||||
});
|
||||
|
||||
it('classifies status=401 as auth_invalid', () => {
|
||||
const sdkErr = Object.assign(new Error('unauthorized'), { status: 401 });
|
||||
const classified = classifyClaudeError(sdkErr);
|
||||
expect(classified.kind).toBe('auth_invalid');
|
||||
});
|
||||
|
||||
it('classifies status=429 as rate_limit', () => {
|
||||
const sdkErr = Object.assign(new Error('rate limited'), { status: 429 });
|
||||
const classified = classifyClaudeError(sdkErr);
|
||||
expect(classified.kind).toBe('rate_limit');
|
||||
});
|
||||
|
||||
it('classifies a network error with no status as transient', () => {
|
||||
const networkErr = new Error('ECONNRESET: socket hang up');
|
||||
const classified = classifyClaudeError(networkErr);
|
||||
expect(classified.kind).toBe('transient');
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,155 @@
|
||||
import { describe, it, expect, beforeAll, afterAll, beforeEach, afterEach, spyOn } from 'bun:test';
|
||||
import * as fs from 'fs';
|
||||
import { tmpdir } from 'os';
|
||||
import { join } from 'path';
|
||||
import {
|
||||
envFilePath,
|
||||
buildIsolatedEnv,
|
||||
buildIsolatedEnvWithFreshOAuth,
|
||||
} from '../src/shared/EnvManager.js';
|
||||
import * as oauthToken from '../src/shared/oauth-token.js';
|
||||
|
||||
/**
|
||||
* Tests for issue #2375: ANTHROPIC_BASE_URL must not leak from the parent
|
||||
* shell into the spawned worker's isolatedEnv, AND the OAuth-skip predicate
|
||||
* must not inject the user's Anthropic OAuth token onto a custom gateway URL
|
||||
* (which would be a token leak to a third party).
|
||||
*
|
||||
* Redirect EnvManager to a per-suite temp file via CLAUDE_MEM_ENV_FILE so
|
||||
* the user's real ~/.claude-mem/.env is never read or mutated even if a test
|
||||
* fails mid-flight. envFilePath() resolves the override on every call, so
|
||||
* this works regardless of the order other tests imported the module.
|
||||
*/
|
||||
|
||||
const TEST_DATA_DIR = fs.mkdtempSync(join(tmpdir(), 'claude-mem-env-isolation-'));
|
||||
const TEST_ENV_FILE = join(TEST_DATA_DIR, '.env');
|
||||
const ORIGINAL_ENV_FILE = process.env.CLAUDE_MEM_ENV_FILE;
|
||||
|
||||
const ORIGINAL_BASE_URL = process.env.ANTHROPIC_BASE_URL;
|
||||
const ORIGINAL_API_KEY = process.env.ANTHROPIC_API_KEY;
|
||||
const ORIGINAL_AUTH_TOKEN = process.env.ANTHROPIC_AUTH_TOKEN;
|
||||
const ORIGINAL_OAUTH_TOKEN = process.env.CLAUDE_CODE_OAUTH_TOKEN;
|
||||
|
||||
function clearEnvFile(): void {
|
||||
if (fs.existsSync(TEST_ENV_FILE)) {
|
||||
fs.unlinkSync(TEST_ENV_FILE);
|
||||
}
|
||||
}
|
||||
|
||||
function clearAnthropicEnv(): void {
|
||||
delete process.env.ANTHROPIC_BASE_URL;
|
||||
delete process.env.ANTHROPIC_API_KEY;
|
||||
delete process.env.ANTHROPIC_AUTH_TOKEN;
|
||||
delete process.env.CLAUDE_CODE_OAUTH_TOKEN;
|
||||
}
|
||||
|
||||
function restoreOriginalEnv(): void {
|
||||
if (ORIGINAL_BASE_URL === undefined) {
|
||||
delete process.env.ANTHROPIC_BASE_URL;
|
||||
} else {
|
||||
process.env.ANTHROPIC_BASE_URL = ORIGINAL_BASE_URL;
|
||||
}
|
||||
if (ORIGINAL_API_KEY === undefined) {
|
||||
delete process.env.ANTHROPIC_API_KEY;
|
||||
} else {
|
||||
process.env.ANTHROPIC_API_KEY = ORIGINAL_API_KEY;
|
||||
}
|
||||
if (ORIGINAL_AUTH_TOKEN === undefined) {
|
||||
delete process.env.ANTHROPIC_AUTH_TOKEN;
|
||||
} else {
|
||||
process.env.ANTHROPIC_AUTH_TOKEN = ORIGINAL_AUTH_TOKEN;
|
||||
}
|
||||
if (ORIGINAL_OAUTH_TOKEN === undefined) {
|
||||
delete process.env.CLAUDE_CODE_OAUTH_TOKEN;
|
||||
} else {
|
||||
process.env.CLAUDE_CODE_OAUTH_TOKEN = ORIGINAL_OAUTH_TOKEN;
|
||||
}
|
||||
}
|
||||
|
||||
describe('Issue #2375: ANTHROPIC_BASE_URL env-var isolation', () => {
|
||||
beforeAll(() => {
|
||||
fs.mkdirSync(TEST_DATA_DIR, { recursive: true, mode: 0o700 });
|
||||
process.env.CLAUDE_MEM_ENV_FILE = TEST_ENV_FILE;
|
||||
expect(envFilePath()).toBe(TEST_ENV_FILE);
|
||||
});
|
||||
|
||||
afterAll(() => {
|
||||
fs.rmSync(TEST_DATA_DIR, { recursive: true, force: true });
|
||||
if (ORIGINAL_ENV_FILE === undefined) {
|
||||
delete process.env.CLAUDE_MEM_ENV_FILE;
|
||||
} else {
|
||||
process.env.CLAUDE_MEM_ENV_FILE = ORIGINAL_ENV_FILE;
|
||||
}
|
||||
});
|
||||
|
||||
beforeEach(() => {
|
||||
clearEnvFile();
|
||||
clearAnthropicEnv();
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
clearEnvFile();
|
||||
restoreOriginalEnv();
|
||||
});
|
||||
|
||||
it('leaked ANTHROPIC_BASE_URL is stripped from isolatedEnv', () => {
|
||||
// No .env file exists. The parent shell sets a stray ANTHROPIC_BASE_URL —
|
||||
// this MUST NOT propagate into the subprocess isolatedEnv, because doing
|
||||
// so used to trigger the OAuth-skip path and leave the worker with no
|
||||
// credentials at all.
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://shouldnotleak.example';
|
||||
|
||||
const result = buildIsolatedEnv();
|
||||
|
||||
expect(result.ANTHROPIC_BASE_URL).toBeUndefined();
|
||||
});
|
||||
|
||||
it('~/.claude-mem/.env BASE_URL + AUTH_TOKEN reaches isolatedEnv', () => {
|
||||
// User intentionally configured a gateway with a gateway-appropriate
|
||||
// auth token. Both must be re-injected into isolatedEnv.
|
||||
fs.writeFileSync(
|
||||
TEST_ENV_FILE,
|
||||
'ANTHROPIC_BASE_URL=https://gateway.example\nANTHROPIC_AUTH_TOKEN=test-token\n',
|
||||
{ mode: 0o600 },
|
||||
);
|
||||
|
||||
const result = buildIsolatedEnv();
|
||||
|
||||
expect(result.ANTHROPIC_BASE_URL).toBe('https://gateway.example');
|
||||
expect(result.ANTHROPIC_AUTH_TOKEN).toBe('test-token');
|
||||
});
|
||||
|
||||
it('bare .env BASE_URL alone does not trigger OAuth fetch', async () => {
|
||||
// A user with a tokenless gateway (e.g. mTLS at the network boundary)
|
||||
// configures BASE_URL only. The three-branch predicate must hit the
|
||||
// BASE_URL-set branch BEFORE OAuth lookup, so CLAUDE_CODE_OAUTH_TOKEN
|
||||
// must NOT appear in the result. This is the security-regression guard
|
||||
// against a token leak to a third-party gateway.
|
||||
//
|
||||
// Note: EnvManager captures readClaudeOAuthToken via a named import at
|
||||
// module load, so spyOn on the namespace export only weakly observes
|
||||
// the call (the binding inside EnvManager is independent). The
|
||||
// behavioral assertions (BASE_URL re-injected AND OAuth token NOT
|
||||
// injected) are the load-bearing checks: in the no-OAuth-injection
|
||||
// outcome, the only execution path that produces this combination is
|
||||
// the new BASE_URL-first branch returning early.
|
||||
fs.writeFileSync(
|
||||
TEST_ENV_FILE,
|
||||
'ANTHROPIC_BASE_URL=https://gateway.example\n',
|
||||
{ mode: 0o600 },
|
||||
);
|
||||
|
||||
const oauthSpy = spyOn(oauthToken, 'readClaudeOAuthToken');
|
||||
|
||||
try {
|
||||
const result = await buildIsolatedEnvWithFreshOAuth();
|
||||
|
||||
expect(result.ANTHROPIC_BASE_URL).toBe('https://gateway.example');
|
||||
expect(result.CLAUDE_CODE_OAUTH_TOKEN).toBeUndefined();
|
||||
// Best-effort sanity check; see note above.
|
||||
expect(oauthSpy).not.toHaveBeenCalled();
|
||||
} finally {
|
||||
oauthSpy.mockRestore();
|
||||
}
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,228 @@
|
||||
import { describe, it, expect, beforeEach, mock } from 'bun:test';
|
||||
|
||||
// Singleton enforcement regression coverage for issue #2313.
|
||||
//
|
||||
// Hypothesis under test: prior to the fix, ChromaMcpManager could leak its
|
||||
// chroma-mcp subprocess tree on every reconnect / transport error, accumulating
|
||||
// 20+ instances per session on Linux because the MCP SDK's transport.close()
|
||||
// only signals the direct child (uvx). The fix routes every "abandon current
|
||||
// transport" path through disposeCurrentSubprocess(), which tree-kills via
|
||||
// killProcessTree() before nulling the handles.
|
||||
|
||||
let transportCount = 0;
|
||||
const transportInstances: Array<FakeTransport> = [];
|
||||
|
||||
interface FakeChildProcess {
|
||||
pid: number;
|
||||
once: (event: string, _cb: (...args: unknown[]) => void) => FakeChildProcess;
|
||||
on: (event: string, _cb: (...args: unknown[]) => void) => FakeChildProcess;
|
||||
}
|
||||
|
||||
class FakeTransport {
|
||||
static nextPid = 100_000;
|
||||
onclose: (() => void) | null = null;
|
||||
closed = false;
|
||||
// Mimic StdioClientTransport's internal `_process` field that the manager
|
||||
// pokes into via `(this.transport as unknown as { _process })._process`.
|
||||
_process: FakeChildProcess;
|
||||
|
||||
constructor(_opts: { command: string; args: string[] }) {
|
||||
transportCount += 1;
|
||||
const pid = FakeTransport.nextPid++;
|
||||
const child: FakeChildProcess = {
|
||||
pid,
|
||||
once: function (this: FakeChildProcess) { return this; },
|
||||
on: function (this: FakeChildProcess) { return this; },
|
||||
};
|
||||
this._process = child;
|
||||
transportInstances.push(this);
|
||||
}
|
||||
|
||||
async close(): Promise<void> {
|
||||
this.closed = true;
|
||||
}
|
||||
}
|
||||
|
||||
mock.module('@modelcontextprotocol/sdk/client/stdio.js', () => ({
|
||||
StdioClientTransport: FakeTransport,
|
||||
}));
|
||||
|
||||
let connectImpl: () => Promise<void> = async () => {};
|
||||
let callToolImpl: () => Promise<unknown> = async () => ({
|
||||
content: [{ type: 'text', text: '{}' }],
|
||||
});
|
||||
|
||||
class FakeClient {
|
||||
closed = false;
|
||||
async connect(): Promise<void> {
|
||||
await connectImpl();
|
||||
}
|
||||
async callTool(): Promise<unknown> {
|
||||
return await callToolImpl();
|
||||
}
|
||||
async close(): Promise<void> {
|
||||
this.closed = true;
|
||||
}
|
||||
}
|
||||
|
||||
mock.module('@modelcontextprotocol/sdk/client/index.js', () => ({
|
||||
Client: FakeClient,
|
||||
}));
|
||||
|
||||
mock.module('../../../src/shared/SettingsDefaultsManager.js', () => ({
|
||||
SettingsDefaultsManager: {
|
||||
get: () => '',
|
||||
getInt: () => 0,
|
||||
loadFromFile: () => ({}),
|
||||
},
|
||||
}));
|
||||
|
||||
mock.module('../../../src/shared/paths.js', () => ({
|
||||
USER_SETTINGS_PATH: '/tmp/fake-settings.json',
|
||||
paths: {
|
||||
chroma: () => '/tmp/fake-chroma',
|
||||
combinedCerts: () => '/tmp/fake-combined-certs.pem',
|
||||
},
|
||||
}));
|
||||
|
||||
mock.module('../../../src/utils/logger.js', () => ({
|
||||
logger: {
|
||||
info: () => {},
|
||||
debug: () => {},
|
||||
warn: () => {},
|
||||
error: () => {},
|
||||
failure: () => {},
|
||||
},
|
||||
}));
|
||||
|
||||
// Track tree-kill invocations and the transport whose subprocess was killed.
|
||||
const killTreeCalls: number[] = [];
|
||||
|
||||
mock.module('../../../src/supervisor/index.ts', () => ({
|
||||
getSupervisor: () => ({
|
||||
assertCanSpawn: () => {},
|
||||
registerProcess: () => {},
|
||||
unregisterProcess: () => {},
|
||||
}),
|
||||
}));
|
||||
|
||||
mock.module('../../../src/supervisor/env-sanitizer.js', () => ({
|
||||
sanitizeEnv: (env: NodeJS.ProcessEnv) => env,
|
||||
}));
|
||||
|
||||
// Replace child_process.execFile so the static killProcessTree implementation
|
||||
// can be observed without actually shelling out. We feed pgrep an empty stdout
|
||||
// (no descendants) so the only signal target is the root pid.
|
||||
mock.module('child_process', () => {
|
||||
const original = require('node:child_process');
|
||||
return {
|
||||
...original,
|
||||
execFile: (
|
||||
cmd: string,
|
||||
args: string[],
|
||||
_opts: unknown,
|
||||
cb: (err: Error | null, stdout: { stdout: string; stderr: string }) => void
|
||||
) => {
|
||||
// Bun's promisify path will call this as if it were a Node-style callback.
|
||||
if (cmd === 'pgrep') {
|
||||
cb(null, { stdout: '', stderr: '' } as any);
|
||||
} else {
|
||||
cb(null, { stdout: '', stderr: '' } as any);
|
||||
}
|
||||
},
|
||||
execSync: () => '',
|
||||
};
|
||||
});
|
||||
|
||||
// Stub process.kill so the tree-kill path can record targets without crashing
|
||||
// the test runner if the synthetic PID happens to collide with a real one.
|
||||
const realProcessKill = process.kill.bind(process);
|
||||
const stubbedProcessKill = ((pid: number, _signal?: string | number) => {
|
||||
killTreeCalls.push(pid);
|
||||
return true;
|
||||
}) as typeof process.kill;
|
||||
process.kill = stubbedProcessKill;
|
||||
|
||||
import { ChromaMcpManager } from '../../../src/services/sync/ChromaMcpManager.js';
|
||||
|
||||
function resetState(): void {
|
||||
transportCount = 0;
|
||||
transportInstances.length = 0;
|
||||
killTreeCalls.length = 0;
|
||||
connectImpl = async () => {};
|
||||
callToolImpl = async () => ({ content: [{ type: 'text', text: '{}' }] });
|
||||
}
|
||||
|
||||
describe('ChromaMcpManager singleton enforcement (#2313)', () => {
|
||||
beforeEach(async () => {
|
||||
await ChromaMcpManager.reset();
|
||||
resetState();
|
||||
});
|
||||
|
||||
it('serializes concurrent ensureConnected() calls into one spawn', async () => {
|
||||
const mgr = ChromaMcpManager.getInstance();
|
||||
|
||||
// Five parallel callers race ensureConnected via callTool — only one
|
||||
// chroma-mcp subprocess (one transport) should be spawned.
|
||||
await Promise.all(
|
||||
Array.from({ length: 5 }, () =>
|
||||
mgr.callTool('chroma_list_collections', { limit: 1 })
|
||||
)
|
||||
);
|
||||
|
||||
expect(transportCount).toBe(1);
|
||||
});
|
||||
|
||||
it('kills the prior subprocess tree before a reconnect spawn', async () => {
|
||||
const mgr = ChromaMcpManager.getInstance();
|
||||
|
||||
// First call: opens transport #1.
|
||||
await mgr.callTool('chroma_list_collections', { limit: 1 });
|
||||
expect(transportInstances.length).toBe(1);
|
||||
const firstPid = transportInstances[0]._process.pid;
|
||||
|
||||
// Second call: rig callTool to throw a transport error on the FIRST attempt
|
||||
// so the manager runs its reconnect-and-retry path. The retry should
|
||||
// dispose the prior subprocess tree (firstPid) before spawning a new one.
|
||||
let invocations = 0;
|
||||
callToolImpl = async () => {
|
||||
invocations += 1;
|
||||
if (invocations === 1) {
|
||||
throw new Error('Connection closed');
|
||||
}
|
||||
return { content: [{ type: 'text', text: '{}' }] };
|
||||
};
|
||||
|
||||
await mgr.callTool('chroma_list_collections', { limit: 1 });
|
||||
|
||||
expect(transportInstances.length).toBe(2);
|
||||
// The first transport's pid must have been signaled by killProcessTree
|
||||
// before the second transport spawned.
|
||||
expect(killTreeCalls).toContain(firstPid);
|
||||
});
|
||||
|
||||
it('stop() disposes state including any pending connecting promise', async () => {
|
||||
const mgr = ChromaMcpManager.getInstance();
|
||||
|
||||
await mgr.callTool('chroma_list_collections', { limit: 1 });
|
||||
expect(transportInstances.length).toBe(1);
|
||||
const subprocessPid = transportInstances[0]._process.pid;
|
||||
|
||||
await mgr.stop();
|
||||
|
||||
// After stop(), every internal handle should be cleared and the prior
|
||||
// subprocess tree must have been signaled.
|
||||
expect(killTreeCalls).toContain(subprocessPid);
|
||||
|
||||
// A subsequent ensureConnected must spawn a fresh transport (not reuse
|
||||
// a stale one).
|
||||
await mgr.callTool('chroma_list_collections', { limit: 1 });
|
||||
expect(transportInstances.length).toBe(2);
|
||||
});
|
||||
});
|
||||
|
||||
// Restore the real process.kill once the test module finishes evaluating any
|
||||
// late-arriving microtasks.
|
||||
process.on('exit', () => {
|
||||
process.kill = realProcessKill;
|
||||
});
|
||||
Reference in New Issue
Block a user