fix: sequoia-territory bug-fix bundle (chroma, env, build, MCP, worker) (#2394)

* fix(mcp): drop ${_R%/} parameter-expansion trim that trips Claude Code MCP validator

The POSIX substring trim ${_R%/} is misread by Claude Code's MCP-config
validator as a required env var named "_R%/", causing /doctor to flag
mcp-search as invalid on every install. POSIX collapses // in paths, so
the trim was cosmetic — drop it and the validator passes.

Fixes #2350, #2354, #2356.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(env): block ANTHROPIC_BASE_URL leak + three-branch OAuth-skip predicate

Issue #2375: parent-shell ANTHROPIC_BASE_URL leaked through to subprocess
isolatedEnv, while ANTHROPIC_AUTH_TOKEN was blocked. The OAuth-skip
predicate fired on bare BASE_URL, but no auth credential reached the
subprocess -> "Not logged in". Add ANTHROPIC_BASE_URL to BLOCKED_ENV_VARS
so it can only enter isolatedEnv via ~/.claude-mem/.env.

Replace the OAuth-skip predicate with three branches to prevent a
second-order security regression: a user with a tokenless gateway
configured in .env (BASE_URL only, no token) would otherwise have their
Anthropic OAuth token fetched and sent to their gateway. Token leak to
third party. Three-branch predicate:

1. BASE_URL set -> return without OAuth (custom gateway, never leak token)
2. API_KEY or AUTH_TOKEN set -> return without OAuth (explicit credentials)
3. Otherwise -> OAuth lookup for api.anthropic.com

Adds tests/env-isolation.test.ts.

Fixes #2375.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(worker): classify Claude SDK HTTP 400 as unrecoverable

ClaudeProvider previously had no explicit HTTP 400 handling — the
default branch classified all errors as `transient`, so a permanent
400 (e.g., model rejecting an `effort` parameter forwarded from a
leaked CLAUDE_CODE_EFFORT_LEVEL) would be retried indefinitely
(#1874+ retries observed in one session per #2357).

Mirror GeminiProvider/OpenRouterProvider's pattern: classify 400 as
`unrecoverable`, 401/403 as `auth_invalid`, 429 as `rate_limit`,
default to `transient`. When the 400 body matches the
"effort parameter" signature, emit a one-time SDK warn log pointing
at the env-leak fix in ~/.claude-mem/.env.

Adds tests/claude-provider-error-classifier.test.ts.

Fixes #2357.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chroma): pin onnxruntime>=1.20 + protobuf<7 to fix INVALID_PROTOBUF on macOS arm64

The shipped all-MiniLM-L6-v2 model has pytorch-2.0 IR. chroma-mcp 0.2.6
transitively depends on `chromadb>=1.0.16` which only requires
`onnxruntime>=1.14.1` — uv can therefore resolve to an onnxruntime old
enough to fail every embedding add with `[ONNXRuntimeError] : 7 :
INVALID_PROTOBUF` on macOS arm64 / Python 3.13. Semantic search silently
degraded to FTS-only and smart backfill broke (#2371).

Path B (override) was required because chroma-mcp 0.2.6 is the latest
PyPI release — no upstream bump exists.

Inject `--with onnxruntime>=1.20 --with protobuf<7` into the uvx spawn
args (both persistent and remote modes). The protobuf cap is essential:
forcing only `onnxruntime>=1.20` causes uv to re-resolve and land on
protobuf 7.x, which trips opentelemetry's `_pb2` stubs with `TypeError:
Descriptors cannot be created directly` because they were generated
with protoc <3.19. Capping below 7 lands on protobuf 6.x which
opentelemetry tolerates.

Verified end-to-end: ONNX model loads, embeddings produce a 384-dim
vector, PersistentClient init / add / query roundtrip succeeds:

    uvx --python 3.13 --with "onnxruntime>=1.20" --with "protobuf<7" \
        chroma-mcp==0.2.6 --help     # clean
    # programmatic test: onnxruntime 1.26.0, protobuf 6.33.6,
    # embedding ok 384, query ok ids=[['1']]

Fixes #2371.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chroma): enforce single chroma-mcp subprocess per worker (#2313)

Root cause: every reconnect path in ChromaMcpManager — connectInternal's
re-entry, the connect-timeout catch, callTool's transport-error retry, and
the transport.onclose handler — used to abandon `this.transport`/`this.client`
by calling at most `transport.close()` and nulling the handles. The MCP SDK's
StdioClientTransport.close() only signals the direct child (uvx); on Linux the
grandchildren (uv -> python -> chroma-mcp) re-parent to init and survive
because the SDK does not put the subprocess in its own process group. Each
reconnect therefore leaked a full chroma-mcp tree, accumulating 20+ instances
per session.

Fix: introduce a private disposeCurrentSubprocess() helper that always tree-
kills via the existing killProcessTree primitive before nulling the transport
reference, and route every "abandon current transport" path (reconnect,
connect-timeout, transport error, onclose, stop) through it. The existing
`connecting: Promise<void> | null` lock continues to serialize concurrent
ensureConnected() callers into a single spawn.

Adds tests/services/sync/chroma-mcp-manager-singleton.test.ts covering:
- 5 parallel ensureConnected() calls produce exactly one spawn
- a transport-error reconnect tree-kills the prior subprocess pid before
  spawning a replacement
- stop() disposes state including any pending connecting promise

Manual verification needed on Linux: after a long session with multiple
tool uses, `ps aux | grep chroma-mcp | wc -l` should return 1, not 20+.

Fixes #2313.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(build): polyfill import.meta.url to __filename in CJS worker bundle

The worker bundles ESM dependencies (notably @anthropic-ai/claude-agent-sdk's
*.mjs files) into CJS output. Those modules call createRequire(import.meta.url)
at module-load time. esbuild's CJS output left this as createRequire(ute.url)
— where `ute` is its `import.meta` polyfill `{}` — so `ute.url` was undefined
and module-load crashed with:

  TypeError: The argument 'filename' must be a file URL object, file URL
  string, or absolute path string. Received undefined
  code: ERR_INVALID_ARG_VALUE

Every Stop hook and every worker subprocess invocation hit this. Fix is the
esbuild `define` option mapping `import.meta.url` to `__filename` (provided as
a real absolute path by the existing CJS prelude in the banner).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: daily dep bump per CLAUDE.md maintenance policy

Root: @anthropic-ai/claude-agent-sdk, @clack/prompts, @types/node,
dompurify, postcss, react, react-dom, yaml, zod.
plugin/: tree-sitter-cli, zod.
openclaw/: @types/node.

All patch/minor bumps; no major version changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build: regenerate plugin artifacts after env/chroma/mcp fixes

Built artifacts are committed so the marketplace-installable plugin
ships with the runtime bundles. Picks up:
- d7b145e9 .mcp.json shell-prelude trim drop
- a8cbd651 EnvManager BASE_URL block + 3-branch predicate
- 8cb73b8c ClaudeProvider HTTP 400 unrecoverable classifier
- ecd5b802 ChromaMcpManager onnxruntime/protobuf overrides
- c79324ea ChromaMcpManager singleton enforcement
- e8376f46 esbuild import.meta.url -> __filename polyfill
- a7541d71 daily dep bump

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build: regenerate plugin artifacts after main merge

Bundles now include both v13.0.0 server-beta runtime (server-beta-service.cjs
+ updated mcp-server.cjs / worker-service.cjs) and this branch's chroma /
env / build / Claude SDK fixes.

Verified: bun test tests/env-isolation.test.ts \\
  tests/claude-provider-error-classifier.test.ts \\
  tests/services/sync/chroma-mcp-manager-singleton.test.ts
→ 13/13 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review): address CodeRabbit findings on PR #2394

1. scripts/build-hooks.js — `import.meta.url` now maps to a file:// URL
   (via pathToFileURL(__filename).href in the CJS banner) instead of the
   raw __filename path. Preserves URL semantics for any bundled ESM dep
   that does `new URL(rel, import.meta.url)`. createRequire still works.

2. src/shared/EnvManager.ts — added envFilePath() that resolves
   CLAUDE_MEM_ENV_FILE lazily (falling back to paths.envFile()), and
   switched internal load/save call sites to use it. ENV_FILE_PATH is
   kept as a deprecated snapshot for back-compat. Lets tests target a
   temp file without depending on module-load order.

3. tests/env-isolation.test.ts — redirects to a temp dir via
   CLAUDE_MEM_ENV_FILE in beforeAll, removes all mutation of the real
   ~/.claude-mem/.env, and wraps the OAuth-spy assertion in try/finally
   so the spy is always restored even if the test fails.

Verified:
  bun test tests/env-isolation.test.ts \
    tests/claude-provider-error-classifier.test.ts \
    tests/services/sync/chroma-mcp-manager-singleton.test.ts
  → 13/13 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alex Newman
2026-05-09 18:05:48 -07:00
committed by GitHub
parent 13d5fa71c2
commit 5533412984
14 changed files with 1064 additions and 417 deletions
+1 -1
View File
@@ -212,7 +212,7 @@ ${f}`}let a=s.lineStart;for(let u=s.lineStart-1;u>=0;u--){let l=i[u].trim();if(l
${c}`}var u_=new Set([".js",".jsx",".ts",".tsx",".mjs",".cjs",".py",".pyw",".go",".rs",".rb",".java",".cs",".cpp",".cc",".cxx",".c",".h",".hpp",".hh",".swift",".kt",".kts",".php",".vue",".svelte",".ex",".exs",".lua",".scala",".sc",".sh",".bash",".zsh",".hs",".zig",".css",".scss",".toml",".yml",".yaml",".sql",".md",".mdx"]),Gx=new Set(["node_modules",".git","dist","build",".next","__pycache__",".venv","venv","env",".env","target","vendor",".cache",".turbo","coverage",".nyc_output",".claude",".smart-file-read"]),Kx=512*1024;async function*l_(t,e,r=20,n){if(r<=0)return;let o;try{o=await(0,Ar.readdir)(t,{withFileTypes:!0})}catch(s){S.debug("WORKER",`walkDir: failed to read directory ${t}`,void 0,s instanceof Error?s:void 0);return}for(let s of o){if(s.name.startsWith(".")&&s.name!=="."||Gx.has(s.name))continue;let i=(0,Vn.join)(t,s.name);if(s.isDirectory())yield*l_(i,e,r-1,n);else if(s.isFile()){let a=s.name.slice(s.name.lastIndexOf("."));(u_.has(a)||n&&n.has(a))&&(yield i)}}}async function Jx(t){try{let e=await(0,Ar.stat)(t);if(e.size>Kx||e.size===0)return null;let r=await(0,Ar.readFile)(t,"utf-8");return r.slice(0,1e3).includes("\0")?null:r}catch(e){return S.debug("WORKER",`safeReadFile: failed to read ${t}`,void 0,e instanceof Error?e:void 0),null}}async function d_(t,e,r={}){let n=r.maxResults||20,o=e.toLowerCase(),s=o.split(/[\s_\-./]+/).filter(w=>w.length>0),i=r.projectRoot||t,a=Wn(i),c=new Set;for(let w of Object.values(a.grammars))for(let v of w.extensions)u_.has(v)||c.add(v);let u=[];for await(let w of l_(t,t,20,c.size>0?c:void 0)){if(r.filePattern&&!(0,Vn.relative)(t,w).toLowerCase().includes(r.filePattern.toLowerCase()))continue;let v=await Jx(w);v&&u.push({absolutePath:w,relativePath:(0,Vn.relative)(t,w),content:v})}let l=i_(u,i),d=[],p=[],f=0;for(let[w,v]of l){f+=Bx(v);let k=Ps(w.toLowerCase(),s)>0,_e=[],Ee=(Dt,Qt)=>{for(let ae of Dt){let St=0,Ve="",Cr=Ps(ae.name.toLowerCase(),s);Cr>0&&(St+=Cr*3,Ve="name match"),ae.signature.toLowerCase().includes(o)&&(St+=2,Ve=Ve?`${Ve} + signature`:"signature match"),ae.jsdoc&&ae.jsdoc.toLowerCase().includes(o)&&(St+=1,Ve=Ve?`${Ve} + jsdoc`:"jsdoc match"),St>0&&(k=!0,_e.push({filePath:w,symbolName:Qt?`${Qt}.${ae.name}`:ae.name,kind:ae.kind,signature:ae.signature,jsdoc:ae.jsdoc,lineStart:ae.lineStart,lineEnd:ae.lineEnd,matchReason:Ve})),ae.children&&Ee(ae.children,ae.name)}};Ee(v.symbols),k&&(d.push(v),p.push(..._e))}p.sort((w,v)=>{let x=Ps(w.symbolName.toLowerCase(),s);return Ps(v.symbolName.toLowerCase(),s)-x});let m=p.slice(0,n),_=new Set(m.map(w=>w.filePath)),y=d.filter(w=>_.has(w.filePath)).slice(0,n),b=y.reduce((w,v)=>w+v.foldedTokenEstimate,0);return{foldedFiles:y,matchingSymbols:m,totalFilesScanned:u.length,totalSymbolsFound:f,tokenEstimate:b}}function Ps(t,e){let r=0;for(let n of e)if(t===n)r+=10;else if(t.includes(n))r+=5;else{let o=0,s=0;for(let i of n){let a=t.indexOf(i,o);a!==-1&&(s++,o=a+1)}s===n.length&&(r+=1)}return r}function Bx(t){let e=t.symbols.length;for(let r of t.symbols)r.children&&(e+=r.children.length);return e}function p_(t,e){let r=[];if(r.push(`\u{1F50D} Smart Search: "${e}"`),r.push(` Scanned ${t.totalFilesScanned} files, found ${t.totalSymbolsFound} symbols`),r.push(` ${t.matchingSymbols.length} matches across ${t.foldedFiles.length} files (~${t.tokenEstimate} tokens for folded view)`),r.push(""),t.matchingSymbols.length===0)return r.push(" No matching symbols found."),r.join(`
`);r.push("\u2500\u2500 Matching Symbols \u2500\u2500"),r.push("");for(let n of t.matchingSymbols){if(r.push(` ${n.kind} ${n.symbolName} (${n.filePath}:${n.lineStart+1})`),r.push(` ${n.signature}`),n.jsdoc){let o=n.jsdoc.split(`
`).find(s=>s.replace(/^[\s*/]+/,"").trim().length>0);o&&r.push(` \u{1F4AC} ${o.replace(/^[\s*/]+/,"").trim()}`)}r.push("")}r.push("\u2500\u2500 Folded File Views \u2500\u2500"),r.push("");for(let n of t.foldedFiles)r.push(Ir(n)),r.push("");return r.push("\u2500\u2500 Actions \u2500\u2500"),r.push(" To see full implementation: use smart_unfold with file path and symbol name"),r.join(`
`)}var iu=require("node:fs/promises"),zs=require("node:fs"),Qe=require("node:path"),h_=require("node:os"),g_=require("node:url"),cP={},Yx="12.7.5";console.log=(...t)=>{S.error("CONSOLE","Intercepted console output (MCP protocol protection)",void 0,{args:t})};var __=!1,y_=(()=>{if(typeof __dirname<"u")return __dirname;try{return(0,Qe.dirname)((0,g_.fileURLToPath)(cP.url))}catch{return __=!0,process.cwd()}})(),au=(0,Qe.resolve)(y_,"worker-service.cjs");function Xx(){__&&((0,zs.existsSync)(au)||S.error("SYSTEM","mcp-server: dirname resolution failed (both __dirname and import.meta.url are unavailable). Fell back to process.cwd() and the resolved WORKER_SCRIPT_PATH does not exist. This is the actual problem \u2014 the worker bundle is fine, but mcp-server cannot locate it. Worker auto-start will fail until the dirname-resolution path is fixed.",{workerScriptPath:au,mcpServerDir:y_}))}var f_={search:"/api/search",timeline:"/api/timeline"};async function su(t,e){S.debug("SYSTEM","\u2192 Worker API",void 0,{endpoint:t,params:e});let r=new URLSearchParams;for(let[o,s]of Object.entries(e))s!=null&&r.append(o,String(s));let n=`${t}?${r}`;try{let o=await $s(n);if(!o.ok){let i=await o.text();throw new Error(`Worker API error (${o.status}): ${i}`)}let s=await o.json();return S.debug("SYSTEM","\u2190 Worker API success",void 0,{endpoint:t}),s}catch(o){return S.error("SYSTEM","\u2190 Worker API error",{endpoint:t},o instanceof Error?o:new Error(String(o))),{content:[{type:"text",text:`Error calling Worker API: ${o instanceof Error?o.message:String(o)}`}],isError:!0}}}async function Qx(t,e){let r=await $s(t,{method:"POST",headers:{"Content-Type":"application/json"},body:JSON.stringify(e)});if(!r.ok){let o=await r.text();throw new Error(`Worker API error (${r.status}): ${o}`)}let n=await r.json();return S.debug("HTTP","Worker API success (POST)",void 0,{endpoint:t}),{content:[{type:"text",text:JSON.stringify(n,null,2)}]}}async function Mr(t,e){S.debug("HTTP","Worker API request (POST)",void 0,{endpoint:t});try{return await Qx(t,e)}catch(r){return S.error("HTTP","Worker API error (POST)",{endpoint:t},r instanceof Error?r:new Error(String(r))),{content:[{type:"text",text:`Error calling Worker API: ${r instanceof Error?r.message:String(r)}`}],isError:!0}}}async function eP(){try{return(await $s("/api/health")).ok}catch(t){return S.debug("SYSTEM","Worker health check failed",{},t instanceof Error?t:new Error(String(t))),!1}}async function tP(){if(await eP())return!0;S.warn("SYSTEM","Worker not available, attempting auto-start for MCP client"),Xx();try{let t=Jc(),e=await Kg(t,au);return e==="dead"&&S.error("SYSTEM","Worker auto-start failed \u2014 MCP tools that require the worker (search, timeline, get_observations) will fail until the worker is running. Check earlier log lines for the specific failure reason (Bun not found, missing worker bundle, port conflict, etc.)."),e!=="dead"}catch(t){return S.error("SYSTEM","Worker auto-start threw \u2014 MCP tools that require the worker (search, timeline, get_observations) will fail until the worker is running.",void 0,t instanceof Error?t:new Error(String(t))),!1}}var S_=[{name:"__IMPORTANT",description:`3-LAYER WORKFLOW (ALWAYS FOLLOW):
`)}var iu=require("node:fs/promises"),zs=require("node:fs"),Qe=require("node:path"),h_=require("node:os"),g_=require("node:url"),cP={},Yx="13.0.0";console.log=(...t)=>{S.error("CONSOLE","Intercepted console output (MCP protocol protection)",void 0,{args:t})};var __=!1,y_=(()=>{if(typeof __dirname<"u")return __dirname;try{return(0,Qe.dirname)((0,g_.fileURLToPath)(cP.url))}catch{return __=!0,process.cwd()}})(),au=(0,Qe.resolve)(y_,"worker-service.cjs");function Xx(){__&&((0,zs.existsSync)(au)||S.error("SYSTEM","mcp-server: dirname resolution failed (both __dirname and import.meta.url are unavailable). Fell back to process.cwd() and the resolved WORKER_SCRIPT_PATH does not exist. This is the actual problem \u2014 the worker bundle is fine, but mcp-server cannot locate it. Worker auto-start will fail until the dirname-resolution path is fixed.",{workerScriptPath:au,mcpServerDir:y_}))}var f_={search:"/api/search",timeline:"/api/timeline"};async function su(t,e){S.debug("SYSTEM","\u2192 Worker API",void 0,{endpoint:t,params:e});let r=new URLSearchParams;for(let[o,s]of Object.entries(e))s!=null&&r.append(o,String(s));let n=`${t}?${r}`;try{let o=await $s(n);if(!o.ok){let i=await o.text();throw new Error(`Worker API error (${o.status}): ${i}`)}let s=await o.json();return S.debug("SYSTEM","\u2190 Worker API success",void 0,{endpoint:t}),s}catch(o){return S.error("SYSTEM","\u2190 Worker API error",{endpoint:t},o instanceof Error?o:new Error(String(o))),{content:[{type:"text",text:`Error calling Worker API: ${o instanceof Error?o.message:String(o)}`}],isError:!0}}}async function Qx(t,e){let r=await $s(t,{method:"POST",headers:{"Content-Type":"application/json"},body:JSON.stringify(e)});if(!r.ok){let o=await r.text();throw new Error(`Worker API error (${r.status}): ${o}`)}let n=await r.json();return S.debug("HTTP","Worker API success (POST)",void 0,{endpoint:t}),{content:[{type:"text",text:JSON.stringify(n,null,2)}]}}async function Mr(t,e){S.debug("HTTP","Worker API request (POST)",void 0,{endpoint:t});try{return await Qx(t,e)}catch(r){return S.error("HTTP","Worker API error (POST)",{endpoint:t},r instanceof Error?r:new Error(String(r))),{content:[{type:"text",text:`Error calling Worker API: ${r instanceof Error?r.message:String(r)}`}],isError:!0}}}async function eP(){try{return(await $s("/api/health")).ok}catch(t){return S.debug("SYSTEM","Worker health check failed",{},t instanceof Error?t:new Error(String(t))),!1}}async function tP(){if(await eP())return!0;S.warn("SYSTEM","Worker not available, attempting auto-start for MCP client"),Xx();try{let t=Jc(),e=await Kg(t,au);return e==="dead"&&S.error("SYSTEM","Worker auto-start failed \u2014 MCP tools that require the worker (search, timeline, get_observations) will fail until the worker is running. Check earlier log lines for the specific failure reason (Bun not found, missing worker bundle, port conflict, etc.)."),e!=="dead"}catch(t){return S.error("SYSTEM","Worker auto-start threw \u2014 MCP tools that require the worker (search, timeline, get_observations) will fail until the worker is running.",void 0,t instanceof Error?t:new Error(String(t))),!1}}var S_=[{name:"__IMPORTANT",description:`3-LAYER WORKFLOW (ALWAYS FOLLOW):
1. search(query) \u2192 Get index with IDs (~50-100 tokens/result)
2. timeline(anchor=ID) \u2192 Get context around interesting results
3. get_observations([IDs]) \u2192 Fetch full details ONLY for filtered IDs
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long