fix: sequoia-territory bug-fix bundle (chroma, env, build, MCP, worker) (#2394)

* fix(mcp): drop ${_R%/} parameter-expansion trim that trips Claude Code MCP validator

The POSIX substring trim ${_R%/} is misread by Claude Code's MCP-config
validator as a required env var named "_R%/", causing /doctor to flag
mcp-search as invalid on every install. POSIX collapses // in paths, so
the trim was cosmetic — drop it and the validator passes.

Fixes #2350, #2354, #2356.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(env): block ANTHROPIC_BASE_URL leak + three-branch OAuth-skip predicate

Issue #2375: parent-shell ANTHROPIC_BASE_URL leaked through to subprocess
isolatedEnv, while ANTHROPIC_AUTH_TOKEN was blocked. The OAuth-skip
predicate fired on bare BASE_URL, but no auth credential reached the
subprocess -> "Not logged in". Add ANTHROPIC_BASE_URL to BLOCKED_ENV_VARS
so it can only enter isolatedEnv via ~/.claude-mem/.env.

Replace the OAuth-skip predicate with three branches to prevent a
second-order security regression: a user with a tokenless gateway
configured in .env (BASE_URL only, no token) would otherwise have their
Anthropic OAuth token fetched and sent to their gateway. Token leak to
third party. Three-branch predicate:

1. BASE_URL set -> return without OAuth (custom gateway, never leak token)
2. API_KEY or AUTH_TOKEN set -> return without OAuth (explicit credentials)
3. Otherwise -> OAuth lookup for api.anthropic.com

Adds tests/env-isolation.test.ts.

Fixes #2375.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(worker): classify Claude SDK HTTP 400 as unrecoverable

ClaudeProvider previously had no explicit HTTP 400 handling — the
default branch classified all errors as `transient`, so a permanent
400 (e.g., model rejecting an `effort` parameter forwarded from a
leaked CLAUDE_CODE_EFFORT_LEVEL) would be retried indefinitely
(#1874+ retries observed in one session per #2357).

Mirror GeminiProvider/OpenRouterProvider's pattern: classify 400 as
`unrecoverable`, 401/403 as `auth_invalid`, 429 as `rate_limit`,
default to `transient`. When the 400 body matches the
"effort parameter" signature, emit a one-time SDK warn log pointing
at the env-leak fix in ~/.claude-mem/.env.

Adds tests/claude-provider-error-classifier.test.ts.

Fixes #2357.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chroma): pin onnxruntime>=1.20 + protobuf<7 to fix INVALID_PROTOBUF on macOS arm64

The shipped all-MiniLM-L6-v2 model has pytorch-2.0 IR. chroma-mcp 0.2.6
transitively depends on `chromadb>=1.0.16` which only requires
`onnxruntime>=1.14.1` — uv can therefore resolve to an onnxruntime old
enough to fail every embedding add with `[ONNXRuntimeError] : 7 :
INVALID_PROTOBUF` on macOS arm64 / Python 3.13. Semantic search silently
degraded to FTS-only and smart backfill broke (#2371).

Path B (override) was required because chroma-mcp 0.2.6 is the latest
PyPI release — no upstream bump exists.

Inject `--with onnxruntime>=1.20 --with protobuf<7` into the uvx spawn
args (both persistent and remote modes). The protobuf cap is essential:
forcing only `onnxruntime>=1.20` causes uv to re-resolve and land on
protobuf 7.x, which trips opentelemetry's `_pb2` stubs with `TypeError:
Descriptors cannot be created directly` because they were generated
with protoc <3.19. Capping below 7 lands on protobuf 6.x which
opentelemetry tolerates.

Verified end-to-end: ONNX model loads, embeddings produce a 384-dim
vector, PersistentClient init / add / query roundtrip succeeds:

    uvx --python 3.13 --with "onnxruntime>=1.20" --with "protobuf<7" \
        chroma-mcp==0.2.6 --help     # clean
    # programmatic test: onnxruntime 1.26.0, protobuf 6.33.6,
    # embedding ok 384, query ok ids=[['1']]

Fixes #2371.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chroma): enforce single chroma-mcp subprocess per worker (#2313)

Root cause: every reconnect path in ChromaMcpManager — connectInternal's
re-entry, the connect-timeout catch, callTool's transport-error retry, and
the transport.onclose handler — used to abandon `this.transport`/`this.client`
by calling at most `transport.close()` and nulling the handles. The MCP SDK's
StdioClientTransport.close() only signals the direct child (uvx); on Linux the
grandchildren (uv -> python -> chroma-mcp) re-parent to init and survive
because the SDK does not put the subprocess in its own process group. Each
reconnect therefore leaked a full chroma-mcp tree, accumulating 20+ instances
per session.

Fix: introduce a private disposeCurrentSubprocess() helper that always tree-
kills via the existing killProcessTree primitive before nulling the transport
reference, and route every "abandon current transport" path (reconnect,
connect-timeout, transport error, onclose, stop) through it. The existing
`connecting: Promise<void> | null` lock continues to serialize concurrent
ensureConnected() callers into a single spawn.

Adds tests/services/sync/chroma-mcp-manager-singleton.test.ts covering:
- 5 parallel ensureConnected() calls produce exactly one spawn
- a transport-error reconnect tree-kills the prior subprocess pid before
  spawning a replacement
- stop() disposes state including any pending connecting promise

Manual verification needed on Linux: after a long session with multiple
tool uses, `ps aux | grep chroma-mcp | wc -l` should return 1, not 20+.

Fixes #2313.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(build): polyfill import.meta.url to __filename in CJS worker bundle

The worker bundles ESM dependencies (notably @anthropic-ai/claude-agent-sdk's
*.mjs files) into CJS output. Those modules call createRequire(import.meta.url)
at module-load time. esbuild's CJS output left this as createRequire(ute.url)
— where `ute` is its `import.meta` polyfill `{}` — so `ute.url` was undefined
and module-load crashed with:

  TypeError: The argument 'filename' must be a file URL object, file URL
  string, or absolute path string. Received undefined
  code: ERR_INVALID_ARG_VALUE

Every Stop hook and every worker subprocess invocation hit this. Fix is the
esbuild `define` option mapping `import.meta.url` to `__filename` (provided as
a real absolute path by the existing CJS prelude in the banner).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: daily dep bump per CLAUDE.md maintenance policy

Root: @anthropic-ai/claude-agent-sdk, @clack/prompts, @types/node,
dompurify, postcss, react, react-dom, yaml, zod.
plugin/: tree-sitter-cli, zod.
openclaw/: @types/node.

All patch/minor bumps; no major version changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build: regenerate plugin artifacts after env/chroma/mcp fixes

Built artifacts are committed so the marketplace-installable plugin
ships with the runtime bundles. Picks up:
- d7b145e9 .mcp.json shell-prelude trim drop
- a8cbd651 EnvManager BASE_URL block + 3-branch predicate
- 8cb73b8c ClaudeProvider HTTP 400 unrecoverable classifier
- ecd5b802 ChromaMcpManager onnxruntime/protobuf overrides
- c79324ea ChromaMcpManager singleton enforcement
- e8376f46 esbuild import.meta.url -> __filename polyfill
- a7541d71 daily dep bump

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build: regenerate plugin artifacts after main merge

Bundles now include both v13.0.0 server-beta runtime (server-beta-service.cjs
+ updated mcp-server.cjs / worker-service.cjs) and this branch's chroma /
env / build / Claude SDK fixes.

Verified: bun test tests/env-isolation.test.ts \\
  tests/claude-provider-error-classifier.test.ts \\
  tests/services/sync/chroma-mcp-manager-singleton.test.ts
→ 13/13 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review): address CodeRabbit findings on PR #2394

1. scripts/build-hooks.js — `import.meta.url` now maps to a file:// URL
   (via pathToFileURL(__filename).href in the CJS banner) instead of the
   raw __filename path. Preserves URL semantics for any bundled ESM dep
   that does `new URL(rel, import.meta.url)`. createRequire still works.

2. src/shared/EnvManager.ts — added envFilePath() that resolves
   CLAUDE_MEM_ENV_FILE lazily (falling back to paths.envFile()), and
   switched internal load/save call sites to use it. ENV_FILE_PATH is
   kept as a deprecated snapshot for back-compat. Lets tests target a
   temp file without depending on module-load order.

3. tests/env-isolation.test.ts — redirects to a temp dir via
   CLAUDE_MEM_ENV_FILE in beforeAll, removes all mutation of the real
   ~/.claude-mem/.env, and wraps the OAuth-spy assertion in try/finally
   so the spy is always restored even if the test fails.

Verified:
  bun test tests/env-isolation.test.ts \
    tests/claude-provider-error-classifier.test.ts \
    tests/services/sync/chroma-mcp-manager-singleton.test.ts
  → 13/13 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alex Newman
2026-05-09 18:05:48 -07:00
committed by GitHub
parent 13d5fa71c2
commit 5533412984
14 changed files with 1064 additions and 417 deletions
+106 -41
View File
@@ -23,6 +23,26 @@ const CHROMA_SUPERVISOR_ID = 'chroma-mcp';
const CHROMA_MCP_PINNED_VERSION = '0.2.6';
// Override transitive dep resolutions for chroma-mcp 0.2.6 (issue #2371).
//
// Why onnxruntime>=1.20: the shipped all-MiniLM-L6-v2 model has pytorch-2.0
// IR. Older onnxruntime versions can't parse it and fail every embedding
// add with `[ONNXRuntimeError] : 7 : INVALID_PROTOBUF`. uv may otherwise
// resolve to a too-old onnxruntime on macOS arm64 / Python 3.13 depending
// on cache state, so we force a floor.
//
// Why protobuf<7: protobuf 7.x's stricter generated-file check rejects
// opentelemetry's _pb2 stubs (generated with protoc <3.19), throwing
// `TypeError: Descriptors cannot be created directly` at chromadb import.
// Capping below 7 lands on protobuf 6.x which opentelemetry tolerates.
//
// These pins are runtime-only (uvx --with) so we don't have to fork
// chroma-mcp upstream — they apply only to claude-mem's spawned subprocess.
const CHROMA_MCP_DEP_OVERRIDES: ReadonlyArray<string> = [
'onnxruntime>=1.20',
'protobuf<7',
];
export class ChromaMcpManager {
private static instance: ChromaMcpManager | null = null;
private client: Client | null = null;
@@ -72,15 +92,14 @@ export class ChromaMcpManager {
}
private async connectInternal(): Promise<void> {
if (this.transport) {
try { await this.transport.close(); } catch { /* already dead */ }
}
if (this.client) {
try { await this.client.close(); } catch { /* already dead */ }
}
this.client = null;
this.transport = null;
this.connected = false;
// Singleton invariant (#2313): kill any pre-existing chroma-mcp subprocess
// tree before spawning a new one. The MCP SDK's transport.close() only
// signals the direct child (uvx); on Linux the grandchildren (uv, python,
// chroma-mcp) get re-parented to init and survive, accumulating 20+
// instances per session if reconnects fire repeatedly. Reuse the same
// tree-kill primitive used by stop() so reconnect can never leave
// orphans behind.
await this.disposeCurrentSubprocess();
const commandArgs = this.buildCommandArgs();
const spawnEnvironment = this.getSpawnEnv();
@@ -121,14 +140,12 @@ export class ChromaMcpManager {
await Promise.race([mcpConnectionPromise, timeoutPromise]);
} catch (connectionError) {
clearTimeout(timeoutId!);
logger.warn('CHROMA_MCP', 'Connection failed, killing subprocess to prevent zombie', {
logger.warn('CHROMA_MCP', 'Connection failed, killing subprocess tree to prevent zombie', {
error: connectionError instanceof Error ? connectionError.message : String(connectionError)
});
try { await this.transport.close(); } catch { /* best effort */ }
try { await this.client.close(); } catch { /* best effort */ }
this.client = null;
this.transport = null;
this.connected = false;
// Tree-kill (not just transport.close) so failed-connect descendants
// can't survive on Linux (#2313).
await this.disposeCurrentSubprocess();
throw connectionError;
}
clearTimeout(timeoutId!);
@@ -139,6 +156,7 @@ export class ChromaMcpManager {
logger.info('CHROMA_MCP', 'Connected to chroma-mcp successfully');
const currentTransport = this.transport;
const currentTrackedPid = (this.transport as unknown as { _process?: ChildProcess })._process?.pid;
this.transport.onclose = () => {
if (this.transport !== currentTransport) {
logger.debug('CHROMA_MCP', 'Ignoring stale onclose from previous transport');
@@ -150,6 +168,20 @@ export class ChromaMcpManager {
this.client = null;
this.transport = null;
this.lastConnectionFailureTimestamp = Date.now();
// Direct child (uvx) emitted close, but on Linux the grandchildren
// (uv/python/chroma-mcp) often outlive their parent because MCP SDK
// does not use process groups. Sweep the descendant tree using the
// captured PID — best-effort; pgrep returns nothing if everything
// already exited (#2313).
if (currentTrackedPid) {
ChromaMcpManager.killProcessTree(currentTrackedPid).catch((error) => {
logger.debug('CHROMA_MCP', 'Background tree-kill after onclose finished (best-effort)', {
pid: currentTrackedPid,
error: error instanceof Error ? error.message : String(error)
});
});
}
};
}
@@ -158,6 +190,8 @@ export class ChromaMcpManager {
const chromaMode = settings.CLAUDE_MEM_CHROMA_MODE || 'local';
const pythonVersion = process.env.CLAUDE_MEM_PYTHON_VERSION || settings.CLAUDE_MEM_PYTHON_VERSION || '3.13';
const depOverrideFlags = CHROMA_MCP_DEP_OVERRIDES.flatMap(spec => ['--with', spec]);
if (chromaMode === 'remote') {
const chromaHost = settings.CLAUDE_MEM_CHROMA_HOST || '127.0.0.1';
const chromaPort = settings.CLAUDE_MEM_CHROMA_PORT || '8000';
@@ -168,6 +202,7 @@ export class ChromaMcpManager {
const args = [
'--python', pythonVersion,
...depOverrideFlags,
`chroma-mcp==${CHROMA_MCP_PINNED_VERSION}`,
'--client-type', 'http',
'--host', chromaHost,
@@ -193,6 +228,7 @@ export class ChromaMcpManager {
return [
'--python', pythonVersion,
...depOverrideFlags,
`chroma-mcp==${CHROMA_MCP_PINNED_VERSION}`,
'--client-type', 'persistent',
'--data-dir', DEFAULT_CHROMA_DATA_DIR.replace(/\\/g, '/')
@@ -213,14 +249,15 @@ export class ChromaMcpManager {
arguments: toolArguments
});
} catch (transportError) {
this.connected = false;
this.client = null;
this.transport = null;
logger.warn('CHROMA_MCP', `Transport error during "${toolName}", reconnecting and retrying once`, {
error: transportError instanceof Error ? transportError.message : String(transportError)
});
// Tree-kill the dying subprocess before reconnect. Previously this path
// just nulled the handle, which on Linux leaks the uv/python/chroma-mcp
// descendants every time a transport error happens (#2313).
await this.disposeCurrentSubprocess();
try {
await this.ensureConnected();
result = await this.client!.callTool({
@@ -328,6 +365,53 @@ export class ChromaMcpManager {
}
}
/**
* Singleton enforcement helper (#2313): tree-kill the currently tracked
* chroma-mcp subprocess and reset all state so the next spawn starts clean.
*
* Why this is the singleton invariant: every code path that intends to
* abandon `this.transport` / `this.client` (reconnect, transport error,
* connect-timeout, onclose, stop()) MUST funnel through here. The MCP
* SDK's transport.close() only signals the direct child (uvx); on Linux
* the grandchildren (uv, python, chroma-mcp) re-parent to init and
* accumulate. Calling killProcessTree() against the captured PID before
* we drop the reference is the only way to guarantee at most one
* chroma-mcp subprocess tree exists per worker process.
*
* Idempotent and best-effort — safe to call when there is no active
* subprocess (no-op in that case).
*/
private async disposeCurrentSubprocess(): Promise<void> {
const chromaProcess = (this.transport as unknown as { _process?: ChildProcess })?._process;
const trackedPid = chromaProcess?.pid;
if (trackedPid) {
try {
await ChromaMcpManager.killProcessTree(trackedPid);
} catch (error) {
logger.warn('CHROMA_MCP', 'failed to kill prior chroma-mcp tree (best-effort)', {
pid: trackedPid,
error: error instanceof Error ? error.message : String(error)
});
}
}
if (this.transport) {
try { await this.transport.close(); } catch { /* already dead */ }
}
if (this.client) {
try { await this.client.close(); } catch { /* already dead */ }
}
if (trackedPid) {
getSupervisor().unregisterProcess(CHROMA_SUPERVISOR_ID);
}
this.client = null;
this.transport = null;
this.connected = false;
}
/**
* Gracefully stop the MCP connection and kill the chroma-mcp subprocess tree.
*
@@ -341,34 +425,15 @@ export class ChromaMcpManager {
* pattern from shutdown.ts (Principle 5: OS-supervised teardown).
*/
async stop(): Promise<void> {
if (!this.client) {
if (!this.client && !this.transport) {
logger.debug('CHROMA_MCP', 'No active MCP connection to stop');
this.connecting = null;
return;
}
logger.info('CHROMA_MCP', 'Stopping chroma-mcp MCP connection');
// Kill the entire process tree before closing the MCP client so
// descendants (uv, python, chroma-mcp) don't become orphans.
const chromaProcess = (this.transport as unknown as { _process?: ChildProcess })?._process;
if (chromaProcess?.pid) {
await ChromaMcpManager.killProcessTree(chromaProcess.pid);
}
try {
await this.client.close();
} catch (error) {
if (error instanceof Error) {
logger.debug('CHROMA_MCP', 'Error during client close (subprocess may already be dead)', {}, error);
} else {
logger.debug('CHROMA_MCP', 'Error during client close (subprocess may already be dead)', { error: String(error) });
}
}
getSupervisor().unregisterProcess(CHROMA_SUPERVISOR_ID);
this.client = null;
this.transport = null;
this.connected = false;
await this.disposeCurrentSubprocess();
this.connecting = null;
logger.info('CHROMA_MCP', 'chroma-mcp MCP connection stopped');
+47 -1
View File
@@ -27,6 +27,19 @@ import {
import { query } from '@anthropic-ai/claude-agent-sdk';
import { ClassifiedProviderError } from './provider-errors.js';
/**
* Module-scoped guard so the "effort parameter" hint only fires once per
* worker process. The underlying cause (a leaked CLAUDE_CODE_EFFORT_LEVEL in
* ~/.claude-mem/.env, see #2357) is environmental — re-logging it on every
* SDK call would spam the logs without adding signal.
*
* Exported solely for tests to reset the latch between cases.
*/
let effortHintLogged = false;
export function __resetEffortHintLatchForTesting(): void {
effortHintLogged = false;
}
/**
* Classify a ClaudeProvider error (executable spawn failures, SDK errors,
* Anthropic API errors). Provider-specific because it relies on:
@@ -36,7 +49,7 @@ import { ClassifiedProviderError } from './provider-errors.js';
*/
export function classifyClaudeError(err: unknown): ClassifiedProviderError {
const message = err instanceof Error ? err.message : String(err);
const errAny = err as { name?: string; status?: number; error?: { type?: string } };
const errAny = err as { name?: string; status?: number; error?: { type?: string }; body?: unknown };
// Executable / spawn issues — unrecoverable, no point retrying.
if (
@@ -88,6 +101,39 @@ export function classifyClaudeError(err: unknown): ClassifiedProviderError {
return new ClassifiedProviderError(message, { kind: 'unrecoverable', cause: err });
}
// HTTP 400 from the Anthropic SDK — bad request, never recoverable. Mirrors
// the pattern in GeminiProvider.classifyGeminiError / classifyOpenRouterError
// (see #2357: the SDK forwards `effort` to the Messages API when
// CLAUDE_CODE_EFFORT_LEVEL leaks into the subprocess env, and models like
// Haiku/Sonnet 4.5 reject with 400 — without this branch the default
// `transient` classification retried indefinitely).
if (errAny.status === 400) {
// Inspect both the message and any structured body for the effort marker.
const bodyText = (() => {
const body = errAny.body;
if (typeof body === 'string') return body;
if (body && typeof body === 'object') {
try { return JSON.stringify(body); } catch { return ''; }
}
return '';
})();
const haystack = `${message}\n${bodyText}`;
if (/effort parameter/i.test(haystack) && !effortHintLogged) {
effortHintLogged = true;
logger.warn(
'SDK',
'Anthropic API rejected request with HTTP 400: this model does not support the `effort` parameter. ' +
'CLAUDE_CODE_EFFORT_LEVEL is likely leaking into the SDK subprocess env via ~/.claude-mem/.env — ' +
'remove it or scope it to models that support effort. See https://github.com/thedotmack/claude-mem/issues/2357.',
{ status: 400 }
);
}
return new ClassifiedProviderError(
message || 'Anthropic bad request (status 400)',
{ kind: 'unrecoverable', cause: err },
);
}
// Server errors → transient.
if (typeof errAny.status === 'number' && errAny.status >= 500 && errAny.status < 600) {
return new ClassifiedProviderError(message, { kind: 'transient', cause: err });
+35 -18
View File
@@ -9,7 +9,16 @@ import {
type OAuthTokenResult,
} from './oauth-token.js';
export const ENV_FILE_PATH = paths.envFile();
// Resolved lazily so tests (and any rare runtime path-overrides) can target a
// temp file via CLAUDE_MEM_ENV_FILE without depending on module-load order.
// Production callers see the canonical ~/.claude-mem/.env path through
// paths.envFile() unchanged.
export function envFilePath(): string {
return process.env.CLAUDE_MEM_ENV_FILE ?? paths.envFile();
}
/** @deprecated Prefer envFilePath(); kept as a snapshot for back-compat. */
export const ENV_FILE_PATH = envFilePath();
const BLOCKED_ENV_VARS = [
'ANTHROPIC_API_KEY', // Issue #733: Prevent auto-discovery from project .env files
@@ -17,6 +26,10 @@ const BLOCKED_ENV_VARS = [
// shell would otherwise short-circuit OAuth lookup at spawn time.
// The fresh token from ~/.claude-mem/.env is re-injected below
// when explicit gateway credentials are configured.
'ANTHROPIC_BASE_URL', // Issue #2375: same leak class as AUTH_TOKEN. A leaked BASE_URL
// alone (no token) was enough to trigger the OAuth-skip path,
// sending the subprocess to a proxy with no credentials.
// Re-injected from ~/.claude-mem/.env when configured.
'CLAUDECODE', // Prevent "cannot be launched inside another Claude Code session" error
'CLAUDE_CODE_OAUTH_TOKEN', // Issue #2215: prevent stale parent-process token from leaking into
// isolated env. The fresh token is read from the keychain at spawn
@@ -77,12 +90,13 @@ function serializeEnvFile(env: Record<string, string>): string {
}
export function loadClaudeMemEnv(): ClaudeMemEnv {
if (!existsSync(ENV_FILE_PATH)) {
const envFile = envFilePath();
if (!existsSync(envFile)) {
return {};
}
try {
const content = readFileSync(ENV_FILE_PATH, 'utf-8');
const content = readFileSync(envFile, 'utf-8');
const parsed = parseEnvFile(content);
const result: ClaudeMemEnv = {};
@@ -94,12 +108,13 @@ export function loadClaudeMemEnv(): ClaudeMemEnv {
return result;
} catch (error: unknown) {
logger.warn('ENV', 'Failed to load .env file', { path: ENV_FILE_PATH }, error instanceof Error ? error : new Error(String(error)));
logger.warn('ENV', 'Failed to load .env file', { path: envFile }, error instanceof Error ? error : new Error(String(error)));
return {};
}
}
export function saveClaudeMemEnv(env: ClaudeMemEnv): void {
const envFile = envFilePath();
let existing: Record<string, string> = {};
try {
if (!existsSync(paths.dataDir())) {
@@ -107,8 +122,8 @@ export function saveClaudeMemEnv(env: ClaudeMemEnv): void {
}
chmodSync(paths.dataDir(), 0o700);
existing = existsSync(ENV_FILE_PATH)
? parseEnvFile(readFileSync(ENV_FILE_PATH, 'utf-8'))
existing = existsSync(envFile)
? parseEnvFile(readFileSync(envFile, 'utf-8'))
: {};
} catch (error) {
const normalizedError = error instanceof Error ? error : new Error(String(error));
@@ -155,10 +170,10 @@ export function saveClaudeMemEnv(env: ClaudeMemEnv): void {
}
try {
writeFileSync(ENV_FILE_PATH, serializeEnvFile(updated), { encoding: 'utf-8', mode: 0o600 });
chmodSync(ENV_FILE_PATH, 0o600);
writeFileSync(envFile, serializeEnvFile(updated), { encoding: 'utf-8', mode: 0o600 });
chmodSync(envFile, 0o600);
} catch (error: unknown) {
logger.error('ENV', 'Failed to save .env file', { path: ENV_FILE_PATH }, error instanceof Error ? error : new Error(String(error)));
logger.error('ENV', 'Failed to save .env file', { path: envFile }, error instanceof Error ? error : new Error(String(error)));
throw error;
}
}
@@ -230,15 +245,17 @@ export async function buildIsolatedEnvWithFreshOAuth(
if (!includeCredentials) return isolatedEnv;
// If the user already configured explicit Anthropic/gateway credentials in
// ~/.claude-mem/.env, honor those and skip OAuth lookup entirely. A bare
// ANTHROPIC_BASE_URL counts because gateways may be tokenless, and falling
// back to OAuth would silently route requests to api.anthropic.com.
if (
isolatedEnv.ANTHROPIC_API_KEY ||
isolatedEnv.ANTHROPIC_BASE_URL ||
isolatedEnv.ANTHROPIC_AUTH_TOKEN
) {
// Custom gateway: never inject OAuth (would leak the user's Anthropic OAuth
// token to a third-party gateway). The user must explicitly configure a
// gateway-appropriate token in ~/.claude-mem/.env if their gateway requires
// one. A bare BASE_URL with no token = tokenless gateway (e.g. mTLS at the
// network boundary).
if (isolatedEnv.ANTHROPIC_BASE_URL) {
clearStaleMarker();
return isolatedEnv;
}
// Direct API with explicit credentials: skip OAuth lookup.
if (isolatedEnv.ANTHROPIC_API_KEY || isolatedEnv.ANTHROPIC_AUTH_TOKEN) {
clearStaleMarker();
return isolatedEnv;
}