fix: remove ONNX/OpenBLAS thread cap from chroma-mcp spawn env
The 2-thread cap was a bandaid for #2220 (Windows) and #2253 (macOS Intel) CPU runaway reports on v12.4.9. The actual root causes (watermark stuck at 0 → continuous re-embed, orphan process trees, fire-and-forget backfill across 80+ projects) were fixed structurally in #2282: per-batch watermark persistence, killProcessTree() + pgid registration, max-3 concurrent backfills with re-entrancy guard, kernel-enforced child cleanup (#2216). With the structural fixes in place, capping ONNX/OpenBLAS/MKL at 2 threads slows initial backfill 3–6× on multi-core machines and provides no steady-state benefit. Defer to the OS scheduler and the user's environment. ANONYMIZED_TELEMETRY=false stays — unrelated to the storm, blocks background HTTP from the embedding subprocess. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -588,15 +588,6 @@ export class ChromaMcpManager {
|
||||
}
|
||||
}
|
||||
|
||||
// Cap embedding-thread fanout. ONNX Runtime / OpenBLAS / MKL all default to
|
||||
// cpu_count(), so a 12-core box runs 12 threads burning embeddings in
|
||||
// parallel — the dominant cause of the chroma-mcp CPU storm on Windows
|
||||
// (#2220). Two threads keeps backfill latency reasonable without saturating
|
||||
// the box. Only set if the user hasn't pinned them explicitly.
|
||||
const threadCap = '2';
|
||||
for (const key of ['OMP_NUM_THREADS', 'ONNX_NUM_THREADS', 'OPENBLAS_NUM_THREADS', 'MKL_NUM_THREADS']) {
|
||||
if (!baseEnv[key]) baseEnv[key] = threadCap;
|
||||
}
|
||||
// Disable Chroma's anonymous telemetry — it issues background HTTP from
|
||||
// the embedding subprocess on every collection touch.
|
||||
if (!baseEnv.ANONYMIZED_TELEMETRY) baseEnv.ANONYMIZED_TELEMETRY = 'false';
|
||||
|
||||
Reference in New Issue
Block a user