Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5.9 KiB
Phase 09: Chroma/Vector Search Fixes
These PRs address Chroma vector database stability, zombie processes, and enterprise compatibility.
Bug Fixes
-
Review PR #887 (
fix: align IDs with metadatas in ChromaSearchStrategyby @abkrim). Files:src/services/worker/search/strategies/ChromaSearchStrategy.ts, tests. IDs and metadata arrays are misaligned causing incorrect search results. Steps: (1)gh pr checkout 887(2) Review — array alignment between document IDs and their metadata is critical for correct results (3) Check test coverage (4) Runnpm run build(5) If clean:gh pr merge 887 --rebase --delete-branch- Merged (2026-02-06): Confirmed the bug —
queryChroma()deduplicates IDs but returns raw metadatas array, causing index misalignment infilterByRecency(). Fix builds a Map from sqlite_id→metadata then iterates deduplicated IDs. 20 tests pass (including 2 new). Build clean.
- Merged (2026-02-06): Confirmed the bug —
-
Review PR #769 (
fix: close transport on connection error to prevent chroma-mcp zombie processesby @jenyapoyarkov). Files:src/services/sync/ChromaSync.ts, tests. Transport left open on connection failure creates zombies. Steps: (1)gh pr checkout 769(2) Review — should close/dispose transport in the error path (3) Runnpm run build(4) If clean:gh pr merge 769 --rebase --delete-branch- Merged (2026-02-06): Confirmed the bug — both
ensureCollection()(~line 202) andqueryChroma()(~line 862) error handlers resetconnectedandclienton connection errors but never calledtransport.close(), leaving chroma-mcp subprocesses alive as zombies. Fix addstransport.close()(wrapped in try/catch for already-dead transports) andtransport = nullbefore resetting state, mirroring theclose()method pattern. 3 new regression tests added. All 19 integration tests + 20 ChromaSearchStrategy tests pass. Build clean.
- Merged (2026-02-06): Confirmed the bug — both
-
Review PR #884 (
fix: add Zscaler SSL certificate support for ChromaDB vector searchby @RClark4958). Files:src/services/sync/ChromaSync.ts+ build artifacts. Enterprise environments use Zscaler SSL inspection which breaks Chroma HTTPS connections. Steps: (1)gh pr checkout 884(2) Review — should respect NODE_EXTRA_CA_CERTS or custom CA cert configuration (3) Verify this doesn't weaken SSL for non-Zscaler users (4) Runnpm run build(5) If appropriate for enterprise support:gh pr merge 884 --rebase --delete-branch- Merged (2026-02-06): Manually applied to main (compiled output conflicts from PR #769 merge). Adds
getCombinedCertPath()method that detects Zscaler certificates in the macOS system keychain viasecurity find-certificate, combines them with certifi CA certs into~/.claude-mem/combined_certs.pem(cached 24h, atomic write), and passesSSL_CERT_FILE/REQUESTS_CA_BUNDLE/CURL_CA_BUNDLEenv vars to the chroma-mcp subprocess. Falls back gracefully when Zscaler is absent — no impact on non-Zscaler environments.NODE_EXTRA_CA_CERTSnot needed since the Python subprocess uses Python SSL env vars, not Node.js ones. macOS-only cert extraction with cross-platform cache reuse. All 39 Chroma tests pass. Build clean.
- Merged (2026-02-06): Manually applied to main (compiled output conflicts from PR #769 merge). Adds
-
Review PR #830 (
fix: Chroma stability + additional process management layersby @michelhelsdingen). Files: 7 files including ChromaSync.ts, ProcessManager.ts, worker-service.ts. Steps: (1)gh pr checkout 830(2) This is a broader stability PR — review scope carefully (3) Check for overlap with ProcessRegistry (v9.0.8) (4) If too broad, request scope reduction to just the Chroma stability portions (5) Runnpm run build- Closed (2026-02-06): PR scope too broad (7 files, 7 distinct concerns) with severe merge conflicts against already-merged PRs #769 (transport cleanup) and #884 (Zscaler SSL). The
ensureConnection()rewrite in this PR predates both merges and would require extensive conflict resolution. Left detailed review comment requesting rebase on current main and split into 8 focused PRs: CHROMA_DISABLED setting, connection timeout, reconnection mutex, parent heartbeat, chroma watchdog, stale session recovery, subprocess pool limit, and MCP orphan cleanup. Individual ideas are solid — the connection timeout, parent heartbeat, and CHROMA_DISABLED setting are particularly welcome as standalone PRs.
- Closed (2026-02-06): PR scope too broad (7 files, 7 distinct concerns) with severe merge conflicts against already-merged PRs #769 (transport cleanup) and #884 (Zscaler SSL). The
Feature
- Evaluate PR #792 (
feat: Replace MCP subprocess with persistent Chroma HTTP serverby @bigph00t). Files: 12 files. Major architectural change — replaces the MCP subprocess model for Chroma with a persistent HTTP server. Steps: (1)gh pr checkout 792(2) This is a significant architecture change. Review the approach: persistent HTTP server vs. MCP subprocess (3) Check resource implications (always-running server vs. on-demand subprocess) (4) Verify graceful shutdown and cross-platform support (5) Runnpm run build(6) If the architecture is sound and well-tested:gh pr merge 792 --rebase --delete-branch. If concerns exist, leave detailed review comments.- Changes Requested (2026-02-06): Architecture direction is sound — persistent HTTP server via
npx chroma runwith ChromaServerManager singleton is the right long-term approach. Eliminates per-operation subprocess overhead, re-enables Windows support, clean separation between server lifecycle and data operations. Build succeeds and all 34 existing Chroma tests pass. However, 4 critical blockers prevent merge: (1) Zscaler SSL regression — PR #884'sgetCombinedCertPath()and SSL env var injection are completely removed, breaking enterprise users behind corporate proxies; (2) No tests for 290-line ChromaServerManager (process spawn, heartbeat polling, cross-platform shutdown); (3) Merge conflicts with PRs #769, #884, #887 already merged to main; (4) macOS untested despite being primary dev platform. Additional concerns: ~180MB memory overhead, port 8000 default conflicts, Python dependency not actually removed (shifted to npx chroma run). Left detailed review comment requesting rebase, Zscaler SSL port, ChromaServerManager tests, and macOS verification.
- Changes Requested (2026-02-06): Architecture direction is sound — persistent HTTP server via