Compare commits

...

6 Commits

Author SHA1 Message Date
Alex Newman b1cfc85333 chore: bump version to 10.2.4
Publish to npm / publish (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 22:49:10 -05:00
Alex Newman ca8421611c fix: backfill Chroma vector DB for all projects on startup (#1154)
* fix: backfill all Chroma projects on worker startup

ChromaSync.ensureBackfilled() existed but was never called. After
v10.2.2's bun cache clear destroyed the ONNX model cache, Chroma only
had ~2 days of embeddings while SQLite had 49k+ observations.

- Add static backfillAllProjects() to ChromaSync — iterates all projects
  in SQLite, creates temporary ChromaSync per project, runs smart diff
- Call backfillAllProjects() fire-and-forget on worker startup
- Add 'CHROMA_SYNC' to logger Component type (pre-existing gap)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: sanitize project names for Chroma collection naming

Replace characters outside [a-zA-Z0-9._-] with underscores so projects
like "YC Stuff" map to collection "cm__YC_Stuff" instead of failing
Chroma's collection name validation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: route backfill to shared cm__claude-mem collection, harden sanitization

- Use single ChromaSync('claude-mem') in backfillAllProjects() instead of
  per-project instances, matching how DatabaseManager and SearchManager
  operate — fixes critical bug where backfilled data landed in orphaned
  collections that no search path reads from
- Strip trailing non-alphanumeric chars from sanitized collection names
  to satisfy Chroma's end-character constraint
- Guard backfill behind Chroma server readiness to avoid N spurious error
  logs when Chroma failed to start
- Use CHROMA_SYNC log component consistently for backfill messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: pass project as parameter to ensureBackfilled instead of mutating instance state

Eliminates shared mutable state in backfillAllProjects() loop. Project
scoping is now passed explicitly via parameter to both ensureBackfilled()
and getExistingChromaIds(), keeping a single Chroma connection while
avoiding fragile instance property mutation across iterations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 22:47:46 -05:00
Alex Newman eea4f599c0 docs: update CHANGELOG.md for v10.2.3
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 19:55:51 -05:00
Alex Newman b446f2630e chore: bump version to 10.2.3
Publish to npm / publish (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 19:55:09 -05:00
Alex Newman 224567f980 fix: prevent ONNX model cache corruption from bun cache clears
Remove nuclear `bun pm cache rm` from smart-install.js and
sync-marketplace.cjs (only needed for removed sharp dependency).
Add `bun install` in cache version directory after sync so worker
can resolve dependencies. Move HuggingFace model cache to
~/.claude-mem/models/ so reinstalls don't corrupt it. Add self-healing
retry for Protobuf parsing failures.

Fixes recurring issues #1104, #1105, #1110.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 19:54:17 -05:00
Alex Newman 2b31792f06 docs: update CHANGELOG.md for v10.2.2
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 19:22:09 -05:00
12 changed files with 504 additions and 403 deletions
+1 -1
View File
@@ -10,7 +10,7 @@
"plugins": [
{
"name": "claude-mem",
"version": "10.2.2",
"version": "10.2.4",
"source": "./plugin",
"description": "Persistent memory system for Claude Code - context compression across sessions"
}
+27 -31
View File
@@ -2,6 +2,33 @@
All notable changes to claude-mem.
## [v10.2.3] - 2026-02-17
## Fix Chroma ONNX Model Cache Corruption
Addresses the persistent embedding pipeline failures reported across #1104, #1105, #1110, and subsequent sessions. Three root causes identified and fixed:
### Changes
- **Removed nuclear `bun pm cache rm`** from both `smart-install.js` and `sync-marketplace.cjs`. This was added in v10.2.2 for the now-removed sharp dependency but destroyed all cached packages, breaking the ONNX resolution chain.
- **Added `bun install` in plugin cache directory** after marketplace sync. The cache directory had a `package.json` with `@chroma-core/default-embed` as a dependency but never ran install, so the worker couldn't resolve it at runtime.
- **Moved HuggingFace model cache to `~/.claude-mem/models/`** outside `node_modules`. The ~23MB ONNX model was stored inside `node_modules/@huggingface/transformers/.cache/`, so any reinstall or cache clear corrupted it.
- **Added self-healing retry** for Protobuf parsing failures. If the downloaded model is corrupted, the cache is cleared and re-downloaded automatically on next use.
### Files Changed
- `scripts/smart-install.js` — removed `bun pm cache rm`
- `scripts/sync-marketplace.cjs` — removed `bun pm cache rm`, added `bun install` in cache dir
- `src/services/sync/ChromaSync.ts` — moved model cache, added corruption recovery
## [v10.2.2] - 2026-02-17
## Bug Fixes
- **Removed `node-addon-api` dev dependency** — was only needed for `sharp`, which was already removed in v10.2.1
- **Simplified native module cache clearing** in `smart-install.js` and `sync-marketplace.cjs` — replaced targeted `@img/sharp` directory deletion and lockfile removal with `bun pm cache rm`
- Reduced ~30 lines of brittle file system manipulation to a clean Bun CLI command
## [v10.2.1] - 2026-02-16
## Bug Fixes
@@ -1392,34 +1419,3 @@ Full changelog: https://github.com/thedotmack/claude-mem/compare/v8.2.4...v8.2.5
Patch release v8.2.4
## [v8.2.3] - 2025-12-27
## Bug Fixes
- Fix worker port environment variable in smart-install script
- Implement file-based locking mechanism for worker operations to prevent race conditions
- Fix restart command references in documentation (changed from `claude-mem restart` to `npm run worker:restart`)
## [v8.2.2] - 2025-12-27
## What's Changed
### Features
- Add OpenRouter provider settings and documentation
- Add modal footer with save button and status indicators
- Implement self-spawn pattern for background worker execution
### Bug Fixes
- Resolve critical error handling issues in worker lifecycle
- Handle Windows/Unix kill errors in orphaned process cleanup
- Validate spawn pid before writing PID file
- Handle process exit in waitForProcessesExit filter
- Use readiness endpoint for health checks instead of port check
- Add missing OpenRouter and Gemini settings to settingKeys array
### Other Changes
- Enhance error handling and validation in agents and routes
- Delete obsolete process management files (ProcessManager, worker-wrapper, worker-cli)
- Update hooks.json to use worker-service.cjs CLI
- Add comprehensive tests for hook constants and worker spawn functionality
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "10.2.2",
"version": "10.2.4",
"description": "Memory compression system for Claude Code - persist context across sessions",
"keywords": [
"claude",
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "10.2.2",
"version": "10.2.4",
"description": "Persistent memory system for Claude Code - seamlessly preserve context across sessions",
"author": {
"name": "Alex Newman"
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem-plugin",
"version": "10.2.2",
"version": "10.2.4",
"private": true,
"description": "Runtime dependencies for claude-mem bundled hooks",
"type": "module",
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
-8
View File
@@ -263,14 +263,6 @@ function installDeps() {
// Quote path for Windows paths with spaces
const bunCmd = IS_WINDOWS && bunPath.includes(' ') ? `"${bunPath}"` : bunPath;
// Clear Bun's package cache to prevent stale native module artifacts
try {
execSync(`${bunCmd} pm cache rm`, { cwd: ROOT, stdio: 'pipe', shell: IS_WINDOWS });
console.error(' Cleared Bun package cache');
} catch {
// Cache may not exist yet on first install
}
execSync(`${bunCmd} install`, { cwd: ROOT, stdio: 'inherit', shell: IS_WINDOWS });
// Write version marker
+4 -8
View File
@@ -80,14 +80,6 @@ try {
{ stdio: 'inherit' }
);
// Clear Bun's package cache to prevent stale native module artifacts
try {
execSync('bun pm cache rm', { cwd: INSTALLED_PATH, stdio: 'pipe' });
console.log('Cleared Bun package cache');
} catch {
// Cache may not exist yet on first install
}
console.log('Running bun install in marketplace...');
execSync(
'cd ~/.claude/plugins/marketplaces/thedotmack/ && bun install',
@@ -107,6 +99,10 @@ try {
{ stdio: 'inherit' }
);
// Install dependencies in cache directory so worker can resolve them
console.log(`Running bun install in cache folder (version ${version})...`);
execSync(`bun install`, { cwd: CACHE_VERSION_PATH, stdio: 'inherit' });
console.log('\x1b[32m%s\x1b[0m', 'Sync complete!');
// Trigger worker restart after file sync
+86 -26
View File
@@ -81,10 +81,16 @@ export class ChromaSync {
private collectionName: string;
private readonly VECTOR_DB_DIR: string;
private readonly BATCH_SIZE = 100;
private modelCacheCorruptionRetried = false;
constructor(project: string) {
this.project = project;
this.collectionName = `cm__${project}`;
// Chroma collection names only allow [a-zA-Z0-9._-], 3-512 chars,
// must start/end with [a-zA-Z0-9]
const sanitized = project
.replace(/[^a-zA-Z0-9._-]/g, '_')
.replace(/[^a-zA-Z0-9]+$/, ''); // strip trailing non-alphanumeric
this.collectionName = `cm__${sanitized || 'unknown'}`;
this.VECTOR_DB_DIR = path.join(os.homedir(), '.claude-mem', 'vector-db');
}
@@ -189,6 +195,10 @@ export class ChromaSync {
}
try {
// Store model cache outside node_modules so reinstalls don't corrupt it
const { env } = await import('@huggingface/transformers');
env.cacheDir = path.join(os.homedir(), '.claude-mem', 'models');
// Use WASM backend to avoid native ONNX binary issues (#1104, #1105, #1110).
// Same model (all-MiniLM-L6-v2), same embeddings, but runs in WASM —
// no native binary loading, no segfaults, no ENOENT errors.
@@ -204,8 +214,22 @@ export class ChromaSync {
collection: this.collectionName
});
} catch (error) {
const errorMessage = error instanceof Error ? error.message : String(error);
// Self-heal: corrupted model cache → clear and retry once
if (errorMessage.includes('Protobuf parsing failed') && !this.modelCacheCorruptionRetried) {
this.modelCacheCorruptionRetried = true;
logger.warn('CHROMA_SYNC', 'Corrupted model cache detected, clearing and retrying...');
const modelCacheDir = path.join(os.homedir(), '.claude-mem', 'models');
const fs = await import('fs');
if (fs.existsSync(modelCacheDir)) {
fs.rmSync(modelCacheDir, { recursive: true, force: true });
}
return this.ensureCollection(); // retry once
}
logger.error('CHROMA_SYNC', 'Failed to get/create collection', { collection: this.collectionName }, error as Error);
throw new Error(`Collection setup failed: ${error instanceof Error ? error.message : String(error)}`);
throw new Error(`Collection setup failed: ${errorMessage}`);
}
}
@@ -524,17 +548,18 @@ export class ChromaSync {
* Fetch all existing document IDs from Chroma collection
* Returns Sets of SQLite IDs for observations, summaries, and prompts
*/
private async getExistingChromaIds(): Promise<{
private async getExistingChromaIds(projectOverride?: string): Promise<{
observations: Set<number>;
summaries: Set<number>;
prompts: Set<number>;
}> {
const targetProject = projectOverride ?? this.project;
await this.ensureCollection();
if (!this.collection) {
throw new Error(
'Chroma collection not initialized. Call ensureCollection() before using collection methods.' +
` Project: ${this.project}`
` Project: ${targetProject}`
);
}
@@ -545,14 +570,14 @@ export class ChromaSync {
let offset = 0;
const limit = 1000; // Large batches, metadata only = fast
logger.info('CHROMA_SYNC', 'Fetching existing Chroma document IDs...', { project: this.project });
logger.info('CHROMA_SYNC', 'Fetching existing Chroma document IDs...', { project: targetProject });
while (true) {
try {
const result = await this.collection.get({
limit,
offset,
where: { project: this.project },
where: { project: targetProject },
include: ['metadatas']
});
@@ -579,18 +604,18 @@ export class ChromaSync {
offset += limit;
logger.debug('CHROMA_SYNC', 'Fetched batch of existing IDs', {
project: this.project,
project: targetProject,
offset,
batchSize: metadatas.length
});
} catch (error) {
logger.error('CHROMA_SYNC', 'Failed to fetch existing IDs', { project: this.project }, error as Error);
logger.error('CHROMA_SYNC', 'Failed to fetch existing IDs', { project: targetProject }, error as Error);
throw error;
}
}
logger.info('CHROMA_SYNC', 'Existing IDs fetched', {
project: this.project,
project: targetProject,
observations: observationIds.size,
summaries: summaryIds.size,
prompts: promptIds.size
@@ -602,15 +627,18 @@ export class ChromaSync {
/**
* Backfill: Sync all observations missing from Chroma
* Reads from SQLite and syncs in batches
* @param projectOverride - If provided, backfill this project instead of this.project.
* Used by backfillAllProjects() to iterate projects without mutating instance state.
* Throws error if backfill fails
*/
async ensureBackfilled(): Promise<void> {
logger.info('CHROMA_SYNC', 'Starting smart backfill', { project: this.project });
async ensureBackfilled(projectOverride?: string): Promise<void> {
const backfillProject = projectOverride ?? this.project;
logger.info('CHROMA_SYNC', 'Starting smart backfill', { project: backfillProject });
await this.ensureCollection();
// Fetch existing IDs from Chroma (fast, metadata only)
const existing = await this.getExistingChromaIds();
const existing = await this.getExistingChromaIds(backfillProject);
const db = new SessionStore();
@@ -626,14 +654,14 @@ export class ChromaSync {
SELECT * FROM observations
WHERE project = ? ${obsExclusionClause}
ORDER BY id ASC
`).all(this.project) as StoredObservation[];
`).all(backfillProject) as StoredObservation[];
const totalObsCount = db.db.prepare(`
SELECT COUNT(*) as count FROM observations WHERE project = ?
`).get(this.project) as { count: number };
`).get(backfillProject) as { count: number };
logger.info('CHROMA_SYNC', 'Backfilling observations', {
project: this.project,
project: backfillProject,
missing: observations.length,
existing: existing.observations.size,
total: totalObsCount.count
@@ -651,7 +679,7 @@ export class ChromaSync {
await this.addDocuments(batch);
logger.debug('CHROMA_SYNC', 'Backfill progress', {
project: this.project,
project: backfillProject,
progress: `${Math.min(i + this.BATCH_SIZE, allDocs.length)}/${allDocs.length}`
});
}
@@ -667,14 +695,14 @@ export class ChromaSync {
SELECT * FROM session_summaries
WHERE project = ? ${summaryExclusionClause}
ORDER BY id ASC
`).all(this.project) as StoredSummary[];
`).all(backfillProject) as StoredSummary[];
const totalSummaryCount = db.db.prepare(`
SELECT COUNT(*) as count FROM session_summaries WHERE project = ?
`).get(this.project) as { count: number };
`).get(backfillProject) as { count: number };
logger.info('CHROMA_SYNC', 'Backfilling summaries', {
project: this.project,
project: backfillProject,
missing: summaries.length,
existing: existing.summaries.size,
total: totalSummaryCount.count
@@ -692,7 +720,7 @@ export class ChromaSync {
await this.addDocuments(batch);
logger.debug('CHROMA_SYNC', 'Backfill progress', {
project: this.project,
project: backfillProject,
progress: `${Math.min(i + this.BATCH_SIZE, summaryDocs.length)}/${summaryDocs.length}`
});
}
@@ -713,17 +741,17 @@ export class ChromaSync {
JOIN sdk_sessions s ON up.content_session_id = s.content_session_id
WHERE s.project = ? ${promptExclusionClause}
ORDER BY up.id ASC
`).all(this.project) as StoredUserPrompt[];
`).all(backfillProject) as StoredUserPrompt[];
const totalPromptCount = db.db.prepare(`
SELECT COUNT(*) as count
FROM user_prompts up
JOIN sdk_sessions s ON up.content_session_id = s.content_session_id
WHERE s.project = ?
`).get(this.project) as { count: number };
`).get(backfillProject) as { count: number };
logger.info('CHROMA_SYNC', 'Backfilling user prompts', {
project: this.project,
project: backfillProject,
missing: prompts.length,
existing: existing.prompts.size,
total: totalPromptCount.count
@@ -741,13 +769,13 @@ export class ChromaSync {
await this.addDocuments(batch);
logger.debug('CHROMA_SYNC', 'Backfill progress', {
project: this.project,
project: backfillProject,
progress: `${Math.min(i + this.BATCH_SIZE, promptDocs.length)}/${promptDocs.length}`
});
}
logger.info('CHROMA_SYNC', 'Smart backfill complete', {
project: this.project,
project: backfillProject,
synced: {
observationDocs: allDocs.length,
summaryDocs: summaryDocs.length,
@@ -761,7 +789,7 @@ export class ChromaSync {
});
} catch (error) {
logger.error('CHROMA_SYNC', 'Backfill failed', { project: this.project }, error as Error);
logger.error('CHROMA_SYNC', 'Backfill failed', { project: backfillProject }, error as Error);
throw new Error(`Backfill failed: ${error instanceof Error ? error.message : String(error)}`);
} finally {
db.close();
@@ -848,6 +876,38 @@ export class ChromaSync {
}
}
/**
* Backfill all projects that have observations in SQLite but may be missing from Chroma.
* Uses a single shared ChromaSync('claude-mem') instance and Chroma connection.
* Per-project scoping is passed as a parameter to ensureBackfilled(), avoiding
* instance state mutation. All documents land in the cm__claude-mem collection
* with project scoped via metadata, matching how DatabaseManager and SearchManager operate.
* Designed to be called fire-and-forget on worker startup.
*/
static async backfillAllProjects(): Promise<void> {
const db = new SessionStore();
const sync = new ChromaSync('claude-mem');
try {
const projects = db.db.prepare(
'SELECT DISTINCT project FROM observations WHERE project IS NOT NULL AND project != ?'
).all('') as { project: string }[];
logger.info('CHROMA_SYNC', `Backfill check for ${projects.length} projects`);
for (const { project } of projects) {
try {
await sync.ensureBackfilled(project);
} catch (error) {
logger.error('CHROMA_SYNC', `Backfill failed for project: ${project}`, {}, error as Error);
// Continue to next project — don't let one failure stop others
}
}
} finally {
await sync.close();
db.close();
}
}
/**
* Close the Chroma client connection
* Server lifecycle is managed by ChromaServerManager, not here
+10
View File
@@ -19,6 +19,7 @@ import { SettingsDefaultsManager } from '../shared/SettingsDefaultsManager.js';
import { getAuthMethodDescription } from '../shared/EnvManager.js';
import { logger } from '../utils/logger.js';
import { ChromaServerManager } from './sync/ChromaServerManager.js';
import { ChromaSync } from './sync/ChromaSync.js';
// Windows: avoid repeated spawn popups when startup fails (issue #921)
const WINDOWS_SPAWN_COOLDOWN_MS = 2 * 60 * 1000;
@@ -423,6 +424,15 @@ export class WorkerService {
this.server.registerRoutes(this.searchRoutes);
logger.info('WORKER', 'SearchManager initialized and search routes registered');
// Auto-backfill Chroma for all projects if out of sync with SQLite (fire-and-forget)
if (this.chromaServer !== null || chromaMode !== 'local') {
ChromaSync.backfillAllProjects().then(() => {
logger.info('CHROMA_SYNC', 'Backfill check complete for all projects');
}).catch(error => {
logger.error('CHROMA_SYNC', 'Backfill failed (non-blocking)', {}, error as Error);
});
}
// Connect to MCP server
const mcpServerPath = path.join(__dirname, 'mcp-server.cjs');
const transport = new StdioClientTransport({
+1 -1
View File
@@ -15,7 +15,7 @@ export enum LogLevel {
SILENT = 4
}
export type Component = 'HOOK' | 'WORKER' | 'SDK' | 'PARSER' | 'DB' | 'SYSTEM' | 'HTTP' | 'SESSION' | 'CHROMA' | 'FOLDER_INDEX' | 'CLAUDE_MD';
export type Component = 'HOOK' | 'WORKER' | 'SDK' | 'PARSER' | 'DB' | 'SYSTEM' | 'HTTP' | 'SESSION' | 'CHROMA' | 'CHROMA_SYNC' | 'FOLDER_INDEX' | 'CLAUDE_MD';
interface LogContext {
sessionId?: number;