- Changed user prompt formatting to use full text instead of truncated version. - Updated date filtering logic to use milliseconds instead of seconds for 90-day cutoff. - Renamed doc_type values in ChromaSync to ensure consistency and prevent deduplication issues. - Improved documentation for concept tags in input schema.
9.9 KiB
CodeRabbit Review - Issue Validation
Analysis Date: 2025-11-03 Analyzed By: Claude (Sonnet 4.5) Priority: 🔴 Critical | 🟡 Medium | 🟢 Low
Issue 1: Chroma Search False Positives
Location: experiment/chroma-search-test.ts:135-166
Priority: 🟢 Low
Status: ✅ CONFIRMED - Real bug, correct fix
Severity: Low (experiment file only, not production code)
Problem
The code marks chromaFound = true if the raw text contains the string 'ids', even for empty results like 'ids': [[]].
Current code (line 137):
testResult.chromaFound = resultText.includes('ids') && resultText.length > 50;
This creates false positives by checking for string containment rather than validating actual result content.
Validation
Confirmed by reading the actual code. The logic uses simple string matching which would match both:
- Real results:
'ids': [['obs_123', 'obs_456']]✓ - Empty results:
'ids': [[]]✗ (incorrectly marked as success)
Recommended Fix
Parse and validate the actual content of the ids and/or documents arrays:
// Extract and parse the 'ids' array
const idsMatch = resultText.match(/'ids':\s*\[(.*?)\]/s);
if (idsMatch) {
try {
// Check if there's at least one non-empty inner array
const idsContent = idsMatch[1];
const hasResults = idsContent.includes('[') &&
!idsContent.match(/\[\s*\]/); // Not just empty arrays
testResult.chromaFound = hasResults;
} catch {
testResult.chromaFound = false;
}
}
Decision
DEFER - This is an experiment file, not production code. The bug doesn't affect actual functionality. Can be fixed as a cleanup task when working in this area.
Issue 2: 90-Day Cutoff Units Mismatch
Location: src/servers/search-server.ts:374-381 (and 3 other hybrid search handlers)
Priority: 🔴 Critical
Status: ✅ CONFIRMED - Critical bug, MUST FIX IMMEDIATELY
Severity: High (breaks 90-day temporal filtering entirely)
Problem
The 90-day cutoff is computed in seconds but created_at_epoch is stored in milliseconds, causing the filter to never exclude anything.
Current code (line 374):
const ninetyDaysAgo = Math.floor(Date.now() / 1000) - (90 * 24 * 60 * 60);
// ...
return meta && meta.created_at_epoch > ninetyDaysAgo;
Validation
Database verification:
$ sqlite3 ~/.claude-mem/claude-mem.db "SELECT created_at_epoch FROM observations LIMIT 1"
1762212399087 # This is in MILLISECONDS
Comparison breakdown:
ninetyDaysAgo= ~1,754,000,000 (seconds, 10 digits)created_at_epoch= ~1,762,212,399,087 (milliseconds, 13 digits)
The millisecond value is ALWAYS larger than the second value, so the filter created_at_epoch > ninetyDaysAgo ALWAYS passes, accepting ALL documents regardless of age.
Impact
- 90-day temporal boundary completely non-functional
- Performance degradation (processes all historical data)
- Incorrect search results (includes very old observations)
- Affects 4 handlers:
search_observations,search_sessions,search_user_prompts,get_timeline_by_query
Recommended Fix
Keep milliseconds throughout (remove the /1000 division):
File: src/servers/search-server.ts
Find and replace in all 4 hybrid search handlers:
// OLD (WRONG - converts to seconds)
const ninetyDaysAgo = Math.floor(Date.now() / 1000) - (90 * 24 * 60 * 60);
// NEW (CORRECT - stays in milliseconds)
const ninetyDaysAgo = Date.now() - (90 * 24 * 60 * 60 * 1000);
Locations to fix:
search_observationshandler (~line 374)search_sessionshandlersearch_user_promptshandlerget_timeline_by_queryhandler
Decision
FIX IMMEDIATELY - This is a critical bug that breaks core functionality.
Issue 3: Chroma Collection Name Mismatch
Location: src/services/sync/ChromaSync.ts:77-81 and src/servers/search-server.ts:26
Priority: 🟡 Medium
Status: ⚠️ CURRENTLY WORKS but architectural risk
Severity: Medium (maintainability issue, potential future breakage)
Problem
ChromaSync builds collection names as cm__${project} (parameterized) while search-server uses a hard-coded 'cm__claude-mem', creating maintainability risk.
ChromaSync.ts (line 79):
this.collectionName = `cm__${project}`;
search-server.ts (line 26):
const COLLECTION_NAME = 'cm__claude-mem';
worker-service.ts (line 94):
this.chromaSync = new ChromaSync('claude-mem');
Validation
Current state: WORKS (both resolve to 'cm__claude-mem')
Risk: If anyone changes the ChromaSync instantiation parameter or creates another instance, collections won't match.
Recommended Fix
Create a shared constant in a common config location:
New file: src/shared/config.ts
export const CHROMA_COLLECTION_NAME = 'cm__claude-mem';
// OR for dynamic project support:
export function getCollectionName(project: string = 'claude-mem'): string {
return `cm__${project}`;
}
Update ChromaSync.ts:
import { CHROMA_COLLECTION_NAME } from '../shared/config';
// ...
this.collectionName = CHROMA_COLLECTION_NAME;
Update search-server.ts:
import { CHROMA_COLLECTION_NAME } from '../shared/config';
// ...
const COLLECTION_NAME = CHROMA_COLLECTION_NAME;
Decision
RECOMMENDED FIX - Good architectural improvement, prevents future bugs. Not urgent since it currently works, but should be included in the next refactoring pass.
Issue 4: doc_type Value Mismatch in ChromaSync
Location: src/services/sync/ChromaSync.ts:523-532 (read) vs lines 240, 429 (write)
Priority: 🔴 Critical
Status: ✅ CONFIRMED - Critical bug, MUST FIX
Severity: High (breaks deduplication, causes duplicate insert failures)
Problem
Documents are written with 'session_summary' and 'user_prompt' but the deduplication logic looks for 'summary' and 'prompt', causing existing documents to not be detected.
Write side (formatSummaryDocs, line 240):
doc_type: 'session_summary',
Write side (formatUserPromptDoc, line 429):
doc_type: 'user_prompt',
Read side (getExistingChromaIds, lines 526-529):
} else if (meta.doc_type === 'summary') {
summaryIds.add(meta.sqlite_id);
} else if (meta.doc_type === 'prompt') {
promptIds.add(meta.sqlite_id);
}
Validation
Confirmed by code inspection. The mismatch causes:
getExistingChromaIdsdoesn't find existing summaries/prompts- They're not added to the deduplication sets
- System tries to insert them again
- Chroma rejects with duplicate ID errors
Impact
- Deduplication completely broken for summaries and prompts
- Backfill operations fail (see Issue 5)
- Duplicate insert errors in production
- Observations work fine (they use 'observation' consistently)
Recommended Fix
PREFERRED APPROACH: Fix the read side (backward compatible with existing Chroma data)
File: src/services/sync/ChromaSync.ts
Lines: 526-529
} else if (meta.doc_type === 'session_summary') { // Changed from 'summary'
summaryIds.add(meta.sqlite_id);
} else if (meta.doc_type === 'user_prompt') { // Changed from 'prompt'
promptIds.add(meta.sqlite_id);
}
Why this approach:
- ✅ Backward compatible with existing Chroma data
- ✅ No data migration required
- ✅ Safer than changing write side
- ✅ Works immediately
Alternative approach (NOT recommended): Change write side to use 'summary'/'prompt'
- ❌ Requires Chroma data migration
- ❌ Orphans existing documents
- ❌ Higher risk
Decision
FIX IMMEDIATELY - Critical bug affecting deduplication. Use the backward-compatible fix (change read side).
Issue 5: doc_type Mismatch Causing Backfill Failures
Location: src/services/worker-service.ts:120-128 (manifestation of Issue 4)
Priority: 🔴 Critical
Status: ✅ CONFIRMED - Same root cause as Issue 4
Severity: High (duplicate of Issue 4)
Problem
Backfill operations fail because of the doc_type mismatch described in Issue 4.
Validation
This is not a separate bug - it's a symptom of Issue 4. The backfill process:
- Queries SQLite for summaries/prompts to sync
- Calls
getExistingChromaIdsto avoid duplicates - Due to doc_type mismatch, existing IDs aren't found
- Tries to insert documents that already exist
- Chroma rejects with duplicate ID errors
- Backfill fails
Decision
AUTOMATICALLY RESOLVED by fixing Issue 4. Not a separate fix needed.
Summary & Action Plan
Critical Issues (Fix Immediately)
-
✅ Issue 2 - 90-day units mismatch
- Fix: Change all 4 handlers to use milliseconds
- Impact: Restores temporal filtering functionality
-
✅ Issue 4 - doc_type mismatch
- Fix: Change getExistingChromaIds to use 'session_summary'/'user_prompt'
- Impact: Fixes deduplication and backfill
-
✅ Issue 5 - Automatically resolved by fixing Issue 4
Medium Priority (Include in Next Refactor)
- ⚠️ Issue 3 - Collection name consistency
- Fix: Create shared constant
- Impact: Better maintainability, prevents future bugs
Low Priority (Defer)
- 🟢 Issue 1 - False positives in experiment
- Fix: Parse and validate arrays
- Impact: More accurate test results (experiment only)
Files Requiring Changes
High Priority:
src/servers/search-server.ts(Issue 2 - 4 locations)src/services/sync/ChromaSync.ts(Issue 4 - lines 526-529)
Medium Priority:
src/shared/config.ts(Issue 3 - new file)src/services/sync/ChromaSync.ts(Issue 3 - import)src/servers/search-server.ts(Issue 3 - import)
Low Priority:
experiment/chroma-search-test.ts(Issue 1)
Testing Recommendations
After fixes:
- Test 90-day filtering with dates before/after cutoff
- Run backfill operation to verify deduplication
- Verify no duplicate ID errors in logs
- Test hybrid search with temporal boundaries