# Hybrid Search Architecture: Problem-Solution Document **Date:** 2025-01-15 **Author:** Claude Code (Session handoff document) **Purpose:** Comprehensive fix guide for hybrid search architecture documentation and implementation --- ## Executive Summary The claude-mem hybrid search architecture is **correctly implemented in code** but **incorrectly documented** in skill guides. Additionally, the workflow is missing the final "instant context timeline" step that completes the human memory analogy. **Quick Status:** - ✅ Backend code (`search-server.ts`): ChromaDB first, SQLite temporal sort - ❌ Skill operation guides: Describe FTS5 as primary search method - ❌ Missing feature: Automatic timeline context retrieval (before/after observations) - ✅ Landing page: Recently corrected - ⚠️ Documentation: Needs validation and potential refinement --- ## The Intended Architecture (User's Vision) ### Storage Flow ``` User Action ↓ 1. SQLite Insert (FAST, synchronous) - Immediate persistence - Available for querying instantly ↓ 2. ChromaDB Sync (BACKGROUND, asynchronous) - Worker generates embeddings - Takes time but doesn't block user - Uses OpenAI text-embedding-3-small ``` **Why this design:** - Users don't wait for embedding generation - SQLite provides immediate access - ChromaDB catches up in background for semantic search ### Search Flow (3-Layer Sequential Architecture) ``` User Query: "How did we implement authentication?" ↓ LAYER 1: Semantic Retrieval (ChromaDB) - Vector similarity search - Returns observation IDs (not full records) - Top 100 semantic matches - 90-day recency filter applied ↓ LAYER 2: Temporal Ordering (SQLite) - Takes IDs from Layer 1 - Hydrates full records from SQLite - Sorts by created_at_epoch DESC - Returns NEWEST relevant observation ↓ LAYER 3: Instant Context Timeline (SQLite) [MISSING IN CURRENT IMPLEMENTATION] - Takes top observation ID from Layer 2 - Retrieves N observations BEFORE that point - Retrieves N observations AFTER that point - Provides temporal context: "what led here" + "what happened next" ↓ Present to User - Most relevant observation - Timeline showing before/after context - Mimics human memory ``` **Why ChromaDB can't do it alone:** - ChromaDB doesn't efficiently support date range queries sorted by time - SQLite excels at temporal operations (ORDER BY created_at_epoch) - Need both: ChromaDB for semantic, SQLite for temporal **Why the timeline matters:** > LLMs don't experience time linearly like humans do. Humans remember: "I did X, which led to Y, then Z happened." The instant context timeline gives LLMs this temporal awareness that humans experience naturally. ### Fallback Behavior ``` IF ChromaDB unavailable OR no results: ↓ FTS5 Keyword Search (SQLite) - Full-text search on observations_fts - Basic keyword matching - Ensures backward compatibility - Fallback for older systems ``` **FTS5 is NOT "optional"** - it's the fallback mechanism for when ChromaDB isn't available or returns no results. --- ## Current State Analysis ### ✅ What's Correct: Backend Implementation **File:** `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts` **Lines:** 360-396 (search_observations handler) The code DOES implement Layers 1 & 2 correctly: ```typescript // Step 1: ChromaDB semantic search (top 100) if (chromaClient) { const chromaResults = await queryChroma(query, 100); // Step 2: Filter by 90-day recency const ninetyDaysAgo = Date.now() - (90 * 24 * 60 * 60 * 1000); const recentIds = chromaResults.ids.filter((_id, idx) => { const meta = chromaResults.metadatas[idx]; return meta && meta.created_at_epoch > ninetyDaysAgo; }); // Step 3: Hydrate from SQLite with temporal ordering results = store.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit }); } // Fallback to FTS5 if ChromaDB unavailable if (results.length === 0) { results = search.searchObservations(query, options); // FTS5 } ``` **What this gets right:** - ChromaDB semantic search FIRST (not FTS5) - 90-day recency filter - SQLite temporal ordering (`orderBy: 'date_desc'`) - FTS5 fallback for reliability ### ❌ What's Wrong: Skill Operation Guides **File:** `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md` **Current Title:** "Search Observations (Full-Text)" **Current Description:** "Search all observations using natural language queries." **Current Line 351:** `query: z.string().describe('Search query for FTS5 full-text search')` **The Problem:** - Describes FTS5 as the search method - No mention of ChromaDB semantic search - Misleading title "Full-Text" implies keyword-only - Examples don't show the ChromaDB → SQLite flow **Impact:** - Claude thinks it's doing FTS5 keyword search - Doesn't understand it's semantic vector search - Can't explain the architecture to users correctly ### ⚠️ What's Missing: Layer 3 (Instant Context Timeline) The current implementation stops at Layer 2 (temporal ordering). It doesn't automatically: 1. Identify the MOST relevant observation (it returns a sorted list) 2. Retrieve observations BEFORE that point in time 3. Retrieve observations AFTER that point in time 4. Present the timeline context to the user **Why this matters:** The timeline is the **killer feature** that mimics human memory. Without it, users get: - ❌ A sorted list of relevant observations - ❌ No context about what led there - ❌ No context about what happened next With timeline, users get: - ✅ The MOST relevant observation - ✅ Context: "You did A and B before this" - ✅ Context: "After this, you did C and D" - ✅ Complete narrative like human memory ### 📋 Documentation Status **Recently Fixed (✅):** - `/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md` - Now describes 3-layer sequential flow - Includes human memory analogy - Positions ChromaDB as primary **Landing Page (✅):** - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx` - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx` - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx` - All updated to describe ChromaDB-first architecture - "Remember Like a Human" messaging added - Timeline feature highlighted **Needs Review:** - SKILL.md technical notes (line 172) - All operation guides in `/operations/` directory - Common workflows documentation --- ## Required Fixes ### Fix 1: Update Skill Operation Guides **Files to modify:** - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md` - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md` **Changes needed:** 1. **observations.md:** - Change title: "Search Observations (Full-Text)" → "Search Observations (Semantic + Temporal)" - Update description: Explain ChromaDB semantic search as primary - Update command examples to explain hybrid flow - Add note: "Uses ChromaDB vector search with SQLite temporal ordering. FTS5 used as fallback." 2. **common-workflows.md:** - Update "Workflow 2: Finding Specific Bug Fixes" to explain ChromaDB → SQLite flow - Add new workflow: "Workflow N: Getting Timeline Context Around Relevant Observations" **Example of corrected observations.md header:** ```markdown # Search Observations (Semantic + Temporal) Search observations using ChromaDB vector similarity with SQLite temporal ordering. ## Architecture **3-Layer Hybrid Search:** 1. **ChromaDB semantic retrieval** - Finds what's semantically relevant (vector similarity) 2. **90-day recency filter** - Prioritizes recent work 3. **SQLite temporal ordering** - Sorts by time, returns newest relevant **Fallback:** If ChromaDB unavailable, falls back to FTS5 keyword search. ## When to Use - User asks: "How did we implement authentication?" - User asks: "What bugs did we fix?" - Looking for past work by meaning/topic (not just keywords) ``` ### Fix 2: Implement Layer 3 (Instant Context Timeline) **Option A: Add to existing search_observations handler** Modify `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts` line ~396: ```typescript // After getting sorted results, if user wants timeline context if (results.length > 0 && options.includeTimeline) { const topObservation = results[0]; const depth_before = options.timelineDepthBefore || 5; const depth_after = options.timelineDepthAfter || 5; // Get observations before and after const timeline = store.getTimelineContext( topObservation.id, depth_before, depth_after ); return { topResult: topObservation, timeline: timeline, format: format }; } ``` **Option B: Use existing timeline-by-query operation** The `/api/timeline/by-query` endpoint already implements search + timeline. Could: 1. Make it the DEFAULT recommended operation in skill guides 2. Update operation guides to emphasize this as primary workflow 3. Position observations search as "timeline-less" alternative **Recommendation:** Option B is faster - leverage existing `timeline-by-query` endpoint and update skill guides to make it the primary workflow. ### Fix 3: Update SKILL.md Technical Notes **File:** `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md` **Line 172:** **Current:** ```markdown - **Search engine:** FTS5 full-text search + structured filters ``` **Change to:** ```markdown - **Search engine:** ChromaDB vector search (primary) + SQLite temporal ordering + instant context timeline (3-layer sequential architecture) ``` ### Fix 4: Update search_observations Description **File:** `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts` **Line 349:** **Current:** ```typescript description: 'Search observations using full-text search across titles, narratives...' ``` **Change to:** ```typescript description: 'Search observations using hybrid semantic search (ChromaDB vector similarity + SQLite temporal ordering). Falls back to FTS5 keyword search if ChromaDB unavailable. IMPORTANT: Always use index format first...' ``` **Line 351:** **Current:** ```typescript query: z.string().describe('Search query for FTS5 full-text search'), ``` **Change to:** ```typescript query: z.string().describe('Search query (semantic vector search via ChromaDB, falls back to FTS5 if unavailable)'), ``` --- ## Implementation Checklist Use this checklist when executing fixes: ### Phase 1: Core Documentation - [ ] Update `observations.md` title and description - [ ] Update `observations.md` architecture explanation - [ ] Update `observations.md` examples to mention ChromaDB - [ ] Update `common-workflows.md` to explain hybrid flow - [ ] Update `SKILL.md` line 172 technical notes - [ ] Verify all operation guides mention ChromaDB correctly ### Phase 2: Backend Updates - [ ] Update `search-server.ts` search_observations description (line 349) - [ ] Update `search-server.ts` query parameter description (line 351) - [ ] Add code comments explaining 3-layer flow - [ ] Consider adding `includeTimeline` option to search_observations ### Phase 3: Timeline Integration - [ ] Review timeline-by-query operation - [ ] Update skill guides to recommend timeline-by-query as primary workflow - [ ] Add example: "When you need context, use timeline-by-query instead of observations search" - [ ] Update quick reference table in SKILL.md to highlight timeline-by-query ### Phase 4: Validation - [ ] Test search behavior with ChromaDB enabled - [ ] Test fallback behavior with ChromaDB disabled - [ ] Verify skill guides accurately describe behavior - [ ] Ensure landing page messaging aligns with skill guides - [ ] Check that human memory analogy is consistent everywhere --- ## Key Messaging (Use Consistently) ### Value Proposition "3-layer hybrid search mimics human memory: ChromaDB semantic retrieval finds what's relevant → SQLite temporal ordering identifies when → instant context timeline shows what led there and what came next." ### Technical Architecture "ChromaDB vector search handles semantic understanding (what's relevant), SQLite handles temporal queries (when it happened, what's newest), and timeline context provides before/after observations (what led there, what happened next)." ### Why It Matters "LLMs don't experience time linearly like humans do. Claude-mem gives them temporal context: not just 'you implemented authentication,' but 'you researched OAuth libraries, then implemented JWT auth, then fixed a token expiration bug.' Complete narrative, like human memory." ### ChromaDB Role "ChromaDB is the PRIMARY search mechanism for semantic understanding. FTS5 is the FALLBACK for backward compatibility and reliability when ChromaDB is unavailable." --- ## Files Reference **Skill Guides (Primary Fixes):** - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md` - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md` - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/timeline-by-query.md` - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md` **Backend Code (Minor Updates):** - `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts` **Documentation (Validation):** - `/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md` **Landing Page (Already Fixed):** - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx` - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx` - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx` --- ## Questions for User (If Needed) 1. **Timeline Integration Approach:** - Option A: Modify search_observations to add `includeTimeline` parameter - Option B: Emphasize timeline-by-query as primary workflow in guides - User preference? 2. **Backward Compatibility:** - Should FTS5 fallback be MORE prominent in docs for older systems? - Or keep it as "implementation detail"? 3. **Progressive Disclosure:** - Should timeline context ALWAYS be included? - Or only when user explicitly asks for context? --- ## Success Criteria When these fixes are complete: 1. ✅ Skill operation guides accurately describe ChromaDB-first architecture 2. ✅ No references to "FTS5 as primary search method" 3. ✅ Timeline feature integrated into standard workflow 4. ✅ Human memory analogy present in key documentation 5. ✅ Consistent messaging across skill guides, docs, and landing page 6. ✅ Backend code comments explain 3-layer flow clearly 7. ✅ Users understand: "This is semantic search with temporal context, not just keyword search" --- ## Notes for Next Claude - The user has already clarified the architecture thoroughly - Backend code is already correct - focus on documentation/guides - Landing page recently updated - validate for consistency - Timeline-by-query endpoint already exists - leverage it - Key insight: This mimics human memory through temporal context - ChromaDB is PRIMARY, not optional. FTS5 is FALLBACK, not primary. **Start with:** Reading this document fully, then update skill operation guides first (highest impact).