Files
claude-mem/HYBRID_SEARCH_ARCHITECTURE_FIX.md
T
Alex Newman c0778bef00 docs: Align search documentation with hybrid ChromaDB architecture (#116)
* feat: Add discovery_tokens for ROI tracking in observations and session summaries

- Introduced `discovery_tokens` column in `observations` and `session_summaries` tables to track token costs associated with discovering and creating each observation and summary.
- Updated relevant services and hooks to calculate and display ROI metrics based on discovery tokens.
- Enhanced context economics reporting to include savings from reusing previous observations.
- Implemented migration to ensure the new column is added to existing tables.
- Adjusted data models and sync processes to accommodate the new `discovery_tokens` field.

* refactor: streamline context hook by removing unused functions and updating terminology

- Removed the estimateTokens and getObservations helper functions as they were not utilized.
- Updated the legend and output messages to replace "discovery" with "work" for clarity.
- Changed the emoji representation for different observation types to better reflect their purpose.
- Enhanced output formatting for improved readability and understanding of token usage.

* Refactor user-message-hook and context-hook for improved clarity and functionality

- Updated user-message-hook.js to enhance error messaging and improve variable naming for clarity.
- Modified context-hook.ts to include a new column key section, improved context index instructions, and added emoji icons for observation types.
- Adjusted footer messages in context-hook.ts to emphasize token savings and access to past research.
- Changed user-message-hook.ts to update the feedback and support message for clarity.

* fix: Critical ROI tracking fixes from PR review

Addresses critical findings from PR #111 review:

1. **Fixed incorrect discovery token calculation** (src/services/worker/SDKAgent.ts)
   - Changed from passing cumulative total to per-response delta
   - Now correctly tracks token cost for each observation/summary
   - Captures token state before/after response processing
   - Prevents all observations getting inflated cumulative values

2. **Fixed schema version mismatch** (src/services/sqlite/SessionStore.ts)
   - Changed ensureDiscoveryTokensColumn() from version 11 to version 7
   - Now matches migration007 definition in migrations.ts
   - Ensures consistent version tracking across migration system

These fixes ensure ROI metrics accurately reflect token costs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Update search documentation to reflect hybrid ChromaDB architecture

The backend correctly implements ChromaDB-first semantic search with SQLite
temporal ordering and FTS5 fallback, but documentation incorrectly described
it as "FTS5 full-text search". This fix aligns all skill guides and tool
descriptions with the actual implementation.

Changes:
- Update SKILL.md to describe hybrid architecture with ChromaDB primary
- Update observations.md title and query parameter descriptions
- Update all three search tool descriptions in search-server.ts:
  * search_observations
  * search_sessions
  * search_user_prompts

All tools now correctly document:
- ChromaDB semantic search (primary ranking)
- 90-day recency filter
- SQLite temporal ordering
- FTS5 fallback (when ChromaDB unavailable)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Add discovery_tokens column to observations and session_summaries tables

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-16 13:36:17 -05:00

435 lines
15 KiB
Markdown

# Hybrid Search Architecture: Problem-Solution Document
**Date:** 2025-01-15
**Author:** Claude Code (Session handoff document)
**Purpose:** Comprehensive fix guide for hybrid search architecture documentation and implementation
---
## Executive Summary
The claude-mem hybrid search architecture is **correctly implemented in code** but **incorrectly documented** in skill guides. Additionally, the workflow is missing the final "instant context timeline" step that completes the human memory analogy.
**Quick Status:**
- ✅ Backend code (`search-server.ts`): ChromaDB first, SQLite temporal sort
- ❌ Skill operation guides: Describe FTS5 as primary search method
- ❌ Missing feature: Automatic timeline context retrieval (before/after observations)
- ✅ Landing page: Recently corrected
- ⚠️ Documentation: Needs validation and potential refinement
---
## The Intended Architecture (User's Vision)
### Storage Flow
```
User Action
1. SQLite Insert (FAST, synchronous)
- Immediate persistence
- Available for querying instantly
2. ChromaDB Sync (BACKGROUND, asynchronous)
- Worker generates embeddings
- Takes time but doesn't block user
- Uses OpenAI text-embedding-3-small
```
**Why this design:**
- Users don't wait for embedding generation
- SQLite provides immediate access
- ChromaDB catches up in background for semantic search
### Search Flow (3-Layer Sequential Architecture)
```
User Query: "How did we implement authentication?"
LAYER 1: Semantic Retrieval (ChromaDB)
- Vector similarity search
- Returns observation IDs (not full records)
- Top 100 semantic matches
- 90-day recency filter applied
LAYER 2: Temporal Ordering (SQLite)
- Takes IDs from Layer 1
- Hydrates full records from SQLite
- Sorts by created_at_epoch DESC
- Returns NEWEST relevant observation
LAYER 3: Instant Context Timeline (SQLite) [MISSING IN CURRENT IMPLEMENTATION]
- Takes top observation ID from Layer 2
- Retrieves N observations BEFORE that point
- Retrieves N observations AFTER that point
- Provides temporal context: "what led here" + "what happened next"
Present to User
- Most relevant observation
- Timeline showing before/after context
- Mimics human memory
```
**Why ChromaDB can't do it alone:**
- ChromaDB doesn't efficiently support date range queries sorted by time
- SQLite excels at temporal operations (ORDER BY created_at_epoch)
- Need both: ChromaDB for semantic, SQLite for temporal
**Why the timeline matters:**
> LLMs don't experience time linearly like humans do. Humans remember: "I did X, which led to Y, then Z happened." The instant context timeline gives LLMs this temporal awareness that humans experience naturally.
### Fallback Behavior
```
IF ChromaDB unavailable OR no results:
FTS5 Keyword Search (SQLite)
- Full-text search on observations_fts
- Basic keyword matching
- Ensures backward compatibility
- Fallback for older systems
```
**FTS5 is NOT "optional"** - it's the fallback mechanism for when ChromaDB isn't available or returns no results.
---
## Current State Analysis
### ✅ What's Correct: Backend Implementation
**File:** `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts`
**Lines:** 360-396 (search_observations handler)
The code DOES implement Layers 1 & 2 correctly:
```typescript
// Step 1: ChromaDB semantic search (top 100)
if (chromaClient) {
const chromaResults = await queryChroma(query, 100);
// Step 2: Filter by 90-day recency
const ninetyDaysAgo = Date.now() - (90 * 24 * 60 * 60 * 1000);
const recentIds = chromaResults.ids.filter((_id, idx) => {
const meta = chromaResults.metadatas[idx];
return meta && meta.created_at_epoch > ninetyDaysAgo;
});
// Step 3: Hydrate from SQLite with temporal ordering
results = store.getObservationsByIds(recentIds, {
orderBy: 'date_desc',
limit
});
}
// Fallback to FTS5 if ChromaDB unavailable
if (results.length === 0) {
results = search.searchObservations(query, options); // FTS5
}
```
**What this gets right:**
- ChromaDB semantic search FIRST (not FTS5)
- 90-day recency filter
- SQLite temporal ordering (`orderBy: 'date_desc'`)
- FTS5 fallback for reliability
### ❌ What's Wrong: Skill Operation Guides
**File:** `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md`
**Current Title:** "Search Observations (Full-Text)"
**Current Description:** "Search all observations using natural language queries."
**Current Line 351:** `query: z.string().describe('Search query for FTS5 full-text search')`
**The Problem:**
- Describes FTS5 as the search method
- No mention of ChromaDB semantic search
- Misleading title "Full-Text" implies keyword-only
- Examples don't show the ChromaDB → SQLite flow
**Impact:**
- Claude thinks it's doing FTS5 keyword search
- Doesn't understand it's semantic vector search
- Can't explain the architecture to users correctly
### ⚠️ What's Missing: Layer 3 (Instant Context Timeline)
The current implementation stops at Layer 2 (temporal ordering). It doesn't automatically:
1. Identify the MOST relevant observation (it returns a sorted list)
2. Retrieve observations BEFORE that point in time
3. Retrieve observations AFTER that point in time
4. Present the timeline context to the user
**Why this matters:**
The timeline is the **killer feature** that mimics human memory. Without it, users get:
- ❌ A sorted list of relevant observations
- ❌ No context about what led there
- ❌ No context about what happened next
With timeline, users get:
- ✅ The MOST relevant observation
- ✅ Context: "You did A and B before this"
- ✅ Context: "After this, you did C and D"
- ✅ Complete narrative like human memory
### 📋 Documentation Status
**Recently Fixed (✅):**
- `/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md`
- Now describes 3-layer sequential flow
- Includes human memory analogy
- Positions ChromaDB as primary
**Landing Page (✅):**
- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx`
- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx`
- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx`
- All updated to describe ChromaDB-first architecture
- "Remember Like a Human" messaging added
- Timeline feature highlighted
**Needs Review:**
- SKILL.md technical notes (line 172)
- All operation guides in `/operations/` directory
- Common workflows documentation
---
## Required Fixes
### Fix 1: Update Skill Operation Guides
**Files to modify:**
- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md`
- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md`
**Changes needed:**
1. **observations.md:**
- Change title: "Search Observations (Full-Text)" → "Search Observations (Semantic + Temporal)"
- Update description: Explain ChromaDB semantic search as primary
- Update command examples to explain hybrid flow
- Add note: "Uses ChromaDB vector search with SQLite temporal ordering. FTS5 used as fallback."
2. **common-workflows.md:**
- Update "Workflow 2: Finding Specific Bug Fixes" to explain ChromaDB → SQLite flow
- Add new workflow: "Workflow N: Getting Timeline Context Around Relevant Observations"
**Example of corrected observations.md header:**
```markdown
# Search Observations (Semantic + Temporal)
Search observations using ChromaDB vector similarity with SQLite temporal ordering.
## Architecture
**3-Layer Hybrid Search:**
1. **ChromaDB semantic retrieval** - Finds what's semantically relevant (vector similarity)
2. **90-day recency filter** - Prioritizes recent work
3. **SQLite temporal ordering** - Sorts by time, returns newest relevant
**Fallback:** If ChromaDB unavailable, falls back to FTS5 keyword search.
## When to Use
- User asks: "How did we implement authentication?"
- User asks: "What bugs did we fix?"
- Looking for past work by meaning/topic (not just keywords)
```
### Fix 2: Implement Layer 3 (Instant Context Timeline)
**Option A: Add to existing search_observations handler**
Modify `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts` line ~396:
```typescript
// After getting sorted results, if user wants timeline context
if (results.length > 0 && options.includeTimeline) {
const topObservation = results[0];
const depth_before = options.timelineDepthBefore || 5;
const depth_after = options.timelineDepthAfter || 5;
// Get observations before and after
const timeline = store.getTimelineContext(
topObservation.id,
depth_before,
depth_after
);
return {
topResult: topObservation,
timeline: timeline,
format: format
};
}
```
**Option B: Use existing timeline-by-query operation**
The `/api/timeline/by-query` endpoint already implements search + timeline. Could:
1. Make it the DEFAULT recommended operation in skill guides
2. Update operation guides to emphasize this as primary workflow
3. Position observations search as "timeline-less" alternative
**Recommendation:** Option B is faster - leverage existing `timeline-by-query` endpoint and update skill guides to make it the primary workflow.
### Fix 3: Update SKILL.md Technical Notes
**File:** `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md`
**Line 172:**
**Current:**
```markdown
- **Search engine:** FTS5 full-text search + structured filters
```
**Change to:**
```markdown
- **Search engine:** ChromaDB vector search (primary) + SQLite temporal ordering + instant context timeline (3-layer sequential architecture)
```
### Fix 4: Update search_observations Description
**File:** `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts`
**Line 349:**
**Current:**
```typescript
description: 'Search observations using full-text search across titles, narratives...'
```
**Change to:**
```typescript
description: 'Search observations using hybrid semantic search (ChromaDB vector similarity + SQLite temporal ordering). Falls back to FTS5 keyword search if ChromaDB unavailable. IMPORTANT: Always use index format first...'
```
**Line 351:**
**Current:**
```typescript
query: z.string().describe('Search query for FTS5 full-text search'),
```
**Change to:**
```typescript
query: z.string().describe('Search query (semantic vector search via ChromaDB, falls back to FTS5 if unavailable)'),
```
---
## Implementation Checklist
Use this checklist when executing fixes:
### Phase 1: Core Documentation
- [ ] Update `observations.md` title and description
- [ ] Update `observations.md` architecture explanation
- [ ] Update `observations.md` examples to mention ChromaDB
- [ ] Update `common-workflows.md` to explain hybrid flow
- [ ] Update `SKILL.md` line 172 technical notes
- [ ] Verify all operation guides mention ChromaDB correctly
### Phase 2: Backend Updates
- [ ] Update `search-server.ts` search_observations description (line 349)
- [ ] Update `search-server.ts` query parameter description (line 351)
- [ ] Add code comments explaining 3-layer flow
- [ ] Consider adding `includeTimeline` option to search_observations
### Phase 3: Timeline Integration
- [ ] Review timeline-by-query operation
- [ ] Update skill guides to recommend timeline-by-query as primary workflow
- [ ] Add example: "When you need context, use timeline-by-query instead of observations search"
- [ ] Update quick reference table in SKILL.md to highlight timeline-by-query
### Phase 4: Validation
- [ ] Test search behavior with ChromaDB enabled
- [ ] Test fallback behavior with ChromaDB disabled
- [ ] Verify skill guides accurately describe behavior
- [ ] Ensure landing page messaging aligns with skill guides
- [ ] Check that human memory analogy is consistent everywhere
---
## Key Messaging (Use Consistently)
### Value Proposition
"3-layer hybrid search mimics human memory: ChromaDB semantic retrieval finds what's relevant → SQLite temporal ordering identifies when → instant context timeline shows what led there and what came next."
### Technical Architecture
"ChromaDB vector search handles semantic understanding (what's relevant), SQLite handles temporal queries (when it happened, what's newest), and timeline context provides before/after observations (what led there, what happened next)."
### Why It Matters
"LLMs don't experience time linearly like humans do. Claude-mem gives them temporal context: not just 'you implemented authentication,' but 'you researched OAuth libraries, then implemented JWT auth, then fixed a token expiration bug.' Complete narrative, like human memory."
### ChromaDB Role
"ChromaDB is the PRIMARY search mechanism for semantic understanding. FTS5 is the FALLBACK for backward compatibility and reliability when ChromaDB is unavailable."
---
## Files Reference
**Skill Guides (Primary Fixes):**
- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md`
- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md`
- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/timeline-by-query.md`
- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md`
**Backend Code (Minor Updates):**
- `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts`
**Documentation (Validation):**
- `/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md`
**Landing Page (Already Fixed):**
- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx`
- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx`
- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx`
---
## Questions for User (If Needed)
1. **Timeline Integration Approach:**
- Option A: Modify search_observations to add `includeTimeline` parameter
- Option B: Emphasize timeline-by-query as primary workflow in guides
- User preference?
2. **Backward Compatibility:**
- Should FTS5 fallback be MORE prominent in docs for older systems?
- Or keep it as "implementation detail"?
3. **Progressive Disclosure:**
- Should timeline context ALWAYS be included?
- Or only when user explicitly asks for context?
---
## Success Criteria
When these fixes are complete:
1. ✅ Skill operation guides accurately describe ChromaDB-first architecture
2. ✅ No references to "FTS5 as primary search method"
3. ✅ Timeline feature integrated into standard workflow
4. ✅ Human memory analogy present in key documentation
5. ✅ Consistent messaging across skill guides, docs, and landing page
6. ✅ Backend code comments explain 3-layer flow clearly
7. ✅ Users understand: "This is semantic search with temporal context, not just keyword search"
---
## Notes for Next Claude
- The user has already clarified the architecture thoroughly
- Backend code is already correct - focus on documentation/guides
- Landing page recently updated - validate for consistency
- Timeline-by-query endpoint already exists - leverage it
- Key insight: This mimics human memory through temporal context
- ChromaDB is PRIMARY, not optional. FTS5 is FALLBACK, not primary.
**Start with:** Reading this document fully, then update skill operation guides first (highest impact).