* feat: Add discovery_tokens for ROI tracking in observations and session summaries - Introduced `discovery_tokens` column in `observations` and `session_summaries` tables to track token costs associated with discovering and creating each observation and summary. - Updated relevant services and hooks to calculate and display ROI metrics based on discovery tokens. - Enhanced context economics reporting to include savings from reusing previous observations. - Implemented migration to ensure the new column is added to existing tables. - Adjusted data models and sync processes to accommodate the new `discovery_tokens` field. * refactor: streamline context hook by removing unused functions and updating terminology - Removed the estimateTokens and getObservations helper functions as they were not utilized. - Updated the legend and output messages to replace "discovery" with "work" for clarity. - Changed the emoji representation for different observation types to better reflect their purpose. - Enhanced output formatting for improved readability and understanding of token usage. * Refactor user-message-hook and context-hook for improved clarity and functionality - Updated user-message-hook.js to enhance error messaging and improve variable naming for clarity. - Modified context-hook.ts to include a new column key section, improved context index instructions, and added emoji icons for observation types. - Adjusted footer messages in context-hook.ts to emphasize token savings and access to past research. - Changed user-message-hook.ts to update the feedback and support message for clarity. * fix: Critical ROI tracking fixes from PR review Addresses critical findings from PR #111 review: 1. **Fixed incorrect discovery token calculation** (src/services/worker/SDKAgent.ts) - Changed from passing cumulative total to per-response delta - Now correctly tracks token cost for each observation/summary - Captures token state before/after response processing - Prevents all observations getting inflated cumulative values 2. **Fixed schema version mismatch** (src/services/sqlite/SessionStore.ts) - Changed ensureDiscoveryTokensColumn() from version 11 to version 7 - Now matches migration007 definition in migrations.ts - Ensures consistent version tracking across migration system These fixes ensure ROI metrics accurately reflect token costs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Update search documentation to reflect hybrid ChromaDB architecture The backend correctly implements ChromaDB-first semantic search with SQLite temporal ordering and FTS5 fallback, but documentation incorrectly described it as "FTS5 full-text search". This fix aligns all skill guides and tool descriptions with the actual implementation. Changes: - Update SKILL.md to describe hybrid architecture with ChromaDB primary - Update observations.md title and query parameter descriptions - Update all three search tool descriptions in search-server.ts: * search_observations * search_sessions * search_user_prompts All tools now correctly document: - ChromaDB semantic search (primary ranking) - 90-day recency filter - SQLite temporal ordering - FTS5 fallback (when ChromaDB unavailable) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Add discovery_tokens column to observations and session_summaries tables --------- Co-authored-by: Claude <noreply@anthropic.com>
15 KiB
Hybrid Search Architecture: Problem-Solution Document
Date: 2025-01-15 Author: Claude Code (Session handoff document) Purpose: Comprehensive fix guide for hybrid search architecture documentation and implementation
Executive Summary
The claude-mem hybrid search architecture is correctly implemented in code but incorrectly documented in skill guides. Additionally, the workflow is missing the final "instant context timeline" step that completes the human memory analogy.
Quick Status:
- ✅ Backend code (
search-server.ts): ChromaDB first, SQLite temporal sort - ❌ Skill operation guides: Describe FTS5 as primary search method
- ❌ Missing feature: Automatic timeline context retrieval (before/after observations)
- ✅ Landing page: Recently corrected
- ⚠️ Documentation: Needs validation and potential refinement
The Intended Architecture (User's Vision)
Storage Flow
User Action
↓
1. SQLite Insert (FAST, synchronous)
- Immediate persistence
- Available for querying instantly
↓
2. ChromaDB Sync (BACKGROUND, asynchronous)
- Worker generates embeddings
- Takes time but doesn't block user
- Uses OpenAI text-embedding-3-small
Why this design:
- Users don't wait for embedding generation
- SQLite provides immediate access
- ChromaDB catches up in background for semantic search
Search Flow (3-Layer Sequential Architecture)
User Query: "How did we implement authentication?"
↓
LAYER 1: Semantic Retrieval (ChromaDB)
- Vector similarity search
- Returns observation IDs (not full records)
- Top 100 semantic matches
- 90-day recency filter applied
↓
LAYER 2: Temporal Ordering (SQLite)
- Takes IDs from Layer 1
- Hydrates full records from SQLite
- Sorts by created_at_epoch DESC
- Returns NEWEST relevant observation
↓
LAYER 3: Instant Context Timeline (SQLite) [MISSING IN CURRENT IMPLEMENTATION]
- Takes top observation ID from Layer 2
- Retrieves N observations BEFORE that point
- Retrieves N observations AFTER that point
- Provides temporal context: "what led here" + "what happened next"
↓
Present to User
- Most relevant observation
- Timeline showing before/after context
- Mimics human memory
Why ChromaDB can't do it alone:
- ChromaDB doesn't efficiently support date range queries sorted by time
- SQLite excels at temporal operations (ORDER BY created_at_epoch)
- Need both: ChromaDB for semantic, SQLite for temporal
Why the timeline matters:
LLMs don't experience time linearly like humans do. Humans remember: "I did X, which led to Y, then Z happened." The instant context timeline gives LLMs this temporal awareness that humans experience naturally.
Fallback Behavior
IF ChromaDB unavailable OR no results:
↓
FTS5 Keyword Search (SQLite)
- Full-text search on observations_fts
- Basic keyword matching
- Ensures backward compatibility
- Fallback for older systems
FTS5 is NOT "optional" - it's the fallback mechanism for when ChromaDB isn't available or returns no results.
Current State Analysis
✅ What's Correct: Backend Implementation
File: /Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts
Lines: 360-396 (search_observations handler)
The code DOES implement Layers 1 & 2 correctly:
// Step 1: ChromaDB semantic search (top 100)
if (chromaClient) {
const chromaResults = await queryChroma(query, 100);
// Step 2: Filter by 90-day recency
const ninetyDaysAgo = Date.now() - (90 * 24 * 60 * 60 * 1000);
const recentIds = chromaResults.ids.filter((_id, idx) => {
const meta = chromaResults.metadatas[idx];
return meta && meta.created_at_epoch > ninetyDaysAgo;
});
// Step 3: Hydrate from SQLite with temporal ordering
results = store.getObservationsByIds(recentIds, {
orderBy: 'date_desc',
limit
});
}
// Fallback to FTS5 if ChromaDB unavailable
if (results.length === 0) {
results = search.searchObservations(query, options); // FTS5
}
What this gets right:
- ChromaDB semantic search FIRST (not FTS5)
- 90-day recency filter
- SQLite temporal ordering (
orderBy: 'date_desc') - FTS5 fallback for reliability
❌ What's Wrong: Skill Operation Guides
File: /Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md
Current Title: "Search Observations (Full-Text)"
Current Description: "Search all observations using natural language queries."
Current Line 351: query: z.string().describe('Search query for FTS5 full-text search')
The Problem:
- Describes FTS5 as the search method
- No mention of ChromaDB semantic search
- Misleading title "Full-Text" implies keyword-only
- Examples don't show the ChromaDB → SQLite flow
Impact:
- Claude thinks it's doing FTS5 keyword search
- Doesn't understand it's semantic vector search
- Can't explain the architecture to users correctly
⚠️ What's Missing: Layer 3 (Instant Context Timeline)
The current implementation stops at Layer 2 (temporal ordering). It doesn't automatically:
- Identify the MOST relevant observation (it returns a sorted list)
- Retrieve observations BEFORE that point in time
- Retrieve observations AFTER that point in time
- Present the timeline context to the user
Why this matters: The timeline is the killer feature that mimics human memory. Without it, users get:
- ❌ A sorted list of relevant observations
- ❌ No context about what led there
- ❌ No context about what happened next
With timeline, users get:
- ✅ The MOST relevant observation
- ✅ Context: "You did A and B before this"
- ✅ Context: "After this, you did C and D"
- ✅ Complete narrative like human memory
📋 Documentation Status
Recently Fixed (✅):
/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md- Now describes 3-layer sequential flow
- Includes human memory analogy
- Positions ChromaDB as primary
Landing Page (✅):
/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx- All updated to describe ChromaDB-first architecture
- "Remember Like a Human" messaging added
- Timeline feature highlighted
Needs Review:
- SKILL.md technical notes (line 172)
- All operation guides in
/operations/directory - Common workflows documentation
Required Fixes
Fix 1: Update Skill Operation Guides
Files to modify:
/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md
Changes needed:
-
observations.md:
- Change title: "Search Observations (Full-Text)" → "Search Observations (Semantic + Temporal)"
- Update description: Explain ChromaDB semantic search as primary
- Update command examples to explain hybrid flow
- Add note: "Uses ChromaDB vector search with SQLite temporal ordering. FTS5 used as fallback."
-
common-workflows.md:
- Update "Workflow 2: Finding Specific Bug Fixes" to explain ChromaDB → SQLite flow
- Add new workflow: "Workflow N: Getting Timeline Context Around Relevant Observations"
Example of corrected observations.md header:
# Search Observations (Semantic + Temporal)
Search observations using ChromaDB vector similarity with SQLite temporal ordering.
## Architecture
**3-Layer Hybrid Search:**
1. **ChromaDB semantic retrieval** - Finds what's semantically relevant (vector similarity)
2. **90-day recency filter** - Prioritizes recent work
3. **SQLite temporal ordering** - Sorts by time, returns newest relevant
**Fallback:** If ChromaDB unavailable, falls back to FTS5 keyword search.
## When to Use
- User asks: "How did we implement authentication?"
- User asks: "What bugs did we fix?"
- Looking for past work by meaning/topic (not just keywords)
Fix 2: Implement Layer 3 (Instant Context Timeline)
Option A: Add to existing search_observations handler
Modify /Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts line ~396:
// After getting sorted results, if user wants timeline context
if (results.length > 0 && options.includeTimeline) {
const topObservation = results[0];
const depth_before = options.timelineDepthBefore || 5;
const depth_after = options.timelineDepthAfter || 5;
// Get observations before and after
const timeline = store.getTimelineContext(
topObservation.id,
depth_before,
depth_after
);
return {
topResult: topObservation,
timeline: timeline,
format: format
};
}
Option B: Use existing timeline-by-query operation
The /api/timeline/by-query endpoint already implements search + timeline. Could:
- Make it the DEFAULT recommended operation in skill guides
- Update operation guides to emphasize this as primary workflow
- Position observations search as "timeline-less" alternative
Recommendation: Option B is faster - leverage existing timeline-by-query endpoint and update skill guides to make it the primary workflow.
Fix 3: Update SKILL.md Technical Notes
File: /Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md
Line 172:
Current:
- **Search engine:** FTS5 full-text search + structured filters
Change to:
- **Search engine:** ChromaDB vector search (primary) + SQLite temporal ordering + instant context timeline (3-layer sequential architecture)
Fix 4: Update search_observations Description
File: /Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts
Line 349:
Current:
description: 'Search observations using full-text search across titles, narratives...'
Change to:
description: 'Search observations using hybrid semantic search (ChromaDB vector similarity + SQLite temporal ordering). Falls back to FTS5 keyword search if ChromaDB unavailable. IMPORTANT: Always use index format first...'
Line 351:
Current:
query: z.string().describe('Search query for FTS5 full-text search'),
Change to:
query: z.string().describe('Search query (semantic vector search via ChromaDB, falls back to FTS5 if unavailable)'),
Implementation Checklist
Use this checklist when executing fixes:
Phase 1: Core Documentation
- Update
observations.mdtitle and description - Update
observations.mdarchitecture explanation - Update
observations.mdexamples to mention ChromaDB - Update
common-workflows.mdto explain hybrid flow - Update
SKILL.mdline 172 technical notes - Verify all operation guides mention ChromaDB correctly
Phase 2: Backend Updates
- Update
search-server.tssearch_observations description (line 349) - Update
search-server.tsquery parameter description (line 351) - Add code comments explaining 3-layer flow
- Consider adding
includeTimelineoption to search_observations
Phase 3: Timeline Integration
- Review timeline-by-query operation
- Update skill guides to recommend timeline-by-query as primary workflow
- Add example: "When you need context, use timeline-by-query instead of observations search"
- Update quick reference table in SKILL.md to highlight timeline-by-query
Phase 4: Validation
- Test search behavior with ChromaDB enabled
- Test fallback behavior with ChromaDB disabled
- Verify skill guides accurately describe behavior
- Ensure landing page messaging aligns with skill guides
- Check that human memory analogy is consistent everywhere
Key Messaging (Use Consistently)
Value Proposition
"3-layer hybrid search mimics human memory: ChromaDB semantic retrieval finds what's relevant → SQLite temporal ordering identifies when → instant context timeline shows what led there and what came next."
Technical Architecture
"ChromaDB vector search handles semantic understanding (what's relevant), SQLite handles temporal queries (when it happened, what's newest), and timeline context provides before/after observations (what led there, what happened next)."
Why It Matters
"LLMs don't experience time linearly like humans do. Claude-mem gives them temporal context: not just 'you implemented authentication,' but 'you researched OAuth libraries, then implemented JWT auth, then fixed a token expiration bug.' Complete narrative, like human memory."
ChromaDB Role
"ChromaDB is the PRIMARY search mechanism for semantic understanding. FTS5 is the FALLBACK for backward compatibility and reliability when ChromaDB is unavailable."
Files Reference
Skill Guides (Primary Fixes):
/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/timeline-by-query.md/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md
Backend Code (Minor Updates):
/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts
Documentation (Validation):
/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md
Landing Page (Already Fixed):
/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx
Questions for User (If Needed)
-
Timeline Integration Approach:
- Option A: Modify search_observations to add
includeTimelineparameter - Option B: Emphasize timeline-by-query as primary workflow in guides
- User preference?
- Option A: Modify search_observations to add
-
Backward Compatibility:
- Should FTS5 fallback be MORE prominent in docs for older systems?
- Or keep it as "implementation detail"?
-
Progressive Disclosure:
- Should timeline context ALWAYS be included?
- Or only when user explicitly asks for context?
Success Criteria
When these fixes are complete:
- ✅ Skill operation guides accurately describe ChromaDB-first architecture
- ✅ No references to "FTS5 as primary search method"
- ✅ Timeline feature integrated into standard workflow
- ✅ Human memory analogy present in key documentation
- ✅ Consistent messaging across skill guides, docs, and landing page
- ✅ Backend code comments explain 3-layer flow clearly
- ✅ Users understand: "This is semantic search with temporal context, not just keyword search"
Notes for Next Claude
- The user has already clarified the architecture thoroughly
- Backend code is already correct - focus on documentation/guides
- Landing page recently updated - validate for consistency
- Timeline-by-query endpoint already exists - leverage it
- Key insight: This mimics human memory through temporal context
- ChromaDB is PRIMARY, not optional. FTS5 is FALLBACK, not primary.
Start with: Reading this document fully, then update skill operation guides first (highest impact).