docs: Align search documentation with hybrid ChromaDB architecture (#116)

* feat: Add discovery_tokens for ROI tracking in observations and session summaries - Introduced `discovery_tokens` column in `observations` and `session_summaries` tables to track token costs associated with discovering and creating each observation and summary. - Updated relevant services and hooks to calculate and display ROI metrics based on discovery tokens. - Enhanced context economics reporting to include savings from reusing previous observations. - Implemented migration to ensure the new column is added to existing tables. - Adjusted data models and sync processes to accommodate the new `discovery_tokens` field. * refactor: streamline context hook by removing unused functions and updating terminology - Removed the estimateTokens and getObservations helper functions as they were not utilized. - Updated the legend and output messages to replace "discovery" with "work" for clarity. - Changed the emoji representation for different observation types to better reflect their purpose. - Enhanced output formatting for improved readability and understanding of token usage. * Refactor user-message-hook and context-hook for improved clarity and functionality - Updated user-message-hook.js to enhance error messaging and improve variable naming for clarity. - Modified context-hook.ts to include a new column key section, improved context index instructions, and added emoji icons for observation types. - Adjusted footer messages in context-hook.ts to emphasize token savings and access to past research. - Changed user-message-hook.ts to update the feedback and support message for clarity. * fix: Critical ROI tracking fixes from PR review Addresses critical findings from PR #111 review: 1. **Fixed incorrect discovery token calculation** (src/services/worker/SDKAgent.ts) - Changed from passing cumulative total to per-response delta - Now correctly tracks token cost for each observation/summary - Captures token state before/after response processing - Prevents all observations getting inflated cumulative values 2. **Fixed schema version mismatch** (src/services/sqlite/SessionStore.ts) - Changed ensureDiscoveryTokensColumn() from version 11 to version 7 - Now matches migration007 definition in migrations.ts - Ensures consistent version tracking across migration system These fixes ensure ROI metrics accurately reflect token costs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Update search documentation to reflect hybrid ChromaDB architecture The backend correctly implements ChromaDB-first semantic search with SQLite temporal ordering and FTS5 fallback, but documentation incorrectly described it as "FTS5 full-text search". This fix aligns all skill guides and tool descriptions with the actual implementation. Changes: - Update SKILL.md to describe hybrid architecture with ChromaDB primary - Update observations.md title and query parameter descriptions - Update all three search tool descriptions in search-server.ts: * search_observations * search_sessions * search_user_prompts All tools now correctly document: - ChromaDB semantic search (primary ranking) - 90-day recency filter - SQLite temporal ordering - FTS5 fallback (when ChromaDB unavailable) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Add discovery_tokens column to observations and session_summaries tables --------- Co-authored-by: Claude <noreply@anthropic.com>
2025-11-16 13:36:17 -05:00
parent 3cbc041c8b
commit c0778bef00
10 changed files with 541 additions and 64 deletions
@@ -0,0 +1,434 @@
+# Hybrid Search Architecture: Problem-Solution Document
+
+**Date:** 2025-01-15
+**Author:** Claude Code (Session handoff document)
+**Purpose:** Comprehensive fix guide for hybrid search architecture documentation and implementation
+
+---
+
+## Executive Summary
+
+The claude-mem hybrid search architecture is **correctly implemented in code** but **incorrectly documented** in skill guides. Additionally, the workflow is missing the final "instant context timeline" step that completes the human memory analogy.
+
+**Quick Status:**
+- ✅ Backend code (`search-server.ts`): ChromaDB first, SQLite temporal sort
+- ❌ Skill operation guides: Describe FTS5 as primary search method
+- ❌ Missing feature: Automatic timeline context retrieval (before/after observations)
+- ✅ Landing page: Recently corrected
+- ⚠️ Documentation: Needs validation and potential refinement
+
+---
+
+## The Intended Architecture (User's Vision)
+
+### Storage Flow
+
+```
+User Action
+    ↓
+1. SQLite Insert (FAST, synchronous)
+    - Immediate persistence
+    - Available for querying instantly
+    ↓
+2. ChromaDB Sync (BACKGROUND, asynchronous)
+    - Worker generates embeddings
+    - Takes time but doesn't block user
+    - Uses OpenAI text-embedding-3-small
+```
+
+**Why this design:**
+- Users don't wait for embedding generation
+- SQLite provides immediate access
+- ChromaDB catches up in background for semantic search
+
+### Search Flow (3-Layer Sequential Architecture)
+
+```
+User Query: "How did we implement authentication?"
+    ↓
+LAYER 1: Semantic Retrieval (ChromaDB)
+    - Vector similarity search
+    - Returns observation IDs (not full records)
+    - Top 100 semantic matches
+    - 90-day recency filter applied
+    ↓
+LAYER 2: Temporal Ordering (SQLite)
+    - Takes IDs from Layer 1
+    - Hydrates full records from SQLite
+    - Sorts by created_at_epoch DESC
+    - Returns NEWEST relevant observation
+    ↓
+LAYER 3: Instant Context Timeline (SQLite) [MISSING IN CURRENT IMPLEMENTATION]
+    - Takes top observation ID from Layer 2
+    - Retrieves N observations BEFORE that point
+    - Retrieves N observations AFTER that point
+    - Provides temporal context: "what led here" + "what happened next"
+    ↓
+Present to User
+    - Most relevant observation
+    - Timeline showing before/after context
+    - Mimics human memory
+```
+
+**Why ChromaDB can't do it alone:**
+- ChromaDB doesn't efficiently support date range queries sorted by time
+- SQLite excels at temporal operations (ORDER BY created_at_epoch)
+- Need both: ChromaDB for semantic, SQLite for temporal
+
+**Why the timeline matters:**
+> LLMs don't experience time linearly like humans do. Humans remember: "I did X, which led to Y, then Z happened." The instant context timeline gives LLMs this temporal awareness that humans experience naturally.
+
+### Fallback Behavior
+
+```
+IF ChromaDB unavailable OR no results:
+    ↓
+FTS5 Keyword Search (SQLite)
+    - Full-text search on observations_fts
+    - Basic keyword matching
+    - Ensures backward compatibility
+    - Fallback for older systems
+```
+
+**FTS5 is NOT "optional"** - it's the fallback mechanism for when ChromaDB isn't available or returns no results.
+
+---
+
+## Current State Analysis
+
+### ✅ What's Correct: Backend Implementation
+
+**File:** `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts`
+**Lines:** 360-396 (search_observations handler)
+
+The code DOES implement Layers 1 & 2 correctly:
+
+```typescript
+// Step 1: ChromaDB semantic search (top 100)
+if (chromaClient) {
+  const chromaResults = await queryChroma(query, 100);
+
+  // Step 2: Filter by 90-day recency
+  const ninetyDaysAgo = Date.now() - (90 * 24 * 60 * 60 * 1000);
+  const recentIds = chromaResults.ids.filter((_id, idx) => {
+    const meta = chromaResults.metadatas[idx];
+    return meta && meta.created_at_epoch > ninetyDaysAgo;
+  });
+
+  // Step 3: Hydrate from SQLite with temporal ordering
+  results = store.getObservationsByIds(recentIds, {
+    orderBy: 'date_desc',
+    limit
+  });
+}
+
+// Fallback to FTS5 if ChromaDB unavailable
+if (results.length === 0) {
+  results = search.searchObservations(query, options); // FTS5
+}
+```
+
+**What this gets right:**
+- ChromaDB semantic search FIRST (not FTS5)
+- 90-day recency filter
+- SQLite temporal ordering (`orderBy: 'date_desc'`)
+- FTS5 fallback for reliability
+
+### ❌ What's Wrong: Skill Operation Guides
+
+**File:** `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md`
+
+**Current Title:** "Search Observations (Full-Text)"
+**Current Description:** "Search all observations using natural language queries."
+**Current Line 351:** `query: z.string().describe('Search query for FTS5 full-text search')`
+
+**The Problem:**
+- Describes FTS5 as the search method
+- No mention of ChromaDB semantic search
+- Misleading title "Full-Text" implies keyword-only
+- Examples don't show the ChromaDB → SQLite flow
+
+**Impact:**
+- Claude thinks it's doing FTS5 keyword search
+- Doesn't understand it's semantic vector search
+- Can't explain the architecture to users correctly
+
+### ⚠️ What's Missing: Layer 3 (Instant Context Timeline)
+
+The current implementation stops at Layer 2 (temporal ordering). It doesn't automatically:
+
+1. Identify the MOST relevant observation (it returns a sorted list)
+2. Retrieve observations BEFORE that point in time
+3. Retrieve observations AFTER that point in time
+4. Present the timeline context to the user
+
+**Why this matters:**
+The timeline is the **killer feature** that mimics human memory. Without it, users get:
+- ❌ A sorted list of relevant observations
+- ❌ No context about what led there
+- ❌ No context about what happened next
+
+With timeline, users get:
+- ✅ The MOST relevant observation
+- ✅ Context: "You did A and B before this"
+- ✅ Context: "After this, you did C and D"
+- ✅ Complete narrative like human memory
+
+### 📋 Documentation Status
+
+**Recently Fixed (✅):**
+- `/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md`
+  - Now describes 3-layer sequential flow
+  - Includes human memory analogy
+  - Positions ChromaDB as primary
+
+**Landing Page (✅):**
+- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx`
+- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx`
+- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx`
+  - All updated to describe ChromaDB-first architecture
+  - "Remember Like a Human" messaging added
+  - Timeline feature highlighted
+
+**Needs Review:**
+- SKILL.md technical notes (line 172)
+- All operation guides in `/operations/` directory
+- Common workflows documentation
+
+---
+
+## Required Fixes
+
+### Fix 1: Update Skill Operation Guides
+
+**Files to modify:**
+- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md`
+- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md`
+
+**Changes needed:**
+
+1. **observations.md:**
+   - Change title: "Search Observations (Full-Text)" → "Search Observations (Semantic + Temporal)"
+   - Update description: Explain ChromaDB semantic search as primary
+   - Update command examples to explain hybrid flow
+   - Add note: "Uses ChromaDB vector search with SQLite temporal ordering. FTS5 used as fallback."
+
+2. **common-workflows.md:**
+   - Update "Workflow 2: Finding Specific Bug Fixes" to explain ChromaDB → SQLite flow
+   - Add new workflow: "Workflow N: Getting Timeline Context Around Relevant Observations"
+
+**Example of corrected observations.md header:**
+
+```markdown
+# Search Observations (Semantic + Temporal)
+
+Search observations using ChromaDB vector similarity with SQLite temporal ordering.
+
+## Architecture
+
+**3-Layer Hybrid Search:**
+1. **ChromaDB semantic retrieval** - Finds what's semantically relevant (vector similarity)
+2. **90-day recency filter** - Prioritizes recent work
+3. **SQLite temporal ordering** - Sorts by time, returns newest relevant
+
+**Fallback:** If ChromaDB unavailable, falls back to FTS5 keyword search.
+
+## When to Use
+
+- User asks: "How did we implement authentication?"
+- User asks: "What bugs did we fix?"
+- Looking for past work by meaning/topic (not just keywords)
+```
+
+### Fix 2: Implement Layer 3 (Instant Context Timeline)
+
+**Option A: Add to existing search_observations handler**
+
+Modify `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts` line ~396:
+
+```typescript
+// After getting sorted results, if user wants timeline context
+if (results.length > 0 && options.includeTimeline) {
+  const topObservation = results[0];
+  const depth_before = options.timelineDepthBefore || 5;
+  const depth_after = options.timelineDepthAfter || 5;
+
+  // Get observations before and after
+  const timeline = store.getTimelineContext(
+    topObservation.id,
+    depth_before,
+    depth_after
+  );
+
+  return {
+    topResult: topObservation,
+    timeline: timeline,
+    format: format
+  };
+}
+```
+
+**Option B: Use existing timeline-by-query operation**
+
+The `/api/timeline/by-query` endpoint already implements search + timeline. Could:
+1. Make it the DEFAULT recommended operation in skill guides
+2. Update operation guides to emphasize this as primary workflow
+3. Position observations search as "timeline-less" alternative
+
+**Recommendation:** Option B is faster - leverage existing `timeline-by-query` endpoint and update skill guides to make it the primary workflow.
+
+### Fix 3: Update SKILL.md Technical Notes
+
+**File:** `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md`
+**Line 172:**
+
+**Current:**
+```markdown
+- **Search engine:** FTS5 full-text search + structured filters
+```
+
+**Change to:**
+```markdown
+- **Search engine:** ChromaDB vector search (primary) + SQLite temporal ordering + instant context timeline (3-layer sequential architecture)
+```
+
+### Fix 4: Update search_observations Description
+
+**File:** `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts`
+**Line 349:**
+
+**Current:**
+```typescript
+description: 'Search observations using full-text search across titles, narratives...'
+```
+
+**Change to:**
+```typescript
+description: 'Search observations using hybrid semantic search (ChromaDB vector similarity + SQLite temporal ordering). Falls back to FTS5 keyword search if ChromaDB unavailable. IMPORTANT: Always use index format first...'
+```
+
+**Line 351:**
+
+**Current:**
+```typescript
+query: z.string().describe('Search query for FTS5 full-text search'),
+```
+
+**Change to:**
+```typescript
+query: z.string().describe('Search query (semantic vector search via ChromaDB, falls back to FTS5 if unavailable)'),
+```
+
+---
+
+## Implementation Checklist
+
+Use this checklist when executing fixes:
+
+### Phase 1: Core Documentation
+- [ ] Update `observations.md` title and description
+- [ ] Update `observations.md` architecture explanation
+- [ ] Update `observations.md` examples to mention ChromaDB
+- [ ] Update `common-workflows.md` to explain hybrid flow
+- [ ] Update `SKILL.md` line 172 technical notes
+- [ ] Verify all operation guides mention ChromaDB correctly
+
+### Phase 2: Backend Updates
+- [ ] Update `search-server.ts` search_observations description (line 349)
+- [ ] Update `search-server.ts` query parameter description (line 351)
+- [ ] Add code comments explaining 3-layer flow
+- [ ] Consider adding `includeTimeline` option to search_observations
+
+### Phase 3: Timeline Integration
+- [ ] Review timeline-by-query operation
+- [ ] Update skill guides to recommend timeline-by-query as primary workflow
+- [ ] Add example: "When you need context, use timeline-by-query instead of observations search"
+- [ ] Update quick reference table in SKILL.md to highlight timeline-by-query
+
+### Phase 4: Validation
+- [ ] Test search behavior with ChromaDB enabled
+- [ ] Test fallback behavior with ChromaDB disabled
+- [ ] Verify skill guides accurately describe behavior
+- [ ] Ensure landing page messaging aligns with skill guides
+- [ ] Check that human memory analogy is consistent everywhere
+
+---
+
+## Key Messaging (Use Consistently)
+
+### Value Proposition
+"3-layer hybrid search mimics human memory: ChromaDB semantic retrieval finds what's relevant → SQLite temporal ordering identifies when → instant context timeline shows what led there and what came next."
+
+### Technical Architecture
+"ChromaDB vector search handles semantic understanding (what's relevant), SQLite handles temporal queries (when it happened, what's newest), and timeline context provides before/after observations (what led there, what happened next)."
+
+### Why It Matters
+"LLMs don't experience time linearly like humans do. Claude-mem gives them temporal context: not just 'you implemented authentication,' but 'you researched OAuth libraries, then implemented JWT auth, then fixed a token expiration bug.' Complete narrative, like human memory."
+
+### ChromaDB Role
+"ChromaDB is the PRIMARY search mechanism for semantic understanding. FTS5 is the FALLBACK for backward compatibility and reliability when ChromaDB is unavailable."
+
+---
+
+## Files Reference
+
+**Skill Guides (Primary Fixes):**
+- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md`
+- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md`
+- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/timeline-by-query.md`
+- `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md`
+
+**Backend Code (Minor Updates):**
+- `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts`
+
+**Documentation (Validation):**
+- `/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md`
+
+**Landing Page (Already Fixed):**
+- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx`
+- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx`
+- `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx`
+
+---
+
+## Questions for User (If Needed)
+
+1. **Timeline Integration Approach:**
+   - Option A: Modify search_observations to add `includeTimeline` parameter
+   - Option B: Emphasize timeline-by-query as primary workflow in guides
+   - User preference?
+
+2. **Backward Compatibility:**
+   - Should FTS5 fallback be MORE prominent in docs for older systems?
+   - Or keep it as "implementation detail"?
+
+3. **Progressive Disclosure:**
+   - Should timeline context ALWAYS be included?
+   - Or only when user explicitly asks for context?
+
+---
+
+## Success Criteria
+
+When these fixes are complete:
+
+1. ✅ Skill operation guides accurately describe ChromaDB-first architecture
+2. ✅ No references to "FTS5 as primary search method"
+3. ✅ Timeline feature integrated into standard workflow
+4. ✅ Human memory analogy present in key documentation
+5. ✅ Consistent messaging across skill guides, docs, and landing page
+6. ✅ Backend code comments explain 3-layer flow clearly
+7. ✅ Users understand: "This is semantic search with temporal context, not just keyword search"
+
+---
+
+## Notes for Next Claude
+
+- The user has already clarified the architecture thoroughly
+- Backend code is already correct - focus on documentation/guides
+- Landing page recently updated - validate for consistency
+- Timeline-by-query endpoint already exists - leverage it
+- Key insight: This mimics human memory through temporal context
+- ChromaDB is PRIMARY, not optional. FTS5 is FALLBACK, not primary.
+
+**Start with:** Reading this document fully, then update skill operation guides first (highest impact).
@@ -148,16 +148,19 @@ When Claude invokes the skill:

 ## Search Architecture

-### Hybrid Search System
+### 3-Layer Hybrid Search System

-claude-mem uses a **hybrid search architecture** combining:
+claude-mem uses a **3-layer sequential search architecture** that mimics human long-term memory:

-1. **SQLite FTS5 (Full-Text Search)** - Keyword-based search
-2. **ChromaDB (Vector Search)** - Semantic similarity search
+**Storage Flow (Write Path):**
+1. **SQLite First** - Data written synchronously to SQLite (fast, immediate access)
+2. **ChromaDB Background Sync** - Worker asynchronously generates embeddings and syncs to ChromaDB
+
+**Search Flow (Read Path - Sequential, NOT parallel):**

 ```
 ┌─────────────────────────────────────────────────────────────┐
-│                   Search Request Flow                        │
+│                3-Layer Sequential Search Flow                │
 └─────────────────────────────────────────────────────────────┘
                            │
                            ▼
@@ -166,61 +169,70 @@ claude-mem uses a **hybrid search architecture** combining:
              │  /api/search/*          │
              └─────────────────────────┘
                            │
-              ┌─────────────┴─────────────┐
-              ▼                           ▼
-┌──────────────────────────┐  ┌──────────────────────────┐
-│  SessionSearch (FTS5)    │  │  ChromaSync (Vector DB)  │
-│                          │  │                          │
-│  Full-text keyword       │  │  Semantic similarity     │
-│  search on:              │  │  search on:              │
-│  - titles                │  │  - narratives            │
-│  - narratives            │  │  - facts                 │
-│  - facts                 │  │  - file content          │
-│  - concepts              │  │                          │
-│                          │  │  Embeddings:             │
-│  SQLite DB:              │  │  - text-embedding-3-small│
-│  observations_fts        │  │  - 90-day recency filter │
-│  sessions_fts            │  │                          │
-│  prompts_fts             │  │  ChromaDB:               │
-│                          │  │  observations collection │
-└──────────────────────────┘  └──────────────────────────┘
-              │                           │
-              └─────────────┬─────────────┘
                            ▼
-              ┌─────────────────────────┐
-              │  Merged Results         │
-              │  - Deduplicated         │
-              │  - Sorted by relevance  │
-              │  - Formatted (index/full)│
-              └─────────────────────────┘
+┌─────────────────────────────────────────────────────────────┐
+│  LAYER 1: Semantic Retrieval (ChromaDB)                     │
+│  ─────────────────────────────────────────────────────────  │
+│  Vector similarity search finds semantically relevant items  │
+│  Returns: observation IDs in index format (~50-100 tokens)  │
+│  Filter: 90-day recency prioritizes recent work             │
+│  Output: List of relevant observation IDs                   │
+└─────────────────────────────────────────────────────────────┘
+                            │
+                            ▼
+┌─────────────────────────────────────────────────────────────┐
+│  LAYER 2: Temporal Ordering (SQLite)                        │
+│  ─────────────────────────────────────────────────────────  │
+│  Takes observation IDs from Layer 1                         │
+│  Sorts by created_at timestamp (fast SQLite temporal query) │
+│  Identifies: MOST RECENT relevant observation               │
+│  Why: ChromaDB doesn't easily query by date range sorted    │
+│  Output: Top observation ID by time                         │
+└─────────────────────────────────────────────────────────────┘
+                            │
+                            ▼
+┌─────────────────────────────────────────────────────────────┐
+│  LAYER 3: Instant Context Timeline (SQLite)                 │
+│  ─────────────────────────────────────────────────────────  │
+│  Uses top observation ID from Layer 2 as anchor             │
+│  Retrieves N observations BEFORE and AFTER that point       │
+│  Provides: "what led here" + "what happened next" context   │
+│  This is the KILLER FEATURE: mimics human memory            │
+│  Output: Timeline with temporal context                     │
+└─────────────────────────────────────────────────────────────┘
 ```

+**Why This Architecture Exists:**
+
+The problem: LLMs don't experience time linearly like humans do. Finding semantically relevant information isn't enough—you need temporal context.
+
+The solution:
+- **ChromaDB** for "what's relevant" (semantic understanding)
+- **SQLite** for "when did it happen" (temporal ordering with fast date-range queries)
+- **Timeline** for "what was the context" (before/after observations)
+
+Together, they mimic how humans recall: "I did X, which led to Y, then Z happened."
+
+**Human Memory Analogy:**
+
+Humans don't just remember isolated facts. They remember sequences: what they did before something, what happened after. The instant context timeline gives LLMs this same temporal awareness that humans experience naturally.
+
 ### Search Types

-#### 1. Full-Text Search (FTS5)
+#### 1. Vector Search (ChromaDB) - PRIMARY Search Layer

-**How it works:**
- Uses SQLite FTS5 virtual tables for instant keyword matching
- Supports boolean operators: `AND`, `OR`, `NOT`, `NEAR`, `*` (wildcard)
- Ranks results by BM25 relevance scoring
- Sub-100ms performance on 8,000+ observations
-
-**Example query:**
-```sql
-- User asks: "How did we implement JWT authentication?"
-SELECT * FROM observations_fts
-WHERE observations_fts MATCH 'JWT AND authentication'
-ORDER BY rank
-LIMIT 20;
-```
-
-#### 2. Vector Search (ChromaDB)
+**Role:** Layer 1 - Semantic Retrieval

 **How it works:**
 - Text is embedded using OpenAI's `text-embedding-3-small` model
- Vector similarity search finds semantically related content
+- Vector similarity search finds semantically related content, not just keyword matches
 - 90-day recency filter prioritizes recent work
- Combined with keyword search for hybrid results
+- Returns observation IDs for temporal processing in Layer 2
+
+**Why it's primary:**
+- Understands meaning, not just keywords ("auth flow" matches "JWT implementation")
+- Finds relevant work even when you don't know exact terms used
+- Semantic understanding crucial for LLM memory retrieval

 **Example query:**
 ```python
@@ -230,6 +242,37 @@ collection.query(
    n_results=20,
    where={"created_at": {"$gte": ninety_days_ago}}
 )
+# Returns: observation IDs semantically related to login/auth
+```
+
+#### 2. Full-Text Search (FTS5) - Supporting Layer
+
+**Role:** Layer 2 & 3 - Temporal Ordering and Timeline Context
+
+**How it works:**
+- Uses SQLite FTS5 virtual tables for instant keyword matching
+- Supports boolean operators: `AND`, `OR`, `NOT`, `NEAR`, `*` (wildcard)
+- Fast temporal queries with date-range sorting
+- Sub-100ms performance on 8,000+ observations
+
+**Why it's supporting:**
+- ChromaDB handles semantic "what's relevant"
+- SQLite/FTS5 handles temporal "when did it happen" and "what came before/after"
+- Optimized for timeline queries and date-based sorting
+
+**Example query:**
+```sql
+-- Takes observation IDs from ChromaDB, sorts by time
+SELECT * FROM observations
+WHERE id IN (/* IDs from ChromaDB */)
+ORDER BY created_at_epoch DESC
+LIMIT 1;
+
+-- Then retrieves timeline context around that observation
+SELECT * FROM observations
+WHERE created_at_epoch < anchor_timestamp
+ORDER BY created_at_epoch DESC
+LIMIT 10; -- "what led here"
 ```

 #### 3. Structured Filters
@@ -166,7 +166,7 @@ ${e.stack}`:e.message;if(Array.isArray(e))return`[${e.length} items]`;let s=Obje
            INSERT INTO user_prompts_fts(rowid, prompt_text)
            VALUES (new.id, new.prompt_text);
          END;
-        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(11))return;this.db.pragma("table_info(observations)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(11,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
+        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(7))return;this.db.pragma("table_info(observations)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(7,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
      SELECT
        request, investigated, learned, completed, next_steps,
        files_read, files_edited, notes, prompt_number, created_at
@@ -166,7 +166,7 @@ ${e.stack}`:e.message;if(Array.isArray(e))return`[${e.length} items]`;let s=Obje
            INSERT INTO user_prompts_fts(rowid, prompt_text)
            VALUES (new.id, new.prompt_text);
          END;
-        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(11))return;this.db.pragma("table_info(observations)").some(a=>a.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(a=>a.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(11,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
+        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(7))return;this.db.pragma("table_info(observations)").some(a=>a.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(a=>a.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(7,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
      SELECT
        request, investigated, learned, completed, next_steps,
        files_read, files_edited, notes, prompt_number, created_at
@@ -166,7 +166,7 @@ ${e.stack}`:e.message;if(Array.isArray(e))return`[${e.length} items]`;let s=Obje
            INSERT INTO user_prompts_fts(rowid, prompt_text)
            VALUES (new.id, new.prompt_text);
          END;
-        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(11))return;this.db.pragma("table_info(observations)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(11,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
+        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(7))return;this.db.pragma("table_info(observations)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(7,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
      SELECT
        request, investigated, learned, completed, next_steps,
        files_read, files_edited, notes, prompt_number, created_at
@@ -166,7 +166,7 @@ ${e.stack}`:e.message;if(Array.isArray(e))return`[${e.length} items]`;let s=Obje
            INSERT INTO user_prompts_fts(rowid, prompt_text)
            VALUES (new.id, new.prompt_text);
          END;
-        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(11))return;this.db.pragma("table_info(observations)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(11,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
+        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(7))return;this.db.pragma("table_info(observations)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(7,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
      SELECT
        request, investigated, learned, completed, next_steps,
        files_read, files_edited, notes, prompt_number, created_at
@@ -166,7 +166,7 @@ ${e.stack}`:e.message;if(Array.isArray(e))return`[${e.length} items]`;let s=Obje
            INSERT INTO user_prompts_fts(rowid, prompt_text)
            VALUES (new.id, new.prompt_text);
          END;
-        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(11))return;this.db.pragma("table_info(observations)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(11,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
+        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(7))return;this.db.pragma("table_info(observations)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(7,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
      SELECT
        request, investigated, learned, completed, next_steps,
        files_read, files_edited, notes, prompt_number, created_at
@@ -169,7 +169,7 @@ For guidelines on presenting search results to users, see [operations/formatting

 - **Port:** Default 37777 (configurable via `CLAUDE_MEM_WORKER_PORT`)
 - **Response format:** Always JSON
- **Search engine:** FTS5 full-text search + structured filters
+- **Search engine:** ChromaDB semantic search (primary ranking) + SQLite FTS5 (fallback) + 90-day recency filter + temporal ordering (hybrid architecture)
 - **All operations:** HTTP GET with query parameters
 - **Worker:** PM2-managed background process

@@ -1,4 +1,4 @@
-# Search Observations (Full-Text)
+# Search Observations (Semantic + Full-Text Hybrid)

 Search all observations using natural language queries.

@@ -17,7 +17,7 @@ curl -s "http://localhost:37777/api/search/observations?query=authentication&for

 ## Parameters

- **query** (required): Search terms (e.g., "authentication", "bug fix", "database migration")
+- **query** (required): Natural language search query - uses semantic search (ChromaDB) for ranking with SQLite FTS5 fallback (e.g., "authentication", "bug fix", "database migration")
 - **format**: "index" (summary) or "full" (complete details). Default: "full"
 - **limit**: Number of results (default: 20, max: 100)
 - **project**: Filter by project name (optional)
@@ -346,9 +346,9 @@ const filterSchema = z.object({
 const tools = [
  {
    name: 'search_observations',
-    description: 'Search observations using full-text search across titles, narratives, facts, and concepts. IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
+    description: 'Search observations using hybrid semantic + full-text search (ChromaDB primary, SQLite FTS5 fallback). IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
    inputSchema: z.object({
-      query: z.string().describe('Search query for FTS5 full-text search'),
+      query: z.string().describe('Natural language search query (semantic ranking via ChromaDB, FTS5 fallback)'),
      format: z.enum(['index', 'full']).default('index').describe('Output format: "index" for titles/dates only (default, RECOMMENDED for initial search), "full" for complete details (use only after reviewing index results)'),
      ...filterSchema.shape
    }),
@@ -434,9 +434,9 @@ const tools = [
  },
  {
    name: 'search_sessions',
-    description: 'Search session summaries using full-text search across requests, completions, learnings, and notes. IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
+    description: 'Search session summaries using hybrid semantic + full-text search (ChromaDB primary, SQLite FTS5 fallback). IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
    inputSchema: z.object({
-      query: z.string().describe('Search query for FTS5 full-text search'),
+      query: z.string().describe('Natural language search query (semantic ranking via ChromaDB, FTS5 fallback)'),
      format: z.enum(['index', 'full']).default('index').describe('Output format: "index" for titles/dates only (default, RECOMMENDED for initial search), "full" for complete details (use only after reviewing index results)'),
      project: z.string().optional().describe('Filter by project name'),
      dateRange: z.object({
@@ -1000,9 +1000,9 @@ const tools = [
  },
  {
    name: 'search_user_prompts',
-    description: 'Search raw user prompts with full-text search. Use this to find what the user actually said/requested across all sessions. IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
+    description: 'Search raw user prompts using hybrid semantic + full-text search (ChromaDB primary, SQLite FTS5 fallback). Use this to find what the user actually said/requested across all sessions. IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
    inputSchema: z.object({
-      query: z.string().describe('Search query for FTS5 full-text search'),
+      query: z.string().describe('Natural language search query (semantic ranking via ChromaDB, FTS5 fallback)'),
      format: z.enum(['index', 'full']).default('index').describe('Output format: "index" for truncated prompts/dates (default, RECOMMENDED for initial search), "full" for complete prompt text (use only after reviewing index results)'),
      project: z.string().optional().describe('Filter by project name'),
      dateRange: z.object({