docs: Align search documentation with hybrid ChromaDB architecture (#116)

* feat: Add discovery_tokens for ROI tracking in observations and session summaries - Introduced `discovery_tokens` column in `observations` and `session_summaries` tables to track token costs associated with discovering and creating each observation and summary. - Updated relevant services and hooks to calculate and display ROI metrics based on discovery tokens. - Enhanced context economics reporting to include savings from reusing previous observations. - Implemented migration to ensure the new column is added to existing tables. - Adjusted data models and sync processes to accommodate the new `discovery_tokens` field. * refactor: streamline context hook by removing unused functions and updating terminology - Removed the estimateTokens and getObservations helper functions as they were not utilized. - Updated the legend and output messages to replace "discovery" with "work" for clarity. - Changed the emoji representation for different observation types to better reflect their purpose. - Enhanced output formatting for improved readability and understanding of token usage. * Refactor user-message-hook and context-hook for improved clarity and functionality - Updated user-message-hook.js to enhance error messaging and improve variable naming for clarity. - Modified context-hook.ts to include a new column key section, improved context index instructions, and added emoji icons for observation types. - Adjusted footer messages in context-hook.ts to emphasize token savings and access to past research. - Changed user-message-hook.ts to update the feedback and support message for clarity. * fix: Critical ROI tracking fixes from PR review Addresses critical findings from PR #111 review: 1. **Fixed incorrect discovery token calculation** (src/services/worker/SDKAgent.ts) - Changed from passing cumulative total to per-response delta - Now correctly tracks token cost for each observation/summary - Captures token state before/after response processing - Prevents all observations getting inflated cumulative values 2. **Fixed schema version mismatch** (src/services/sqlite/SessionStore.ts) - Changed ensureDiscoveryTokensColumn() from version 11 to version 7 - Now matches migration007 definition in migrations.ts - Ensures consistent version tracking across migration system These fixes ensure ROI metrics accurately reflect token costs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Update search documentation to reflect hybrid ChromaDB architecture The backend correctly implements ChromaDB-first semantic search with SQLite temporal ordering and FTS5 fallback, but documentation incorrectly described it as "FTS5 full-text search". This fix aligns all skill guides and tool descriptions with the actual implementation. Changes: - Update SKILL.md to describe hybrid architecture with ChromaDB primary - Update observations.md title and query parameter descriptions - Update all three search tool descriptions in search-server.ts: * search_observations * search_sessions * search_user_prompts All tools now correctly document: - ChromaDB semantic search (primary ranking) - 90-day recency filter - SQLite temporal ordering - FTS5 fallback (when ChromaDB unavailable) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Add discovery_tokens column to observations and session_summaries tables --------- Co-authored-by: Claude <noreply@anthropic.com>
2025-11-16 13:36:17 -05:00
parent 3cbc041c8b
commit c0778bef00
10 changed files with 541 additions and 64 deletions
@@ -0,0 +1,434 @@
 # Hybrid Search Architecture: Problem-Solution Document
 **Date:** 2025-01-15
 **Author:** Claude Code (Session handoff document)
 **Purpose:** Comprehensive fix guide for hybrid search architecture documentation and implementation
 ---
 ## Executive Summary
 The claude-mem hybrid search architecture is **correctly implemented in code** but **incorrectly documented** in skill guides. Additionally, the workflow is missing the final "instant context timeline" step that completes the human memory analogy.
 **Quick Status:**
 - ✅ Backend code (`search-server.ts`): ChromaDB first, SQLite temporal sort
 - ❌ Skill operation guides: Describe FTS5 as primary search method
 - ❌ Missing feature: Automatic timeline context retrieval (before/after observations)
 - ✅ Landing page: Recently corrected
 - ⚠️ Documentation: Needs validation and potential refinement
 ---
 ## The Intended Architecture (User's Vision)
 ### Storage Flow
 ```
 User Action
    ↓
 1. SQLite Insert (FAST, synchronous)
    - Immediate persistence
    - Available for querying instantly
    ↓
 2. ChromaDB Sync (BACKGROUND, asynchronous)
    - Worker generates embeddings
    - Takes time but doesn't block user
    - Uses OpenAI text-embedding-3-small
 ```
 **Why this design:**
 - Users don't wait for embedding generation
 - SQLite provides immediate access
 - ChromaDB catches up in background for semantic search
 ### Search Flow (3-Layer Sequential Architecture)
 ```
 User Query: "How did we implement authentication?"
    ↓
 LAYER 1: Semantic Retrieval (ChromaDB)
    - Vector similarity search
    - Returns observation IDs (not full records)
    - Top 100 semantic matches
    - 90-day recency filter applied
    ↓
 LAYER 2: Temporal Ordering (SQLite)
    - Takes IDs from Layer 1
    - Hydrates full records from SQLite
    - Sorts by created_at_epoch DESC
    - Returns NEWEST relevant observation
    ↓
 LAYER 3: Instant Context Timeline (SQLite) [MISSING IN CURRENT IMPLEMENTATION]
    - Takes top observation ID from Layer 2
    - Retrieves N observations BEFORE that point
    - Retrieves N observations AFTER that point
    - Provides temporal context: "what led here" + "what happened next"
    ↓
 Present to User
    - Most relevant observation
    - Timeline showing before/after context
    - Mimics human memory
 ```
 **Why ChromaDB can't do it alone:**
 - ChromaDB doesn't efficiently support date range queries sorted by time
 - SQLite excels at temporal operations (ORDER BY created_at_epoch)
 - Need both: ChromaDB for semantic, SQLite for temporal
 **Why the timeline matters:**
 > LLMs don't experience time linearly like humans do. Humans remember: "I did X, which led to Y, then Z happened." The instant context timeline gives LLMs this temporal awareness that humans experience naturally.
 ### Fallback Behavior
 ```
 IF ChromaDB unavailable OR no results:
    ↓
 FTS5 Keyword Search (SQLite)
    - Full-text search on observations_fts
    - Basic keyword matching
    - Ensures backward compatibility
    - Fallback for older systems
 ```
 **FTS5 is NOT "optional"** - it's the fallback mechanism for when ChromaDB isn't available or returns no results.
 ---
 ## Current State Analysis
 ### ✅ What's Correct: Backend Implementation
 **File:** `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts`
 **Lines:** 360-396 (search_observations handler)
 The code DOES implement Layers 1 & 2 correctly:
 ```typescript
 // Step 1: ChromaDB semantic search (top 100)
 if (chromaClient) {
  const chromaResults = await queryChroma(query, 100);
  // Step 2: Filter by 90-day recency
  const ninetyDaysAgo = Date.now() - (90 * 24 * 60 * 60 * 1000);
  const recentIds = chromaResults.ids.filter((_id, idx) => {
    const meta = chromaResults.metadatas[idx];
    return meta && meta.created_at_epoch > ninetyDaysAgo;
  });
  // Step 3: Hydrate from SQLite with temporal ordering
  results = store.getObservationsByIds(recentIds, {
    orderBy: 'date_desc',
    limit
  });
 }
 // Fallback to FTS5 if ChromaDB unavailable
 if (results.length === 0) {
  results = search.searchObservations(query, options); // FTS5
 }
 ```
 **What this gets right:**
 - ChromaDB semantic search FIRST (not FTS5)
 - 90-day recency filter
 - SQLite temporal ordering (`orderBy: 'date_desc'`)
 - FTS5 fallback for reliability
 ### ❌ What's Wrong: Skill Operation Guides
 **File:** `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md`
 **Current Title:** "Search Observations (Full-Text)"
 **Current Description:** "Search all observations using natural language queries."
 **Current Line 351:** `query: z.string().describe('Search query for FTS5 full-text search')`
 **The Problem:**
 - Describes FTS5 as the search method
 - No mention of ChromaDB semantic search
 - Misleading title "Full-Text" implies keyword-only
 - Examples don't show the ChromaDB → SQLite flow
 **Impact:**
 - Claude thinks it's doing FTS5 keyword search
 - Doesn't understand it's semantic vector search
 - Can't explain the architecture to users correctly
 ### ⚠️ What's Missing: Layer 3 (Instant Context Timeline)
 The current implementation stops at Layer 2 (temporal ordering). It doesn't automatically:
 1. Identify the MOST relevant observation (it returns a sorted list)
 2. Retrieve observations BEFORE that point in time
 3. Retrieve observations AFTER that point in time
 4. Present the timeline context to the user
 **Why this matters:**
 The timeline is the **killer feature** that mimics human memory. Without it, users get:
 - ❌ A sorted list of relevant observations
 - ❌ No context about what led there
 - ❌ No context about what happened next
 With timeline, users get:
 - ✅ The MOST relevant observation
 - ✅ Context: "You did A and B before this"
 - ✅ Context: "After this, you did C and D"
 - ✅ Complete narrative like human memory
 ### 📋 Documentation Status
 **Recently Fixed (✅):**
 - `/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md`
  - Now describes 3-layer sequential flow
  - Includes human memory analogy
  - Positions ChromaDB as primary
 **Landing Page (✅):**
 - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx`
 - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx`
 - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx`
  - All updated to describe ChromaDB-first architecture
  - "Remember Like a Human" messaging added
  - Timeline feature highlighted
 **Needs Review:**
 - SKILL.md technical notes (line 172)
 - All operation guides in `/operations/` directory
 - Common workflows documentation
 ---
 ## Required Fixes
 ### Fix 1: Update Skill Operation Guides
 **Files to modify:**
 - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md`
 - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md`
 **Changes needed:**
 1. **observations.md:**
   - Change title: "Search Observations (Full-Text)" → "Search Observations (Semantic + Temporal)"
   - Update description: Explain ChromaDB semantic search as primary
   - Update command examples to explain hybrid flow
   - Add note: "Uses ChromaDB vector search with SQLite temporal ordering. FTS5 used as fallback."
 2. **common-workflows.md:**
   - Update "Workflow 2: Finding Specific Bug Fixes" to explain ChromaDB → SQLite flow
   - Add new workflow: "Workflow N: Getting Timeline Context Around Relevant Observations"
 **Example of corrected observations.md header:**
 ```markdown
 # Search Observations (Semantic + Temporal)
 Search observations using ChromaDB vector similarity with SQLite temporal ordering.
 ## Architecture
 **3-Layer Hybrid Search:**
 1. **ChromaDB semantic retrieval** - Finds what's semantically relevant (vector similarity)
 2. **90-day recency filter** - Prioritizes recent work
 3. **SQLite temporal ordering** - Sorts by time, returns newest relevant
 **Fallback:** If ChromaDB unavailable, falls back to FTS5 keyword search.
 ## When to Use
 - User asks: "How did we implement authentication?"
 - User asks: "What bugs did we fix?"
 - Looking for past work by meaning/topic (not just keywords)
 ```
 ### Fix 2: Implement Layer 3 (Instant Context Timeline)
 **Option A: Add to existing search_observations handler**
 Modify `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts` line ~396:
 ```typescript
 // After getting sorted results, if user wants timeline context
 if (results.length > 0 && options.includeTimeline) {
  const topObservation = results[0];
  const depth_before = options.timelineDepthBefore || 5;
  const depth_after = options.timelineDepthAfter || 5;
  // Get observations before and after
  const timeline = store.getTimelineContext(
    topObservation.id,
    depth_before,
    depth_after
  );
  return {
    topResult: topObservation,
    timeline: timeline,
    format: format
  };
 }
 ```
 **Option B: Use existing timeline-by-query operation**
 The `/api/timeline/by-query` endpoint already implements search + timeline. Could:
 1. Make it the DEFAULT recommended operation in skill guides
 2. Update operation guides to emphasize this as primary workflow
 3. Position observations search as "timeline-less" alternative
 **Recommendation:** Option B is faster - leverage existing `timeline-by-query` endpoint and update skill guides to make it the primary workflow.
 ### Fix 3: Update SKILL.md Technical Notes
 **File:** `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md`
 **Line 172:**
 **Current:**
 ```markdown
 - **Search engine:** FTS5 full-text search + structured filters
 ```
 **Change to:**
 ```markdown
 - **Search engine:** ChromaDB vector search (primary) + SQLite temporal ordering + instant context timeline (3-layer sequential architecture)
 ```
 ### Fix 4: Update search_observations Description
 **File:** `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts`
 **Line 349:**
 **Current:**
 ```typescript
 description: 'Search observations using full-text search across titles, narratives...'
 ```
 **Change to:**
 ```typescript
 description: 'Search observations using hybrid semantic search (ChromaDB vector similarity + SQLite temporal ordering). Falls back to FTS5 keyword search if ChromaDB unavailable. IMPORTANT: Always use index format first...'
 ```
 **Line 351:**
 **Current:**
 ```typescript
 query: z.string().describe('Search query for FTS5 full-text search'),
 ```
 **Change to:**
 ```typescript
 query: z.string().describe('Search query (semantic vector search via ChromaDB, falls back to FTS5 if unavailable)'),
 ```
 ---
 ## Implementation Checklist
 Use this checklist when executing fixes:
 ### Phase 1: Core Documentation
 - [ ] Update `observations.md` title and description
 - [ ] Update `observations.md` architecture explanation
 - [ ] Update `observations.md` examples to mention ChromaDB
 - [ ] Update `common-workflows.md` to explain hybrid flow
 - [ ] Update `SKILL.md` line 172 technical notes
 - [ ] Verify all operation guides mention ChromaDB correctly
 ### Phase 2: Backend Updates
 - [ ] Update `search-server.ts` search_observations description (line 349)
 - [ ] Update `search-server.ts` query parameter description (line 351)
 - [ ] Add code comments explaining 3-layer flow
 - [ ] Consider adding `includeTimeline` option to search_observations
 ### Phase 3: Timeline Integration
 - [ ] Review timeline-by-query operation
 - [ ] Update skill guides to recommend timeline-by-query as primary workflow
 - [ ] Add example: "When you need context, use timeline-by-query instead of observations search"
 - [ ] Update quick reference table in SKILL.md to highlight timeline-by-query
 ### Phase 4: Validation
 - [ ] Test search behavior with ChromaDB enabled
 - [ ] Test fallback behavior with ChromaDB disabled
 - [ ] Verify skill guides accurately describe behavior
 - [ ] Ensure landing page messaging aligns with skill guides
 - [ ] Check that human memory analogy is consistent everywhere
 ---
 ## Key Messaging (Use Consistently)
 ### Value Proposition
 "3-layer hybrid search mimics human memory: ChromaDB semantic retrieval finds what's relevant → SQLite temporal ordering identifies when → instant context timeline shows what led there and what came next."
 ### Technical Architecture
 "ChromaDB vector search handles semantic understanding (what's relevant), SQLite handles temporal queries (when it happened, what's newest), and timeline context provides before/after observations (what led there, what happened next)."
 ### Why It Matters
 "LLMs don't experience time linearly like humans do. Claude-mem gives them temporal context: not just 'you implemented authentication,' but 'you researched OAuth libraries, then implemented JWT auth, then fixed a token expiration bug.' Complete narrative, like human memory."
 ### ChromaDB Role
 "ChromaDB is the PRIMARY search mechanism for semantic understanding. FTS5 is the FALLBACK for backward compatibility and reliability when ChromaDB is unavailable."
 ---
 ## Files Reference
 **Skill Guides (Primary Fixes):**
 - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md`
 - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md`
 - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/timeline-by-query.md`
 - `/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md`
 **Backend Code (Minor Updates):**
 - `/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts`
 **Documentation (Validation):**
 - `/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md`
 **Landing Page (Already Fixed):**
 - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx`
 - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx`
 - `/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx`
 ---
 ## Questions for User (If Needed)
 1. **Timeline Integration Approach:**
   - Option A: Modify search_observations to add `includeTimeline` parameter
   - Option B: Emphasize timeline-by-query as primary workflow in guides
   - User preference?
 2. **Backward Compatibility:**
   - Should FTS5 fallback be MORE prominent in docs for older systems?
   - Or keep it as "implementation detail"?
 3. **Progressive Disclosure:**
   - Should timeline context ALWAYS be included?
   - Or only when user explicitly asks for context?
 ---
 ## Success Criteria
 When these fixes are complete:
 1. ✅ Skill operation guides accurately describe ChromaDB-first architecture
 2. ✅ No references to "FTS5 as primary search method"
 3. ✅ Timeline feature integrated into standard workflow
 4. ✅ Human memory analogy present in key documentation
 5. ✅ Consistent messaging across skill guides, docs, and landing page
 6. ✅ Backend code comments explain 3-layer flow clearly
 7. ✅ Users understand: "This is semantic search with temporal context, not just keyword search"
 ---
 ## Notes for Next Claude
 - The user has already clarified the architecture thoroughly
 - Backend code is already correct - focus on documentation/guides
 - Landing page recently updated - validate for consistency
 - Timeline-by-query endpoint already exists - leverage it
 - Key insight: This mimics human memory through temporal context
 - ChromaDB is PRIMARY, not optional. FTS5 is FALLBACK, not primary.
 **Start with:** Reading this document fully, then update skill operation guides first (highest impact).
@@ -148,16 +148,19 @@ When Claude invokes the skill:
 ## Search Architecture
-### Hybrid Search System
+### 3-Layer Hybrid Search System
-claude-mem uses a **hybrid search architecture** combining:
+claude-mem uses a **3-layer sequential search architecture** that mimics human long-term memory:
-1. **SQLite FTS5 (Full-Text Search)** - Keyword-based search
+**Storage Flow (Write Path):**
-2. **ChromaDB (Vector Search)** - Semantic similarity search
+1. **SQLite First** - Data written synchronously to SQLite (fast, immediate access)
 2. **ChromaDB Background Sync** - Worker asynchronously generates embeddings and syncs to ChromaDB
 **Search Flow (Read Path - Sequential, NOT parallel):**
 ```
 ┌─────────────────────────────────────────────────────────────┐
-│                   Search Request Flow                        │
+│                3-Layer Sequential Search Flow                │
 └─────────────────────────────────────────────────────────────┘
                            │
                            ▼
@@ -166,61 +169,70 @@ claude-mem uses a **hybrid search architecture** combining:
              │  /api/search/*          │
              └─────────────────────────┘
                            │
              ┌─────────────┴─────────────┐
              ▼                           ▼
 ┌──────────────────────────┐  ┌──────────────────────────┐
 │  SessionSearch (FTS5)    │  │  ChromaSync (Vector DB)  │
 │                          │  │                          │
 │  Full-text keyword       │  │  Semantic similarity     │
 │  search on:              │  │  search on:              │
 │  - titles                │  │  - narratives            │
 │  - narratives            │  │  - facts                 │
 │  - facts                 │  │  - file content          │
 │  - concepts              │  │                          │
 │                          │  │  Embeddings:             │
 │  SQLite DB:              │  │  - text-embedding-3-small│
 │  observations_fts        │  │  - 90-day recency filter │
 │  sessions_fts            │  │                          │
 │  prompts_fts             │  │  ChromaDB:               │
 │                          │  │  observations collection │
 └──────────────────────────┘  └──────────────────────────┘
              │                           │
              └─────────────┬─────────────┘
                            ▼
-              ┌─────────────────────────┐
+┌─────────────────────────────────────────────────────────────┐
-              │  Merged Results         │
+│  LAYER 1: Semantic Retrieval (ChromaDB)                     │
-              │  - Deduplicated         │
+│  ─────────────────────────────────────────────────────────  │
-              │  - Sorted by relevance  │
+│  Vector similarity search finds semantically relevant items  │
-              │  - Formatted (index/full)│
+│  Returns: observation IDs in index format (~50-100 tokens)  │
-              └─────────────────────────┘
+│  Filter: 90-day recency prioritizes recent work             │
 │  Output: List of relevant observation IDs                   │
 └─────────────────────────────────────────────────────────────┘
                            │
                            ▼
 ┌─────────────────────────────────────────────────────────────┐
 │  LAYER 2: Temporal Ordering (SQLite)                        │
 │  ─────────────────────────────────────────────────────────  │
 │  Takes observation IDs from Layer 1                         │
 │  Sorts by created_at timestamp (fast SQLite temporal query) │
 │  Identifies: MOST RECENT relevant observation               │
 │  Why: ChromaDB doesn't easily query by date range sorted    │
 │  Output: Top observation ID by time                         │
 └─────────────────────────────────────────────────────────────┘
                            │
                            ▼
 ┌─────────────────────────────────────────────────────────────┐
 │  LAYER 3: Instant Context Timeline (SQLite)                 │
 │  ─────────────────────────────────────────────────────────  │
 │  Uses top observation ID from Layer 2 as anchor             │
 │  Retrieves N observations BEFORE and AFTER that point       │
 │  Provides: "what led here" + "what happened next" context   │
 │  This is the KILLER FEATURE: mimics human memory            │
 │  Output: Timeline with temporal context                     │
 └─────────────────────────────────────────────────────────────┘
 ```
 **Why This Architecture Exists:**
 The problem: LLMs don't experience time linearly like humans do. Finding semantically relevant information isn't enough—you need temporal context.
 The solution:
 - **ChromaDB** for "what's relevant" (semantic understanding)
 - **SQLite** for "when did it happen" (temporal ordering with fast date-range queries)
 - **Timeline** for "what was the context" (before/after observations)
 Together, they mimic how humans recall: "I did X, which led to Y, then Z happened."
 **Human Memory Analogy:**
 Humans don't just remember isolated facts. They remember sequences: what they did before something, what happened after. The instant context timeline gives LLMs this same temporal awareness that humans experience naturally.
 ### Search Types
-#### 1. Full-Text Search (FTS5)
+#### 1. Vector Search (ChromaDB) - PRIMARY Search Layer
-**How it works:**
+**Role:** Layer 1 - Semantic Retrieval
 - Uses SQLite FTS5 virtual tables for instant keyword matching
 - Supports boolean operators: `AND`, `OR`, `NOT`, `NEAR`, `*` (wildcard)
 - Ranks results by BM25 relevance scoring
 - Sub-100ms performance on 8,000+ observations
 **Example query:**
 ```sql
 -- User asks: "How did we implement JWT authentication?"
 SELECT * FROM observations_fts
 WHERE observations_fts MATCH 'JWT AND authentication'
 ORDER BY rank
 LIMIT 20;
 ```
 #### 2. Vector Search (ChromaDB)
 **How it works:**
 - Text is embedded using OpenAI's `text-embedding-3-small` model
- Vector similarity search finds semantically related content
+- Vector similarity search finds semantically related content, not just keyword matches
 - 90-day recency filter prioritizes recent work
- Combined with keyword search for hybrid results
+- Returns observation IDs for temporal processing in Layer 2
 **Why it's primary:**
 - Understands meaning, not just keywords ("auth flow" matches "JWT implementation")
 - Finds relevant work even when you don't know exact terms used
 - Semantic understanding crucial for LLM memory retrieval
 **Example query:**
 ```python
@@ -230,6 +242,37 @@ collection.query(
    n_results=20,
    where={"created_at": {"$gte": ninety_days_ago}}
 )
 # Returns: observation IDs semantically related to login/auth
 ```
 #### 2. Full-Text Search (FTS5) - Supporting Layer
 **Role:** Layer 2 & 3 - Temporal Ordering and Timeline Context
 **How it works:**
 - Uses SQLite FTS5 virtual tables for instant keyword matching
 - Supports boolean operators: `AND`, `OR`, `NOT`, `NEAR`, `*` (wildcard)
 - Fast temporal queries with date-range sorting
 - Sub-100ms performance on 8,000+ observations
 **Why it's supporting:**
 - ChromaDB handles semantic "what's relevant"
 - SQLite/FTS5 handles temporal "when did it happen" and "what came before/after"
 - Optimized for timeline queries and date-based sorting
 **Example query:**
 ```sql
 -- Takes observation IDs from ChromaDB, sorts by time
 SELECT * FROM observations
 WHERE id IN (/* IDs from ChromaDB */)
 ORDER BY created_at_epoch DESC
 LIMIT 1;
 -- Then retrieves timeline context around that observation
 SELECT * FROM observations
 WHERE created_at_epoch < anchor_timestamp
 ORDER BY created_at_epoch DESC
 LIMIT 10; -- "what led here"
 ```
 #### 3. Structured Filters
@@ -166,7 +166,7 @@ ${e.stack}`:e.message;if(Array.isArray(e))return`[${e.length} items]`;let s=Obje
            INSERT INTO user_prompts_fts(rowid, prompt_text)
            VALUES (new.id, new.prompt_text);
          END;
-        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(11))return;this.db.pragma("table_info(observations)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(11,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
+        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(7))return;this.db.pragma("table_info(observations)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(7,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
      SELECT
        request, investigated, learned, completed, next_steps,
        files_read, files_edited, notes, prompt_number, created_at
@@ -166,7 +166,7 @@ ${e.stack}`:e.message;if(Array.isArray(e))return`[${e.length} items]`;let s=Obje
            INSERT INTO user_prompts_fts(rowid, prompt_text)
            VALUES (new.id, new.prompt_text);
          END;
-        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(11))return;this.db.pragma("table_info(observations)").some(a=>a.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(a=>a.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(11,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
+        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(7))return;this.db.pragma("table_info(observations)").some(a=>a.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(a=>a.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(7,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
      SELECT
        request, investigated, learned, completed, next_steps,
        files_read, files_edited, notes, prompt_number, created_at
@@ -166,7 +166,7 @@ ${e.stack}`:e.message;if(Array.isArray(e))return`[${e.length} items]`;let s=Obje
            INSERT INTO user_prompts_fts(rowid, prompt_text)
            VALUES (new.id, new.prompt_text);
          END;
-        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(11))return;this.db.pragma("table_info(observations)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(11,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
+        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(7))return;this.db.pragma("table_info(observations)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(o=>o.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(7,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
      SELECT
        request, investigated, learned, completed, next_steps,
        files_read, files_edited, notes, prompt_number, created_at
@@ -166,7 +166,7 @@ ${e.stack}`:e.message;if(Array.isArray(e))return`[${e.length} items]`;let s=Obje
            INSERT INTO user_prompts_fts(rowid, prompt_text)
            VALUES (new.id, new.prompt_text);
          END;
-        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(11))return;this.db.pragma("table_info(observations)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(11,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
+        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(7))return;this.db.pragma("table_info(observations)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(7,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
      SELECT
        request, investigated, learned, completed, next_steps,
        files_read, files_edited, notes, prompt_number, created_at
@@ -166,7 +166,7 @@ ${e.stack}`:e.message;if(Array.isArray(e))return`[${e.length} items]`;let s=Obje
            INSERT INTO user_prompts_fts(rowid, prompt_text)
            VALUES (new.id, new.prompt_text);
          END;
-        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(11))return;this.db.pragma("table_info(observations)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(11,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
+        `),this.db.exec("COMMIT"),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(10,new Date().toISOString()),console.error("[SessionStore] Successfully created user_prompts table with FTS5 support")}catch(t){throw this.db.exec("ROLLBACK"),t}}catch(e){console.error("[SessionStore] Migration error (create user_prompts table):",e.message)}}ensureDiscoveryTokensColumn(){try{if(this.db.prepare("SELECT version FROM schema_versions WHERE version = ?").get(7))return;this.db.pragma("table_info(observations)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE observations ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to observations table")),this.db.pragma("table_info(session_summaries)").some(i=>i.name==="discovery_tokens")||(this.db.exec("ALTER TABLE session_summaries ADD COLUMN discovery_tokens INTEGER DEFAULT 0"),console.error("[SessionStore] Added discovery_tokens column to session_summaries table")),this.db.prepare("INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)").run(7,new Date().toISOString())}catch(e){console.error("[SessionStore] Discovery tokens migration error:",e.message)}}getRecentSummaries(e,s=10){return this.db.prepare(`
      SELECT
        request, investigated, learned, completed, next_steps,
        files_read, files_edited, notes, prompt_number, created_at
@@ -169,7 +169,7 @@ For guidelines on presenting search results to users, see [operations/formatting
 - **Port:** Default 37777 (configurable via `CLAUDE_MEM_WORKER_PORT`)
 - **Response format:** Always JSON
- **Search engine:** FTS5 full-text search + structured filters
+- **Search engine:** ChromaDB semantic search (primary ranking) + SQLite FTS5 (fallback) + 90-day recency filter + temporal ordering (hybrid architecture)
 - **All operations:** HTTP GET with query parameters
 - **Worker:** PM2-managed background process
@@ -1,4 +1,4 @@
-# Search Observations (Full-Text)
+# Search Observations (Semantic + Full-Text Hybrid)
 Search all observations using natural language queries.
@@ -17,7 +17,7 @@ curl -s "http://localhost:37777/api/search/observations?query=authentication&for
 ## Parameters
- **query** (required): Search terms (e.g., "authentication", "bug fix", "database migration")
+- **query** (required): Natural language search query - uses semantic search (ChromaDB) for ranking with SQLite FTS5 fallback (e.g., "authentication", "bug fix", "database migration")
 - **format**: "index" (summary) or "full" (complete details). Default: "full"
 - **limit**: Number of results (default: 20, max: 100)
 - **project**: Filter by project name (optional)
@@ -346,9 +346,9 @@ const filterSchema = z.object({
 const tools = [
  {
    name: 'search_observations',
-    description: 'Search observations using full-text search across titles, narratives, facts, and concepts. IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
+    description: 'Search observations using hybrid semantic + full-text search (ChromaDB primary, SQLite FTS5 fallback). IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
    inputSchema: z.object({
-      query: z.string().describe('Search query for FTS5 full-text search'),
+      query: z.string().describe('Natural language search query (semantic ranking via ChromaDB, FTS5 fallback)'),
      format: z.enum(['index', 'full']).default('index').describe('Output format: "index" for titles/dates only (default, RECOMMENDED for initial search), "full" for complete details (use only after reviewing index results)'),
      ...filterSchema.shape
    }),
@@ -434,9 +434,9 @@ const tools = [
  },
  {
    name: 'search_sessions',
-    description: 'Search session summaries using full-text search across requests, completions, learnings, and notes. IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
+    description: 'Search session summaries using hybrid semantic + full-text search (ChromaDB primary, SQLite FTS5 fallback). IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
    inputSchema: z.object({
-      query: z.string().describe('Search query for FTS5 full-text search'),
+      query: z.string().describe('Natural language search query (semantic ranking via ChromaDB, FTS5 fallback)'),
      format: z.enum(['index', 'full']).default('index').describe('Output format: "index" for titles/dates only (default, RECOMMENDED for initial search), "full" for complete details (use only after reviewing index results)'),
      project: z.string().optional().describe('Filter by project name'),
      dateRange: z.object({
@@ -1000,9 +1000,9 @@ const tools = [
  },
  {
    name: 'search_user_prompts',
-    description: 'Search raw user prompts with full-text search. Use this to find what the user actually said/requested across all sessions. IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
+    description: 'Search raw user prompts using hybrid semantic + full-text search (ChromaDB primary, SQLite FTS5 fallback). Use this to find what the user actually said/requested across all sessions. IMPORTANT: Always use index format first (default) to get an overview with minimal token usage, then use format: "full" only for specific items of interest.',
    inputSchema: z.object({
-      query: z.string().describe('Search query for FTS5 full-text search'),
+      query: z.string().describe('Natural language search query (semantic ranking via ChromaDB, FTS5 fallback)'),
      format: z.enum(['index', 'full']).default('index').describe('Output format: "index" for truncated prompts/dates (default, RECOMMENDED for initial search), "full" for complete prompt text (use only after reviewing index results)'),
      project: z.string().optional().describe('Filter by project name'),
      dateRange: z.object({