Files

T

Alex Newman c0778bef00 docs: Align search documentation with hybrid ChromaDB architecture (#116 )

* feat: Add discovery_tokens for ROI tracking in observations and session summaries

- Introduced `discovery_tokens` column in `observations` and `session_summaries` tables to track token costs associated with discovering and creating each observation and summary.
- Updated relevant services and hooks to calculate and display ROI metrics based on discovery tokens.
- Enhanced context economics reporting to include savings from reusing previous observations.
- Implemented migration to ensure the new column is added to existing tables.
- Adjusted data models and sync processes to accommodate the new `discovery_tokens` field.

* refactor: streamline context hook by removing unused functions and updating terminology

- Removed the estimateTokens and getObservations helper functions as they were not utilized.
- Updated the legend and output messages to replace "discovery" with "work" for clarity.
- Changed the emoji representation for different observation types to better reflect their purpose.
- Enhanced output formatting for improved readability and understanding of token usage.

* Refactor user-message-hook and context-hook for improved clarity and functionality

- Updated user-message-hook.js to enhance error messaging and improve variable naming for clarity.
- Modified context-hook.ts to include a new column key section, improved context index instructions, and added emoji icons for observation types.
- Adjusted footer messages in context-hook.ts to emphasize token savings and access to past research.
- Changed user-message-hook.ts to update the feedback and support message for clarity.

* fix: Critical ROI tracking fixes from PR review

Addresses critical findings from PR #111 review:

1. **Fixed incorrect discovery token calculation** (src/services/worker/SDKAgent.ts)
   - Changed from passing cumulative total to per-response delta
   - Now correctly tracks token cost for each observation/summary
   - Captures token state before/after response processing
   - Prevents all observations getting inflated cumulative values

2. **Fixed schema version mismatch** (src/services/sqlite/SessionStore.ts)
   - Changed ensureDiscoveryTokensColumn() from version 11 to version 7
   - Now matches migration007 definition in migrations.ts
   - Ensures consistent version tracking across migration system

These fixes ensure ROI metrics accurately reflect token costs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Update search documentation to reflect hybrid ChromaDB architecture

The backend correctly implements ChromaDB-first semantic search with SQLite
temporal ordering and FTS5 fallback, but documentation incorrectly described
it as "FTS5 full-text search". This fix aligns all skill guides and tool
descriptions with the actual implementation.

Changes:
- Update SKILL.md to describe hybrid architecture with ChromaDB primary
- Update observations.md title and query parameter descriptions
- Update all three search tool descriptions in search-server.ts:
  * search_observations
  * search_sessions
  * search_user_prompts

All tools now correctly document:
- ChromaDB semantic search (primary ranking)
- 90-day recency filter
- SQLite temporal ordering
- FTS5 fallback (when ChromaDB unavailable)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Add discovery_tokens column to observations and session_summaries tables

---------

Co-authored-by: Claude <noreply@anthropic.com>

2025-11-16 13:36:17 -05:00

15 KiB

Raw Blame History

Hybrid Search Architecture: Problem-Solution Document

Date: 2025-01-15 Author: Claude Code (Session handoff document) Purpose: Comprehensive fix guide for hybrid search architecture documentation and implementation

Executive Summary

The claude-mem hybrid search architecture is correctly implemented in code but incorrectly documented in skill guides. Additionally, the workflow is missing the final "instant context timeline" step that completes the human memory analogy.

Quick Status:

✅ Backend code (search-server.ts): ChromaDB first, SQLite temporal sort
❌ Skill operation guides: Describe FTS5 as primary search method
❌ Missing feature: Automatic timeline context retrieval (before/after observations)
✅ Landing page: Recently corrected
⚠️ Documentation: Needs validation and potential refinement

The Intended Architecture (User's Vision)

Storage Flow

User Action
    ↓
1. SQLite Insert (FAST, synchronous)
    - Immediate persistence
    - Available for querying instantly
    ↓
2. ChromaDB Sync (BACKGROUND, asynchronous)
    - Worker generates embeddings
    - Takes time but doesn't block user
    - Uses OpenAI text-embedding-3-small

Why this design:

Users don't wait for embedding generation
SQLite provides immediate access
ChromaDB catches up in background for semantic search

Search Flow (3-Layer Sequential Architecture)

User Query: "How did we implement authentication?"
    ↓
LAYER 1: Semantic Retrieval (ChromaDB)
    - Vector similarity search
    - Returns observation IDs (not full records)
    - Top 100 semantic matches
    - 90-day recency filter applied
    ↓
LAYER 2: Temporal Ordering (SQLite)
    - Takes IDs from Layer 1
    - Hydrates full records from SQLite
    - Sorts by created_at_epoch DESC
    - Returns NEWEST relevant observation
    ↓
LAYER 3: Instant Context Timeline (SQLite) [MISSING IN CURRENT IMPLEMENTATION]
    - Takes top observation ID from Layer 2
    - Retrieves N observations BEFORE that point
    - Retrieves N observations AFTER that point
    - Provides temporal context: "what led here" + "what happened next"
    ↓
Present to User
    - Most relevant observation
    - Timeline showing before/after context
    - Mimics human memory

Why ChromaDB can't do it alone:

ChromaDB doesn't efficiently support date range queries sorted by time
SQLite excels at temporal operations (ORDER BY created_at_epoch)
Need both: ChromaDB for semantic, SQLite for temporal

Why the timeline matters:

LLMs don't experience time linearly like humans do. Humans remember: "I did X, which led to Y, then Z happened." The instant context timeline gives LLMs this temporal awareness that humans experience naturally.

Fallback Behavior

IF ChromaDB unavailable OR no results:
    ↓
FTS5 Keyword Search (SQLite)
    - Full-text search on observations_fts
    - Basic keyword matching
    - Ensures backward compatibility
    - Fallback for older systems

FTS5 is NOT "optional" - it's the fallback mechanism for when ChromaDB isn't available or returns no results.

Current State Analysis

✅ What's Correct: Backend Implementation

File: /Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts Lines: 360-396 (search_observations handler)

The code DOES implement Layers 1 & 2 correctly:

// Step 1: ChromaDB semantic search (top 100)
if (chromaClient) {
  const chromaResults = await queryChroma(query, 100);

  // Step 2: Filter by 90-day recency
  const ninetyDaysAgo = Date.now() - (90 * 24 * 60 * 60 * 1000);
  const recentIds = chromaResults.ids.filter((_id, idx) => {
    const meta = chromaResults.metadatas[idx];
    return meta && meta.created_at_epoch > ninetyDaysAgo;
  });

  // Step 3: Hydrate from SQLite with temporal ordering
  results = store.getObservationsByIds(recentIds, {
    orderBy: 'date_desc',
    limit
  });
}

// Fallback to FTS5 if ChromaDB unavailable
if (results.length === 0) {
  results = search.searchObservations(query, options); // FTS5
}

What this gets right:

ChromaDB semantic search FIRST (not FTS5)
90-day recency filter
SQLite temporal ordering (orderBy: 'date_desc')
FTS5 fallback for reliability

❌ What's Wrong: Skill Operation Guides

File: /Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md

Current Title: "Search Observations (Full-Text)" Current Description: "Search all observations using natural language queries." Current Line 351: query: z.string().describe('Search query for FTS5 full-text search')

The Problem:

Describes FTS5 as the search method
No mention of ChromaDB semantic search
Misleading title "Full-Text" implies keyword-only
Examples don't show the ChromaDB → SQLite flow

Impact:

Claude thinks it's doing FTS5 keyword search
Doesn't understand it's semantic vector search
Can't explain the architecture to users correctly

⚠️ What's Missing: Layer 3 (Instant Context Timeline)

The current implementation stops at Layer 2 (temporal ordering). It doesn't automatically:

Identify the MOST relevant observation (it returns a sorted list)
Retrieve observations BEFORE that point in time
Retrieve observations AFTER that point in time
Present the timeline context to the user

Why this matters: The timeline is the killer feature that mimics human memory. Without it, users get:

❌ A sorted list of relevant observations
❌ No context about what led there
❌ No context about what happened next

With timeline, users get:

✅ The MOST relevant observation
✅ Context: "You did A and B before this"
✅ Context: "After this, you did C and D"
✅ Complete narrative like human memory

📋 Documentation Status

Recently Fixed (✅):

/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md
- Now describes 3-layer sequential flow
- Includes human memory analogy
- Positions ChromaDB as primary

Landing Page (✅):

/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx
/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx
/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx
- All updated to describe ChromaDB-first architecture
- "Remember Like a Human" messaging added
- Timeline feature highlighted

Needs Review:

SKILL.md technical notes (line 172)
All operation guides in /operations/ directory
Common workflows documentation

Required Fixes

Fix 1: Update Skill Operation Guides

Files to modify:

/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md
/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md

Changes needed:

observations.md:
- Change title: "Search Observations (Full-Text)" → "Search Observations (Semantic + Temporal)"
- Update description: Explain ChromaDB semantic search as primary
- Update command examples to explain hybrid flow
- Add note: "Uses ChromaDB vector search with SQLite temporal ordering. FTS5 used as fallback."
common-workflows.md:
- Update "Workflow 2: Finding Specific Bug Fixes" to explain ChromaDB → SQLite flow
- Add new workflow: "Workflow N: Getting Timeline Context Around Relevant Observations"

Example of corrected observations.md header:

# Search Observations (Semantic + Temporal)

Search observations using ChromaDB vector similarity with SQLite temporal ordering.

## Architecture

**3-Layer Hybrid Search:**
1. **ChromaDB semantic retrieval** - Finds what's semantically relevant (vector similarity)
2. **90-day recency filter** - Prioritizes recent work
3. **SQLite temporal ordering** - Sorts by time, returns newest relevant

**Fallback:** If ChromaDB unavailable, falls back to FTS5 keyword search.

## When to Use

- User asks: "How did we implement authentication?"
- User asks: "What bugs did we fix?"
- Looking for past work by meaning/topic (not just keywords)

Fix 2: Implement Layer 3 (Instant Context Timeline)

Option A: Add to existing search_observations handler

Modify /Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts line ~396:

// After getting sorted results, if user wants timeline context
if (results.length > 0 && options.includeTimeline) {
  const topObservation = results[0];
  const depth_before = options.timelineDepthBefore || 5;
  const depth_after = options.timelineDepthAfter || 5;

  // Get observations before and after
  const timeline = store.getTimelineContext(
    topObservation.id,
    depth_before,
    depth_after
  );

  return {
    topResult: topObservation,
    timeline: timeline,
    format: format
  };
}

Option B: Use existing timeline-by-query operation

The /api/timeline/by-query endpoint already implements search + timeline. Could:

Make it the DEFAULT recommended operation in skill guides
Update operation guides to emphasize this as primary workflow
Position observations search as "timeline-less" alternative

Recommendation: Option B is faster - leverage existing timeline-by-query endpoint and update skill guides to make it the primary workflow.

Fix 3: Update SKILL.md Technical Notes

File: /Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md Line 172:

Current:

- **Search engine:** FTS5 full-text search + structured filters

Change to:

- **Search engine:** ChromaDB vector search (primary) + SQLite temporal ordering + instant context timeline (3-layer sequential architecture)

Fix 4: Update search_observations Description

File: /Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts Line 349:

Current:

description: 'Search observations using full-text search across titles, narratives...'

Change to:

description: 'Search observations using hybrid semantic search (ChromaDB vector similarity + SQLite temporal ordering). Falls back to FTS5 keyword search if ChromaDB unavailable. IMPORTANT: Always use index format first...'

Line 351:

Current:

query: z.string().describe('Search query for FTS5 full-text search'),

Change to:

query: z.string().describe('Search query (semantic vector search via ChromaDB, falls back to FTS5 if unavailable)'),

Implementation Checklist

Use this checklist when executing fixes:

Phase 1: Core Documentation

Update observations.md title and description
Update observations.md architecture explanation
Update observations.md examples to mention ChromaDB
Update common-workflows.md to explain hybrid flow
Update SKILL.md line 172 technical notes
Verify all operation guides mention ChromaDB correctly

Phase 2: Backend Updates

Update search-server.ts search_observations description (line 349)
Update search-server.ts query parameter description (line 351)
Add code comments explaining 3-layer flow
Consider adding includeTimeline option to search_observations

Phase 3: Timeline Integration

Review timeline-by-query operation
Update skill guides to recommend timeline-by-query as primary workflow
Add example: "When you need context, use timeline-by-query instead of observations search"
Update quick reference table in SKILL.md to highlight timeline-by-query

Phase 4: Validation

Test search behavior with ChromaDB enabled
Test fallback behavior with ChromaDB disabled
Verify skill guides accurately describe behavior
Ensure landing page messaging aligns with skill guides
Check that human memory analogy is consistent everywhere

Key Messaging (Use Consistently)

Value Proposition

"3-layer hybrid search mimics human memory: ChromaDB semantic retrieval finds what's relevant → SQLite temporal ordering identifies when → instant context timeline shows what led there and what came next."

Technical Architecture

"ChromaDB vector search handles semantic understanding (what's relevant), SQLite handles temporal queries (when it happened, what's newest), and timeline context provides before/after observations (what led there, what happened next)."

Why It Matters

"LLMs don't experience time linearly like humans do. Claude-mem gives them temporal context: not just 'you implemented authentication,' but 'you researched OAuth libraries, then implemented JWT auth, then fixed a token expiration bug.' Complete narrative, like human memory."

ChromaDB Role

"ChromaDB is the PRIMARY search mechanism for semantic understanding. FTS5 is the FALLBACK for backward compatibility and reliability when ChromaDB is unavailable."

Files Reference

Skill Guides (Primary Fixes):

/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/SKILL.md
/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/observations.md
/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/timeline-by-query.md
/Users/alexnewman/Scripts/claude-mem/plugin/skills/mem-search/operations/common-workflows.md

Backend Code (Minor Updates):

/Users/alexnewman/Scripts/claude-mem/src/servers/search-server.ts

Documentation (Validation):

/Users/alexnewman/Scripts/claude-mem/docs/context/mem-search-technical-architecture.md

Landing Page (Already Fixed):

/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Features.tsx
/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/QuickBenefits.tsx
/Users/alexnewman/Scripts/claude-mem-pro/src/components/landing/Architecture.tsx

Questions for User (If Needed)

Timeline Integration Approach:
- Option A: Modify search_observations to add includeTimeline parameter
- Option B: Emphasize timeline-by-query as primary workflow in guides
- User preference?
Backward Compatibility:
- Should FTS5 fallback be MORE prominent in docs for older systems?
- Or keep it as "implementation detail"?
Progressive Disclosure:
- Should timeline context ALWAYS be included?
- Or only when user explicitly asks for context?

Success Criteria

When these fixes are complete:

✅ Skill operation guides accurately describe ChromaDB-first architecture
✅ No references to "FTS5 as primary search method"
✅ Timeline feature integrated into standard workflow
✅ Human memory analogy present in key documentation
✅ Consistent messaging across skill guides, docs, and landing page
✅ Backend code comments explain 3-layer flow clearly
✅ Users understand: "This is semantic search with temporal context, not just keyword search"

Notes for Next Claude

The user has already clarified the architecture thoroughly
Backend code is already correct - focus on documentation/guides
Landing page recently updated - validate for consistency
Timeline-by-query endpoint already exists - leverage it
Key insight: This mimics human memory through temporal context
ChromaDB is PRIMARY, not optional. FTS5 is FALLBACK, not primary.

Start with: Reading this document fully, then update skill operation guides first (highest impact).

15 KiB Raw Blame History

Hybrid Search Architecture: Problem-Solution Document

Executive Summary

The Intended Architecture (User's Vision)

Storage Flow

Search Flow (3-Layer Sequential Architecture)

Fallback Behavior

Current State Analysis

✅ What's Correct: Backend Implementation

❌ What's Wrong: Skill Operation Guides

⚠️ What's Missing: Layer 3 (Instant Context Timeline)

📋 Documentation Status

Required Fixes

Fix 1: Update Skill Operation Guides

Fix 2: Implement Layer 3 (Instant Context Timeline)

Fix 3: Update SKILL.md Technical Notes

Fix 4: Update search_observations Description

Implementation Checklist

Phase 1: Core Documentation

Phase 2: Backend Updates

Phase 3: Timeline Integration

Phase 4: Validation

Key Messaging (Use Consistently)

Value Proposition

Technical Architecture

Why It Matters

ChromaDB Role

Files Reference

Questions for User (If Needed)

Success Criteria

Notes for Next Claude

15 KiB

Raw Blame History