Files
claude-mem/docs/context/CHROMA.md
T
basher83 97d565e3cd Replace search skill with mem-search (#91)
* feat: add mem-search skill with progressive disclosure architecture

Add comprehensive mem-search skill for accessing claude-mem's persistent
cross-session memory database. Implements progressive disclosure workflow
and token-efficient search patterns.

Features:
- 12 search operations (observations, sessions, prompts, by-type, by-concept, by-file, timelines, etc.)
- Progressive disclosure principles to minimize token usage
- Anti-patterns documentation to guide LLM behavior
- HTTP API integration for all search functionality
- Common workflows with composition examples

Structure:
- SKILL.md: Entry point with temporal trigger patterns
- principles/: Progressive disclosure + anti-patterns
- operations/: 12 search operation files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: add CHANGELOG entry for mem-search skill

Document mem-search skill addition in Unreleased section with:
- 100% effectiveness compliance metrics
- Comparison to previous search skill implementation
- Progressive disclosure architecture details
- Reference to audit report documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: add mem-search skill audit report

Add comprehensive audit report validating mem-search skill against
Anthropic's official skill-creator documentation.

Report includes:
- Effectiveness metrics comparison (search vs mem-search)
- Critical issues analysis for production readiness
- Compliance validation across 6 key dimensions
- Reference implementation guidance

Result: mem-search achieves 100% compliance vs search's 67%

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: Add comprehensive search architecture analysis document

- Document current state of dual search architectures (HTTP API and MCP)
- Analyze HTTP endpoints and MCP search server architectures
- Identify DRY violations across search implementations
- Evaluate the use of curl as the optimal approach for search
- Provide architectural recommendations for immediate and long-term improvements
- Outline action plan for cleanup, feature parity, DRY refactoring

* refactor: Remove deprecated search skill documentation and operations

* refactor: Reorganize documentation into public and context directories

Changes:
- Created docs/public/ for Mintlify documentation (.mdx files)
- Created docs/context/ for internal planning and implementation docs
- Moved all .mdx files and assets to docs/public/
- Moved all internal .md files to docs/context/
- Added CLAUDE.md to both directories explaining their purpose
- Updated docs.json paths to work with new structure

Benefits:
- Clear separation between user-facing and internal documentation
- Easier to maintain Mintlify docs in dedicated directory
- Internal context files organized separately

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Enhance session management and continuity in hooks

- Updated new-hook.ts to clarify session_id threading and idempotent session creation.
- Modified prompts.ts to require claudeSessionId for continuation prompts, ensuring session context is maintained.
- Improved SessionStore.ts documentation on createSDKSession to emphasize idempotent behavior and session connection.
- Refined SDKAgent.ts to detail continuation prompt logic and its reliance on session.claudeSessionId for unified session handling.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Alex Newman <thedotmack@gmail.com>
2025-11-11 16:15:07 -05:00

543 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Chroma Vector Database - Hybrid Semantic Search
## Overview
Claude-Mem v5.0.0 introduced **Chroma**, a vector database that enables semantic search across your memory stream. Combined with SQLite's FTS5 keyword search, this creates a powerful **hybrid search architecture** that finds contextually relevant observations using both meaning and keywords.
**Key Benefits:**
- 🧠 **Semantic Search** - Find observations by meaning, not just keywords
- 🔍 **Hybrid Architecture** - Combines semantic similarity with keyword matching
- ⏱️ **Recency Filtering** - Focus on recent 90 days for relevant context
-**Fast Performance** - Semantic search under 200ms with 8,000+ documents
- 🔄 **Auto-Sync** - ChromaSync service keeps vectors updated automatically
## What is Chroma?
[ChromaDB](https://www.trychroma.com/) is an open-source vector database designed for AI applications. It stores text as **vector embeddings** - mathematical representations that capture semantic meaning.
**Example:**
```
Query: "authentication bug"
Keyword Match: Must contain both "authentication" AND "bug"
Semantic Match: Also finds "login error", "auth failure", "sign-in issue"
```
Semantic search understands that "authentication bug" is conceptually similar to "login error" even though they share no keywords.
## Architecture
### Hybrid Search Flow
```
┌──────────────────────────────────────────────────────────────┐
│ User Query: "How does authentication work?" │
└──────────────────────────────────────────────────────────────┘
┌─────────────────┴─────────────────┐
↓ ↓
┌──────────────────────┐ ┌──────────────────────┐
│ Chroma Semantic │ │ SQLite FTS5 │
│ Vector Similarity │ │ Keyword Search │
│ │ │ │
│ Finds conceptually │ │ Finds exact/fuzzy │
│ similar observations │ │ keyword matches │
└──────────────────────┘ └──────────────────────┘
↓ ↓
└─────────────────┬─────────────────┘
┌─────────────────────────────────┐
│ Merge Results │
│ - Deduplicate by ID │
│ - Sort by relevance + recency │
│ - Filter by 90-day window │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ Return Top Matches │
│ Semantic + Keyword combined │
└─────────────────────────────────┘
```
### ChromaSync Service
The **ChromaSync** service (`src/services/sync/ChromaSync.ts`) automatically synchronizes observations to Chroma:
**When Observations Are Synced:**
1. **Session Summary** - After each session completes, all new observations synced
2. **Worker Startup** - On initialization, checks for unsynced observations
3. **Manual Trigger** - Can force sync via internal API (development only)
**What Gets Embedded:**
- Observation ID (unique identifier)
- Title (compressed learning statement)
- Narrative (detailed explanation)
- Project path (for project-specific filtering)
- Timestamp (for recency filtering)
- Concepts (semantic tags)
- File references (associated code files)
**Embedding Model:**
- Currently using Chroma's default embedding function
- Future: Configurable embedding models (e.g., OpenAI, sentence-transformers)
### Data Structure
**SQLite (Source of Truth):**
```sql
CREATE TABLE observations (
id INTEGER PRIMARY KEY,
title TEXT,
narrative TEXT,
facts TEXT,
concepts TEXT,
files TEXT,
type TEXT,
projectPath TEXT,
createdAt INTEGER
);
```
**Chroma (Vector Embeddings):**
```typescript
{
ids: ["obs_12345"],
embeddings: [[0.123, -0.456, ...]], // 384-dimensional vector
documents: ["Title: Authentication flow\nNarrative: Implemented..."],
metadatas: [{
type: "feature",
project: "claude-mem",
timestamp: 1698765432000,
concepts: "pattern,architecture"
}]
}
```
## How Semantic Search Works
### Vector Embeddings
Text converted to high-dimensional vectors that capture meaning:
```
"user authentication" → [0.12, -0.34, 0.56, ..., 0.78]
"login system" → [0.15, -0.32, 0.54, ..., 0.81]
"database schema" → [-0.45, 0.67, -0.23, ..., 0.12]
```
Notice: "user authentication" and "login system" have similar vectors (close in vector space), while "database schema" is distant.
### Similarity Search
Chroma uses **cosine similarity** to find nearest neighbors:
```typescript
// Query embedding
query: "authentication bug"
query_vector: [0.14, -0.33, 0.55, ..., 0.79]
// Find observations with similar vectors
results = chroma.query(
query_vector,
n_results: 10,
where: { timestamp: { $gte: now - 90_days } }
)
```
**Result Ranking:**
- Higher cosine similarity = more semantically similar
- Filtered by 90-day recency window
- Combined with keyword matches from FTS5
## 90-Day Recency Filtering
Why 90 days?
**Rationale:**
- Recent context more likely relevant to current work
- Prevents very old observations from diluting results
- Balances completeness with relevance
- Reduces vector search space for faster queries
**Implementation:**
```typescript
const ninetyDaysAgo = Date.now() - (90 * 24 * 60 * 60 * 1000);
// Chroma metadata filter
where: {
timestamp: { $gte: ninetyDaysAgo }
}
// SQLite WHERE clause
WHERE createdAt >= ?
```
**Configurable?**
- Not currently user-configurable
- Hard-coded in `src/servers/search-server.ts`
- Future: Add `CLAUDE_MEM_RECENCY_DAYS` environment variable
## MCP Tool Integration
All 9 MCP search tools benefit from hybrid search:
### search_observations (Hybrid)
```typescript
// Keyword-only (v4.x)
search_observations(query: "authentication")
// Returns: Observations containing "authentication"
// Hybrid semantic + keyword (v5.x)
search_observations(query: "authentication")
// Returns: Observations with "authentication" PLUS semantically similar:
// - "login system"
// - "user credentials"
// - "session management"
```
### get_timeline_by_query (Semantic-First)
```typescript
// Uses Chroma to find best match, then builds timeline
get_timeline_by_query(
query: "when did we implement the viewer UI?",
mode: "auto",
depth_before: 10,
depth_after: 10
)
// Chroma finds: Observation #4057 "Web-Based Viewer UI for Real-Time Memory Stream"
// Returns: Timeline with 10 observations before + anchor + 10 after
```
### Benefits Across All Tools
- **find_by_concept**: Semantic similarity finds related concepts
- **find_by_file**: Finds semantically similar code changes
- **find_by_type**: Better relevance ranking within type
- **get_recent_context**: Prioritizes semantically relevant recent context
## Performance
### Benchmarks (8,279 vector documents)
| Operation | Time | Notes |
|-----------|------|-------|
| **Semantic Query** | 150-200ms | 90-day window, top 10 results |
| **Keyword Query (FTS5)** | 5-10ms | Full-text search |
| **Hybrid Query** | 160-220ms | Combined semantic + keyword |
| **Initial Sync** | 2-5 min | First-time embedding of all observations |
| **Incremental Sync** | 100-500ms | 1-10 new observations per session |
### Memory Usage
- **Chroma DB Size**: ~50MB for 8,000 observations
- **Embeddings**: 384 dimensions × 4 bytes = 1.5KB per observation
- **Metadata**: ~500 bytes per observation (project, type, timestamp)
- **Total**: ~2KB per observation in Chroma
### Optimization Tips
1. **Reduce vector dimensions**: Use smaller embedding models (future)
2. **Adjust recency window**: Narrow to 30/60 days for faster queries
3. **Limit result count**: Request fewer results (n_results=5 vs 10)
4. **Project filtering**: Add project filter to metadata query
## Installation & Dependencies
### Python Requirement
Chroma requires Python 3.7+ installed:
**Check Python:**
```bash
python3 --version
# Should show: Python 3.7.x or higher
```
**Install Python (if needed):**
- **macOS**: `brew install python3`
- **Windows**: Download from [python.org](https://www.python.org/downloads/)
- **Linux**: `apt-get install python3` or `yum install python3`
### ChromaDB Installation
Chroma installed automatically as npm dependency:
```bash
npm install
# Installs: chromadb (Python package via node-gyp bindings)
```
**Manual Installation (if auto-install fails):**
```bash
pip3 install chromadb
```
### Troubleshooting Installation
**Error: "Python not found"**
```bash
# Set Python path explicitly
export PYTHON=/usr/local/bin/python3
npm install
```
**Error: "chromadb module not found"**
```bash
# Reinstall chromadb
pip3 install --upgrade chromadb
# Verify installation
python3 -c "import chromadb; print(chromadb.__version__)"
```
**Error: "node-gyp build failed"**
```bash
# Install build tools
# macOS: xcode-select --install
# Windows: npm install --global windows-build-tools
# Linux: apt-get install build-essential
```
## Configuration
### Environment Variables
Currently no user-configurable settings. Future options:
```json
// Proposed for future versions
{
"env": {
"CLAUDE_MEM_CHROMA_ENABLED": "true", // Enable/disable Chroma
"CLAUDE_MEM_CHROMA_PATH": "~/.claude-mem/chroma", // DB location
"CLAUDE_MEM_EMBEDDING_MODEL": "default", // Embedding model choice
"CLAUDE_MEM_RECENCY_DAYS": "90", // Recency window
"CLAUDE_MEM_VECTOR_DIM": "384" // Embedding dimensions
}
}
```
### Disabling Chroma (Future)
To disable semantic search and use keyword-only:
```json
{
"env": {
"CLAUDE_MEM_CHROMA_ENABLED": "false"
}
}
```
Falls back to SQLite FTS5 keyword search only.
## Database Maintenance
### Location
```
~/.claude-mem/chroma/
├── chroma.sqlite3 # Chroma metadata database
└── index/ # Vector index files
└── *.bin # Binary vector data
```
### Backup
```bash
# Backup entire Chroma directory
cp -r ~/.claude-mem/chroma ~/.claude-mem/chroma.backup
# Restore from backup
rm -rf ~/.claude-mem/chroma
cp -r ~/.claude-mem/chroma.backup ~/.claude-mem/chroma
```
### Reset Chroma (Force Resync)
```bash
# Delete Chroma database
rm -rf ~/.claude-mem/chroma
# Restart worker to trigger full resync
npm run worker:restart
# Check logs for sync progress
npm run worker:logs
```
**Note**: Resync can take 2-5 minutes for thousands of observations.
### Disk Space Management
**Chroma grows with observations:**
- 1,000 observations ≈ 5MB
- 10,000 observations ≈ 50MB
- 100,000 observations ≈ 500MB
**Cleanup old observations:**
```sql
-- Delete observations older than 1 year
-- This will trigger Chroma resync on next startup
sqlite3 ~/.claude-mem/claude-mem.db \
"DELETE FROM observations WHERE createdAt < strftime('%s', 'now', '-1 year') * 1000;"
```
## Advanced Usage
### Direct Chroma Queries (Development)
For debugging or custom queries:
```typescript
import { ChromaSync } from './services/sync/ChromaSync';
const sync = new ChromaSync();
await sync.initialize();
// Query Chroma directly
const results = await sync.query({
queryTexts: ["authentication implementation"],
nResults: 10,
where: {
type: "feature",
timestamp: { $gte: Date.now() - 90_days }
}
});
console.log(results.ids, results.distances, results.documents);
```
### Custom Embedding Models (Future)
Chroma supports multiple embedding models:
```typescript
// Future configuration
const sync = new ChromaSync({
embeddingModel: "sentence-transformers/all-MiniLM-L6-v2", // Smaller, faster
// or: "text-embedding-ada-002" (OpenAI, requires API key)
// or: "all-mpnet-base-v2" (Higher quality, slower)
});
```
### Metadata Filtering
Chroma supports advanced metadata queries:
```typescript
// Find observations by type and project
results = await sync.query({
queryTexts: ["API design"],
where: {
$and: [
{ type: { $in: ["decision", "feature"] } },
{ project: "claude-mem" }
]
}
});
// Find recent observations
results = await sync.query({
queryTexts: ["database schema"],
where: {
timestamp: { $gte: Date.now() - 30_days }
}
});
```
## Comparison: Semantic vs Keyword Search
| Aspect | Semantic (Chroma) | Keyword (FTS5) |
|--------|-------------------|----------------|
| **Speed** | 150-200ms | 5-10ms |
| **Accuracy** | High (meaning-based) | Medium (exact match) |
| **Storage** | ~2KB per observation | ~500 bytes per observation |
| **Conceptual Matching** | ✅ Yes | ❌ No |
| **Exact Match** | ❌ Not guaranteed | ✅ Always |
| **Typo Tolerance** | ✅ High | ⚠️ Limited (fuzzy) |
| **Dependencies** | Python + chromadb | None (SQLite built-in) |
| **Recency Bias** | ✅ Built-in (90 days) | Manual filtering |
**Best Practice:** Use hybrid search (both) for optimal results.
## Troubleshooting
### "Chroma not found" Error
**Symptom:** Worker logs show "Chroma not available, using keyword-only search"
**Solution:**
```bash
# Check Python installation
python3 --version
# Reinstall chromadb
pip3 install chromadb
# Restart worker
npm run worker:restart
```
### Slow Query Performance
**Symptom:** Searches taking >1 second
**Solutions:**
1. Reduce recency window (edit `src/servers/search-server.ts`)
2. Limit result count (`nResults: 5` instead of 10)
3. Add project filter to narrow search space
4. Check Chroma index size (may need rebuild)
### Out of Memory Errors
**Symptom:** Worker crashes with "JavaScript heap out of memory"
**Solution:**
```bash
# Increase Node.js heap size
export NODE_OPTIONS="--max-old-space-size=4096"
# Restart worker
npm run worker:restart
```
### Sync Taking Too Long
**Symptom:** Initial Chroma sync takes >10 minutes
**Possible Causes:**
- Large number of observations (>10,000)
- Slow embedding model
- Limited CPU resources
**Solutions:**
1. Let it complete (one-time cost)
2. Delete very old observations to reduce count
3. Close resource-intensive apps during sync
## Future Enhancements
Potential improvements for future versions:
- **Configurable Recency**: User-defined recency window (30/60/90/365 days)
- **Custom Embeddings**: Choose embedding model (quality vs speed trade-off)
- **Incremental Updates**: Update existing vectors instead of full resync
- **Semantic Filters**: Search by semantic concept ("all architectural decisions")
- **Multi-Language Support**: Embeddings optimized for non-English code/docs
- **Clustering**: Auto-cluster related observations for discovery
- **Visualization**: 2D/3D visualization of vector space (similar observations near each other)
## Resources
- **ChromaDB Documentation**: https://docs.trychroma.com/
- **Source Code**: `src/services/sync/ChromaSync.ts`
- **Search Server**: `src/servers/search-server.ts`
- **Python Package**: https://pypi.org/project/chromadb/
---
**Powered by ChromaDB** | **Hybrid Semantic + Keyword Search** | **90-Day Recency Window**