* feat: add mem-search skill with progressive disclosure architecture Add comprehensive mem-search skill for accessing claude-mem's persistent cross-session memory database. Implements progressive disclosure workflow and token-efficient search patterns. Features: - 12 search operations (observations, sessions, prompts, by-type, by-concept, by-file, timelines, etc.) - Progressive disclosure principles to minimize token usage - Anti-patterns documentation to guide LLM behavior - HTTP API integration for all search functionality - Common workflows with composition examples Structure: - SKILL.md: Entry point with temporal trigger patterns - principles/: Progressive disclosure + anti-patterns - operations/: 12 search operation files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: add CHANGELOG entry for mem-search skill Document mem-search skill addition in Unreleased section with: - 100% effectiveness compliance metrics - Comparison to previous search skill implementation - Progressive disclosure architecture details - Reference to audit report documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: add mem-search skill audit report Add comprehensive audit report validating mem-search skill against Anthropic's official skill-creator documentation. Report includes: - Effectiveness metrics comparison (search vs mem-search) - Critical issues analysis for production readiness - Compliance validation across 6 key dimensions - Reference implementation guidance Result: mem-search achieves 100% compliance vs search's 67% 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: Add comprehensive search architecture analysis document - Document current state of dual search architectures (HTTP API and MCP) - Analyze HTTP endpoints and MCP search server architectures - Identify DRY violations across search implementations - Evaluate the use of curl as the optimal approach for search - Provide architectural recommendations for immediate and long-term improvements - Outline action plan for cleanup, feature parity, DRY refactoring * refactor: Remove deprecated search skill documentation and operations * refactor: Reorganize documentation into public and context directories Changes: - Created docs/public/ for Mintlify documentation (.mdx files) - Created docs/context/ for internal planning and implementation docs - Moved all .mdx files and assets to docs/public/ - Moved all internal .md files to docs/context/ - Added CLAUDE.md to both directories explaining their purpose - Updated docs.json paths to work with new structure Benefits: - Clear separation between user-facing and internal documentation - Easier to maintain Mintlify docs in dedicated directory - Internal context files organized separately 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Enhance session management and continuity in hooks - Updated new-hook.ts to clarify session_id threading and idempotent session creation. - Modified prompts.ts to require claudeSessionId for continuation prompts, ensuring session context is maintained. - Improved SessionStore.ts documentation on createSDKSession to emphasize idempotent behavior and session connection. - Refined SDKAgent.ts to detail continuation prompt logic and its reliance on session.claudeSessionId for unified session handling. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Alex Newman <thedotmack@gmail.com>
16 KiB
Chroma Vector Database - Hybrid Semantic Search
Overview
Claude-Mem v5.0.0 introduced Chroma, a vector database that enables semantic search across your memory stream. Combined with SQLite's FTS5 keyword search, this creates a powerful hybrid search architecture that finds contextually relevant observations using both meaning and keywords.
Key Benefits:
- 🧠 Semantic Search - Find observations by meaning, not just keywords
- 🔍 Hybrid Architecture - Combines semantic similarity with keyword matching
- ⏱️ Recency Filtering - Focus on recent 90 days for relevant context
- ⚡ Fast Performance - Semantic search under 200ms with 8,000+ documents
- 🔄 Auto-Sync - ChromaSync service keeps vectors updated automatically
What is Chroma?
ChromaDB is an open-source vector database designed for AI applications. It stores text as vector embeddings - mathematical representations that capture semantic meaning.
Example:
Query: "authentication bug"
Keyword Match: Must contain both "authentication" AND "bug"
Semantic Match: Also finds "login error", "auth failure", "sign-in issue"
Semantic search understands that "authentication bug" is conceptually similar to "login error" even though they share no keywords.
Architecture
Hybrid Search Flow
┌──────────────────────────────────────────────────────────────┐
│ User Query: "How does authentication work?" │
└──────────────────────────────────────────────────────────────┘
↓
┌─────────────────┴─────────────────┐
↓ ↓
┌──────────────────────┐ ┌──────────────────────┐
│ Chroma Semantic │ │ SQLite FTS5 │
│ Vector Similarity │ │ Keyword Search │
│ │ │ │
│ Finds conceptually │ │ Finds exact/fuzzy │
│ similar observations │ │ keyword matches │
└──────────────────────┘ └──────────────────────┘
↓ ↓
└─────────────────┬─────────────────┘
↓
┌─────────────────────────────────┐
│ Merge Results │
│ - Deduplicate by ID │
│ - Sort by relevance + recency │
│ - Filter by 90-day window │
└─────────────────────────────────┘
↓
┌─────────────────────────────────┐
│ Return Top Matches │
│ Semantic + Keyword combined │
└─────────────────────────────────┘
ChromaSync Service
The ChromaSync service (src/services/sync/ChromaSync.ts) automatically synchronizes observations to Chroma:
When Observations Are Synced:
- Session Summary - After each session completes, all new observations synced
- Worker Startup - On initialization, checks for unsynced observations
- Manual Trigger - Can force sync via internal API (development only)
What Gets Embedded:
- Observation ID (unique identifier)
- Title (compressed learning statement)
- Narrative (detailed explanation)
- Project path (for project-specific filtering)
- Timestamp (for recency filtering)
- Concepts (semantic tags)
- File references (associated code files)
Embedding Model:
- Currently using Chroma's default embedding function
- Future: Configurable embedding models (e.g., OpenAI, sentence-transformers)
Data Structure
SQLite (Source of Truth):
CREATE TABLE observations (
id INTEGER PRIMARY KEY,
title TEXT,
narrative TEXT,
facts TEXT,
concepts TEXT,
files TEXT,
type TEXT,
projectPath TEXT,
createdAt INTEGER
);
Chroma (Vector Embeddings):
{
ids: ["obs_12345"],
embeddings: [[0.123, -0.456, ...]], // 384-dimensional vector
documents: ["Title: Authentication flow\nNarrative: Implemented..."],
metadatas: [{
type: "feature",
project: "claude-mem",
timestamp: 1698765432000,
concepts: "pattern,architecture"
}]
}
How Semantic Search Works
Vector Embeddings
Text converted to high-dimensional vectors that capture meaning:
"user authentication" → [0.12, -0.34, 0.56, ..., 0.78]
"login system" → [0.15, -0.32, 0.54, ..., 0.81]
"database schema" → [-0.45, 0.67, -0.23, ..., 0.12]
Notice: "user authentication" and "login system" have similar vectors (close in vector space), while "database schema" is distant.
Similarity Search
Chroma uses cosine similarity to find nearest neighbors:
// Query embedding
query: "authentication bug"
query_vector: [0.14, -0.33, 0.55, ..., 0.79]
// Find observations with similar vectors
results = chroma.query(
query_vector,
n_results: 10,
where: { timestamp: { $gte: now - 90_days } }
)
Result Ranking:
- Higher cosine similarity = more semantically similar
- Filtered by 90-day recency window
- Combined with keyword matches from FTS5
90-Day Recency Filtering
Why 90 days?
Rationale:
- Recent context more likely relevant to current work
- Prevents very old observations from diluting results
- Balances completeness with relevance
- Reduces vector search space for faster queries
Implementation:
const ninetyDaysAgo = Date.now() - (90 * 24 * 60 * 60 * 1000);
// Chroma metadata filter
where: {
timestamp: { $gte: ninetyDaysAgo }
}
// SQLite WHERE clause
WHERE createdAt >= ?
Configurable?
- Not currently user-configurable
- Hard-coded in
src/servers/search-server.ts - Future: Add
CLAUDE_MEM_RECENCY_DAYSenvironment variable
MCP Tool Integration
All 9 MCP search tools benefit from hybrid search:
search_observations (Hybrid)
// Keyword-only (v4.x)
search_observations(query: "authentication")
// Returns: Observations containing "authentication"
// Hybrid semantic + keyword (v5.x)
search_observations(query: "authentication")
// Returns: Observations with "authentication" PLUS semantically similar:
// - "login system"
// - "user credentials"
// - "session management"
get_timeline_by_query (Semantic-First)
// Uses Chroma to find best match, then builds timeline
get_timeline_by_query(
query: "when did we implement the viewer UI?",
mode: "auto",
depth_before: 10,
depth_after: 10
)
// Chroma finds: Observation #4057 "Web-Based Viewer UI for Real-Time Memory Stream"
// Returns: Timeline with 10 observations before + anchor + 10 after
Benefits Across All Tools
- find_by_concept: Semantic similarity finds related concepts
- find_by_file: Finds semantically similar code changes
- find_by_type: Better relevance ranking within type
- get_recent_context: Prioritizes semantically relevant recent context
Performance
Benchmarks (8,279 vector documents)
| Operation | Time | Notes |
|---|---|---|
| Semantic Query | 150-200ms | 90-day window, top 10 results |
| Keyword Query (FTS5) | 5-10ms | Full-text search |
| Hybrid Query | 160-220ms | Combined semantic + keyword |
| Initial Sync | 2-5 min | First-time embedding of all observations |
| Incremental Sync | 100-500ms | 1-10 new observations per session |
Memory Usage
- Chroma DB Size: ~50MB for 8,000 observations
- Embeddings: 384 dimensions × 4 bytes = 1.5KB per observation
- Metadata: ~500 bytes per observation (project, type, timestamp)
- Total: ~2KB per observation in Chroma
Optimization Tips
- Reduce vector dimensions: Use smaller embedding models (future)
- Adjust recency window: Narrow to 30/60 days for faster queries
- Limit result count: Request fewer results (n_results=5 vs 10)
- Project filtering: Add project filter to metadata query
Installation & Dependencies
Python Requirement
Chroma requires Python 3.7+ installed:
Check Python:
python3 --version
# Should show: Python 3.7.x or higher
Install Python (if needed):
- macOS:
brew install python3 - Windows: Download from python.org
- Linux:
apt-get install python3oryum install python3
ChromaDB Installation
Chroma installed automatically as npm dependency:
npm install
# Installs: chromadb (Python package via node-gyp bindings)
Manual Installation (if auto-install fails):
pip3 install chromadb
Troubleshooting Installation
Error: "Python not found"
# Set Python path explicitly
export PYTHON=/usr/local/bin/python3
npm install
Error: "chromadb module not found"
# Reinstall chromadb
pip3 install --upgrade chromadb
# Verify installation
python3 -c "import chromadb; print(chromadb.__version__)"
Error: "node-gyp build failed"
# Install build tools
# macOS: xcode-select --install
# Windows: npm install --global windows-build-tools
# Linux: apt-get install build-essential
Configuration
Environment Variables
Currently no user-configurable settings. Future options:
// Proposed for future versions
{
"env": {
"CLAUDE_MEM_CHROMA_ENABLED": "true", // Enable/disable Chroma
"CLAUDE_MEM_CHROMA_PATH": "~/.claude-mem/chroma", // DB location
"CLAUDE_MEM_EMBEDDING_MODEL": "default", // Embedding model choice
"CLAUDE_MEM_RECENCY_DAYS": "90", // Recency window
"CLAUDE_MEM_VECTOR_DIM": "384" // Embedding dimensions
}
}
Disabling Chroma (Future)
To disable semantic search and use keyword-only:
{
"env": {
"CLAUDE_MEM_CHROMA_ENABLED": "false"
}
}
Falls back to SQLite FTS5 keyword search only.
Database Maintenance
Location
~/.claude-mem/chroma/
├── chroma.sqlite3 # Chroma metadata database
└── index/ # Vector index files
└── *.bin # Binary vector data
Backup
# Backup entire Chroma directory
cp -r ~/.claude-mem/chroma ~/.claude-mem/chroma.backup
# Restore from backup
rm -rf ~/.claude-mem/chroma
cp -r ~/.claude-mem/chroma.backup ~/.claude-mem/chroma
Reset Chroma (Force Resync)
# Delete Chroma database
rm -rf ~/.claude-mem/chroma
# Restart worker to trigger full resync
npm run worker:restart
# Check logs for sync progress
npm run worker:logs
Note: Resync can take 2-5 minutes for thousands of observations.
Disk Space Management
Chroma grows with observations:
- 1,000 observations ≈ 5MB
- 10,000 observations ≈ 50MB
- 100,000 observations ≈ 500MB
Cleanup old observations:
-- Delete observations older than 1 year
-- This will trigger Chroma resync on next startup
sqlite3 ~/.claude-mem/claude-mem.db \
"DELETE FROM observations WHERE createdAt < strftime('%s', 'now', '-1 year') * 1000;"
Advanced Usage
Direct Chroma Queries (Development)
For debugging or custom queries:
import { ChromaSync } from './services/sync/ChromaSync';
const sync = new ChromaSync();
await sync.initialize();
// Query Chroma directly
const results = await sync.query({
queryTexts: ["authentication implementation"],
nResults: 10,
where: {
type: "feature",
timestamp: { $gte: Date.now() - 90_days }
}
});
console.log(results.ids, results.distances, results.documents);
Custom Embedding Models (Future)
Chroma supports multiple embedding models:
// Future configuration
const sync = new ChromaSync({
embeddingModel: "sentence-transformers/all-MiniLM-L6-v2", // Smaller, faster
// or: "text-embedding-ada-002" (OpenAI, requires API key)
// or: "all-mpnet-base-v2" (Higher quality, slower)
});
Metadata Filtering
Chroma supports advanced metadata queries:
// Find observations by type and project
results = await sync.query({
queryTexts: ["API design"],
where: {
$and: [
{ type: { $in: ["decision", "feature"] } },
{ project: "claude-mem" }
]
}
});
// Find recent observations
results = await sync.query({
queryTexts: ["database schema"],
where: {
timestamp: { $gte: Date.now() - 30_days }
}
});
Comparison: Semantic vs Keyword Search
| Aspect | Semantic (Chroma) | Keyword (FTS5) |
|---|---|---|
| Speed | 150-200ms | 5-10ms |
| Accuracy | High (meaning-based) | Medium (exact match) |
| Storage | ~2KB per observation | ~500 bytes per observation |
| Conceptual Matching | ✅ Yes | ❌ No |
| Exact Match | ❌ Not guaranteed | ✅ Always |
| Typo Tolerance | ✅ High | ⚠️ Limited (fuzzy) |
| Dependencies | Python + chromadb | None (SQLite built-in) |
| Recency Bias | ✅ Built-in (90 days) | Manual filtering |
Best Practice: Use hybrid search (both) for optimal results.
Troubleshooting
"Chroma not found" Error
Symptom: Worker logs show "Chroma not available, using keyword-only search"
Solution:
# Check Python installation
python3 --version
# Reinstall chromadb
pip3 install chromadb
# Restart worker
npm run worker:restart
Slow Query Performance
Symptom: Searches taking >1 second
Solutions:
- Reduce recency window (edit
src/servers/search-server.ts) - Limit result count (
nResults: 5instead of 10) - Add project filter to narrow search space
- Check Chroma index size (may need rebuild)
Out of Memory Errors
Symptom: Worker crashes with "JavaScript heap out of memory"
Solution:
# Increase Node.js heap size
export NODE_OPTIONS="--max-old-space-size=4096"
# Restart worker
npm run worker:restart
Sync Taking Too Long
Symptom: Initial Chroma sync takes >10 minutes
Possible Causes:
- Large number of observations (>10,000)
- Slow embedding model
- Limited CPU resources
Solutions:
- Let it complete (one-time cost)
- Delete very old observations to reduce count
- Close resource-intensive apps during sync
Future Enhancements
Potential improvements for future versions:
- Configurable Recency: User-defined recency window (30/60/90/365 days)
- Custom Embeddings: Choose embedding model (quality vs speed trade-off)
- Incremental Updates: Update existing vectors instead of full resync
- Semantic Filters: Search by semantic concept ("all architectural decisions")
- Multi-Language Support: Embeddings optimized for non-English code/docs
- Clustering: Auto-cluster related observations for discovery
- Visualization: 2D/3D visualization of vector space (similar observations near each other)
Resources
- ChromaDB Documentation: https://docs.trychroma.com/
- Source Code:
src/services/sync/ChromaSync.ts - Search Server:
src/servers/search-server.ts - Python Package: https://pypi.org/project/chromadb/
Powered by ChromaDB | Hybrid Semantic + Keyword Search | 90-Day Recency Window