Refactor search documentation to implement a 3-layer workflow for memory retrieval; update tool names and usage examples for clarity and efficiency. Enhance troubleshooting section with new error handling and token management strategies.

This commit is contained in:
Alex Newman
2025-12-29 00:26:06 -05:00
parent f1aa4c3943
commit 00d0bc51e0
6 changed files with 1024 additions and 732 deletions
+158
View File
@@ -248,6 +248,164 @@ search_observations({
---
## MCP Architecture Simplification (December 2025)
### The Problem: Complex MCP Implementation
**Before:**
```
9+ MCP tools registered at session start:
- search_observations
- find_by_type
- find_by_file
- find_by_concept
- get_recent_context
- get_observation
- get_session
- get_prompt
- help
Problems:
- Overlapping operations (search_observations vs find_by_type)
- Complex parameter schemas (~2,500 tokens in tool definitions)
- No built-in workflow guidance
- High cognitive load for Claude (which tool to use?)
- Code size: ~2,718 lines in mcp-server.ts
```
**The Insight:** Progressive disclosure should be built into tool design itself, not something Claude has to remember.
### The Solution: 3-Layer Workflow
**After:**
```
4 MCP tools following 3-layer workflow:
1. __IMPORTANT - Workflow documentation (always visible)
"3-LAYER WORKFLOW (ALWAYS FOLLOW):
1. search(query) → Get index with IDs
2. timeline(anchor=ID) → Get context
3. get_observations([IDs]) → Fetch details
NEVER fetch full details without filtering first."
2. search - Layer 1: Get index with IDs (~50-100 tokens/result)
3. timeline - Layer 2: Get chronological context
4. get_observations - Layer 3: Fetch full details (~500-1,000 tokens/result)
Benefits:
- Progressive disclosure enforced by tool structure
- No overlapping operations
- Simple schemas (additionalProperties: true)
- Clear workflow pattern
- Code size: ~312 lines in mcp-server.ts (88% reduction)
- ~10x token savings
```
### Migration: Skill-Based Search Removed
**Previously:** Used skill-based search
- mem-search skill invoked via natural language
- HTTP API called directly via curl
- Progressive disclosure through skill loading
- 17 skill documentation files
**Now:** Removed skill-based approach
- MCP-only architecture
- Native MCP protocol (better Claude integration)
- Works with both Claude Desktop and Claude Code
- Simpler to maintain (no skill files)
- All 19 mem-search skill files removed (~2,744 lines)
### Key Architectural Changes
**MCP Server Refactor:**
Before:
```typescript
// Complex parameter schemas
{
name: "search_observations",
inputSchema: {
type: "object",
properties: {
query: { type: "string", description: "..." },
type: { type: "array", items: { enum: [...] } },
format: { enum: ["index", "full"] },
limit: { type: "number", minimum: 1, maximum: 100 },
// ... many more parameters
}
}
}
```
After:
```typescript
// Simple schemas with workflow guidance
{
name: "search",
description: "Step 1: Search memory. Returns index with IDs.",
inputSchema: {
type: "object",
properties: {},
additionalProperties: true // Accept any parameters
}
}
```
**Workflow Enforcement:**
Before: Claude had to remember progressive disclosure pattern
After: Tool structure makes it impossible to skip steps
- Can't get details without IDs from search
- Can't search without seeing __IMPORTANT reminder
- Timeline provides middle ground (context without full details)
### Impact
**Token Efficiency:**
```
Traditional: Fetch 20 observations upfront
→ 10,000-20,000 tokens
→ Only 2 observations relevant (90% waste)
3-Layer Workflow:
→ search (20 results): ~1,000-2,000 tokens
→ Review index, identify 3 relevant IDs
→ get_observations (3 IDs): ~1,500-3,000 tokens
→ Total: 2,500-5,000 tokens (50-75% savings)
```
**Code Simplicity:**
- MCP server: 2,718 lines → 312 lines (88% reduction)
- Removed: 19 skill files (~2,744 lines)
- Net reduction: ~5,150 lines of code removed
**User Experience:**
- Same natural language interaction
- Better token efficiency
- Clearer architecture
- Works identically on Claude Desktop and Claude Code
### Design Philosophy
**Progressive Disclosure Through Structure:**
The 3-layer workflow embodies progressive disclosure at the architectural level:
1. **Layer 1 (Index)** - "What exists?" - Cheap survey of options
2. **Layer 2 (Timeline)** - "What was happening?" - Context around specific points
3. **Layer 3 (Details)** - "Tell me everything" - Full details only when justified
Each layer provides a decision point where Claude can:
- Stop if irrelevant
- Get more context if uncertain
- Dive deep if confident
This makes it structurally difficult to waste tokens.
---
## v1-v2: The Naive Approach
### The First Attempt: Dump Everything
+403 -354
View File
@@ -1,448 +1,497 @@
---
title: "Search Architecture"
description: "mem-search skill with HTTP API and progressive disclosure"
description: "MCP tools with 3-layer workflow for token-efficient memory retrieval"
---
# Search Architecture
Claude-Mem uses a skill-based search architecture that provides intelligent memory retrieval through natural language queries. This replaced the MCP-based approach in v5.4.0 with a more efficient implementation. The skill was enhanced and renamed to "mem-search" in v5.5.0 for better scope differentiation.
Claude-mem uses an **MCP-based search architecture** that provides intelligent memory retrieval through 4 streamlined tools following a 3-layer workflow pattern.
## Overview
**Architecture**: Skill-Based Search + HTTP API + Progressive Disclosure
**Architecture**: MCP Tools → MCP Protocol → HTTP API → Worker Service
**Key Components**:
1. **mem-search Skill** (`plugin/skills/mem-search/SKILL.md`) - Auto-invoked when users ask about past work
2. **HTTP API Endpoints** (10 routes) - Fast, efficient search operations on port 37777
3. **Worker Service** - Express.js server with FTS5 full-text search
4. **SQLite Database** - Persistent storage with FTS5 virtual tables
5. **Chroma Vector DB** - Semantic search with hybrid retrieval
1. **MCP Tools** (4 tools) - `search`, `timeline`, `get_observations`, `__IMPORTANT`
2. **MCP Server** (`plugin/scripts/mcp-server.cjs`) - Thin wrapper over HTTP API
3. **HTTP API Endpoints** - Fast search operations on Worker Service (port 37777)
4. **Worker Service** - Express.js server with FTS5 full-text search
5. **SQLite Database** - Persistent storage with FTS5 virtual tables
6. **Chroma Vector DB** - Semantic search with hybrid retrieval
**v5.5.0 Enhancement**: Renamed from "search" to "mem-search" with:
- Effectiveness increased from 67% to 100%
- Concrete triggers increased from 44% to 85%
- 5+ unique identifiers for better scope differentiation
- Comprehensive documentation (17 files, 12 operation guides)
**Token Efficiency**: ~10x savings through 3-layer workflow pattern
## How It Works
### 1. User Query (Natural Language)
### 1. User Query
Claude has access to 4 MCP tools. When searching memory, Claude follows the 3-layer workflow:
```
User: "What bugs did we fix last session?"
Step 1: search(query="authentication bug", type="bugfix", limit=10)
Step 2: timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
Step 3: get_observations(ids=[123, 456, 789])
```
### 2. Skill Invocation
### 2. MCP Protocol
Claude recognizes the intent and invokes the mem-search skill:
- Skill frontmatter (~250 tokens) loaded at session start
- Full skill instructions loaded on-demand when skill is invoked
- Progressive disclosure pattern minimizes context overhead
- "mem-search" naming provides clear scope differentiation from native memory
MCP server receives tool call via JSON-RPC over stdio:
```json
{
"method": "tools/call",
"params": {
"name": "search",
"arguments": {
"query": "authentication bug",
"type": "bugfix",
"limit": 10
}
}
}
```
### 3. HTTP API Call
The skill uses `curl` to call the HTTP API:
MCP server translates to HTTP request:
```bash
curl "http://localhost:37777/api/search/observations?query=bugs&type=bugfix&limit=5"
```typescript
const url = `http://localhost:37777/api/search?query=authentication%20bug&type=bugfix&limit=10`;
const response = await fetch(url);
```
### 4. FTS5 Search
### 4. Worker Processing
Worker service queries SQLite FTS5 virtual tables:
Worker service executes FTS5 query:
```sql
SELECT * FROM observations_fts
WHERE observations_fts MATCH ?
AND type = 'bugfix'
ORDER BY rank
LIMIT 5
LIMIT 10
```
### 5. Results Formatted
### 5. Results Returned
Skill formats results and returns to Claude:
Worker returns structured data → MCP server → Claude:
```
## Recent Bugfixes
1. [bugfix] Fixed authentication token expiry
Date: 2025-11-08 14:23:45
Files: src/auth/jwt.ts
2. [bugfix] Resolved database connection leak
Date: 2025-11-08 13:15:22
Files: src/services/database.ts
```
### 6. User Sees Answer
Claude presents the formatted results naturally in conversation.
## Architecture Change (v5.4.0)
### Before: MCP-Based Search
**Approach**: 9 MCP tools registered at session start
**Token Cost**: ~2,500 tokens in tool definitions per session
- Each tool's schema, parameters, descriptions loaded
- All 9 tools available whether needed or not
- No progressive disclosure
**Example MCP Tool**:
```json
{
"name": "search_observations",
"description": "Full-text search across observations...",
"inputSchema": {
"type": "object",
"properties": {
"query": { "type": "string", "description": "..." },
"type": { "type": "array", "items": { "enum": [...] } },
"format": { "enum": ["index", "full"] },
// ... many more parameters
"content": [{
"type": "text",
"text": "| ID | Time | Title | Type |\n|---|---|---|---|\n| #123 | 2:15 PM | Fixed auth token expiry | bugfix |"
}]
}
```
### 6. Claude Processes Results
Claude reviews the index, decides which observations are relevant, and can:
- Use `timeline` to get context
- Use `get_observations` to fetch full details for selected IDs
## The 4 MCP Tools
### `__IMPORTANT` - Workflow Documentation
Always visible to Claude. Explains the 3-layer workflow pattern.
**Description:**
```
3-LAYER WORKFLOW (ALWAYS FOLLOW):
1. search(query) → Get index with IDs (~50-100 tokens/result)
2. timeline(anchor=ID) → Get context around interesting results
3. get_observations([IDs]) → Fetch full details ONLY for filtered IDs
NEVER fetch full details without filtering first. 10x token savings.
```
**Purpose:** Ensures Claude follows token-efficient pattern
### `search` - Search Memory Index
**Tool Definition:**
```typescript
{
name: 'search',
description: 'Step 1: Search memory. Returns index with IDs. Params: query, limit, project, type, obs_type, dateStart, dateEnd, offset, orderBy',
inputSchema: {
type: 'object',
properties: {},
additionalProperties: true // Accepts any parameters
}
}
```
**HTTP Endpoint:** `GET /api/search`
**Parameters:**
- `query` - Full-text search query
- `limit` - Maximum results (default: 20)
- `type` - Filter by observation type
- `project` - Filter by project name
- `dateStart`, `dateEnd` - Date range filters
- `offset` - Pagination offset
- `orderBy` - Sort order
**Returns:** Compact index with IDs, titles, dates, types (~50-100 tokens per result)
### `timeline` - Get Chronological Context
**Tool Definition:**
```typescript
{
name: 'timeline',
description: 'Step 2: Get context around results. Params: anchor (observation ID) OR query (finds anchor automatically), depth_before, depth_after, project',
inputSchema: {
type: 'object',
properties: {},
additionalProperties: true
}
}
```
**HTTP Endpoint:** `GET /api/timeline`
**Parameters:**
- `anchor` - Observation ID to center timeline around (optional if query provided)
- `query` - Search query to find anchor automatically (optional if anchor provided)
- `depth_before` - Number of observations before anchor (default: 3)
- `depth_after` - Number of observations after anchor (default: 3)
- `project` - Filter by project name
**Returns:** Chronological view showing what happened before/during/after
### `get_observations` - Fetch Full Details
**Tool Definition:**
```typescript
{
name: 'get_observations',
description: 'Step 3: Fetch full details for filtered IDs. Params: ids (array of observation IDs, required), orderBy, limit, project',
inputSchema: {
type: 'object',
properties: {
ids: {
type: 'array',
items: { type: 'number' },
description: 'Array of observation IDs to fetch (required)'
}
},
required: ['ids'],
additionalProperties: true
}
}
```
**HTTP Endpoint:** `POST /api/observations/batch`
**Body:**
```json
{
"ids": [123, 456, 789],
"orderBy": "date_desc",
"project": "my-app"
}
```
**Returns:** Complete observation details (~500-1,000 tokens per observation)
## MCP Server Implementation
**Location:** `/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs`
**Role:** Thin wrapper that translates MCP protocol to HTTP API calls
**Key Characteristics:**
- ~312 lines of code (reduced from ~2,718 lines in old implementation)
- No business logic - just protocol translation
- Single source of truth: Worker HTTP API
- Simple schemas with `additionalProperties: true`
**Handler Example:**
```typescript
{
name: 'search',
handler: async (args: any) => {
const endpoint = '/api/search';
const searchParams = new URLSearchParams();
for (const [key, value] of Object.entries(args)) {
searchParams.append(key, String(value));
}
const url = `http://localhost:37777${endpoint}?${searchParams}`;
const response = await fetch(url);
return await response.json();
}
}
```
## Worker HTTP API
**Location:** `src/services/worker-service.ts`
**Port:** 37777
**Search Endpoints:**
```typescript
GET /api/search # Main search (used by MCP search tool)
GET /api/timeline # Timeline context (used by MCP timeline tool)
POST /api/observations/batch # Fetch by IDs (used by MCP get_observations tool)
GET /api/health # Health check
```
**Database Access:**
- Uses `SessionSearch` service for FTS5 queries
- Uses `SessionStore` for structured queries
- Hybrid search with ChromaDB for semantic similarity
**FTS5 Full-Text Search:**
```typescript
// search tool → HTTP GET → FTS5 query
SELECT * FROM observations_fts
WHERE observations_fts MATCH ?
AND type = ?
AND date >= ? AND date <= ?
ORDER BY rank
LIMIT ? OFFSET ?
```
## The 3-Layer Workflow Pattern
### Design Philosophy
The 3-layer workflow embodies **progressive disclosure** - a core principle of claude-mem's architecture.
**Layer 1: Index (Search)**
- **What:** Compact table with IDs, titles, dates, types
- **Cost:** ~50-100 tokens per result
- **Purpose:** Survey what exists before committing tokens
- **Decision Point:** "Which observations are relevant?"
**Layer 2: Context (Timeline)**
- **What:** Chronological view of observations around a point
- **Cost:** Variable based on depth
- **Purpose:** Understand narrative arc, see what led to/from a point
- **Decision Point:** "Do I need full details?"
**Layer 3: Details (Get Observations)**
- **What:** Complete observation data (narrative, facts, files, concepts)
- **Cost:** ~500-1,000 tokens per observation
- **Purpose:** Deep dive on validated, relevant observations
- **Decision Point:** "Apply knowledge to current task"
### Token Efficiency
**Traditional RAG Approach:**
```
Fetch 20 observations upfront: 10,000-20,000 tokens
Relevance: ~10% (only 2 observations actually useful)
Waste: 18,000 tokens on irrelevant context
```
**3-Layer Workflow:**
```
Step 1: search (20 results) ~1,000-2,000 tokens
Step 2: Review index, filter to 3 relevant IDs
Step 3: get_observations (3 IDs) ~1,500-3,000 tokens
Total: 2,500-5,000 tokens (50-75% savings)
```
**10x Savings:** By filtering at index level before fetching full details
## Architecture Evolution
### Before: Complex MCP Implementation
**Approach:** 9 MCP tools with detailed parameter schemas
**Token Cost:** ~2,500 tokens in tool definitions per session
- `search_observations` - Full-text search
- `find_by_type` - Filter by type
- `find_by_file` - Filter by file
- `find_by_concept` - Filter by concept
- `get_recent_context` - Recent sessions
- `get_observation` - Fetch single observation
- `get_session` - Fetch session
- `get_prompt` - Fetch prompt
- `help` - API documentation
**Problems:**
- Overlapping operations (search_observations vs find_by_type)
- Complex parameter schemas
- No built-in workflow guidance
- High token cost at session start
**Code Size:** ~2,718 lines in mcp-server.ts
### After: Streamlined MCP Implementation
**Approach:** 4 MCP tools following 3-layer workflow
**Token Cost:** ~312 lines of code, simplified tool definitions
**Tools:**
1. `__IMPORTANT` - Workflow guidance (always visible)
2. `search` - Step 1 (index)
3. `timeline` - Step 2 (context)
4. `get_observations` - Step 3 (details)
**Benefits:**
- Progressive disclosure built into tool design
- No overlapping operations
- Simple schemas (`additionalProperties: true`)
- Clear workflow pattern
- ~10x token savings
**Code Size:** ~312 lines in mcp-server.ts (88% reduction)
### Key Insight
**Before:** Progressive disclosure was something Claude had to remember
**After:** Progressive disclosure is enforced by tool design itself
The 3-layer workflow pattern makes it structurally difficult to waste tokens:
- Can't fetch details without first getting IDs from search
- Can't search without seeing workflow reminder (`__IMPORTANT`)
- Timeline provides middle ground between index and full details
## Configuration
### Claude Desktop
Add to `claude_desktop_config.json`:
```json
{
"mcpServers": {
"mcp-search": {
"command": "node",
"args": [
"/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs"
]
}
}
}
```
### After: Skill-Based Search
### Claude Code
**Approach**: 1 mem-search skill with progressive disclosure
MCP server is automatically configured via plugin installation. No manual setup required.
**Token Cost**: ~250 tokens in skill frontmatter per session
- Only skill description loaded at session start
- Full instructions loaded on-demand when skill is invoked
- HTTP API endpoints instead of MCP protocol
**Both clients use the same MCP tools** - the architecture works identically for Claude Desktop and Claude Code.
**Example Skill Frontmatter**:
```markdown
# Claude-Mem mem-search Skill
## Security
Access claude-mem's persistent memory through a comprehensive HTTP API.
Search for past work, understand context, and learn from previous decisions.
### FTS5 Injection Prevention
## When to Use This Skill
All search queries are escaped before FTS5 processing:
Invoke this skill when users ask about:
- Past work: "What did we do last session?"
- Bug fixes: "Did we fix this before?"
- Features: "How did we implement authentication?"
...
```
**Token Efficiency**: Minimal frontmatter at session start with progressive disclosure
## HTTP API Endpoints
The worker service exposes 10 search endpoints:
### Full-Text Search
```
GET /api/search/observations
GET /api/search/sessions
GET /api/search/prompts
```
**Parameters**:
- `query` - FTS5 search query (required)
- `type` - Filter by type (bugfix, feature, refactor, etc.)
- `project` - Filter by project name
- `limit` - Maximum results (default: 20)
- `offset` - Pagination offset
- `format` - Response format (index or full)
**Example**:
```bash
curl "http://localhost:37777/api/search/observations?query=authentication&type=decision&limit=5"
```
### Filtered Search
```
GET /api/search/by-type
GET /api/search/by-concept
GET /api/search/by-file
```
**Parameters**:
- `type` / `concept` / `filePath` - Filter criteria (required)
- `project` - Filter by project
- `limit` - Maximum results
- `format` - Response format
**Example**:
```bash
curl "http://localhost:37777/api/search/by-file?filePath=worker-service.ts&limit=10"
```
### Context Retrieval
```
GET /api/context/recent
GET /api/context/timeline
GET /api/timeline/by-query
```
**Parameters**:
- `project` - Filter by project
- `limit` - Number of sessions/records
- `anchor` - Timeline anchor point (ID or timestamp)
- `depth_before` - Records before anchor
- `depth_after` - Records after anchor
**Example**:
```bash
curl "http://localhost:37777/api/context/recent?project=claude-mem&limit=5"
```
### Documentation
```
GET /api/search/help
```
Returns API documentation in JSON format.
## Progressive Disclosure Pattern
The mem-search skill uses progressive disclosure to minimize token usage:
### Layer 1: Skill Frontmatter (Session Start)
**What's Loaded**: Skill description and when to use it (~250 tokens)
**Purpose**: Claude can recognize when to invoke the skill
**Example**:
```markdown
# Claude-Mem mem-search Skill
Access claude-mem's persistent memory through a comprehensive HTTP API.
## When to Use This Skill
Invoke this skill when users ask about:
- Past work: "What did we do last session?"
- Bug fixes: "Did we fix this before?"
...
```
### Layer 2: Full Skill Instructions (On-Demand)
**What's Loaded**: Complete operation documentation (~2,500 tokens)
**Purpose**: Detailed instructions for each search operation
**When Loaded**: Only when Claude invokes the skill
**Example Structure**:
```
/skills/search/
├── SKILL.md (main frontmatter)
├── operations/
│ ├── observations.md (detailed instructions)
│ ├── sessions.md
│ ├── prompts.md
│ ├── by-type.md
│ ├── by-concept.md
│ ├── by-file.md
│ ├── recent-context.md
│ ├── timeline.md
│ ├── timeline-by-query.md
│ ├── help.md
│ ├── formatting.md
│ └── common-workflows.md
```
### Layer 3: API Response
**What's Returned**: Search results in requested format
**Format Options**:
- `index` - Titles, dates, IDs only (~50-100 tokens per result)
- `full` - Complete details (~500-1000 tokens per result)
**Progressive Usage**: Start with `index`, drill down with `full` as needed
## Implementation Details
### mem-search Skill Structure
```
plugin/skills/mem-search/
├── SKILL.md # Main frontmatter (~250 tokens)
├── operations/
│ ├── observations.md # Search observations
│ ├── sessions.md # Search sessions
│ ├── prompts.md # Search prompts
│ ├── by-type.md # Filter by type
│ ├── by-concept.md # Filter by concept
│ ├── by-file.md # Filter by file
│ ├── recent-context.md # Get recent context
│ ├── timeline.md # Timeline around point
│ ├── timeline-by-query.md # Search + timeline
│ ├── help.md # API documentation
│ ├── formatting.md # Result formatting guide
│ └── common-workflows.md # Usage patterns
```
### Worker Service Integration
**File**: `src/services/worker-service.ts`
**Search Routes**:
```typescript
// Full-text search
app.get('/api/search/observations', handleSearchObservations);
app.get('/api/search/sessions', handleSearchSessions);
app.get('/api/search/prompts', handleSearchPrompts);
// Filtered search
app.get('/api/search/by-type', handleSearchByType);
app.get('/api/search/by-concept', handleSearchByConcept);
app.get('/api/search/by-file', handleSearchByFile);
// Context retrieval
app.get('/api/context/recent', handleRecentContext);
app.get('/api/context/timeline', handleTimeline);
app.get('/api/timeline/by-query', handleTimelineByQuery);
// Documentation
app.get('/api/search/help', handleHelp);
```
**Database Access**:
- Uses `SessionSearch` service for FTS5 queries
- Uses `SessionStore` for structured queries
- Hybrid search with ChromaDB for semantic similarity
### Security
**FTS5 Injection Prevention** (v4.2.3):
```typescript
function escapeFTS5Query(query: string): string {
return query.replace(/"/g, '""');
}
```
All user-provided search queries are properly escaped to prevent SQL injection.
**Testing:** 332 injection attack tests covering special characters, SQL keywords, quote escaping, and boolean operators.
**Comprehensive Testing**: 332 injection attack tests covering:
- Special characters
- SQL keywords
- Quote escaping
- Boolean operators
### MCP Protocol Security
## Benefits
- Stdio transport (no network exposure)
- Local-only HTTP API (localhost:37777)
- No authentication needed (local development only)
### 1. Token Efficiency
## Performance
**Before (MCP)**:
- Session start: All tool definitions loaded upfront
- Every session pays this cost
- No progressive disclosure
**FTS5 Full-Text Search:** <10ms for typical queries
**After (Skill)**:
- Session start: Minimal token cost for skill frontmatter
- Full instructions loaded only when invoked (progressive disclosure)
- More efficient than loading all tool definitions upfront
**MCP Overhead:** Minimal - simple protocol translation
### 2. Natural Language Interface
**Caching:** HTTP layer allows response caching (future enhancement)
**Before**: Users needed to learn MCP tool syntax
```
search_observations with query="authentication" and type="decision"
```
**Pagination:** Efficient with offset/limit
**After**: Users ask naturally
```
"What decisions did we make about authentication?"
```
**Batching:** `get_observations` accepts multiple IDs in single call
Claude translates to appropriate API call.
## Benefits Over Alternative Approaches
### 3. Flexibility
### vs. Traditional RAG
**HTTP API Benefits**:
- Can be called from skills, MCP tools, or other clients
- Easy to test with curl
- Standard REST conventions
- JSON responses
**Traditional RAG:**
- Fetches everything upfront
- High token cost
- Low relevance ratio
**Progressive Disclosure**:
- Loads only what's needed
- Can add more operations without increasing base cost
- Documentation co-located with operations
**3-Layer MCP:**
- Fetches only what's needed
- ~10x token savings
- 100% relevance (Claude chooses what to fetch)
### 4. Performance
### vs. Previous MCP Implementation (v5.x)
**Fast Queries**: FTS5 full-text search under 10ms for typical queries
**Previous (9 tools):**
- Complex schemas
- Overlapping operations
- No workflow guidance
- ~2,500 tokens in definitions
**Caching**: HTTP layer allows response caching
**Current (4 tools):**
- Simple schemas
- Clear workflow
- Built-in guidance
- ~312 lines of code
**Pagination**: Efficient result pagination with offset/limit
### vs. Skill-Based Approach (Previously)
## Migration Notes
**Skill approach:**
- Required separate skill files
- HTTP API called directly via curl
- Progressive disclosure through skill loading
### For Users
**MCP approach:**
- Native MCP protocol (better Claude integration)
- Cleaner architecture (protocol translation layer)
- Works with both Claude Desktop and Claude Code
- Simpler to maintain (no skill files)
**No Action Required**: The migration from MCP to skill-based search is transparent.
**Same Questions Work**: Natural language queries work exactly the same way.
**Invisible Change**: Users won't notice any difference except better performance.
### For Developers
**Renamed**: MCP server (formerly `search-server.ts`, now `src/servers/mcp-server.ts`)
- Source file kept for reference
- No longer built or registered
- MCP configuration removed from `plugin/.mcp.json`
**New Implementation**: Skill-based search
- Skill files: `plugin/skills/mem-search/`
- HTTP endpoints: `src/services/worker-service.ts` (lines 200-400)
- Build script: `npm run build` includes skill files
- Sync script: `npm run sync-marketplace` copies to plugin directory
**Migration:** Skill-based search was removed in favor of streamlined MCP architecture.
## Troubleshooting
### MCP Server Not Connected
**Symptoms:** Tools not appearing in Claude
**Solution:**
1. Check MCP server path in configuration
2. Verify worker service is running: `curl http://localhost:37777/api/health`
3. Restart Claude Desktop/Code
### Worker Service Not Running
If searches fail, check worker service:
**Symptoms:** MCP tools fail with connection errors
**Solution:**
```bash
npm run worker:status # Check status
npm run worker:restart # Restart worker
npm run worker:logs # View logs
```
### HTTP Endpoints Not Responding
### Empty Search Results
Test endpoints directly:
**Symptoms:** search() returns no results
```bash
# Health check
curl http://localhost:37777/health
# Search test
curl "http://localhost:37777/api/search/observations?query=test&limit=1"
```
### Skill Not Invoking
If Claude doesn't invoke the mem-search skill automatically:
1. Check skill files exist: `ls ~/.claude/plugins/marketplaces/thedotmack/plugin/skills/mem-search/`
2. Restart Claude Code session to reload skill definitions
3. Try more explicit phrasing: "Search past sessions for bug fixes" or "What did we do in yesterday's session?"
4. Ensure your question is about previous sessions (not current conversation context)
**Troubleshooting:**
1. Test API directly: `curl "http://localhost:37777/api/search?query=test"`
2. Check database: `ls ~/.claude-mem/claude-mem.db`
3. Verify observations exist: `curl "http://localhost:37777/api/health"`
## Next Steps
- [Search Tools Usage](/usage/search-tools) - User guide with examples
- [Memory Search Usage](/usage/search-tools) - User guide with examples
- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow
- [Worker Service Architecture](/architecture/worker-service) - HTTP API details
- [Database Schema](/architecture/database) - FTS5 tables and indexes
+55 -38
View File
@@ -260,14 +260,12 @@ The index is useless without retrieval mechanisms:
*Use claude-mem MCP search to access records with the given ID*
```
**Available tools:**
- `search_observations` - Full-text search
- `find_by_concept` - Concept-based retrieval
- `find_by_file` - File-based retrieval
- `find_by_type` - Type-based retrieval
- `get_recent_context` - Recent session summaries
**Available MCP tools:**
- `search` - Search memory index (Layer 1: Get IDs)
- `timeline` - Get chronological context (Layer 2: See narrative arc)
- `get_observations` - Fetch full details (Layer 3: Deep dive)
Each tool supports `format: "index"` (default) and `format: "full"`.
The 3-layer workflow ensures progressive disclosure: index → context → details.
---
@@ -318,16 +316,18 @@ Is my task related to npm? → YES
---
## The Two-Tier Search Strategy
## The Three-Layer Workflow
Claude-Mem implements progressive disclosure in search results too:
Claude-Mem implements progressive disclosure through a 3-layer workflow pattern:
### Tier 1: Index Format (Default)
### Layer 1: Search (Index)
Start by searching to get a compact index with IDs:
```typescript
search_observations({
search({
query: "hook timeout",
format: "index" // Default
limit: 10
})
```
@@ -335,23 +335,40 @@ search_observations({
```
Found 3 observations matching "hook timeout":
| ID | Date | Type | Title | Tokens |
|----|------|------|-------|--------|
| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | ~155 |
| #2891 | Oct 25 | how-it-works | Hook timeout configuration | ~203 |
| #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ~89 |
| ID | Date | Type | Title |
|----|------|------|-------|
| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short |
| #2891 | Oct 25 | how-it-works | Hook timeout configuration |
| #2102 | Oct 20 | problem-solution | Fixed timeout in CI |
```
**Cost:** ~100 tokens for 3 results
**Value:** Agent can scan and decide which to fetch
**Cost:** ~50-100 tokens per result
**Value:** Agent can scan and decide which observations are relevant
### Tier 2: Full Format (On-Demand)
### Layer 2: Timeline (Context)
Get chronological context around interesting observations:
```typescript
search_observations({
query: "hook timeout",
format: "full",
limit: 1 // Fetch just the most relevant
timeline({
anchor: 2543, // Observation ID from search
depth_before: 3,
depth_after: 3
})
```
**Returns:** Chronological view showing what happened before/during/after observation #2543
**Cost:** Variable based on depth
**Value:** Understand narrative arc and context
### Layer 3: Get Observations (Details)
Fetch full details only for relevant observations:
```typescript
get_observations({
ids: [2543, 2102] // Selected from search results
})
```
@@ -463,29 +480,30 @@ Here are 10 observations.
*Use MCP search tools to fetch full observation details on-demand*
```
### ❌ Defaulting to Full Format
### ❌ Skipping the Index Layer
**Bad:**
```typescript
search_observations({
query: "hooks",
format: "full" // Fetches everything
// Fetching full details immediately
get_observations({
ids: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] // Guessing which are relevant
})
```
**Good:**
```typescript
search_observations({
// Follow the 3-layer workflow
// Layer 1: Search for index
search({
query: "hooks",
format: "index", // Scan first
limit: 20
})
// Then, if needed:
search_observations({
query: "hooks",
format: "full",
limit: 1 // Just the most relevant
// Layer 2: Review index, identify 2-3 relevant IDs
// Layer 3: Fetch only relevant observations
get_observations({
ids: [2543, 2891] // Just the most relevant
})
```
@@ -595,10 +613,9 @@ SessionStart({ source: "compact" }):
```typescript
// Use embeddings to pre-sort index by relevance
search_observations({
search({
query: "authentication bug",
format: "index",
sort: "relevance" // Based on semantic similarity
orderBy: "relevance" // Based on semantic similarity (future enhancement)
})
```
+24 -12
View File
@@ -742,17 +742,17 @@ sqlite3 ~/.claude-mem/claude-mem.db "
3. Test simple query:
```bash
# In Claude Code
search_observations with query="test"
# Test MCP search tool
search(query="test", limit=5)
```
4. Check query syntax:
```bash
# Bad: Special characters
search_observations with query="[test]"
# Bad: Special characters may cause issues
search(query="[test]")
# Good: Simple words
search_observations with query="test"
search(query="test")
```
### Token Limit Errors
@@ -761,28 +761,40 @@ sqlite3 ~/.claude-mem/claude-mem.db "
**Solutions**:
1. Use index format:
1. Follow 3-layer workflow (don't skip to get_observations):
```bash
search_observations with query="..." and format="index"
# Start with search to get index
search(query="...", limit=10)
# Review IDs, then fetch only relevant ones
get_observations(ids=[<2-3 relevant IDs>])
```
2. Reduce limit:
2. Reduce limit in search:
```bash
search_observations with query="..." and limit=3
search(query="...", limit=3)
```
3. Use filters to narrow results:
```bash
search_observations with query="..." and type="decision" and limit=5
search(query="...", type="decision", limit=5)
```
4. Paginate results:
```bash
# First page
search_observations with query="..." and limit=5 and offset=0
search(query="...", limit=5, offset=0)
# Second page
search_observations with query="..." and limit=5 and offset=5
search(query="...", limit=5, offset=5)
```
5. Batch IDs in get_observations:
```bash
# Always batch multiple IDs in one call
get_observations(ids=[123, 456, 789])
# Don't make separate calls per ID
```
## Performance Issues
+357 -306
View File
@@ -1,403 +1,454 @@
---
title: "mem-search Skill"
description: "Query your project history with natural language"
title: "Memory Search"
description: "Search your project history with MCP tools"
---
# mem-search Skill Usage
# Memory Search with MCP Tools
Once claude-mem is installed as a plugin, you can search your project history using natural language. Claude automatically invokes the mem-search skill when you ask about past work.
Claude-mem provides persistent memory across sessions through **4 MCP tools** that follow a token-efficient **3-layer workflow pattern**.
## How It Works
## Overview
**v5.5.0 Enhancement**: The search skill was renamed to "mem-search" for better scope differentiation, with effectiveness increased from 67% to 100% and enhanced concrete triggers (85% vs 44%).
Instead of fetching all historical data upfront (expensive), claude-mem uses a progressive disclosure approach:
**v5.4.0 Architecture**: Claude-Mem uses a skill-based search architecture instead of MCP tools, saving ~2,250 tokens per session start through progressive disclosure.
1. **Search** → Get a compact index with IDs (~50-100 tokens/result)
2. **Timeline** → Get context around interesting results
3. **Get Observations** → Fetch full details ONLY for filtered IDs
**Simple Usage:**
- Just ask naturally: *"What did we do last session?"*
- Claude recognizes the intent and invokes the mem-search skill
- The skill uses HTTP API endpoints to query your memory
- Results are formatted and presented to you
This achieves **~10x token savings** compared to traditional RAG approaches.
**Benefits:**
- **Token Efficient**: ~250 tokens (skill frontmatter) vs ~2,500 tokens (MCP tool definitions)
- **Natural Language**: No need to learn specific tool syntax
- **Progressive Disclosure**: Only loads detailed instructions when needed
- **Auto-Invoked**: Claude knows when to search based on your questions
- **Scope Differentiation**: "mem-search" clearly distinguishes from native conversation memory
## The 3-Layer Workflow
## Quick Reference
### Layer 1: Search (Index)
| Operation | Purpose |
|-------------------------|----------------------------------------------|
| Search Observations | Full-text search across observations |
| Search Sessions | Full-text search across session summaries |
| Search Prompts | Full-text search across raw user prompts |
| By Concept | Find observations tagged with concepts |
| By File | Find observations referencing files |
| By Type | Find observations by type |
| Recent Context | Get recent session context |
| Timeline | Get unified timeline around a specific point |
| Timeline by Query | Search and get timeline context in one step |
| API Help | Get search API documentation |
## Example Queries
### Natural Language Queries
**Search Observations:**
```
"What bugs did we fix related to authentication?"
"Show me all decisions about the build system"
"Find refactoring work on the database"
```
**Search Sessions:**
```
"What did we learn about hooks?"
"What was accomplished in the API implementation?"
"Show me recent work on this project"
```
**Search Prompts:**
```
"When did I ask about authentication features?"
"Find all my requests about dark mode"
```
**Note**: Claude automatically translates your natural language queries into the appropriate search operations.
### Search by File
Start by searching to get a lightweight index of results:
```
"Show me everything related to worker-service.ts"
"What changes were made to migrations.ts?"
"Find all work on the database file"
search(query="authentication bug", type="bugfix", limit=10)
```
### Search by Concept
**Returns:** Compact table with IDs, titles, dates, types
**Cost:** ~50-100 tokens per result
**Purpose:** Survey what exists before fetching details
### Layer 2: Timeline (Context)
Get chronological context around specific observations:
```
"Show observations tagged with architecture"
"Find all security-related observations"
"What patterns have we used?"
timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
```
### Search by Type
Or search and get timeline in one step:
```
"Find all feature implementations"
"Show me all decisions and discoveries"
"What bugs have we fixed?"
timeline(query="authentication", depth_before=2, depth_after=2)
```
### Recent Context
**Returns:** Chronological view showing what was happening before/after
**Cost:** Variable, depends on depth
**Purpose:** Understand narrative arc and context
### Layer 3: Get Observations (Details)
Fetch full details only for relevant observations:
```
"Show me what we've been working on"
"Get context from the last 5 sessions"
"What happened recently on this project?"
get_observations(ids=[123, 456, 789])
```
### Timeline Queries
**Returns:** Complete observation details (narrative, facts, files, concepts)
**Cost:** ~500-1000 tokens per observation
**Purpose:** Deep dive on specific, validated items
**Get timeline around a specific point:**
### Why This Works
**Traditional Approach:**
- Fetch everything upfront: 20,000 tokens
- Relevance: ~10% (2,000 tokens actually useful)
- Waste: 18,000 tokens on irrelevant context
**3-Layer Approach:**
- Search index: 1,000 tokens (10 results)
- Timeline context: 500 tokens (around 2 key results)
- Fetch details: 1,500 tokens (3 observations)
- **Total: 3,000 tokens, 100% relevant**
## Available Tools
### `__IMPORTANT` - Workflow Documentation
Always visible reminder of the 3-layer workflow pattern. Helps Claude understand how to use the search tools efficiently.
**Usage:** Automatically shown, no need to invoke
### `search` - Search Memory Index
Search your memory and get a compact index with IDs.
**Parameters:**
- `query` - Full-text search query (supports AND, OR, NOT, phrase searches)
- `limit` - Maximum results (default: 20)
- `offset` - Skip first N results for pagination
- `type` - Filter by observation type (bugfix, feature, decision, discovery, refactor, change)
- `obs_type` - Filter by record type (observation, session, prompt)
- `project` - Filter by project name
- `dateStart` - Filter by start date (YYYY-MM-DD)
- `dateEnd` - Filter by end date (YYYY-MM-DD)
- `orderBy` - Sort order (date_desc, date_asc, relevance)
**Returns:** Compact index table with IDs, titles, dates, types
**Example:**
```
"What was happening when we implemented authentication?"
"Show me the context around that bug fix"
"What led to the decision to refactor the database?"
search(query="database migration", type="bugfix", limit=5, orderBy="date_desc")
```
**Timeline by query:**
### `timeline` - Get Chronological Context
Get a chronological view of observations around a specific point or query.
**Parameters:**
- `anchor` - Observation ID to center timeline around (optional if query provided)
- `query` - Search query to find anchor automatically (optional if anchor provided)
- `depth_before` - Number of observations before anchor (default: 3)
- `depth_after` - Number of observations after anchor (default: 3)
- `project` - Filter by project name
**Returns:** Chronological list showing what happened before/during/after
**Example:**
```
"Find when we added the viewer UI and show what happened around that time"
"Search for authentication work and show the timeline"
timeline(anchor=12345, depth_before=5, depth_after=5)
```
**Benefits:**
- See the complete narrative arc around key events
- All record types (observations, sessions, prompts) in chronological view
- Understand what was happening before and after important changes
## Search Strategy
The mem-search skill uses a progressive disclosure pattern to efficiently retrieve information:
### 1. Ask Naturally
Start with a natural language question:
Or search-based:
```
"What bugs did we fix related to authentication?"
timeline(query="implemented JWT auth", depth_before=3, depth_after=3)
```
### 2. Claude Invokes mem-search Skill
### `get_observations` - Fetch Full Details
Claude recognizes your intent and loads the mem-search skill (~250 tokens for skill frontmatter).
Fetch complete observation details by IDs. **Always batch multiple IDs in a single call for efficiency.**
### 3. Skill Uses HTTP API
**Parameters:**
- `ids` - Array of observation IDs (required)
- `orderBy` - Sort order (date_desc, date_asc)
- `limit` - Maximum observations to return
- `project` - Filter by project name
The skill calls the appropriate HTTP endpoint (e.g., `/api/search/observations`) with the query.
**Returns:** Complete observation details including narrative, facts, files, concepts
### 4. Results Formatted
Results are formatted and presented to you, usually starting with an index/summary format.
### 5. Deep Dive if Needed
If you need more details, ask follow-up questions:
**Example:**
```
"Tell me more about observation #123"
"Show me the full details of that decision"
get_observations(ids=[123, 456, 789, 1011])
```
**Benefits of This Approach:**
- **Token Efficient**: Only loads what you need, when you need it
- **Natural**: No syntax to learn
- **Progressive**: Start with overview, drill down as needed
- **Automatic**: Claude handles the search invocation
**Important:** Always batch IDs instead of making separate calls per observation.
## Common Use Cases
### Debugging Issues
**Scenario:** Find what went wrong with database connections
```
Step 1: search(query="error database connection", type="bugfix", limit=10)
→ Review index, identify observations #245, #312, #489
Step 2: timeline(anchor=312, depth_before=3, depth_after=3)
→ See what was happening around the fix
Step 3: get_observations(ids=[312, 489])
→ Get full details on relevant fixes
```
### Understanding Decisions
**Scenario:** Review architectural choices about authentication
```
Step 1: search(query="authentication", type="decision", limit=5)
→ Find decision observations
Step 2: get_observations(ids=[<relevant_ids>])
→ Get full decision rationale, trade-offs, facts
```
### Code Archaeology
**Scenario:** Find when a specific file was modified
```
Step 1: search(query="worker-service.ts", limit=20)
→ Get all observations mentioning that file
Step 2: timeline(query="worker-service.ts refactor", depth_before=2, depth_after=2)
→ See what led to and followed from the refactor
Step 3: get_observations(ids=[<specific_observation_ids>])
→ Get implementation details
```
### Feature History
**Scenario:** Track how a feature evolved
```
Step 1: search(query="dark mode", type="feature", orderBy="date_asc")
→ Chronological view of feature work
Step 2: timeline(anchor=<first_observation_id>, depth_after=10)
→ See the full development timeline
Step 3: get_observations(ids=[<key_milestones>])
→ Deep dive on critical implementation points
```
### Learning from Past Work
**Scenario:** Review refactoring patterns
```
Step 1: search(type="refactor", limit=10, orderBy="date_desc")
→ Recent refactoring work
Step 2: get_observations(ids=[<interesting_ids>])
→ Study the patterns and approaches used
```
### Context Recovery
**Scenario:** Restore context after time away from project
```
Step 1: search(query="project-name", limit=10, orderBy="date_desc")
→ See recent work
Step 2: timeline(anchor=<most_recent_id>, depth_before=10)
→ Understand what led to current state
Step 3: get_observations(ids=[<critical_observations>])
→ Refresh memory on key decisions
```
## Search Query Syntax
The `query` parameter supports SQLite FTS5 full-text search syntax:
### Boolean Operators
```
query="authentication AND JWT" # Both terms must appear
query="OAuth OR JWT" # Either term can appear
query="security NOT deprecated" # Exclude deprecated items
```
### Phrase Searches
```
query='"database migration"' # Exact phrase match
```
### Column-Specific Searches
```
query="title:authentication" # Search in title only
query="content:database" # Search in content only
query="concepts:security" # Search in concepts only
```
### Combining Operators
```
query='"user auth" AND (JWT OR session) NOT deprecated'
```
## Token Management
### Token Efficiency Best Practices
1. **Always start with search** - Get index first (~50-100 tokens/result)
2. **Use small limits** - Start with 3-5 results, increase if needed
3. **Filter before fetching** - Use type, date, project filters
4. **Batch get_observations** - Always group multiple IDs in one call
5. **Use timeline strategically** - Get context only when narrative matters
### Token Cost Estimates
| Operation | Tokens per Result |
|-----------|-------------------|
| search (index) | 50-100 |
| timeline (per observation) | 100-200 |
| get_observations (full details) | 500-1,000 |
**Example Comparison:**
**Inefficient:**
```
# Fetching 20 full observations upfront: 10,000-20,000 tokens
get_observations(ids=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])
```
**Efficient:**
```
# Search index: ~1,000 tokens
search(query="bug fix", limit=20)
# Review IDs, identify 3 relevant observations
# Fetch only relevant: ~1,500-3,000 tokens
get_observations(ids=[5, 12, 18])
# Total: 2,500-4,000 tokens (vs 10,000-20,000)
```
## Advanced Filtering
You can refine searches using natural language filters:
### Date Ranges
```
"What bugs did we fix in October?"
"Show me work from last week"
"Find decisions made between October 1-31"
search(
query="performance optimization",
dateStart="2025-10-01",
dateEnd="2025-10-31"
)
```
### Multiple Types
```
"Show me all decisions and features"
"Find bugfixes and refactorings"
```
### Concepts
For observations of multiple types, make multiple searches or use broader query:
```
"Find database work related to architecture and performance"
"Show security observations"
search(query="database", type="bugfix", limit=10)
search(query="database", type="feature", limit=10)
```
### File-Specific
### Project-Specific
```
"Show refactoring work that touched worker-service.ts"
"Find changes to auth files"
search(query="API", project="my-app", limit=15)
```
### Project Filtering
### Pagination
```
"Show authentication work on my-app project"
"What have we done on this codebase?"
# First page
search(query="refactor", limit=10, offset=0)
# Second page
search(query="refactor", limit=10, offset=10)
# Third page
search(query="refactor", limit=10, offset=20)
```
**Note**: Claude translates your natural language into the appropriate API filters automatically.
## Under the Hood: HTTP API
The mem-search skill uses HTTP endpoints on the worker service (port 37777):
- `GET /api/search/observations` - Full-text search observations
- `GET /api/search/sessions` - Full-text search session summaries
- `GET /api/search/prompts` - Full-text search user prompts
- `GET /api/search/by-concept` - Find observations by concept tag
- `GET /api/search/by-file` - Find work related to specific files
- `GET /api/search/by-type` - Find observations by type
- `GET /api/context/recent` - Get recent session context
- `GET /api/context/timeline` - Get timeline around specific point
- `GET /api/timeline/by-query` - Search + timeline in one call
- `GET /api/search/help` - API documentation
These endpoints use FTS5 full-text search with support for:
- Boolean operators (AND, OR, NOT)
- Phrase searches
- Column-specific searches
- Date range filtering
- Project filtering
## Result Metadata
All results include rich metadata:
All observations include rich metadata:
```
## JWT authentication decision
**Type**: decision
**Date**: 2025-10-21 14:23:45
**Concepts**: authentication, security, architecture
**Files Read**: src/auth/middleware.ts, src/utils/jwt.ts
**Files Modified**: src/auth/jwt-strategy.ts
**Narrative**:
Decided to implement JWT-based authentication instead of session-based
authentication for better scalability and stateless design...
**Facts**:
• JWT tokens expire after 1 hour
• Refresh tokens stored in httpOnly cookies
• Token signing uses RS256 algorithm
• Public keys rotated every 30 days
```
## Citations
All search results include observation IDs that can be accessed via the HTTP API:
- `http://localhost:37777/api/observation/{id}` - Get specific observation by ID
- View all observations in the web viewer at `http://localhost:37777`
These citations enable referencing specific historical context in your work.
## Token Management
### Token Efficiency Tips
1. **Start with index format**: ~50-100 tokens per result
2. **Use small limits**: Start with 3-5 results
3. **Apply filters**: Narrow results before searching
4. **Paginate**: Use offset to browse results in batches
### Token Estimates
| Format | Tokens per Result |
|--------|-------------------|
| Index | 50-100 |
| Full | 500-1000 |
**Example**:
- 20 results in index format: ~1,000-2,000 tokens
- 20 results in full format: ~10,000-20,000 tokens
## Common Use Cases
### 1. Debugging Issues
Find what went wrong:
```
search_observations with query="error database connection" and type="bugfix"
```
### 2. Understanding Decisions
Review architectural choices:
```
find_by_type with type="decision" and format="index"
```
Then deep dive on specific decisions:
```
search_observations with query="[DECISION TITLE]" and format="full"
```
### 3. Code Archaeology
Find when a file was modified:
```
find_by_file with filePath="worker-service.ts"
```
### 4. Feature History
Track feature development:
```
search_sessions with query="authentication feature"
search_user_prompts with query="add authentication"
```
### 5. Learning from Past Work
Review refactoring patterns:
```
find_by_type with type="refactor" and limit=10
```
### 6. Context Recovery
Restore context after time away:
```
get_recent_context with limit=5
search_sessions with query="[YOUR PROJECT NAME]" and orderBy="date_desc"
```
## Best Practices
1. **Index first, full later**: Always start with index format
2. **Small limits**: Start with 3-5 results to avoid token limits
3. **Use filters**: Narrow results before searching
4. **Specific queries**: More specific = better results
5. **Review citations**: Use citations to reference past decisions
6. **Date filtering**: Use date ranges for time-based searches
7. **Type filtering**: Use types to categorize searches
8. **Concept tags**: Use concepts for thematic searches
- **ID** - Unique observation identifier
- **Type** - bugfix, feature, decision, discovery, refactor, change
- **Date** - When the work occurred
- **Title** - Concise description
- **Concepts** - Tagged themes (e.g., security, performance, architecture)
- **Files Read** - Files examined during work
- **Files Modified** - Files changed during work
- **Narrative** - Story of what happened and why
- **Facts** - Key factual points (decisions made, patterns used, metrics)
## Troubleshooting
### No Results Found
1. Check database has data:
1. **Broaden your search:**
```
# Too specific
search(query="JWT authentication implementation with RS256")
# Better
search(query="authentication")
```
2. **Check database has data:**
```bash
sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"
curl "http://localhost:37777/api/search?query=test"
```
2. Try broader natural language query:
3. **Try without filters:**
```
"Show me anything about authentication" # Broader
vs
"Find exact JWT authentication implementation" # Too specific
# Remove type/date filters to see if data exists
search(query="your-search-term")
```
3. Ask without filters first:
```
"What do we have about auth?"
# Then narrow down
"Show me auth-related decisions"
```
### IDs Not Found in get_observations
### Worker Service Not Running
**Error:** "Observation IDs not found: [123, 456]"
If search isn't working, check the worker service:
**Causes:**
- IDs from different project (use `project` parameter)
- IDs were deleted
- Typo in ID numbers
```bash
npm run worker:status # Check worker status
npm run worker:restart # Restart if needed
npm run worker:logs # View logs
**Solution:**
```
# Verify IDs exist
search(query="<related-search>")
# Use correct project filter
get_observations(ids=[123, 456], project="correct-project-name")
```
Or describe the issue to Claude and the troubleshoot skill will automatically activate to provide diagnosis.
### Token Limit Errors
### Performance Issues
**Error:** Response exceeds token limits
**Solution:** Use the 3-layer workflow to reduce upfront costs:
```
# Instead of fetching 50 full observations:
# get_observations(ids=[1,2,3,...,50]) # 25,000-50,000 tokens!
# Do this:
search(query="<your-query>", limit=50) # ~2,500-5,000 tokens
# Review index, identify 5 relevant observations
get_observations(ids=[<5-most-relevant>]) # ~2,500-5,000 tokens
# Total: 5,000-10,000 tokens (50-80% savings)
```
### Search Performance
If searches seem slow:
1. Be more specific in your queries
2. Ask for recent work (naturally filters by date)
3. Specify the project you're interested in
4. Ask for fewer results initially
1. Be more specific in queries (helps FTS5 index)
2. Use date range filters to narrow scope
3. Specify project filter when possible
4. Use smaller limit values
## Best Practices
1. **Index First, Details Later** - Always start with search to survey options
2. **Filter Before Fetching** - Use search parameters to narrow results
3. **Batch ID Fetches** - Group multiple IDs in one get_observations call
4. **Use Timeline for Context** - When narrative matters, timeline shows the story
5. **Specific Queries** - More specific = better relevance
6. **Small Limits Initially** - Start with 3-5 results, expand if needed
7. **Review Before Deep Dive** - Check index before fetching full details
## Technical Details
**Architecture Change (v5.4.0)**:
- **Before**: 9 MCP tools (~2,500 tokens in tool definitions per session start)
- **After**: 1 mem-search skill (~250 tokens in frontmatter, full instructions loaded on-demand)
- **Savings**: ~2,250 tokens per session start
- **Migration**: Transparent - users don't need to change how they ask questions
**Architecture:** MCP tools are a thin wrapper over the Worker HTTP API (localhost:37777). The MCP server translates tool calls into HTTP requests to the worker service, which handles all business logic, database queries, and Chroma vector search.
**v5.5.0 Enhancement**: Renamed from "search" to "mem-search" with improved effectiveness (67% → 100%) and enhanced triggers (44% → 85%).
**MCP Server:** Located at `~/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs`
**How the Skill Works:**
1. User asks a question about past work
2. Claude recognizes the intent matches the mem-search skill description
3. Skill loads full instructions from `plugin/skills/mem-search/SKILL.md`
4. Skill uses `curl` to call HTTP API endpoints
5. Results formatted and returned to Claude
6. Claude presents results to user
**Worker Service:** Express API on port 37777, managed by Bun
**Database:** SQLite FTS5 full-text search on `~/.claude-mem/claude-mem.db`
**Vector Search:** Chroma embeddings for semantic search (underlying implementation)
## Next Steps
- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow
- [Architecture Overview](/architecture/overview) - System components
- [Database Schema](/architecture/database) - Understanding the data
- [Getting Started](/usage/getting-started) - Automatic operation
- [Database Schema](/architecture/database) - Understanding the data structure
- [Claude Desktop Setup](/usage/claude-desktop) - Installation and configuration