Refactor search documentation to implement a 3-layer workflow for memory retrieval; update tool names and usage examples for clarity and efficiency. Enhance troubleshooting section with new error handling and token management strategies.
This commit is contained in:
@@ -248,6 +248,164 @@ search_observations({
|
||||
|
||||
---
|
||||
|
||||
## MCP Architecture Simplification (December 2025)
|
||||
|
||||
### The Problem: Complex MCP Implementation
|
||||
|
||||
**Before:**
|
||||
```
|
||||
9+ MCP tools registered at session start:
|
||||
- search_observations
|
||||
- find_by_type
|
||||
- find_by_file
|
||||
- find_by_concept
|
||||
- get_recent_context
|
||||
- get_observation
|
||||
- get_session
|
||||
- get_prompt
|
||||
- help
|
||||
|
||||
Problems:
|
||||
- Overlapping operations (search_observations vs find_by_type)
|
||||
- Complex parameter schemas (~2,500 tokens in tool definitions)
|
||||
- No built-in workflow guidance
|
||||
- High cognitive load for Claude (which tool to use?)
|
||||
- Code size: ~2,718 lines in mcp-server.ts
|
||||
```
|
||||
|
||||
**The Insight:** Progressive disclosure should be built into tool design itself, not something Claude has to remember.
|
||||
|
||||
### The Solution: 3-Layer Workflow
|
||||
|
||||
**After:**
|
||||
```
|
||||
4 MCP tools following 3-layer workflow:
|
||||
|
||||
1. __IMPORTANT - Workflow documentation (always visible)
|
||||
"3-LAYER WORKFLOW (ALWAYS FOLLOW):
|
||||
1. search(query) → Get index with IDs
|
||||
2. timeline(anchor=ID) → Get context
|
||||
3. get_observations([IDs]) → Fetch details
|
||||
NEVER fetch full details without filtering first."
|
||||
|
||||
2. search - Layer 1: Get index with IDs (~50-100 tokens/result)
|
||||
3. timeline - Layer 2: Get chronological context
|
||||
4. get_observations - Layer 3: Fetch full details (~500-1,000 tokens/result)
|
||||
|
||||
Benefits:
|
||||
- Progressive disclosure enforced by tool structure
|
||||
- No overlapping operations
|
||||
- Simple schemas (additionalProperties: true)
|
||||
- Clear workflow pattern
|
||||
- Code size: ~312 lines in mcp-server.ts (88% reduction)
|
||||
- ~10x token savings
|
||||
```
|
||||
|
||||
### Migration: Skill-Based Search Removed
|
||||
|
||||
**Previously:** Used skill-based search
|
||||
- mem-search skill invoked via natural language
|
||||
- HTTP API called directly via curl
|
||||
- Progressive disclosure through skill loading
|
||||
- 17 skill documentation files
|
||||
|
||||
**Now:** Removed skill-based approach
|
||||
- MCP-only architecture
|
||||
- Native MCP protocol (better Claude integration)
|
||||
- Works with both Claude Desktop and Claude Code
|
||||
- Simpler to maintain (no skill files)
|
||||
- All 19 mem-search skill files removed (~2,744 lines)
|
||||
|
||||
### Key Architectural Changes
|
||||
|
||||
**MCP Server Refactor:**
|
||||
|
||||
Before:
|
||||
```typescript
|
||||
// Complex parameter schemas
|
||||
{
|
||||
name: "search_observations",
|
||||
inputSchema: {
|
||||
type: "object",
|
||||
properties: {
|
||||
query: { type: "string", description: "..." },
|
||||
type: { type: "array", items: { enum: [...] } },
|
||||
format: { enum: ["index", "full"] },
|
||||
limit: { type: "number", minimum: 1, maximum: 100 },
|
||||
// ... many more parameters
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
After:
|
||||
```typescript
|
||||
// Simple schemas with workflow guidance
|
||||
{
|
||||
name: "search",
|
||||
description: "Step 1: Search memory. Returns index with IDs.",
|
||||
inputSchema: {
|
||||
type: "object",
|
||||
properties: {},
|
||||
additionalProperties: true // Accept any parameters
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Workflow Enforcement:**
|
||||
|
||||
Before: Claude had to remember progressive disclosure pattern
|
||||
|
||||
After: Tool structure makes it impossible to skip steps
|
||||
- Can't get details without IDs from search
|
||||
- Can't search without seeing __IMPORTANT reminder
|
||||
- Timeline provides middle ground (context without full details)
|
||||
|
||||
### Impact
|
||||
|
||||
**Token Efficiency:**
|
||||
```
|
||||
Traditional: Fetch 20 observations upfront
|
||||
→ 10,000-20,000 tokens
|
||||
→ Only 2 observations relevant (90% waste)
|
||||
|
||||
3-Layer Workflow:
|
||||
→ search (20 results): ~1,000-2,000 tokens
|
||||
→ Review index, identify 3 relevant IDs
|
||||
→ get_observations (3 IDs): ~1,500-3,000 tokens
|
||||
→ Total: 2,500-5,000 tokens (50-75% savings)
|
||||
```
|
||||
|
||||
**Code Simplicity:**
|
||||
- MCP server: 2,718 lines → 312 lines (88% reduction)
|
||||
- Removed: 19 skill files (~2,744 lines)
|
||||
- Net reduction: ~5,150 lines of code removed
|
||||
|
||||
**User Experience:**
|
||||
- Same natural language interaction
|
||||
- Better token efficiency
|
||||
- Clearer architecture
|
||||
- Works identically on Claude Desktop and Claude Code
|
||||
|
||||
### Design Philosophy
|
||||
|
||||
**Progressive Disclosure Through Structure:**
|
||||
|
||||
The 3-layer workflow embodies progressive disclosure at the architectural level:
|
||||
|
||||
1. **Layer 1 (Index)** - "What exists?" - Cheap survey of options
|
||||
2. **Layer 2 (Timeline)** - "What was happening?" - Context around specific points
|
||||
3. **Layer 3 (Details)** - "Tell me everything" - Full details only when justified
|
||||
|
||||
Each layer provides a decision point where Claude can:
|
||||
- Stop if irrelevant
|
||||
- Get more context if uncertain
|
||||
- Dive deep if confident
|
||||
|
||||
This makes it structurally difficult to waste tokens.
|
||||
|
||||
---
|
||||
|
||||
## v1-v2: The Naive Approach
|
||||
|
||||
### The First Attempt: Dump Everything
|
||||
|
||||
@@ -1,448 +1,497 @@
|
||||
---
|
||||
title: "Search Architecture"
|
||||
description: "mem-search skill with HTTP API and progressive disclosure"
|
||||
description: "MCP tools with 3-layer workflow for token-efficient memory retrieval"
|
||||
---
|
||||
|
||||
# Search Architecture
|
||||
|
||||
Claude-Mem uses a skill-based search architecture that provides intelligent memory retrieval through natural language queries. This replaced the MCP-based approach in v5.4.0 with a more efficient implementation. The skill was enhanced and renamed to "mem-search" in v5.5.0 for better scope differentiation.
|
||||
Claude-mem uses an **MCP-based search architecture** that provides intelligent memory retrieval through 4 streamlined tools following a 3-layer workflow pattern.
|
||||
|
||||
## Overview
|
||||
|
||||
**Architecture**: Skill-Based Search + HTTP API + Progressive Disclosure
|
||||
**Architecture**: MCP Tools → MCP Protocol → HTTP API → Worker Service
|
||||
|
||||
**Key Components**:
|
||||
1. **mem-search Skill** (`plugin/skills/mem-search/SKILL.md`) - Auto-invoked when users ask about past work
|
||||
2. **HTTP API Endpoints** (10 routes) - Fast, efficient search operations on port 37777
|
||||
3. **Worker Service** - Express.js server with FTS5 full-text search
|
||||
4. **SQLite Database** - Persistent storage with FTS5 virtual tables
|
||||
5. **Chroma Vector DB** - Semantic search with hybrid retrieval
|
||||
1. **MCP Tools** (4 tools) - `search`, `timeline`, `get_observations`, `__IMPORTANT`
|
||||
2. **MCP Server** (`plugin/scripts/mcp-server.cjs`) - Thin wrapper over HTTP API
|
||||
3. **HTTP API Endpoints** - Fast search operations on Worker Service (port 37777)
|
||||
4. **Worker Service** - Express.js server with FTS5 full-text search
|
||||
5. **SQLite Database** - Persistent storage with FTS5 virtual tables
|
||||
6. **Chroma Vector DB** - Semantic search with hybrid retrieval
|
||||
|
||||
**v5.5.0 Enhancement**: Renamed from "search" to "mem-search" with:
|
||||
- Effectiveness increased from 67% to 100%
|
||||
- Concrete triggers increased from 44% to 85%
|
||||
- 5+ unique identifiers for better scope differentiation
|
||||
- Comprehensive documentation (17 files, 12 operation guides)
|
||||
**Token Efficiency**: ~10x savings through 3-layer workflow pattern
|
||||
|
||||
## How It Works
|
||||
|
||||
### 1. User Query (Natural Language)
|
||||
### 1. User Query
|
||||
|
||||
Claude has access to 4 MCP tools. When searching memory, Claude follows the 3-layer workflow:
|
||||
|
||||
```
|
||||
User: "What bugs did we fix last session?"
|
||||
Step 1: search(query="authentication bug", type="bugfix", limit=10)
|
||||
Step 2: timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
|
||||
Step 3: get_observations(ids=[123, 456, 789])
|
||||
```
|
||||
|
||||
### 2. Skill Invocation
|
||||
### 2. MCP Protocol
|
||||
|
||||
Claude recognizes the intent and invokes the mem-search skill:
|
||||
- Skill frontmatter (~250 tokens) loaded at session start
|
||||
- Full skill instructions loaded on-demand when skill is invoked
|
||||
- Progressive disclosure pattern minimizes context overhead
|
||||
- "mem-search" naming provides clear scope differentiation from native memory
|
||||
MCP server receives tool call via JSON-RPC over stdio:
|
||||
|
||||
```json
|
||||
{
|
||||
"method": "tools/call",
|
||||
"params": {
|
||||
"name": "search",
|
||||
"arguments": {
|
||||
"query": "authentication bug",
|
||||
"type": "bugfix",
|
||||
"limit": 10
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3. HTTP API Call
|
||||
|
||||
The skill uses `curl` to call the HTTP API:
|
||||
MCP server translates to HTTP request:
|
||||
|
||||
```bash
|
||||
curl "http://localhost:37777/api/search/observations?query=bugs&type=bugfix&limit=5"
|
||||
```typescript
|
||||
const url = `http://localhost:37777/api/search?query=authentication%20bug&type=bugfix&limit=10`;
|
||||
const response = await fetch(url);
|
||||
```
|
||||
|
||||
### 4. FTS5 Search
|
||||
### 4. Worker Processing
|
||||
|
||||
Worker service queries SQLite FTS5 virtual tables:
|
||||
Worker service executes FTS5 query:
|
||||
|
||||
```sql
|
||||
SELECT * FROM observations_fts
|
||||
WHERE observations_fts MATCH ?
|
||||
AND type = 'bugfix'
|
||||
ORDER BY rank
|
||||
LIMIT 5
|
||||
LIMIT 10
|
||||
```
|
||||
|
||||
### 5. Results Formatted
|
||||
### 5. Results Returned
|
||||
|
||||
Skill formats results and returns to Claude:
|
||||
Worker returns structured data → MCP server → Claude:
|
||||
|
||||
```
|
||||
## Recent Bugfixes
|
||||
|
||||
1. [bugfix] Fixed authentication token expiry
|
||||
Date: 2025-11-08 14:23:45
|
||||
Files: src/auth/jwt.ts
|
||||
|
||||
2. [bugfix] Resolved database connection leak
|
||||
Date: 2025-11-08 13:15:22
|
||||
Files: src/services/database.ts
|
||||
```
|
||||
|
||||
### 6. User Sees Answer
|
||||
|
||||
Claude presents the formatted results naturally in conversation.
|
||||
|
||||
## Architecture Change (v5.4.0)
|
||||
|
||||
### Before: MCP-Based Search
|
||||
|
||||
**Approach**: 9 MCP tools registered at session start
|
||||
|
||||
**Token Cost**: ~2,500 tokens in tool definitions per session
|
||||
- Each tool's schema, parameters, descriptions loaded
|
||||
- All 9 tools available whether needed or not
|
||||
- No progressive disclosure
|
||||
|
||||
**Example MCP Tool**:
|
||||
```json
|
||||
{
|
||||
"name": "search_observations",
|
||||
"description": "Full-text search across observations...",
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": { "type": "string", "description": "..." },
|
||||
"type": { "type": "array", "items": { "enum": [...] } },
|
||||
"format": { "enum": ["index", "full"] },
|
||||
// ... many more parameters
|
||||
"content": [{
|
||||
"type": "text",
|
||||
"text": "| ID | Time | Title | Type |\n|---|---|---|---|\n| #123 | 2:15 PM | Fixed auth token expiry | bugfix |"
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
### 6. Claude Processes Results
|
||||
|
||||
Claude reviews the index, decides which observations are relevant, and can:
|
||||
- Use `timeline` to get context
|
||||
- Use `get_observations` to fetch full details for selected IDs
|
||||
|
||||
## The 4 MCP Tools
|
||||
|
||||
### `__IMPORTANT` - Workflow Documentation
|
||||
|
||||
Always visible to Claude. Explains the 3-layer workflow pattern.
|
||||
|
||||
**Description:**
|
||||
```
|
||||
3-LAYER WORKFLOW (ALWAYS FOLLOW):
|
||||
1. search(query) → Get index with IDs (~50-100 tokens/result)
|
||||
2. timeline(anchor=ID) → Get context around interesting results
|
||||
3. get_observations([IDs]) → Fetch full details ONLY for filtered IDs
|
||||
NEVER fetch full details without filtering first. 10x token savings.
|
||||
```
|
||||
|
||||
**Purpose:** Ensures Claude follows token-efficient pattern
|
||||
|
||||
### `search` - Search Memory Index
|
||||
|
||||
**Tool Definition:**
|
||||
```typescript
|
||||
{
|
||||
name: 'search',
|
||||
description: 'Step 1: Search memory. Returns index with IDs. Params: query, limit, project, type, obs_type, dateStart, dateEnd, offset, orderBy',
|
||||
inputSchema: {
|
||||
type: 'object',
|
||||
properties: {},
|
||||
additionalProperties: true // Accepts any parameters
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**HTTP Endpoint:** `GET /api/search`
|
||||
|
||||
**Parameters:**
|
||||
- `query` - Full-text search query
|
||||
- `limit` - Maximum results (default: 20)
|
||||
- `type` - Filter by observation type
|
||||
- `project` - Filter by project name
|
||||
- `dateStart`, `dateEnd` - Date range filters
|
||||
- `offset` - Pagination offset
|
||||
- `orderBy` - Sort order
|
||||
|
||||
**Returns:** Compact index with IDs, titles, dates, types (~50-100 tokens per result)
|
||||
|
||||
### `timeline` - Get Chronological Context
|
||||
|
||||
**Tool Definition:**
|
||||
```typescript
|
||||
{
|
||||
name: 'timeline',
|
||||
description: 'Step 2: Get context around results. Params: anchor (observation ID) OR query (finds anchor automatically), depth_before, depth_after, project',
|
||||
inputSchema: {
|
||||
type: 'object',
|
||||
properties: {},
|
||||
additionalProperties: true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**HTTP Endpoint:** `GET /api/timeline`
|
||||
|
||||
**Parameters:**
|
||||
- `anchor` - Observation ID to center timeline around (optional if query provided)
|
||||
- `query` - Search query to find anchor automatically (optional if anchor provided)
|
||||
- `depth_before` - Number of observations before anchor (default: 3)
|
||||
- `depth_after` - Number of observations after anchor (default: 3)
|
||||
- `project` - Filter by project name
|
||||
|
||||
**Returns:** Chronological view showing what happened before/during/after
|
||||
|
||||
### `get_observations` - Fetch Full Details
|
||||
|
||||
**Tool Definition:**
|
||||
```typescript
|
||||
{
|
||||
name: 'get_observations',
|
||||
description: 'Step 3: Fetch full details for filtered IDs. Params: ids (array of observation IDs, required), orderBy, limit, project',
|
||||
inputSchema: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
ids: {
|
||||
type: 'array',
|
||||
items: { type: 'number' },
|
||||
description: 'Array of observation IDs to fetch (required)'
|
||||
}
|
||||
},
|
||||
required: ['ids'],
|
||||
additionalProperties: true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**HTTP Endpoint:** `POST /api/observations/batch`
|
||||
|
||||
**Body:**
|
||||
```json
|
||||
{
|
||||
"ids": [123, 456, 789],
|
||||
"orderBy": "date_desc",
|
||||
"project": "my-app"
|
||||
}
|
||||
```
|
||||
|
||||
**Returns:** Complete observation details (~500-1,000 tokens per observation)
|
||||
|
||||
## MCP Server Implementation
|
||||
|
||||
**Location:** `/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs`
|
||||
|
||||
**Role:** Thin wrapper that translates MCP protocol to HTTP API calls
|
||||
|
||||
**Key Characteristics:**
|
||||
- ~312 lines of code (reduced from ~2,718 lines in old implementation)
|
||||
- No business logic - just protocol translation
|
||||
- Single source of truth: Worker HTTP API
|
||||
- Simple schemas with `additionalProperties: true`
|
||||
|
||||
**Handler Example:**
|
||||
```typescript
|
||||
{
|
||||
name: 'search',
|
||||
handler: async (args: any) => {
|
||||
const endpoint = '/api/search';
|
||||
const searchParams = new URLSearchParams();
|
||||
|
||||
for (const [key, value] of Object.entries(args)) {
|
||||
searchParams.append(key, String(value));
|
||||
}
|
||||
|
||||
const url = `http://localhost:37777${endpoint}?${searchParams}`;
|
||||
const response = await fetch(url);
|
||||
return await response.json();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Worker HTTP API
|
||||
|
||||
**Location:** `src/services/worker-service.ts`
|
||||
|
||||
**Port:** 37777
|
||||
|
||||
**Search Endpoints:**
|
||||
```typescript
|
||||
GET /api/search # Main search (used by MCP search tool)
|
||||
GET /api/timeline # Timeline context (used by MCP timeline tool)
|
||||
POST /api/observations/batch # Fetch by IDs (used by MCP get_observations tool)
|
||||
GET /api/health # Health check
|
||||
```
|
||||
|
||||
**Database Access:**
|
||||
- Uses `SessionSearch` service for FTS5 queries
|
||||
- Uses `SessionStore` for structured queries
|
||||
- Hybrid search with ChromaDB for semantic similarity
|
||||
|
||||
**FTS5 Full-Text Search:**
|
||||
```typescript
|
||||
// search tool → HTTP GET → FTS5 query
|
||||
SELECT * FROM observations_fts
|
||||
WHERE observations_fts MATCH ?
|
||||
AND type = ?
|
||||
AND date >= ? AND date <= ?
|
||||
ORDER BY rank
|
||||
LIMIT ? OFFSET ?
|
||||
```
|
||||
|
||||
## The 3-Layer Workflow Pattern
|
||||
|
||||
### Design Philosophy
|
||||
|
||||
The 3-layer workflow embodies **progressive disclosure** - a core principle of claude-mem's architecture.
|
||||
|
||||
**Layer 1: Index (Search)**
|
||||
- **What:** Compact table with IDs, titles, dates, types
|
||||
- **Cost:** ~50-100 tokens per result
|
||||
- **Purpose:** Survey what exists before committing tokens
|
||||
- **Decision Point:** "Which observations are relevant?"
|
||||
|
||||
**Layer 2: Context (Timeline)**
|
||||
- **What:** Chronological view of observations around a point
|
||||
- **Cost:** Variable based on depth
|
||||
- **Purpose:** Understand narrative arc, see what led to/from a point
|
||||
- **Decision Point:** "Do I need full details?"
|
||||
|
||||
**Layer 3: Details (Get Observations)**
|
||||
- **What:** Complete observation data (narrative, facts, files, concepts)
|
||||
- **Cost:** ~500-1,000 tokens per observation
|
||||
- **Purpose:** Deep dive on validated, relevant observations
|
||||
- **Decision Point:** "Apply knowledge to current task"
|
||||
|
||||
### Token Efficiency
|
||||
|
||||
**Traditional RAG Approach:**
|
||||
```
|
||||
Fetch 20 observations upfront: 10,000-20,000 tokens
|
||||
Relevance: ~10% (only 2 observations actually useful)
|
||||
Waste: 18,000 tokens on irrelevant context
|
||||
```
|
||||
|
||||
**3-Layer Workflow:**
|
||||
```
|
||||
Step 1: search (20 results) ~1,000-2,000 tokens
|
||||
Step 2: Review index, filter to 3 relevant IDs
|
||||
Step 3: get_observations (3 IDs) ~1,500-3,000 tokens
|
||||
Total: 2,500-5,000 tokens (50-75% savings)
|
||||
```
|
||||
|
||||
**10x Savings:** By filtering at index level before fetching full details
|
||||
|
||||
## Architecture Evolution
|
||||
|
||||
### Before: Complex MCP Implementation
|
||||
|
||||
**Approach:** 9 MCP tools with detailed parameter schemas
|
||||
|
||||
**Token Cost:** ~2,500 tokens in tool definitions per session
|
||||
- `search_observations` - Full-text search
|
||||
- `find_by_type` - Filter by type
|
||||
- `find_by_file` - Filter by file
|
||||
- `find_by_concept` - Filter by concept
|
||||
- `get_recent_context` - Recent sessions
|
||||
- `get_observation` - Fetch single observation
|
||||
- `get_session` - Fetch session
|
||||
- `get_prompt` - Fetch prompt
|
||||
- `help` - API documentation
|
||||
|
||||
**Problems:**
|
||||
- Overlapping operations (search_observations vs find_by_type)
|
||||
- Complex parameter schemas
|
||||
- No built-in workflow guidance
|
||||
- High token cost at session start
|
||||
|
||||
**Code Size:** ~2,718 lines in mcp-server.ts
|
||||
|
||||
### After: Streamlined MCP Implementation
|
||||
|
||||
**Approach:** 4 MCP tools following 3-layer workflow
|
||||
|
||||
**Token Cost:** ~312 lines of code, simplified tool definitions
|
||||
|
||||
**Tools:**
|
||||
1. `__IMPORTANT` - Workflow guidance (always visible)
|
||||
2. `search` - Step 1 (index)
|
||||
3. `timeline` - Step 2 (context)
|
||||
4. `get_observations` - Step 3 (details)
|
||||
|
||||
**Benefits:**
|
||||
- Progressive disclosure built into tool design
|
||||
- No overlapping operations
|
||||
- Simple schemas (`additionalProperties: true`)
|
||||
- Clear workflow pattern
|
||||
- ~10x token savings
|
||||
|
||||
**Code Size:** ~312 lines in mcp-server.ts (88% reduction)
|
||||
|
||||
### Key Insight
|
||||
|
||||
**Before:** Progressive disclosure was something Claude had to remember
|
||||
|
||||
**After:** Progressive disclosure is enforced by tool design itself
|
||||
|
||||
The 3-layer workflow pattern makes it structurally difficult to waste tokens:
|
||||
- Can't fetch details without first getting IDs from search
|
||||
- Can't search without seeing workflow reminder (`__IMPORTANT`)
|
||||
- Timeline provides middle ground between index and full details
|
||||
|
||||
## Configuration
|
||||
|
||||
### Claude Desktop
|
||||
|
||||
Add to `claude_desktop_config.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"mcp-search": {
|
||||
"command": "node",
|
||||
"args": [
|
||||
"/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### After: Skill-Based Search
|
||||
### Claude Code
|
||||
|
||||
**Approach**: 1 mem-search skill with progressive disclosure
|
||||
MCP server is automatically configured via plugin installation. No manual setup required.
|
||||
|
||||
**Token Cost**: ~250 tokens in skill frontmatter per session
|
||||
- Only skill description loaded at session start
|
||||
- Full instructions loaded on-demand when skill is invoked
|
||||
- HTTP API endpoints instead of MCP protocol
|
||||
**Both clients use the same MCP tools** - the architecture works identically for Claude Desktop and Claude Code.
|
||||
|
||||
**Example Skill Frontmatter**:
|
||||
```markdown
|
||||
# Claude-Mem mem-search Skill
|
||||
## Security
|
||||
|
||||
Access claude-mem's persistent memory through a comprehensive HTTP API.
|
||||
Search for past work, understand context, and learn from previous decisions.
|
||||
### FTS5 Injection Prevention
|
||||
|
||||
## When to Use This Skill
|
||||
All search queries are escaped before FTS5 processing:
|
||||
|
||||
Invoke this skill when users ask about:
|
||||
- Past work: "What did we do last session?"
|
||||
- Bug fixes: "Did we fix this before?"
|
||||
- Features: "How did we implement authentication?"
|
||||
...
|
||||
```
|
||||
|
||||
**Token Efficiency**: Minimal frontmatter at session start with progressive disclosure
|
||||
|
||||
## HTTP API Endpoints
|
||||
|
||||
The worker service exposes 10 search endpoints:
|
||||
|
||||
### Full-Text Search
|
||||
|
||||
```
|
||||
GET /api/search/observations
|
||||
GET /api/search/sessions
|
||||
GET /api/search/prompts
|
||||
```
|
||||
|
||||
**Parameters**:
|
||||
- `query` - FTS5 search query (required)
|
||||
- `type` - Filter by type (bugfix, feature, refactor, etc.)
|
||||
- `project` - Filter by project name
|
||||
- `limit` - Maximum results (default: 20)
|
||||
- `offset` - Pagination offset
|
||||
- `format` - Response format (index or full)
|
||||
|
||||
**Example**:
|
||||
```bash
|
||||
curl "http://localhost:37777/api/search/observations?query=authentication&type=decision&limit=5"
|
||||
```
|
||||
|
||||
### Filtered Search
|
||||
|
||||
```
|
||||
GET /api/search/by-type
|
||||
GET /api/search/by-concept
|
||||
GET /api/search/by-file
|
||||
```
|
||||
|
||||
**Parameters**:
|
||||
- `type` / `concept` / `filePath` - Filter criteria (required)
|
||||
- `project` - Filter by project
|
||||
- `limit` - Maximum results
|
||||
- `format` - Response format
|
||||
|
||||
**Example**:
|
||||
```bash
|
||||
curl "http://localhost:37777/api/search/by-file?filePath=worker-service.ts&limit=10"
|
||||
```
|
||||
|
||||
### Context Retrieval
|
||||
|
||||
```
|
||||
GET /api/context/recent
|
||||
GET /api/context/timeline
|
||||
GET /api/timeline/by-query
|
||||
```
|
||||
|
||||
**Parameters**:
|
||||
- `project` - Filter by project
|
||||
- `limit` - Number of sessions/records
|
||||
- `anchor` - Timeline anchor point (ID or timestamp)
|
||||
- `depth_before` - Records before anchor
|
||||
- `depth_after` - Records after anchor
|
||||
|
||||
**Example**:
|
||||
```bash
|
||||
curl "http://localhost:37777/api/context/recent?project=claude-mem&limit=5"
|
||||
```
|
||||
|
||||
### Documentation
|
||||
|
||||
```
|
||||
GET /api/search/help
|
||||
```
|
||||
|
||||
Returns API documentation in JSON format.
|
||||
|
||||
## Progressive Disclosure Pattern
|
||||
|
||||
The mem-search skill uses progressive disclosure to minimize token usage:
|
||||
|
||||
### Layer 1: Skill Frontmatter (Session Start)
|
||||
|
||||
**What's Loaded**: Skill description and when to use it (~250 tokens)
|
||||
|
||||
**Purpose**: Claude can recognize when to invoke the skill
|
||||
|
||||
**Example**:
|
||||
```markdown
|
||||
# Claude-Mem mem-search Skill
|
||||
|
||||
Access claude-mem's persistent memory through a comprehensive HTTP API.
|
||||
|
||||
## When to Use This Skill
|
||||
Invoke this skill when users ask about:
|
||||
- Past work: "What did we do last session?"
|
||||
- Bug fixes: "Did we fix this before?"
|
||||
...
|
||||
```
|
||||
|
||||
### Layer 2: Full Skill Instructions (On-Demand)
|
||||
|
||||
**What's Loaded**: Complete operation documentation (~2,500 tokens)
|
||||
|
||||
**Purpose**: Detailed instructions for each search operation
|
||||
|
||||
**When Loaded**: Only when Claude invokes the skill
|
||||
|
||||
**Example Structure**:
|
||||
```
|
||||
/skills/search/
|
||||
├── SKILL.md (main frontmatter)
|
||||
├── operations/
|
||||
│ ├── observations.md (detailed instructions)
|
||||
│ ├── sessions.md
|
||||
│ ├── prompts.md
|
||||
│ ├── by-type.md
|
||||
│ ├── by-concept.md
|
||||
│ ├── by-file.md
|
||||
│ ├── recent-context.md
|
||||
│ ├── timeline.md
|
||||
│ ├── timeline-by-query.md
|
||||
│ ├── help.md
|
||||
│ ├── formatting.md
|
||||
│ └── common-workflows.md
|
||||
```
|
||||
|
||||
### Layer 3: API Response
|
||||
|
||||
**What's Returned**: Search results in requested format
|
||||
|
||||
**Format Options**:
|
||||
- `index` - Titles, dates, IDs only (~50-100 tokens per result)
|
||||
- `full` - Complete details (~500-1000 tokens per result)
|
||||
|
||||
**Progressive Usage**: Start with `index`, drill down with `full` as needed
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### mem-search Skill Structure
|
||||
|
||||
```
|
||||
plugin/skills/mem-search/
|
||||
├── SKILL.md # Main frontmatter (~250 tokens)
|
||||
├── operations/
|
||||
│ ├── observations.md # Search observations
|
||||
│ ├── sessions.md # Search sessions
|
||||
│ ├── prompts.md # Search prompts
|
||||
│ ├── by-type.md # Filter by type
|
||||
│ ├── by-concept.md # Filter by concept
|
||||
│ ├── by-file.md # Filter by file
|
||||
│ ├── recent-context.md # Get recent context
|
||||
│ ├── timeline.md # Timeline around point
|
||||
│ ├── timeline-by-query.md # Search + timeline
|
||||
│ ├── help.md # API documentation
|
||||
│ ├── formatting.md # Result formatting guide
|
||||
│ └── common-workflows.md # Usage patterns
|
||||
```
|
||||
|
||||
### Worker Service Integration
|
||||
|
||||
**File**: `src/services/worker-service.ts`
|
||||
|
||||
**Search Routes**:
|
||||
```typescript
|
||||
// Full-text search
|
||||
app.get('/api/search/observations', handleSearchObservations);
|
||||
app.get('/api/search/sessions', handleSearchSessions);
|
||||
app.get('/api/search/prompts', handleSearchPrompts);
|
||||
|
||||
// Filtered search
|
||||
app.get('/api/search/by-type', handleSearchByType);
|
||||
app.get('/api/search/by-concept', handleSearchByConcept);
|
||||
app.get('/api/search/by-file', handleSearchByFile);
|
||||
|
||||
// Context retrieval
|
||||
app.get('/api/context/recent', handleRecentContext);
|
||||
app.get('/api/context/timeline', handleTimeline);
|
||||
app.get('/api/timeline/by-query', handleTimelineByQuery);
|
||||
|
||||
// Documentation
|
||||
app.get('/api/search/help', handleHelp);
|
||||
```
|
||||
|
||||
**Database Access**:
|
||||
- Uses `SessionSearch` service for FTS5 queries
|
||||
- Uses `SessionStore` for structured queries
|
||||
- Hybrid search with ChromaDB for semantic similarity
|
||||
|
||||
### Security
|
||||
|
||||
**FTS5 Injection Prevention** (v4.2.3):
|
||||
```typescript
|
||||
function escapeFTS5Query(query: string): string {
|
||||
return query.replace(/"/g, '""');
|
||||
}
|
||||
```
|
||||
|
||||
All user-provided search queries are properly escaped to prevent SQL injection.
|
||||
**Testing:** 332 injection attack tests covering special characters, SQL keywords, quote escaping, and boolean operators.
|
||||
|
||||
**Comprehensive Testing**: 332 injection attack tests covering:
|
||||
- Special characters
|
||||
- SQL keywords
|
||||
- Quote escaping
|
||||
- Boolean operators
|
||||
### MCP Protocol Security
|
||||
|
||||
## Benefits
|
||||
- Stdio transport (no network exposure)
|
||||
- Local-only HTTP API (localhost:37777)
|
||||
- No authentication needed (local development only)
|
||||
|
||||
### 1. Token Efficiency
|
||||
## Performance
|
||||
|
||||
**Before (MCP)**:
|
||||
- Session start: All tool definitions loaded upfront
|
||||
- Every session pays this cost
|
||||
- No progressive disclosure
|
||||
**FTS5 Full-Text Search:** <10ms for typical queries
|
||||
|
||||
**After (Skill)**:
|
||||
- Session start: Minimal token cost for skill frontmatter
|
||||
- Full instructions loaded only when invoked (progressive disclosure)
|
||||
- More efficient than loading all tool definitions upfront
|
||||
**MCP Overhead:** Minimal - simple protocol translation
|
||||
|
||||
### 2. Natural Language Interface
|
||||
**Caching:** HTTP layer allows response caching (future enhancement)
|
||||
|
||||
**Before**: Users needed to learn MCP tool syntax
|
||||
```
|
||||
search_observations with query="authentication" and type="decision"
|
||||
```
|
||||
**Pagination:** Efficient with offset/limit
|
||||
|
||||
**After**: Users ask naturally
|
||||
```
|
||||
"What decisions did we make about authentication?"
|
||||
```
|
||||
**Batching:** `get_observations` accepts multiple IDs in single call
|
||||
|
||||
Claude translates to appropriate API call.
|
||||
## Benefits Over Alternative Approaches
|
||||
|
||||
### 3. Flexibility
|
||||
### vs. Traditional RAG
|
||||
|
||||
**HTTP API Benefits**:
|
||||
- Can be called from skills, MCP tools, or other clients
|
||||
- Easy to test with curl
|
||||
- Standard REST conventions
|
||||
- JSON responses
|
||||
**Traditional RAG:**
|
||||
- Fetches everything upfront
|
||||
- High token cost
|
||||
- Low relevance ratio
|
||||
|
||||
**Progressive Disclosure**:
|
||||
- Loads only what's needed
|
||||
- Can add more operations without increasing base cost
|
||||
- Documentation co-located with operations
|
||||
**3-Layer MCP:**
|
||||
- Fetches only what's needed
|
||||
- ~10x token savings
|
||||
- 100% relevance (Claude chooses what to fetch)
|
||||
|
||||
### 4. Performance
|
||||
### vs. Previous MCP Implementation (v5.x)
|
||||
|
||||
**Fast Queries**: FTS5 full-text search under 10ms for typical queries
|
||||
**Previous (9 tools):**
|
||||
- Complex schemas
|
||||
- Overlapping operations
|
||||
- No workflow guidance
|
||||
- ~2,500 tokens in definitions
|
||||
|
||||
**Caching**: HTTP layer allows response caching
|
||||
**Current (4 tools):**
|
||||
- Simple schemas
|
||||
- Clear workflow
|
||||
- Built-in guidance
|
||||
- ~312 lines of code
|
||||
|
||||
**Pagination**: Efficient result pagination with offset/limit
|
||||
### vs. Skill-Based Approach (Previously)
|
||||
|
||||
## Migration Notes
|
||||
**Skill approach:**
|
||||
- Required separate skill files
|
||||
- HTTP API called directly via curl
|
||||
- Progressive disclosure through skill loading
|
||||
|
||||
### For Users
|
||||
**MCP approach:**
|
||||
- Native MCP protocol (better Claude integration)
|
||||
- Cleaner architecture (protocol translation layer)
|
||||
- Works with both Claude Desktop and Claude Code
|
||||
- Simpler to maintain (no skill files)
|
||||
|
||||
**No Action Required**: The migration from MCP to skill-based search is transparent.
|
||||
|
||||
**Same Questions Work**: Natural language queries work exactly the same way.
|
||||
|
||||
**Invisible Change**: Users won't notice any difference except better performance.
|
||||
|
||||
### For Developers
|
||||
|
||||
**Renamed**: MCP server (formerly `search-server.ts`, now `src/servers/mcp-server.ts`)
|
||||
- Source file kept for reference
|
||||
- No longer built or registered
|
||||
- MCP configuration removed from `plugin/.mcp.json`
|
||||
|
||||
**New Implementation**: Skill-based search
|
||||
- Skill files: `plugin/skills/mem-search/`
|
||||
- HTTP endpoints: `src/services/worker-service.ts` (lines 200-400)
|
||||
- Build script: `npm run build` includes skill files
|
||||
- Sync script: `npm run sync-marketplace` copies to plugin directory
|
||||
**Migration:** Skill-based search was removed in favor of streamlined MCP architecture.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### MCP Server Not Connected
|
||||
|
||||
**Symptoms:** Tools not appearing in Claude
|
||||
|
||||
**Solution:**
|
||||
1. Check MCP server path in configuration
|
||||
2. Verify worker service is running: `curl http://localhost:37777/api/health`
|
||||
3. Restart Claude Desktop/Code
|
||||
|
||||
### Worker Service Not Running
|
||||
|
||||
If searches fail, check worker service:
|
||||
**Symptoms:** MCP tools fail with connection errors
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
npm run worker:status # Check status
|
||||
npm run worker:restart # Restart worker
|
||||
npm run worker:logs # View logs
|
||||
```
|
||||
|
||||
### HTTP Endpoints Not Responding
|
||||
### Empty Search Results
|
||||
|
||||
Test endpoints directly:
|
||||
**Symptoms:** search() returns no results
|
||||
|
||||
```bash
|
||||
# Health check
|
||||
curl http://localhost:37777/health
|
||||
|
||||
# Search test
|
||||
curl "http://localhost:37777/api/search/observations?query=test&limit=1"
|
||||
```
|
||||
|
||||
### Skill Not Invoking
|
||||
|
||||
If Claude doesn't invoke the mem-search skill automatically:
|
||||
|
||||
1. Check skill files exist: `ls ~/.claude/plugins/marketplaces/thedotmack/plugin/skills/mem-search/`
|
||||
2. Restart Claude Code session to reload skill definitions
|
||||
3. Try more explicit phrasing: "Search past sessions for bug fixes" or "What did we do in yesterday's session?"
|
||||
4. Ensure your question is about previous sessions (not current conversation context)
|
||||
**Troubleshooting:**
|
||||
1. Test API directly: `curl "http://localhost:37777/api/search?query=test"`
|
||||
2. Check database: `ls ~/.claude-mem/claude-mem.db`
|
||||
3. Verify observations exist: `curl "http://localhost:37777/api/health"`
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Search Tools Usage](/usage/search-tools) - User guide with examples
|
||||
- [Memory Search Usage](/usage/search-tools) - User guide with examples
|
||||
- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow
|
||||
- [Worker Service Architecture](/architecture/worker-service) - HTTP API details
|
||||
- [Database Schema](/architecture/database) - FTS5 tables and indexes
|
||||
|
||||
@@ -260,14 +260,12 @@ The index is useless without retrieval mechanisms:
|
||||
*Use claude-mem MCP search to access records with the given ID*
|
||||
```
|
||||
|
||||
**Available tools:**
|
||||
- `search_observations` - Full-text search
|
||||
- `find_by_concept` - Concept-based retrieval
|
||||
- `find_by_file` - File-based retrieval
|
||||
- `find_by_type` - Type-based retrieval
|
||||
- `get_recent_context` - Recent session summaries
|
||||
**Available MCP tools:**
|
||||
- `search` - Search memory index (Layer 1: Get IDs)
|
||||
- `timeline` - Get chronological context (Layer 2: See narrative arc)
|
||||
- `get_observations` - Fetch full details (Layer 3: Deep dive)
|
||||
|
||||
Each tool supports `format: "index"` (default) and `format: "full"`.
|
||||
The 3-layer workflow ensures progressive disclosure: index → context → details.
|
||||
|
||||
---
|
||||
|
||||
@@ -318,16 +316,18 @@ Is my task related to npm? → YES
|
||||
|
||||
---
|
||||
|
||||
## The Two-Tier Search Strategy
|
||||
## The Three-Layer Workflow
|
||||
|
||||
Claude-Mem implements progressive disclosure in search results too:
|
||||
Claude-Mem implements progressive disclosure through a 3-layer workflow pattern:
|
||||
|
||||
### Tier 1: Index Format (Default)
|
||||
### Layer 1: Search (Index)
|
||||
|
||||
Start by searching to get a compact index with IDs:
|
||||
|
||||
```typescript
|
||||
search_observations({
|
||||
search({
|
||||
query: "hook timeout",
|
||||
format: "index" // Default
|
||||
limit: 10
|
||||
})
|
||||
```
|
||||
|
||||
@@ -335,23 +335,40 @@ search_observations({
|
||||
```
|
||||
Found 3 observations matching "hook timeout":
|
||||
|
||||
| ID | Date | Type | Title | Tokens |
|
||||
|----|------|------|-------|--------|
|
||||
| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | ~155 |
|
||||
| #2891 | Oct 25 | how-it-works | Hook timeout configuration | ~203 |
|
||||
| #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ~89 |
|
||||
| ID | Date | Type | Title |
|
||||
|----|------|------|-------|
|
||||
| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short |
|
||||
| #2891 | Oct 25 | how-it-works | Hook timeout configuration |
|
||||
| #2102 | Oct 20 | problem-solution | Fixed timeout in CI |
|
||||
```
|
||||
|
||||
**Cost:** ~100 tokens for 3 results
|
||||
**Value:** Agent can scan and decide which to fetch
|
||||
**Cost:** ~50-100 tokens per result
|
||||
**Value:** Agent can scan and decide which observations are relevant
|
||||
|
||||
### Tier 2: Full Format (On-Demand)
|
||||
### Layer 2: Timeline (Context)
|
||||
|
||||
Get chronological context around interesting observations:
|
||||
|
||||
```typescript
|
||||
search_observations({
|
||||
query: "hook timeout",
|
||||
format: "full",
|
||||
limit: 1 // Fetch just the most relevant
|
||||
timeline({
|
||||
anchor: 2543, // Observation ID from search
|
||||
depth_before: 3,
|
||||
depth_after: 3
|
||||
})
|
||||
```
|
||||
|
||||
**Returns:** Chronological view showing what happened before/during/after observation #2543
|
||||
|
||||
**Cost:** Variable based on depth
|
||||
**Value:** Understand narrative arc and context
|
||||
|
||||
### Layer 3: Get Observations (Details)
|
||||
|
||||
Fetch full details only for relevant observations:
|
||||
|
||||
```typescript
|
||||
get_observations({
|
||||
ids: [2543, 2102] // Selected from search results
|
||||
})
|
||||
```
|
||||
|
||||
@@ -463,29 +480,30 @@ Here are 10 observations.
|
||||
*Use MCP search tools to fetch full observation details on-demand*
|
||||
```
|
||||
|
||||
### ❌ Defaulting to Full Format
|
||||
### ❌ Skipping the Index Layer
|
||||
|
||||
**Bad:**
|
||||
```typescript
|
||||
search_observations({
|
||||
query: "hooks",
|
||||
format: "full" // Fetches everything
|
||||
// Fetching full details immediately
|
||||
get_observations({
|
||||
ids: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] // Guessing which are relevant
|
||||
})
|
||||
```
|
||||
|
||||
**Good:**
|
||||
```typescript
|
||||
search_observations({
|
||||
// Follow the 3-layer workflow
|
||||
// Layer 1: Search for index
|
||||
search({
|
||||
query: "hooks",
|
||||
format: "index", // Scan first
|
||||
limit: 20
|
||||
})
|
||||
|
||||
// Then, if needed:
|
||||
search_observations({
|
||||
query: "hooks",
|
||||
format: "full",
|
||||
limit: 1 // Just the most relevant
|
||||
// Layer 2: Review index, identify 2-3 relevant IDs
|
||||
|
||||
// Layer 3: Fetch only relevant observations
|
||||
get_observations({
|
||||
ids: [2543, 2891] // Just the most relevant
|
||||
})
|
||||
```
|
||||
|
||||
@@ -595,10 +613,9 @@ SessionStart({ source: "compact" }):
|
||||
|
||||
```typescript
|
||||
// Use embeddings to pre-sort index by relevance
|
||||
search_observations({
|
||||
search({
|
||||
query: "authentication bug",
|
||||
format: "index",
|
||||
sort: "relevance" // Based on semantic similarity
|
||||
orderBy: "relevance" // Based on semantic similarity (future enhancement)
|
||||
})
|
||||
```
|
||||
|
||||
|
||||
@@ -742,17 +742,17 @@ sqlite3 ~/.claude-mem/claude-mem.db "
|
||||
|
||||
3. Test simple query:
|
||||
```bash
|
||||
# In Claude Code
|
||||
search_observations with query="test"
|
||||
# Test MCP search tool
|
||||
search(query="test", limit=5)
|
||||
```
|
||||
|
||||
4. Check query syntax:
|
||||
```bash
|
||||
# Bad: Special characters
|
||||
search_observations with query="[test]"
|
||||
# Bad: Special characters may cause issues
|
||||
search(query="[test]")
|
||||
|
||||
# Good: Simple words
|
||||
search_observations with query="test"
|
||||
search(query="test")
|
||||
```
|
||||
|
||||
### Token Limit Errors
|
||||
@@ -761,28 +761,40 @@ sqlite3 ~/.claude-mem/claude-mem.db "
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. Use index format:
|
||||
1. Follow 3-layer workflow (don't skip to get_observations):
|
||||
```bash
|
||||
search_observations with query="..." and format="index"
|
||||
# Start with search to get index
|
||||
search(query="...", limit=10)
|
||||
|
||||
# Review IDs, then fetch only relevant ones
|
||||
get_observations(ids=[<2-3 relevant IDs>])
|
||||
```
|
||||
|
||||
2. Reduce limit:
|
||||
2. Reduce limit in search:
|
||||
```bash
|
||||
search_observations with query="..." and limit=3
|
||||
search(query="...", limit=3)
|
||||
```
|
||||
|
||||
3. Use filters to narrow results:
|
||||
```bash
|
||||
search_observations with query="..." and type="decision" and limit=5
|
||||
search(query="...", type="decision", limit=5)
|
||||
```
|
||||
|
||||
4. Paginate results:
|
||||
```bash
|
||||
# First page
|
||||
search_observations with query="..." and limit=5 and offset=0
|
||||
search(query="...", limit=5, offset=0)
|
||||
|
||||
# Second page
|
||||
search_observations with query="..." and limit=5 and offset=5
|
||||
search(query="...", limit=5, offset=5)
|
||||
```
|
||||
|
||||
5. Batch IDs in get_observations:
|
||||
```bash
|
||||
# Always batch multiple IDs in one call
|
||||
get_observations(ids=[123, 456, 789])
|
||||
|
||||
# Don't make separate calls per ID
|
||||
```
|
||||
|
||||
## Performance Issues
|
||||
|
||||
+357
-306
@@ -1,403 +1,454 @@
|
||||
---
|
||||
title: "mem-search Skill"
|
||||
description: "Query your project history with natural language"
|
||||
title: "Memory Search"
|
||||
description: "Search your project history with MCP tools"
|
||||
---
|
||||
|
||||
# mem-search Skill Usage
|
||||
# Memory Search with MCP Tools
|
||||
|
||||
Once claude-mem is installed as a plugin, you can search your project history using natural language. Claude automatically invokes the mem-search skill when you ask about past work.
|
||||
Claude-mem provides persistent memory across sessions through **4 MCP tools** that follow a token-efficient **3-layer workflow pattern**.
|
||||
|
||||
## How It Works
|
||||
## Overview
|
||||
|
||||
**v5.5.0 Enhancement**: The search skill was renamed to "mem-search" for better scope differentiation, with effectiveness increased from 67% to 100% and enhanced concrete triggers (85% vs 44%).
|
||||
Instead of fetching all historical data upfront (expensive), claude-mem uses a progressive disclosure approach:
|
||||
|
||||
**v5.4.0 Architecture**: Claude-Mem uses a skill-based search architecture instead of MCP tools, saving ~2,250 tokens per session start through progressive disclosure.
|
||||
1. **Search** → Get a compact index with IDs (~50-100 tokens/result)
|
||||
2. **Timeline** → Get context around interesting results
|
||||
3. **Get Observations** → Fetch full details ONLY for filtered IDs
|
||||
|
||||
**Simple Usage:**
|
||||
- Just ask naturally: *"What did we do last session?"*
|
||||
- Claude recognizes the intent and invokes the mem-search skill
|
||||
- The skill uses HTTP API endpoints to query your memory
|
||||
- Results are formatted and presented to you
|
||||
This achieves **~10x token savings** compared to traditional RAG approaches.
|
||||
|
||||
**Benefits:**
|
||||
- **Token Efficient**: ~250 tokens (skill frontmatter) vs ~2,500 tokens (MCP tool definitions)
|
||||
- **Natural Language**: No need to learn specific tool syntax
|
||||
- **Progressive Disclosure**: Only loads detailed instructions when needed
|
||||
- **Auto-Invoked**: Claude knows when to search based on your questions
|
||||
- **Scope Differentiation**: "mem-search" clearly distinguishes from native conversation memory
|
||||
## The 3-Layer Workflow
|
||||
|
||||
## Quick Reference
|
||||
### Layer 1: Search (Index)
|
||||
|
||||
| Operation | Purpose |
|
||||
|-------------------------|----------------------------------------------|
|
||||
| Search Observations | Full-text search across observations |
|
||||
| Search Sessions | Full-text search across session summaries |
|
||||
| Search Prompts | Full-text search across raw user prompts |
|
||||
| By Concept | Find observations tagged with concepts |
|
||||
| By File | Find observations referencing files |
|
||||
| By Type | Find observations by type |
|
||||
| Recent Context | Get recent session context |
|
||||
| Timeline | Get unified timeline around a specific point |
|
||||
| Timeline by Query | Search and get timeline context in one step |
|
||||
| API Help | Get search API documentation |
|
||||
|
||||
## Example Queries
|
||||
|
||||
### Natural Language Queries
|
||||
|
||||
**Search Observations:**
|
||||
```
|
||||
"What bugs did we fix related to authentication?"
|
||||
"Show me all decisions about the build system"
|
||||
"Find refactoring work on the database"
|
||||
```
|
||||
|
||||
**Search Sessions:**
|
||||
```
|
||||
"What did we learn about hooks?"
|
||||
"What was accomplished in the API implementation?"
|
||||
"Show me recent work on this project"
|
||||
```
|
||||
|
||||
**Search Prompts:**
|
||||
```
|
||||
"When did I ask about authentication features?"
|
||||
"Find all my requests about dark mode"
|
||||
```
|
||||
|
||||
**Note**: Claude automatically translates your natural language queries into the appropriate search operations.
|
||||
|
||||
### Search by File
|
||||
Start by searching to get a lightweight index of results:
|
||||
|
||||
```
|
||||
"Show me everything related to worker-service.ts"
|
||||
"What changes were made to migrations.ts?"
|
||||
"Find all work on the database file"
|
||||
search(query="authentication bug", type="bugfix", limit=10)
|
||||
```
|
||||
|
||||
### Search by Concept
|
||||
**Returns:** Compact table with IDs, titles, dates, types
|
||||
**Cost:** ~50-100 tokens per result
|
||||
**Purpose:** Survey what exists before fetching details
|
||||
|
||||
### Layer 2: Timeline (Context)
|
||||
|
||||
Get chronological context around specific observations:
|
||||
|
||||
```
|
||||
"Show observations tagged with architecture"
|
||||
"Find all security-related observations"
|
||||
"What patterns have we used?"
|
||||
timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
|
||||
```
|
||||
|
||||
### Search by Type
|
||||
Or search and get timeline in one step:
|
||||
|
||||
```
|
||||
"Find all feature implementations"
|
||||
"Show me all decisions and discoveries"
|
||||
"What bugs have we fixed?"
|
||||
timeline(query="authentication", depth_before=2, depth_after=2)
|
||||
```
|
||||
|
||||
### Recent Context
|
||||
**Returns:** Chronological view showing what was happening before/after
|
||||
**Cost:** Variable, depends on depth
|
||||
**Purpose:** Understand narrative arc and context
|
||||
|
||||
### Layer 3: Get Observations (Details)
|
||||
|
||||
Fetch full details only for relevant observations:
|
||||
|
||||
```
|
||||
"Show me what we've been working on"
|
||||
"Get context from the last 5 sessions"
|
||||
"What happened recently on this project?"
|
||||
get_observations(ids=[123, 456, 789])
|
||||
```
|
||||
|
||||
### Timeline Queries
|
||||
**Returns:** Complete observation details (narrative, facts, files, concepts)
|
||||
**Cost:** ~500-1000 tokens per observation
|
||||
**Purpose:** Deep dive on specific, validated items
|
||||
|
||||
**Get timeline around a specific point:**
|
||||
### Why This Works
|
||||
|
||||
**Traditional Approach:**
|
||||
- Fetch everything upfront: 20,000 tokens
|
||||
- Relevance: ~10% (2,000 tokens actually useful)
|
||||
- Waste: 18,000 tokens on irrelevant context
|
||||
|
||||
**3-Layer Approach:**
|
||||
- Search index: 1,000 tokens (10 results)
|
||||
- Timeline context: 500 tokens (around 2 key results)
|
||||
- Fetch details: 1,500 tokens (3 observations)
|
||||
- **Total: 3,000 tokens, 100% relevant**
|
||||
|
||||
## Available Tools
|
||||
|
||||
### `__IMPORTANT` - Workflow Documentation
|
||||
|
||||
Always visible reminder of the 3-layer workflow pattern. Helps Claude understand how to use the search tools efficiently.
|
||||
|
||||
**Usage:** Automatically shown, no need to invoke
|
||||
|
||||
### `search` - Search Memory Index
|
||||
|
||||
Search your memory and get a compact index with IDs.
|
||||
|
||||
**Parameters:**
|
||||
- `query` - Full-text search query (supports AND, OR, NOT, phrase searches)
|
||||
- `limit` - Maximum results (default: 20)
|
||||
- `offset` - Skip first N results for pagination
|
||||
- `type` - Filter by observation type (bugfix, feature, decision, discovery, refactor, change)
|
||||
- `obs_type` - Filter by record type (observation, session, prompt)
|
||||
- `project` - Filter by project name
|
||||
- `dateStart` - Filter by start date (YYYY-MM-DD)
|
||||
- `dateEnd` - Filter by end date (YYYY-MM-DD)
|
||||
- `orderBy` - Sort order (date_desc, date_asc, relevance)
|
||||
|
||||
**Returns:** Compact index table with IDs, titles, dates, types
|
||||
|
||||
**Example:**
|
||||
```
|
||||
"What was happening when we implemented authentication?"
|
||||
"Show me the context around that bug fix"
|
||||
"What led to the decision to refactor the database?"
|
||||
search(query="database migration", type="bugfix", limit=5, orderBy="date_desc")
|
||||
```
|
||||
|
||||
**Timeline by query:**
|
||||
### `timeline` - Get Chronological Context
|
||||
|
||||
Get a chronological view of observations around a specific point or query.
|
||||
|
||||
**Parameters:**
|
||||
- `anchor` - Observation ID to center timeline around (optional if query provided)
|
||||
- `query` - Search query to find anchor automatically (optional if anchor provided)
|
||||
- `depth_before` - Number of observations before anchor (default: 3)
|
||||
- `depth_after` - Number of observations after anchor (default: 3)
|
||||
- `project` - Filter by project name
|
||||
|
||||
**Returns:** Chronological list showing what happened before/during/after
|
||||
|
||||
**Example:**
|
||||
```
|
||||
"Find when we added the viewer UI and show what happened around that time"
|
||||
"Search for authentication work and show the timeline"
|
||||
timeline(anchor=12345, depth_before=5, depth_after=5)
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- See the complete narrative arc around key events
|
||||
- All record types (observations, sessions, prompts) in chronological view
|
||||
- Understand what was happening before and after important changes
|
||||
|
||||
## Search Strategy
|
||||
|
||||
The mem-search skill uses a progressive disclosure pattern to efficiently retrieve information:
|
||||
|
||||
### 1. Ask Naturally
|
||||
|
||||
Start with a natural language question:
|
||||
Or search-based:
|
||||
```
|
||||
"What bugs did we fix related to authentication?"
|
||||
timeline(query="implemented JWT auth", depth_before=3, depth_after=3)
|
||||
```
|
||||
|
||||
### 2. Claude Invokes mem-search Skill
|
||||
### `get_observations` - Fetch Full Details
|
||||
|
||||
Claude recognizes your intent and loads the mem-search skill (~250 tokens for skill frontmatter).
|
||||
Fetch complete observation details by IDs. **Always batch multiple IDs in a single call for efficiency.**
|
||||
|
||||
### 3. Skill Uses HTTP API
|
||||
**Parameters:**
|
||||
- `ids` - Array of observation IDs (required)
|
||||
- `orderBy` - Sort order (date_desc, date_asc)
|
||||
- `limit` - Maximum observations to return
|
||||
- `project` - Filter by project name
|
||||
|
||||
The skill calls the appropriate HTTP endpoint (e.g., `/api/search/observations`) with the query.
|
||||
**Returns:** Complete observation details including narrative, facts, files, concepts
|
||||
|
||||
### 4. Results Formatted
|
||||
|
||||
Results are formatted and presented to you, usually starting with an index/summary format.
|
||||
|
||||
### 5. Deep Dive if Needed
|
||||
|
||||
If you need more details, ask follow-up questions:
|
||||
**Example:**
|
||||
```
|
||||
"Tell me more about observation #123"
|
||||
"Show me the full details of that decision"
|
||||
get_observations(ids=[123, 456, 789, 1011])
|
||||
```
|
||||
|
||||
**Benefits of This Approach:**
|
||||
- **Token Efficient**: Only loads what you need, when you need it
|
||||
- **Natural**: No syntax to learn
|
||||
- **Progressive**: Start with overview, drill down as needed
|
||||
- **Automatic**: Claude handles the search invocation
|
||||
**Important:** Always batch IDs instead of making separate calls per observation.
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
### Debugging Issues
|
||||
|
||||
**Scenario:** Find what went wrong with database connections
|
||||
|
||||
```
|
||||
Step 1: search(query="error database connection", type="bugfix", limit=10)
|
||||
→ Review index, identify observations #245, #312, #489
|
||||
|
||||
Step 2: timeline(anchor=312, depth_before=3, depth_after=3)
|
||||
→ See what was happening around the fix
|
||||
|
||||
Step 3: get_observations(ids=[312, 489])
|
||||
→ Get full details on relevant fixes
|
||||
```
|
||||
|
||||
### Understanding Decisions
|
||||
|
||||
**Scenario:** Review architectural choices about authentication
|
||||
|
||||
```
|
||||
Step 1: search(query="authentication", type="decision", limit=5)
|
||||
→ Find decision observations
|
||||
|
||||
Step 2: get_observations(ids=[<relevant_ids>])
|
||||
→ Get full decision rationale, trade-offs, facts
|
||||
```
|
||||
|
||||
### Code Archaeology
|
||||
|
||||
**Scenario:** Find when a specific file was modified
|
||||
|
||||
```
|
||||
Step 1: search(query="worker-service.ts", limit=20)
|
||||
→ Get all observations mentioning that file
|
||||
|
||||
Step 2: timeline(query="worker-service.ts refactor", depth_before=2, depth_after=2)
|
||||
→ See what led to and followed from the refactor
|
||||
|
||||
Step 3: get_observations(ids=[<specific_observation_ids>])
|
||||
→ Get implementation details
|
||||
```
|
||||
|
||||
### Feature History
|
||||
|
||||
**Scenario:** Track how a feature evolved
|
||||
|
||||
```
|
||||
Step 1: search(query="dark mode", type="feature", orderBy="date_asc")
|
||||
→ Chronological view of feature work
|
||||
|
||||
Step 2: timeline(anchor=<first_observation_id>, depth_after=10)
|
||||
→ See the full development timeline
|
||||
|
||||
Step 3: get_observations(ids=[<key_milestones>])
|
||||
→ Deep dive on critical implementation points
|
||||
```
|
||||
|
||||
### Learning from Past Work
|
||||
|
||||
**Scenario:** Review refactoring patterns
|
||||
|
||||
```
|
||||
Step 1: search(type="refactor", limit=10, orderBy="date_desc")
|
||||
→ Recent refactoring work
|
||||
|
||||
Step 2: get_observations(ids=[<interesting_ids>])
|
||||
→ Study the patterns and approaches used
|
||||
```
|
||||
|
||||
### Context Recovery
|
||||
|
||||
**Scenario:** Restore context after time away from project
|
||||
|
||||
```
|
||||
Step 1: search(query="project-name", limit=10, orderBy="date_desc")
|
||||
→ See recent work
|
||||
|
||||
Step 2: timeline(anchor=<most_recent_id>, depth_before=10)
|
||||
→ Understand what led to current state
|
||||
|
||||
Step 3: get_observations(ids=[<critical_observations>])
|
||||
→ Refresh memory on key decisions
|
||||
```
|
||||
|
||||
## Search Query Syntax
|
||||
|
||||
The `query` parameter supports SQLite FTS5 full-text search syntax:
|
||||
|
||||
### Boolean Operators
|
||||
|
||||
```
|
||||
query="authentication AND JWT" # Both terms must appear
|
||||
query="OAuth OR JWT" # Either term can appear
|
||||
query="security NOT deprecated" # Exclude deprecated items
|
||||
```
|
||||
|
||||
### Phrase Searches
|
||||
|
||||
```
|
||||
query='"database migration"' # Exact phrase match
|
||||
```
|
||||
|
||||
### Column-Specific Searches
|
||||
|
||||
```
|
||||
query="title:authentication" # Search in title only
|
||||
query="content:database" # Search in content only
|
||||
query="concepts:security" # Search in concepts only
|
||||
```
|
||||
|
||||
### Combining Operators
|
||||
|
||||
```
|
||||
query='"user auth" AND (JWT OR session) NOT deprecated'
|
||||
```
|
||||
|
||||
## Token Management
|
||||
|
||||
### Token Efficiency Best Practices
|
||||
|
||||
1. **Always start with search** - Get index first (~50-100 tokens/result)
|
||||
2. **Use small limits** - Start with 3-5 results, increase if needed
|
||||
3. **Filter before fetching** - Use type, date, project filters
|
||||
4. **Batch get_observations** - Always group multiple IDs in one call
|
||||
5. **Use timeline strategically** - Get context only when narrative matters
|
||||
|
||||
### Token Cost Estimates
|
||||
|
||||
| Operation | Tokens per Result |
|
||||
|-----------|-------------------|
|
||||
| search (index) | 50-100 |
|
||||
| timeline (per observation) | 100-200 |
|
||||
| get_observations (full details) | 500-1,000 |
|
||||
|
||||
**Example Comparison:**
|
||||
|
||||
**Inefficient:**
|
||||
```
|
||||
# Fetching 20 full observations upfront: 10,000-20,000 tokens
|
||||
get_observations(ids=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])
|
||||
```
|
||||
|
||||
**Efficient:**
|
||||
```
|
||||
# Search index: ~1,000 tokens
|
||||
search(query="bug fix", limit=20)
|
||||
|
||||
# Review IDs, identify 3 relevant observations
|
||||
|
||||
# Fetch only relevant: ~1,500-3,000 tokens
|
||||
get_observations(ids=[5, 12, 18])
|
||||
|
||||
# Total: 2,500-4,000 tokens (vs 10,000-20,000)
|
||||
```
|
||||
|
||||
## Advanced Filtering
|
||||
|
||||
You can refine searches using natural language filters:
|
||||
|
||||
### Date Ranges
|
||||
|
||||
```
|
||||
"What bugs did we fix in October?"
|
||||
"Show me work from last week"
|
||||
"Find decisions made between October 1-31"
|
||||
search(
|
||||
query="performance optimization",
|
||||
dateStart="2025-10-01",
|
||||
dateEnd="2025-10-31"
|
||||
)
|
||||
```
|
||||
|
||||
### Multiple Types
|
||||
|
||||
```
|
||||
"Show me all decisions and features"
|
||||
"Find bugfixes and refactorings"
|
||||
```
|
||||
|
||||
### Concepts
|
||||
For observations of multiple types, make multiple searches or use broader query:
|
||||
|
||||
```
|
||||
"Find database work related to architecture and performance"
|
||||
"Show security observations"
|
||||
search(query="database", type="bugfix", limit=10)
|
||||
search(query="database", type="feature", limit=10)
|
||||
```
|
||||
|
||||
### File-Specific
|
||||
### Project-Specific
|
||||
|
||||
```
|
||||
"Show refactoring work that touched worker-service.ts"
|
||||
"Find changes to auth files"
|
||||
search(query="API", project="my-app", limit=15)
|
||||
```
|
||||
|
||||
### Project Filtering
|
||||
### Pagination
|
||||
|
||||
```
|
||||
"Show authentication work on my-app project"
|
||||
"What have we done on this codebase?"
|
||||
# First page
|
||||
search(query="refactor", limit=10, offset=0)
|
||||
|
||||
# Second page
|
||||
search(query="refactor", limit=10, offset=10)
|
||||
|
||||
# Third page
|
||||
search(query="refactor", limit=10, offset=20)
|
||||
```
|
||||
|
||||
**Note**: Claude translates your natural language into the appropriate API filters automatically.
|
||||
|
||||
## Under the Hood: HTTP API
|
||||
|
||||
The mem-search skill uses HTTP endpoints on the worker service (port 37777):
|
||||
|
||||
- `GET /api/search/observations` - Full-text search observations
|
||||
- `GET /api/search/sessions` - Full-text search session summaries
|
||||
- `GET /api/search/prompts` - Full-text search user prompts
|
||||
- `GET /api/search/by-concept` - Find observations by concept tag
|
||||
- `GET /api/search/by-file` - Find work related to specific files
|
||||
- `GET /api/search/by-type` - Find observations by type
|
||||
- `GET /api/context/recent` - Get recent session context
|
||||
- `GET /api/context/timeline` - Get timeline around specific point
|
||||
- `GET /api/timeline/by-query` - Search + timeline in one call
|
||||
- `GET /api/search/help` - API documentation
|
||||
|
||||
These endpoints use FTS5 full-text search with support for:
|
||||
- Boolean operators (AND, OR, NOT)
|
||||
- Phrase searches
|
||||
- Column-specific searches
|
||||
- Date range filtering
|
||||
- Project filtering
|
||||
|
||||
## Result Metadata
|
||||
|
||||
All results include rich metadata:
|
||||
All observations include rich metadata:
|
||||
|
||||
```
|
||||
## JWT authentication decision
|
||||
|
||||
**Type**: decision
|
||||
**Date**: 2025-10-21 14:23:45
|
||||
**Concepts**: authentication, security, architecture
|
||||
**Files Read**: src/auth/middleware.ts, src/utils/jwt.ts
|
||||
**Files Modified**: src/auth/jwt-strategy.ts
|
||||
|
||||
**Narrative**:
|
||||
Decided to implement JWT-based authentication instead of session-based
|
||||
authentication for better scalability and stateless design...
|
||||
|
||||
**Facts**:
|
||||
• JWT tokens expire after 1 hour
|
||||
• Refresh tokens stored in httpOnly cookies
|
||||
• Token signing uses RS256 algorithm
|
||||
• Public keys rotated every 30 days
|
||||
```
|
||||
|
||||
## Citations
|
||||
|
||||
All search results include observation IDs that can be accessed via the HTTP API:
|
||||
|
||||
- `http://localhost:37777/api/observation/{id}` - Get specific observation by ID
|
||||
- View all observations in the web viewer at `http://localhost:37777`
|
||||
|
||||
These citations enable referencing specific historical context in your work.
|
||||
|
||||
## Token Management
|
||||
|
||||
### Token Efficiency Tips
|
||||
|
||||
1. **Start with index format**: ~50-100 tokens per result
|
||||
2. **Use small limits**: Start with 3-5 results
|
||||
3. **Apply filters**: Narrow results before searching
|
||||
4. **Paginate**: Use offset to browse results in batches
|
||||
|
||||
### Token Estimates
|
||||
|
||||
| Format | Tokens per Result |
|
||||
|--------|-------------------|
|
||||
| Index | 50-100 |
|
||||
| Full | 500-1000 |
|
||||
|
||||
**Example**:
|
||||
- 20 results in index format: ~1,000-2,000 tokens
|
||||
- 20 results in full format: ~10,000-20,000 tokens
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
### 1. Debugging Issues
|
||||
|
||||
Find what went wrong:
|
||||
```
|
||||
search_observations with query="error database connection" and type="bugfix"
|
||||
```
|
||||
|
||||
### 2. Understanding Decisions
|
||||
|
||||
Review architectural choices:
|
||||
```
|
||||
find_by_type with type="decision" and format="index"
|
||||
```
|
||||
|
||||
Then deep dive on specific decisions:
|
||||
```
|
||||
search_observations with query="[DECISION TITLE]" and format="full"
|
||||
```
|
||||
|
||||
### 3. Code Archaeology
|
||||
|
||||
Find when a file was modified:
|
||||
```
|
||||
find_by_file with filePath="worker-service.ts"
|
||||
```
|
||||
|
||||
### 4. Feature History
|
||||
|
||||
Track feature development:
|
||||
```
|
||||
search_sessions with query="authentication feature"
|
||||
search_user_prompts with query="add authentication"
|
||||
```
|
||||
|
||||
### 5. Learning from Past Work
|
||||
|
||||
Review refactoring patterns:
|
||||
```
|
||||
find_by_type with type="refactor" and limit=10
|
||||
```
|
||||
|
||||
### 6. Context Recovery
|
||||
|
||||
Restore context after time away:
|
||||
```
|
||||
get_recent_context with limit=5
|
||||
search_sessions with query="[YOUR PROJECT NAME]" and orderBy="date_desc"
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Index first, full later**: Always start with index format
|
||||
2. **Small limits**: Start with 3-5 results to avoid token limits
|
||||
3. **Use filters**: Narrow results before searching
|
||||
4. **Specific queries**: More specific = better results
|
||||
5. **Review citations**: Use citations to reference past decisions
|
||||
6. **Date filtering**: Use date ranges for time-based searches
|
||||
7. **Type filtering**: Use types to categorize searches
|
||||
8. **Concept tags**: Use concepts for thematic searches
|
||||
- **ID** - Unique observation identifier
|
||||
- **Type** - bugfix, feature, decision, discovery, refactor, change
|
||||
- **Date** - When the work occurred
|
||||
- **Title** - Concise description
|
||||
- **Concepts** - Tagged themes (e.g., security, performance, architecture)
|
||||
- **Files Read** - Files examined during work
|
||||
- **Files Modified** - Files changed during work
|
||||
- **Narrative** - Story of what happened and why
|
||||
- **Facts** - Key factual points (decisions made, patterns used, metrics)
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### No Results Found
|
||||
|
||||
1. Check database has data:
|
||||
1. **Broaden your search:**
|
||||
```
|
||||
# Too specific
|
||||
search(query="JWT authentication implementation with RS256")
|
||||
|
||||
# Better
|
||||
search(query="authentication")
|
||||
```
|
||||
|
||||
2. **Check database has data:**
|
||||
```bash
|
||||
sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"
|
||||
curl "http://localhost:37777/api/search?query=test"
|
||||
```
|
||||
|
||||
2. Try broader natural language query:
|
||||
3. **Try without filters:**
|
||||
```
|
||||
"Show me anything about authentication" # Broader
|
||||
vs
|
||||
"Find exact JWT authentication implementation" # Too specific
|
||||
# Remove type/date filters to see if data exists
|
||||
search(query="your-search-term")
|
||||
```
|
||||
|
||||
3. Ask without filters first:
|
||||
```
|
||||
"What do we have about auth?"
|
||||
# Then narrow down
|
||||
"Show me auth-related decisions"
|
||||
```
|
||||
### IDs Not Found in get_observations
|
||||
|
||||
### Worker Service Not Running
|
||||
**Error:** "Observation IDs not found: [123, 456]"
|
||||
|
||||
If search isn't working, check the worker service:
|
||||
**Causes:**
|
||||
- IDs from different project (use `project` parameter)
|
||||
- IDs were deleted
|
||||
- Typo in ID numbers
|
||||
|
||||
```bash
|
||||
npm run worker:status # Check worker status
|
||||
npm run worker:restart # Restart if needed
|
||||
npm run worker:logs # View logs
|
||||
**Solution:**
|
||||
```
|
||||
# Verify IDs exist
|
||||
search(query="<related-search>")
|
||||
|
||||
# Use correct project filter
|
||||
get_observations(ids=[123, 456], project="correct-project-name")
|
||||
```
|
||||
|
||||
Or describe the issue to Claude and the troubleshoot skill will automatically activate to provide diagnosis.
|
||||
### Token Limit Errors
|
||||
|
||||
### Performance Issues
|
||||
**Error:** Response exceeds token limits
|
||||
|
||||
**Solution:** Use the 3-layer workflow to reduce upfront costs:
|
||||
|
||||
```
|
||||
# Instead of fetching 50 full observations:
|
||||
# get_observations(ids=[1,2,3,...,50]) # 25,000-50,000 tokens!
|
||||
|
||||
# Do this:
|
||||
search(query="<your-query>", limit=50) # ~2,500-5,000 tokens
|
||||
# Review index, identify 5 relevant observations
|
||||
get_observations(ids=[<5-most-relevant>]) # ~2,500-5,000 tokens
|
||||
# Total: 5,000-10,000 tokens (50-80% savings)
|
||||
```
|
||||
|
||||
### Search Performance
|
||||
|
||||
If searches seem slow:
|
||||
1. Be more specific in your queries
|
||||
2. Ask for recent work (naturally filters by date)
|
||||
3. Specify the project you're interested in
|
||||
4. Ask for fewer results initially
|
||||
1. Be more specific in queries (helps FTS5 index)
|
||||
2. Use date range filters to narrow scope
|
||||
3. Specify project filter when possible
|
||||
4. Use smaller limit values
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Index First, Details Later** - Always start with search to survey options
|
||||
2. **Filter Before Fetching** - Use search parameters to narrow results
|
||||
3. **Batch ID Fetches** - Group multiple IDs in one get_observations call
|
||||
4. **Use Timeline for Context** - When narrative matters, timeline shows the story
|
||||
5. **Specific Queries** - More specific = better relevance
|
||||
6. **Small Limits Initially** - Start with 3-5 results, expand if needed
|
||||
7. **Review Before Deep Dive** - Check index before fetching full details
|
||||
|
||||
## Technical Details
|
||||
|
||||
**Architecture Change (v5.4.0)**:
|
||||
- **Before**: 9 MCP tools (~2,500 tokens in tool definitions per session start)
|
||||
- **After**: 1 mem-search skill (~250 tokens in frontmatter, full instructions loaded on-demand)
|
||||
- **Savings**: ~2,250 tokens per session start
|
||||
- **Migration**: Transparent - users don't need to change how they ask questions
|
||||
**Architecture:** MCP tools are a thin wrapper over the Worker HTTP API (localhost:37777). The MCP server translates tool calls into HTTP requests to the worker service, which handles all business logic, database queries, and Chroma vector search.
|
||||
|
||||
**v5.5.0 Enhancement**: Renamed from "search" to "mem-search" with improved effectiveness (67% → 100%) and enhanced triggers (44% → 85%).
|
||||
**MCP Server:** Located at `~/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs`
|
||||
|
||||
**How the Skill Works:**
|
||||
1. User asks a question about past work
|
||||
2. Claude recognizes the intent matches the mem-search skill description
|
||||
3. Skill loads full instructions from `plugin/skills/mem-search/SKILL.md`
|
||||
4. Skill uses `curl` to call HTTP API endpoints
|
||||
5. Results formatted and returned to Claude
|
||||
6. Claude presents results to user
|
||||
**Worker Service:** Express API on port 37777, managed by Bun
|
||||
|
||||
**Database:** SQLite FTS5 full-text search on `~/.claude-mem/claude-mem.db`
|
||||
|
||||
**Vector Search:** Chroma embeddings for semantic search (underlying implementation)
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow
|
||||
- [Architecture Overview](/architecture/overview) - System components
|
||||
- [Database Schema](/architecture/database) - Understanding the data
|
||||
- [Getting Started](/usage/getting-started) - Automatic operation
|
||||
- [Database Schema](/architecture/database) - Understanding the data structure
|
||||
- [Claude Desktop Setup](/usage/claude-desktop) - Installation and configuration
|
||||
|
||||
Reference in New Issue
Block a user