diff --git a/README.md b/README.md index b0db0c52..c69256bd 100644 --- a/README.md +++ b/README.md @@ -172,35 +172,40 @@ See [Architecture Overview](https://docs.claude-mem.ai/architecture/overview) fo --- -## mem-search Skill +## MCP Search Tools -Claude-Mem provides intelligent search through the mem-search skill that auto-invokes when you ask about past work: +Claude-Mem provides intelligent memory search through **4 MCP tools** following a token-efficient **3-layer workflow pattern**: + +**The 3-Layer Workflow:** + +1. **`search`** - Get compact index with IDs (~50-100 tokens/result) +2. **`timeline`** - Get chronological context around interesting results +3. **`get_observations`** - Fetch full details ONLY for filtered IDs (~500-1,000 tokens/result) **How It Works:** -- Just ask naturally: *"What did we do last session?"* or *"Did we fix this bug before?"* -- Claude automatically invokes the mem-search skill to find relevant context +- Claude uses MCP tools to search your memory +- Start with `search` to get an index of results +- Use `timeline` to see what was happening around specific observations +- Use `get_observations` to fetch full details for relevant IDs +- **~10x token savings** by filtering before fetching details -**Available Search Operations:** +**Available MCP Tools:** -1. **Search Observations** - Full-text search across observations -2. **Search Sessions** - Full-text search across session summaries -3. **Search Prompts** - Search raw user requests -4. **By Concept** - Find by concept tags (discovery, problem-solution, pattern, etc.) -5. **By File** - Find observations referencing specific files -6. **By Type** - Find by type (decision, bugfix, feature, refactor, discovery, change) -7. **Recent Context** - Get recent session context for a project -8. **Timeline** - Get unified timeline of context around a specific point in time -9. **Timeline by Query** - Search for observations and get timeline context around best match -10. **API Help** - Get search API documentation +1. **`search`** - Search memory index with full-text queries, filters by type/date/project +2. **`timeline`** - Get chronological context around a specific observation or query +3. **`get_observations`** - Fetch full observation details by IDs (always batch multiple IDs) +4. **`__IMPORTANT`** - Workflow documentation (always visible to Claude) -**Example Natural Language Queries:** +**Example Usage:** -``` -"What bugs did we fix last session?" -"How did we implement authentication?" -"What changes were made to worker-service.ts?" -"Show me recent work on this project" -"What was happening when we added the viewer UI?" +```typescript +// Step 1: Search for index +search(query="authentication bug", type="bugfix", limit=10) + +// Step 2: Review index, identify relevant IDs (e.g., #123, #456) + +// Step 3: Fetch full details +get_observations(ids=[123, 456]) ``` See [Search Tools Guide](https://docs.claude-mem.ai/usage/search-tools) for detailed examples. diff --git a/docs/public/architecture-evolution.mdx b/docs/public/architecture-evolution.mdx index 67685d12..9a53ddd4 100644 --- a/docs/public/architecture-evolution.mdx +++ b/docs/public/architecture-evolution.mdx @@ -248,6 +248,164 @@ search_observations({ --- +## MCP Architecture Simplification (December 2025) + +### The Problem: Complex MCP Implementation + +**Before:** +``` +9+ MCP tools registered at session start: +- search_observations +- find_by_type +- find_by_file +- find_by_concept +- get_recent_context +- get_observation +- get_session +- get_prompt +- help + +Problems: +- Overlapping operations (search_observations vs find_by_type) +- Complex parameter schemas (~2,500 tokens in tool definitions) +- No built-in workflow guidance +- High cognitive load for Claude (which tool to use?) +- Code size: ~2,718 lines in mcp-server.ts +``` + +**The Insight:** Progressive disclosure should be built into tool design itself, not something Claude has to remember. + +### The Solution: 3-Layer Workflow + +**After:** +``` +4 MCP tools following 3-layer workflow: + +1. __IMPORTANT - Workflow documentation (always visible) + "3-LAYER WORKFLOW (ALWAYS FOLLOW): + 1. search(query) → Get index with IDs + 2. timeline(anchor=ID) → Get context + 3. get_observations([IDs]) → Fetch details + NEVER fetch full details without filtering first." + +2. search - Layer 1: Get index with IDs (~50-100 tokens/result) +3. timeline - Layer 2: Get chronological context +4. get_observations - Layer 3: Fetch full details (~500-1,000 tokens/result) + +Benefits: +- Progressive disclosure enforced by tool structure +- No overlapping operations +- Simple schemas (additionalProperties: true) +- Clear workflow pattern +- Code size: ~312 lines in mcp-server.ts (88% reduction) +- ~10x token savings +``` + +### Migration: Skill-Based Search Removed + +**Previously:** Used skill-based search +- mem-search skill invoked via natural language +- HTTP API called directly via curl +- Progressive disclosure through skill loading +- 17 skill documentation files + +**Now:** Removed skill-based approach +- MCP-only architecture +- Native MCP protocol (better Claude integration) +- Works with both Claude Desktop and Claude Code +- Simpler to maintain (no skill files) +- All 19 mem-search skill files removed (~2,744 lines) + +### Key Architectural Changes + +**MCP Server Refactor:** + +Before: +```typescript +// Complex parameter schemas +{ + name: "search_observations", + inputSchema: { + type: "object", + properties: { + query: { type: "string", description: "..." }, + type: { type: "array", items: { enum: [...] } }, + format: { enum: ["index", "full"] }, + limit: { type: "number", minimum: 1, maximum: 100 }, + // ... many more parameters + } + } +} +``` + +After: +```typescript +// Simple schemas with workflow guidance +{ + name: "search", + description: "Step 1: Search memory. Returns index with IDs.", + inputSchema: { + type: "object", + properties: {}, + additionalProperties: true // Accept any parameters + } +} +``` + +**Workflow Enforcement:** + +Before: Claude had to remember progressive disclosure pattern + +After: Tool structure makes it impossible to skip steps +- Can't get details without IDs from search +- Can't search without seeing __IMPORTANT reminder +- Timeline provides middle ground (context without full details) + +### Impact + +**Token Efficiency:** +``` +Traditional: Fetch 20 observations upfront +→ 10,000-20,000 tokens +→ Only 2 observations relevant (90% waste) + +3-Layer Workflow: +→ search (20 results): ~1,000-2,000 tokens +→ Review index, identify 3 relevant IDs +→ get_observations (3 IDs): ~1,500-3,000 tokens +→ Total: 2,500-5,000 tokens (50-75% savings) +``` + +**Code Simplicity:** +- MCP server: 2,718 lines → 312 lines (88% reduction) +- Removed: 19 skill files (~2,744 lines) +- Net reduction: ~5,150 lines of code removed + +**User Experience:** +- Same natural language interaction +- Better token efficiency +- Clearer architecture +- Works identically on Claude Desktop and Claude Code + +### Design Philosophy + +**Progressive Disclosure Through Structure:** + +The 3-layer workflow embodies progressive disclosure at the architectural level: + +1. **Layer 1 (Index)** - "What exists?" - Cheap survey of options +2. **Layer 2 (Timeline)** - "What was happening?" - Context around specific points +3. **Layer 3 (Details)** - "Tell me everything" - Full details only when justified + +Each layer provides a decision point where Claude can: +- Stop if irrelevant +- Get more context if uncertain +- Dive deep if confident + +This makes it structurally difficult to waste tokens. + +--- + ## v1-v2: The Naive Approach ### The First Attempt: Dump Everything diff --git a/docs/public/architecture/search-architecture.mdx b/docs/public/architecture/search-architecture.mdx index b3314e99..76a14692 100644 --- a/docs/public/architecture/search-architecture.mdx +++ b/docs/public/architecture/search-architecture.mdx @@ -1,448 +1,497 @@ --- title: "Search Architecture" -description: "mem-search skill with HTTP API and progressive disclosure" +description: "MCP tools with 3-layer workflow for token-efficient memory retrieval" --- # Search Architecture -Claude-Mem uses a skill-based search architecture that provides intelligent memory retrieval through natural language queries. This replaced the MCP-based approach in v5.4.0 with a more efficient implementation. The skill was enhanced and renamed to "mem-search" in v5.5.0 for better scope differentiation. +Claude-mem uses an **MCP-based search architecture** that provides intelligent memory retrieval through 4 streamlined tools following a 3-layer workflow pattern. ## Overview -**Architecture**: Skill-Based Search + HTTP API + Progressive Disclosure +**Architecture**: MCP Tools → MCP Protocol → HTTP API → Worker Service **Key Components**: -1. **mem-search Skill** (`plugin/skills/mem-search/SKILL.md`) - Auto-invoked when users ask about past work -2. **HTTP API Endpoints** (10 routes) - Fast, efficient search operations on port 37777 -3. **Worker Service** - Express.js server with FTS5 full-text search -4. **SQLite Database** - Persistent storage with FTS5 virtual tables -5. **Chroma Vector DB** - Semantic search with hybrid retrieval +1. **MCP Tools** (4 tools) - `search`, `timeline`, `get_observations`, `__IMPORTANT` +2. **MCP Server** (`plugin/scripts/mcp-server.cjs`) - Thin wrapper over HTTP API +3. **HTTP API Endpoints** - Fast search operations on Worker Service (port 37777) +4. **Worker Service** - Express.js server with FTS5 full-text search +5. **SQLite Database** - Persistent storage with FTS5 virtual tables +6. **Chroma Vector DB** - Semantic search with hybrid retrieval -**v5.5.0 Enhancement**: Renamed from "search" to "mem-search" with: -- Effectiveness increased from 67% to 100% -- Concrete triggers increased from 44% to 85% -- 5+ unique identifiers for better scope differentiation -- Comprehensive documentation (17 files, 12 operation guides) +**Token Efficiency**: ~10x savings through 3-layer workflow pattern ## How It Works -### 1. User Query (Natural Language) +### 1. User Query + +Claude has access to 4 MCP tools. When searching memory, Claude follows the 3-layer workflow: ``` -User: "What bugs did we fix last session?" +Step 1: search(query="authentication bug", type="bugfix", limit=10) +Step 2: timeline(anchor=, depth_before=3, depth_after=3) +Step 3: get_observations(ids=[123, 456, 789]) ``` -### 2. Skill Invocation +### 2. MCP Protocol -Claude recognizes the intent and invokes the mem-search skill: -- Skill frontmatter (~250 tokens) loaded at session start -- Full skill instructions loaded on-demand when skill is invoked -- Progressive disclosure pattern minimizes context overhead -- "mem-search" naming provides clear scope differentiation from native memory +MCP server receives tool call via JSON-RPC over stdio: + +```json +{ + "method": "tools/call", + "params": { + "name": "search", + "arguments": { + "query": "authentication bug", + "type": "bugfix", + "limit": 10 + } + } +} +``` ### 3. HTTP API Call -The skill uses `curl` to call the HTTP API: +MCP server translates to HTTP request: -```bash -curl "http://localhost:37777/api/search/observations?query=bugs&type=bugfix&limit=5" +```typescript +const url = `http://localhost:37777/api/search?query=authentication%20bug&type=bugfix&limit=10`; +const response = await fetch(url); ``` -### 4. FTS5 Search +### 4. Worker Processing -Worker service queries SQLite FTS5 virtual tables: +Worker service executes FTS5 query: ```sql SELECT * FROM observations_fts WHERE observations_fts MATCH ? AND type = 'bugfix' ORDER BY rank -LIMIT 5 +LIMIT 10 ``` -### 5. Results Formatted +### 5. Results Returned -Skill formats results and returns to Claude: +Worker returns structured data → MCP server → Claude: -``` -## Recent Bugfixes - -1. [bugfix] Fixed authentication token expiry - Date: 2025-11-08 14:23:45 - Files: src/auth/jwt.ts - -2. [bugfix] Resolved database connection leak - Date: 2025-11-08 13:15:22 - Files: src/services/database.ts -``` - -### 6. User Sees Answer - -Claude presents the formatted results naturally in conversation. - -## Architecture Change (v5.4.0) - -### Before: MCP-Based Search - -**Approach**: 9 MCP tools registered at session start - -**Token Cost**: ~2,500 tokens in tool definitions per session -- Each tool's schema, parameters, descriptions loaded -- All 9 tools available whether needed or not -- No progressive disclosure - -**Example MCP Tool**: ```json { - "name": "search_observations", - "description": "Full-text search across observations...", - "inputSchema": { - "type": "object", - "properties": { - "query": { "type": "string", "description": "..." }, - "type": { "type": "array", "items": { "enum": [...] } }, - "format": { "enum": ["index", "full"] }, - // ... many more parameters + "content": [{ + "type": "text", + "text": "| ID | Time | Title | Type |\n|---|---|---|---|\n| #123 | 2:15 PM | Fixed auth token expiry | bugfix |" + }] +} +``` + +### 6. Claude Processes Results + +Claude reviews the index, decides which observations are relevant, and can: +- Use `timeline` to get context +- Use `get_observations` to fetch full details for selected IDs + +## The 4 MCP Tools + +### `__IMPORTANT` - Workflow Documentation + +Always visible to Claude. Explains the 3-layer workflow pattern. + +**Description:** +``` +3-LAYER WORKFLOW (ALWAYS FOLLOW): +1. search(query) → Get index with IDs (~50-100 tokens/result) +2. timeline(anchor=ID) → Get context around interesting results +3. get_observations([IDs]) → Fetch full details ONLY for filtered IDs +NEVER fetch full details without filtering first. 10x token savings. +``` + +**Purpose:** Ensures Claude follows token-efficient pattern + +### `search` - Search Memory Index + +**Tool Definition:** +```typescript +{ + name: 'search', + description: 'Step 1: Search memory. Returns index with IDs. Params: query, limit, project, type, obs_type, dateStart, dateEnd, offset, orderBy', + inputSchema: { + type: 'object', + properties: {}, + additionalProperties: true // Accepts any parameters + } +} +``` + +**HTTP Endpoint:** `GET /api/search` + +**Parameters:** +- `query` - Full-text search query +- `limit` - Maximum results (default: 20) +- `type` - Filter by observation type +- `project` - Filter by project name +- `dateStart`, `dateEnd` - Date range filters +- `offset` - Pagination offset +- `orderBy` - Sort order + +**Returns:** Compact index with IDs, titles, dates, types (~50-100 tokens per result) + +### `timeline` - Get Chronological Context + +**Tool Definition:** +```typescript +{ + name: 'timeline', + description: 'Step 2: Get context around results. Params: anchor (observation ID) OR query (finds anchor automatically), depth_before, depth_after, project', + inputSchema: { + type: 'object', + properties: {}, + additionalProperties: true + } +} +``` + +**HTTP Endpoint:** `GET /api/timeline` + +**Parameters:** +- `anchor` - Observation ID to center timeline around (optional if query provided) +- `query` - Search query to find anchor automatically (optional if anchor provided) +- `depth_before` - Number of observations before anchor (default: 3) +- `depth_after` - Number of observations after anchor (default: 3) +- `project` - Filter by project name + +**Returns:** Chronological view showing what happened before/during/after + +### `get_observations` - Fetch Full Details + +**Tool Definition:** +```typescript +{ + name: 'get_observations', + description: 'Step 3: Fetch full details for filtered IDs. Params: ids (array of observation IDs, required), orderBy, limit, project', + inputSchema: { + type: 'object', + properties: { + ids: { + type: 'array', + items: { type: 'number' }, + description: 'Array of observation IDs to fetch (required)' + } + }, + required: ['ids'], + additionalProperties: true + } +} +``` + +**HTTP Endpoint:** `POST /api/observations/batch` + +**Body:** +```json +{ + "ids": [123, 456, 789], + "orderBy": "date_desc", + "project": "my-app" +} +``` + +**Returns:** Complete observation details (~500-1,000 tokens per observation) + +## MCP Server Implementation + +**Location:** `/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs` + +**Role:** Thin wrapper that translates MCP protocol to HTTP API calls + +**Key Characteristics:** +- ~312 lines of code (reduced from ~2,718 lines in old implementation) +- No business logic - just protocol translation +- Single source of truth: Worker HTTP API +- Simple schemas with `additionalProperties: true` + +**Handler Example:** +```typescript +{ + name: 'search', + handler: async (args: any) => { + const endpoint = '/api/search'; + const searchParams = new URLSearchParams(); + + for (const [key, value] of Object.entries(args)) { + searchParams.append(key, String(value)); + } + + const url = `http://localhost:37777${endpoint}?${searchParams}`; + const response = await fetch(url); + return await response.json(); + } +} +``` + +## Worker HTTP API + +**Location:** `src/services/worker-service.ts` + +**Port:** 37777 + +**Search Endpoints:** +```typescript +GET /api/search # Main search (used by MCP search tool) +GET /api/timeline # Timeline context (used by MCP timeline tool) +POST /api/observations/batch # Fetch by IDs (used by MCP get_observations tool) +GET /api/health # Health check +``` + +**Database Access:** +- Uses `SessionSearch` service for FTS5 queries +- Uses `SessionStore` for structured queries +- Hybrid search with ChromaDB for semantic similarity + +**FTS5 Full-Text Search:** +```typescript +// search tool → HTTP GET → FTS5 query +SELECT * FROM observations_fts +WHERE observations_fts MATCH ? +AND type = ? +AND date >= ? AND date <= ? +ORDER BY rank +LIMIT ? OFFSET ? +``` + +## The 3-Layer Workflow Pattern + +### Design Philosophy + +The 3-layer workflow embodies **progressive disclosure** - a core principle of claude-mem's architecture. + +**Layer 1: Index (Search)** +- **What:** Compact table with IDs, titles, dates, types +- **Cost:** ~50-100 tokens per result +- **Purpose:** Survey what exists before committing tokens +- **Decision Point:** "Which observations are relevant?" + +**Layer 2: Context (Timeline)** +- **What:** Chronological view of observations around a point +- **Cost:** Variable based on depth +- **Purpose:** Understand narrative arc, see what led to/from a point +- **Decision Point:** "Do I need full details?" + +**Layer 3: Details (Get Observations)** +- **What:** Complete observation data (narrative, facts, files, concepts) +- **Cost:** ~500-1,000 tokens per observation +- **Purpose:** Deep dive on validated, relevant observations +- **Decision Point:** "Apply knowledge to current task" + +### Token Efficiency + +**Traditional RAG Approach:** +``` +Fetch 20 observations upfront: 10,000-20,000 tokens +Relevance: ~10% (only 2 observations actually useful) +Waste: 18,000 tokens on irrelevant context +``` + +**3-Layer Workflow:** +``` +Step 1: search (20 results) ~1,000-2,000 tokens +Step 2: Review index, filter to 3 relevant IDs +Step 3: get_observations (3 IDs) ~1,500-3,000 tokens +Total: 2,500-5,000 tokens (50-75% savings) +``` + +**10x Savings:** By filtering at index level before fetching full details + +## Architecture Evolution + +### Before: Complex MCP Implementation + +**Approach:** 9 MCP tools with detailed parameter schemas + +**Token Cost:** ~2,500 tokens in tool definitions per session +- `search_observations` - Full-text search +- `find_by_type` - Filter by type +- `find_by_file` - Filter by file +- `find_by_concept` - Filter by concept +- `get_recent_context` - Recent sessions +- `get_observation` - Fetch single observation +- `get_session` - Fetch session +- `get_prompt` - Fetch prompt +- `help` - API documentation + +**Problems:** +- Overlapping operations (search_observations vs find_by_type) +- Complex parameter schemas +- No built-in workflow guidance +- High token cost at session start + +**Code Size:** ~2,718 lines in mcp-server.ts + +### After: Streamlined MCP Implementation + +**Approach:** 4 MCP tools following 3-layer workflow + +**Token Cost:** ~312 lines of code, simplified tool definitions + +**Tools:** +1. `__IMPORTANT` - Workflow guidance (always visible) +2. `search` - Step 1 (index) +3. `timeline` - Step 2 (context) +4. `get_observations` - Step 3 (details) + +**Benefits:** +- Progressive disclosure built into tool design +- No overlapping operations +- Simple schemas (`additionalProperties: true`) +- Clear workflow pattern +- ~10x token savings + +**Code Size:** ~312 lines in mcp-server.ts (88% reduction) + +### Key Insight + +**Before:** Progressive disclosure was something Claude had to remember + +**After:** Progressive disclosure is enforced by tool design itself + +The 3-layer workflow pattern makes it structurally difficult to waste tokens: +- Can't fetch details without first getting IDs from search +- Can't search without seeing workflow reminder (`__IMPORTANT`) +- Timeline provides middle ground between index and full details + +## Configuration + +### Claude Desktop + +Add to `claude_desktop_config.json`: + +```json +{ + "mcpServers": { + "mcp-search": { + "command": "node", + "args": [ + "/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs" + ] } } } ``` -### After: Skill-Based Search +### Claude Code -**Approach**: 1 mem-search skill with progressive disclosure +MCP server is automatically configured via plugin installation. No manual setup required. -**Token Cost**: ~250 tokens in skill frontmatter per session -- Only skill description loaded at session start -- Full instructions loaded on-demand when skill is invoked -- HTTP API endpoints instead of MCP protocol +**Both clients use the same MCP tools** - the architecture works identically for Claude Desktop and Claude Code. -**Example Skill Frontmatter**: -```markdown -# Claude-Mem mem-search Skill +## Security -Access claude-mem's persistent memory through a comprehensive HTTP API. -Search for past work, understand context, and learn from previous decisions. +### FTS5 Injection Prevention -## When to Use This Skill +All search queries are escaped before FTS5 processing: -Invoke this skill when users ask about: -- Past work: "What did we do last session?" -- Bug fixes: "Did we fix this before?" -- Features: "How did we implement authentication?" -... -``` - -**Token Efficiency**: Minimal frontmatter at session start with progressive disclosure - -## HTTP API Endpoints - -The worker service exposes 10 search endpoints: - -### Full-Text Search - -``` -GET /api/search/observations -GET /api/search/sessions -GET /api/search/prompts -``` - -**Parameters**: -- `query` - FTS5 search query (required) -- `type` - Filter by type (bugfix, feature, refactor, etc.) -- `project` - Filter by project name -- `limit` - Maximum results (default: 20) -- `offset` - Pagination offset -- `format` - Response format (index or full) - -**Example**: -```bash -curl "http://localhost:37777/api/search/observations?query=authentication&type=decision&limit=5" -``` - -### Filtered Search - -``` -GET /api/search/by-type -GET /api/search/by-concept -GET /api/search/by-file -``` - -**Parameters**: -- `type` / `concept` / `filePath` - Filter criteria (required) -- `project` - Filter by project -- `limit` - Maximum results -- `format` - Response format - -**Example**: -```bash -curl "http://localhost:37777/api/search/by-file?filePath=worker-service.ts&limit=10" -``` - -### Context Retrieval - -``` -GET /api/context/recent -GET /api/context/timeline -GET /api/timeline/by-query -``` - -**Parameters**: -- `project` - Filter by project -- `limit` - Number of sessions/records -- `anchor` - Timeline anchor point (ID or timestamp) -- `depth_before` - Records before anchor -- `depth_after` - Records after anchor - -**Example**: -```bash -curl "http://localhost:37777/api/context/recent?project=claude-mem&limit=5" -``` - -### Documentation - -``` -GET /api/search/help -``` - -Returns API documentation in JSON format. - -## Progressive Disclosure Pattern - -The mem-search skill uses progressive disclosure to minimize token usage: - -### Layer 1: Skill Frontmatter (Session Start) - -**What's Loaded**: Skill description and when to use it (~250 tokens) - -**Purpose**: Claude can recognize when to invoke the skill - -**Example**: -```markdown -# Claude-Mem mem-search Skill - -Access claude-mem's persistent memory through a comprehensive HTTP API. - -## When to Use This Skill -Invoke this skill when users ask about: -- Past work: "What did we do last session?" -- Bug fixes: "Did we fix this before?" -... -``` - -### Layer 2: Full Skill Instructions (On-Demand) - -**What's Loaded**: Complete operation documentation (~2,500 tokens) - -**Purpose**: Detailed instructions for each search operation - -**When Loaded**: Only when Claude invokes the skill - -**Example Structure**: -``` -/skills/search/ -├── SKILL.md (main frontmatter) -├── operations/ -│ ├── observations.md (detailed instructions) -│ ├── sessions.md -│ ├── prompts.md -│ ├── by-type.md -│ ├── by-concept.md -│ ├── by-file.md -│ ├── recent-context.md -│ ├── timeline.md -│ ├── timeline-by-query.md -│ ├── help.md -│ ├── formatting.md -│ └── common-workflows.md -``` - -### Layer 3: API Response - -**What's Returned**: Search results in requested format - -**Format Options**: -- `index` - Titles, dates, IDs only (~50-100 tokens per result) -- `full` - Complete details (~500-1000 tokens per result) - -**Progressive Usage**: Start with `index`, drill down with `full` as needed - -## Implementation Details - -### mem-search Skill Structure - -``` -plugin/skills/mem-search/ -├── SKILL.md # Main frontmatter (~250 tokens) -├── operations/ -│ ├── observations.md # Search observations -│ ├── sessions.md # Search sessions -│ ├── prompts.md # Search prompts -│ ├── by-type.md # Filter by type -│ ├── by-concept.md # Filter by concept -│ ├── by-file.md # Filter by file -│ ├── recent-context.md # Get recent context -│ ├── timeline.md # Timeline around point -│ ├── timeline-by-query.md # Search + timeline -│ ├── help.md # API documentation -│ ├── formatting.md # Result formatting guide -│ └── common-workflows.md # Usage patterns -``` - -### Worker Service Integration - -**File**: `src/services/worker-service.ts` - -**Search Routes**: -```typescript -// Full-text search -app.get('/api/search/observations', handleSearchObservations); -app.get('/api/search/sessions', handleSearchSessions); -app.get('/api/search/prompts', handleSearchPrompts); - -// Filtered search -app.get('/api/search/by-type', handleSearchByType); -app.get('/api/search/by-concept', handleSearchByConcept); -app.get('/api/search/by-file', handleSearchByFile); - -// Context retrieval -app.get('/api/context/recent', handleRecentContext); -app.get('/api/context/timeline', handleTimeline); -app.get('/api/timeline/by-query', handleTimelineByQuery); - -// Documentation -app.get('/api/search/help', handleHelp); -``` - -**Database Access**: -- Uses `SessionSearch` service for FTS5 queries -- Uses `SessionStore` for structured queries -- Hybrid search with ChromaDB for semantic similarity - -### Security - -**FTS5 Injection Prevention** (v4.2.3): ```typescript function escapeFTS5Query(query: string): string { return query.replace(/"/g, '""'); } ``` -All user-provided search queries are properly escaped to prevent SQL injection. +**Testing:** 332 injection attack tests covering special characters, SQL keywords, quote escaping, and boolean operators. -**Comprehensive Testing**: 332 injection attack tests covering: -- Special characters -- SQL keywords -- Quote escaping -- Boolean operators +### MCP Protocol Security -## Benefits +- Stdio transport (no network exposure) +- Local-only HTTP API (localhost:37777) +- No authentication needed (local development only) -### 1. Token Efficiency +## Performance -**Before (MCP)**: -- Session start: All tool definitions loaded upfront -- Every session pays this cost -- No progressive disclosure +**FTS5 Full-Text Search:** <10ms for typical queries -**After (Skill)**: -- Session start: Minimal token cost for skill frontmatter -- Full instructions loaded only when invoked (progressive disclosure) -- More efficient than loading all tool definitions upfront +**MCP Overhead:** Minimal - simple protocol translation -### 2. Natural Language Interface +**Caching:** HTTP layer allows response caching (future enhancement) -**Before**: Users needed to learn MCP tool syntax -``` -search_observations with query="authentication" and type="decision" -``` +**Pagination:** Efficient with offset/limit -**After**: Users ask naturally -``` -"What decisions did we make about authentication?" -``` +**Batching:** `get_observations` accepts multiple IDs in single call -Claude translates to appropriate API call. +## Benefits Over Alternative Approaches -### 3. Flexibility +### vs. Traditional RAG -**HTTP API Benefits**: -- Can be called from skills, MCP tools, or other clients -- Easy to test with curl -- Standard REST conventions -- JSON responses +**Traditional RAG:** +- Fetches everything upfront +- High token cost +- Low relevance ratio -**Progressive Disclosure**: -- Loads only what's needed -- Can add more operations without increasing base cost -- Documentation co-located with operations +**3-Layer MCP:** +- Fetches only what's needed +- ~10x token savings +- 100% relevance (Claude chooses what to fetch) -### 4. Performance +### vs. Previous MCP Implementation (v5.x) -**Fast Queries**: FTS5 full-text search under 10ms for typical queries +**Previous (9 tools):** +- Complex schemas +- Overlapping operations +- No workflow guidance +- ~2,500 tokens in definitions -**Caching**: HTTP layer allows response caching +**Current (4 tools):** +- Simple schemas +- Clear workflow +- Built-in guidance +- ~312 lines of code -**Pagination**: Efficient result pagination with offset/limit +### vs. Skill-Based Approach (Previously) -## Migration Notes +**Skill approach:** +- Required separate skill files +- HTTP API called directly via curl +- Progressive disclosure through skill loading -### For Users +**MCP approach:** +- Native MCP protocol (better Claude integration) +- Cleaner architecture (protocol translation layer) +- Works with both Claude Desktop and Claude Code +- Simpler to maintain (no skill files) -**No Action Required**: The migration from MCP to skill-based search is transparent. - -**Same Questions Work**: Natural language queries work exactly the same way. - -**Invisible Change**: Users won't notice any difference except better performance. - -### For Developers - -**Renamed**: MCP server (formerly `search-server.ts`, now `src/servers/mcp-server.ts`) -- Source file kept for reference -- No longer built or registered -- MCP configuration removed from `plugin/.mcp.json` - -**New Implementation**: Skill-based search -- Skill files: `plugin/skills/mem-search/` -- HTTP endpoints: `src/services/worker-service.ts` (lines 200-400) -- Build script: `npm run build` includes skill files -- Sync script: `npm run sync-marketplace` copies to plugin directory +**Migration:** Skill-based search was removed in favor of streamlined MCP architecture. ## Troubleshooting +### MCP Server Not Connected + +**Symptoms:** Tools not appearing in Claude + +**Solution:** +1. Check MCP server path in configuration +2. Verify worker service is running: `curl http://localhost:37777/api/health` +3. Restart Claude Desktop/Code + ### Worker Service Not Running -If searches fail, check worker service: +**Symptoms:** MCP tools fail with connection errors +**Solution:** ```bash npm run worker:status # Check status npm run worker:restart # Restart worker npm run worker:logs # View logs ``` -### HTTP Endpoints Not Responding +### Empty Search Results -Test endpoints directly: +**Symptoms:** search() returns no results -```bash -# Health check -curl http://localhost:37777/health - -# Search test -curl "http://localhost:37777/api/search/observations?query=test&limit=1" -``` - -### Skill Not Invoking - -If Claude doesn't invoke the mem-search skill automatically: - -1. Check skill files exist: `ls ~/.claude/plugins/marketplaces/thedotmack/plugin/skills/mem-search/` -2. Restart Claude Code session to reload skill definitions -3. Try more explicit phrasing: "Search past sessions for bug fixes" or "What did we do in yesterday's session?" -4. Ensure your question is about previous sessions (not current conversation context) +**Troubleshooting:** +1. Test API directly: `curl "http://localhost:37777/api/search?query=test"` +2. Check database: `ls ~/.claude-mem/claude-mem.db` +3. Verify observations exist: `curl "http://localhost:37777/api/health"` ## Next Steps -- [Search Tools Usage](/usage/search-tools) - User guide with examples +- [Memory Search Usage](/usage/search-tools) - User guide with examples +- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow - [Worker Service Architecture](/architecture/worker-service) - HTTP API details - [Database Schema](/architecture/database) - FTS5 tables and indexes diff --git a/docs/public/progressive-disclosure.mdx b/docs/public/progressive-disclosure.mdx index 47188a25..f3f4c703 100644 --- a/docs/public/progressive-disclosure.mdx +++ b/docs/public/progressive-disclosure.mdx @@ -260,14 +260,12 @@ The index is useless without retrieval mechanisms: *Use claude-mem MCP search to access records with the given ID* ``` -**Available tools:** -- `search_observations` - Full-text search -- `find_by_concept` - Concept-based retrieval -- `find_by_file` - File-based retrieval -- `find_by_type` - Type-based retrieval -- `get_recent_context` - Recent session summaries +**Available MCP tools:** +- `search` - Search memory index (Layer 1: Get IDs) +- `timeline` - Get chronological context (Layer 2: See narrative arc) +- `get_observations` - Fetch full details (Layer 3: Deep dive) -Each tool supports `format: "index"` (default) and `format: "full"`. +The 3-layer workflow ensures progressive disclosure: index → context → details. --- @@ -318,16 +316,18 @@ Is my task related to npm? → YES --- -## The Two-Tier Search Strategy +## The Three-Layer Workflow -Claude-Mem implements progressive disclosure in search results too: +Claude-Mem implements progressive disclosure through a 3-layer workflow pattern: -### Tier 1: Index Format (Default) +### Layer 1: Search (Index) + +Start by searching to get a compact index with IDs: ```typescript -search_observations({ +search({ query: "hook timeout", - format: "index" // Default + limit: 10 }) ``` @@ -335,23 +335,40 @@ search_observations({ ``` Found 3 observations matching "hook timeout": -| ID | Date | Type | Title | Tokens | -|----|------|------|-------|--------| -| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | ~155 | -| #2891 | Oct 25 | how-it-works | Hook timeout configuration | ~203 | -| #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ~89 | +| ID | Date | Type | Title | +|----|------|------|-------| +| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | +| #2891 | Oct 25 | how-it-works | Hook timeout configuration | +| #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ``` -**Cost:** ~100 tokens for 3 results -**Value:** Agent can scan and decide which to fetch +**Cost:** ~50-100 tokens per result +**Value:** Agent can scan and decide which observations are relevant -### Tier 2: Full Format (On-Demand) +### Layer 2: Timeline (Context) + +Get chronological context around interesting observations: ```typescript -search_observations({ - query: "hook timeout", - format: "full", - limit: 1 // Fetch just the most relevant +timeline({ + anchor: 2543, // Observation ID from search + depth_before: 3, + depth_after: 3 +}) +``` + +**Returns:** Chronological view showing what happened before/during/after observation #2543 + +**Cost:** Variable based on depth +**Value:** Understand narrative arc and context + +### Layer 3: Get Observations (Details) + +Fetch full details only for relevant observations: + +```typescript +get_observations({ + ids: [2543, 2102] // Selected from search results }) ``` @@ -463,29 +480,30 @@ Here are 10 observations. *Use MCP search tools to fetch full observation details on-demand* ``` -### ❌ Defaulting to Full Format +### ❌ Skipping the Index Layer **Bad:** ```typescript -search_observations({ - query: "hooks", - format: "full" // Fetches everything +// Fetching full details immediately +get_observations({ + ids: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] // Guessing which are relevant }) ``` **Good:** ```typescript -search_observations({ +// Follow the 3-layer workflow +// Layer 1: Search for index +search({ query: "hooks", - format: "index", // Scan first limit: 20 }) -// Then, if needed: -search_observations({ - query: "hooks", - format: "full", - limit: 1 // Just the most relevant +// Layer 2: Review index, identify 2-3 relevant IDs + +// Layer 3: Fetch only relevant observations +get_observations({ + ids: [2543, 2891] // Just the most relevant }) ``` @@ -595,10 +613,9 @@ SessionStart({ source: "compact" }): ```typescript // Use embeddings to pre-sort index by relevance -search_observations({ +search({ query: "authentication bug", - format: "index", - sort: "relevance" // Based on semantic similarity + orderBy: "relevance" // Based on semantic similarity (future enhancement) }) ``` diff --git a/docs/public/troubleshooting.mdx b/docs/public/troubleshooting.mdx index 51e7c7d7..826a7793 100644 --- a/docs/public/troubleshooting.mdx +++ b/docs/public/troubleshooting.mdx @@ -742,17 +742,17 @@ sqlite3 ~/.claude-mem/claude-mem.db " 3. Test simple query: ```bash - # In Claude Code - search_observations with query="test" + # Test MCP search tool + search(query="test", limit=5) ``` 4. Check query syntax: ```bash - # Bad: Special characters - search_observations with query="[test]" + # Bad: Special characters may cause issues + search(query="[test]") # Good: Simple words - search_observations with query="test" + search(query="test") ``` ### Token Limit Errors @@ -761,28 +761,40 @@ sqlite3 ~/.claude-mem/claude-mem.db " **Solutions**: -1. Use index format: +1. Follow 3-layer workflow (don't skip to get_observations): ```bash - search_observations with query="..." and format="index" + # Start with search to get index + search(query="...", limit=10) + + # Review IDs, then fetch only relevant ones + get_observations(ids=[<2-3 relevant IDs>]) ``` -2. Reduce limit: +2. Reduce limit in search: ```bash - search_observations with query="..." and limit=3 + search(query="...", limit=3) ``` 3. Use filters to narrow results: ```bash - search_observations with query="..." and type="decision" and limit=5 + search(query="...", type="decision", limit=5) ``` 4. Paginate results: ```bash # First page - search_observations with query="..." and limit=5 and offset=0 + search(query="...", limit=5, offset=0) # Second page - search_observations with query="..." and limit=5 and offset=5 + search(query="...", limit=5, offset=5) + ``` + +5. Batch IDs in get_observations: + ```bash + # Always batch multiple IDs in one call + get_observations(ids=[123, 456, 789]) + + # Don't make separate calls per ID ``` ## Performance Issues diff --git a/docs/public/usage/search-tools.mdx b/docs/public/usage/search-tools.mdx index 32343138..6ddf7b6d 100644 --- a/docs/public/usage/search-tools.mdx +++ b/docs/public/usage/search-tools.mdx @@ -1,403 +1,454 @@ --- -title: "mem-search Skill" -description: "Query your project history with natural language" +title: "Memory Search" +description: "Search your project history with MCP tools" --- -# mem-search Skill Usage +# Memory Search with MCP Tools -Once claude-mem is installed as a plugin, you can search your project history using natural language. Claude automatically invokes the mem-search skill when you ask about past work. +Claude-mem provides persistent memory across sessions through **4 MCP tools** that follow a token-efficient **3-layer workflow pattern**. -## How It Works +## Overview -**v5.5.0 Enhancement**: The search skill was renamed to "mem-search" for better scope differentiation, with effectiveness increased from 67% to 100% and enhanced concrete triggers (85% vs 44%). +Instead of fetching all historical data upfront (expensive), claude-mem uses a progressive disclosure approach: -**v5.4.0 Architecture**: Claude-Mem uses a skill-based search architecture instead of MCP tools, saving ~2,250 tokens per session start through progressive disclosure. +1. **Search** → Get a compact index with IDs (~50-100 tokens/result) +2. **Timeline** → Get context around interesting results +3. **Get Observations** → Fetch full details ONLY for filtered IDs -**Simple Usage:** -- Just ask naturally: *"What did we do last session?"* -- Claude recognizes the intent and invokes the mem-search skill -- The skill uses HTTP API endpoints to query your memory -- Results are formatted and presented to you +This achieves **~10x token savings** compared to traditional RAG approaches. -**Benefits:** -- **Token Efficient**: ~250 tokens (skill frontmatter) vs ~2,500 tokens (MCP tool definitions) -- **Natural Language**: No need to learn specific tool syntax -- **Progressive Disclosure**: Only loads detailed instructions when needed -- **Auto-Invoked**: Claude knows when to search based on your questions -- **Scope Differentiation**: "mem-search" clearly distinguishes from native conversation memory +## The 3-Layer Workflow -## Quick Reference +### Layer 1: Search (Index) -| Operation | Purpose | -|-------------------------|----------------------------------------------| -| Search Observations | Full-text search across observations | -| Search Sessions | Full-text search across session summaries | -| Search Prompts | Full-text search across raw user prompts | -| By Concept | Find observations tagged with concepts | -| By File | Find observations referencing files | -| By Type | Find observations by type | -| Recent Context | Get recent session context | -| Timeline | Get unified timeline around a specific point | -| Timeline by Query | Search and get timeline context in one step | -| API Help | Get search API documentation | - -## Example Queries - -### Natural Language Queries - -**Search Observations:** -``` -"What bugs did we fix related to authentication?" -"Show me all decisions about the build system" -"Find refactoring work on the database" -``` - -**Search Sessions:** -``` -"What did we learn about hooks?" -"What was accomplished in the API implementation?" -"Show me recent work on this project" -``` - -**Search Prompts:** -``` -"When did I ask about authentication features?" -"Find all my requests about dark mode" -``` - -**Note**: Claude automatically translates your natural language queries into the appropriate search operations. - -### Search by File +Start by searching to get a lightweight index of results: ``` -"Show me everything related to worker-service.ts" -"What changes were made to migrations.ts?" -"Find all work on the database file" +search(query="authentication bug", type="bugfix", limit=10) ``` -### Search by Concept +**Returns:** Compact table with IDs, titles, dates, types +**Cost:** ~50-100 tokens per result +**Purpose:** Survey what exists before fetching details + +### Layer 2: Timeline (Context) + +Get chronological context around specific observations: ``` -"Show observations tagged with architecture" -"Find all security-related observations" -"What patterns have we used?" +timeline(anchor=, depth_before=3, depth_after=3) ``` -### Search by Type +Or search and get timeline in one step: ``` -"Find all feature implementations" -"Show me all decisions and discoveries" -"What bugs have we fixed?" +timeline(query="authentication", depth_before=2, depth_after=2) ``` -### Recent Context +**Returns:** Chronological view showing what was happening before/after +**Cost:** Variable, depends on depth +**Purpose:** Understand narrative arc and context + +### Layer 3: Get Observations (Details) + +Fetch full details only for relevant observations: ``` -"Show me what we've been working on" -"Get context from the last 5 sessions" -"What happened recently on this project?" +get_observations(ids=[123, 456, 789]) ``` -### Timeline Queries +**Returns:** Complete observation details (narrative, facts, files, concepts) +**Cost:** ~500-1000 tokens per observation +**Purpose:** Deep dive on specific, validated items -**Get timeline around a specific point:** +### Why This Works + +**Traditional Approach:** +- Fetch everything upfront: 20,000 tokens +- Relevance: ~10% (2,000 tokens actually useful) +- Waste: 18,000 tokens on irrelevant context + +**3-Layer Approach:** +- Search index: 1,000 tokens (10 results) +- Timeline context: 500 tokens (around 2 key results) +- Fetch details: 1,500 tokens (3 observations) +- **Total: 3,000 tokens, 100% relevant** + +## Available Tools + +### `__IMPORTANT` - Workflow Documentation + +Always visible reminder of the 3-layer workflow pattern. Helps Claude understand how to use the search tools efficiently. + +**Usage:** Automatically shown, no need to invoke + +### `search` - Search Memory Index + +Search your memory and get a compact index with IDs. + +**Parameters:** +- `query` - Full-text search query (supports AND, OR, NOT, phrase searches) +- `limit` - Maximum results (default: 20) +- `offset` - Skip first N results for pagination +- `type` - Filter by observation type (bugfix, feature, decision, discovery, refactor, change) +- `obs_type` - Filter by record type (observation, session, prompt) +- `project` - Filter by project name +- `dateStart` - Filter by start date (YYYY-MM-DD) +- `dateEnd` - Filter by end date (YYYY-MM-DD) +- `orderBy` - Sort order (date_desc, date_asc, relevance) + +**Returns:** Compact index table with IDs, titles, dates, types + +**Example:** ``` -"What was happening when we implemented authentication?" -"Show me the context around that bug fix" -"What led to the decision to refactor the database?" +search(query="database migration", type="bugfix", limit=5, orderBy="date_desc") ``` -**Timeline by query:** +### `timeline` - Get Chronological Context + +Get a chronological view of observations around a specific point or query. + +**Parameters:** +- `anchor` - Observation ID to center timeline around (optional if query provided) +- `query` - Search query to find anchor automatically (optional if anchor provided) +- `depth_before` - Number of observations before anchor (default: 3) +- `depth_after` - Number of observations after anchor (default: 3) +- `project` - Filter by project name + +**Returns:** Chronological list showing what happened before/during/after + +**Example:** ``` -"Find when we added the viewer UI and show what happened around that time" -"Search for authentication work and show the timeline" +timeline(anchor=12345, depth_before=5, depth_after=5) ``` -**Benefits:** -- See the complete narrative arc around key events -- All record types (observations, sessions, prompts) in chronological view -- Understand what was happening before and after important changes - -## Search Strategy - -The mem-search skill uses a progressive disclosure pattern to efficiently retrieve information: - -### 1. Ask Naturally - -Start with a natural language question: +Or search-based: ``` -"What bugs did we fix related to authentication?" +timeline(query="implemented JWT auth", depth_before=3, depth_after=3) ``` -### 2. Claude Invokes mem-search Skill +### `get_observations` - Fetch Full Details -Claude recognizes your intent and loads the mem-search skill (~250 tokens for skill frontmatter). +Fetch complete observation details by IDs. **Always batch multiple IDs in a single call for efficiency.** -### 3. Skill Uses HTTP API +**Parameters:** +- `ids` - Array of observation IDs (required) +- `orderBy` - Sort order (date_desc, date_asc) +- `limit` - Maximum observations to return +- `project` - Filter by project name -The skill calls the appropriate HTTP endpoint (e.g., `/api/search/observations`) with the query. +**Returns:** Complete observation details including narrative, facts, files, concepts -### 4. Results Formatted - -Results are formatted and presented to you, usually starting with an index/summary format. - -### 5. Deep Dive if Needed - -If you need more details, ask follow-up questions: +**Example:** ``` -"Tell me more about observation #123" -"Show me the full details of that decision" +get_observations(ids=[123, 456, 789, 1011]) ``` -**Benefits of This Approach:** -- **Token Efficient**: Only loads what you need, when you need it -- **Natural**: No syntax to learn -- **Progressive**: Start with overview, drill down as needed -- **Automatic**: Claude handles the search invocation +**Important:** Always batch IDs instead of making separate calls per observation. + +## Common Use Cases + +### Debugging Issues + +**Scenario:** Find what went wrong with database connections + +``` +Step 1: search(query="error database connection", type="bugfix", limit=10) + → Review index, identify observations #245, #312, #489 + +Step 2: timeline(anchor=312, depth_before=3, depth_after=3) + → See what was happening around the fix + +Step 3: get_observations(ids=[312, 489]) + → Get full details on relevant fixes +``` + +### Understanding Decisions + +**Scenario:** Review architectural choices about authentication + +``` +Step 1: search(query="authentication", type="decision", limit=5) + → Find decision observations + +Step 2: get_observations(ids=[]) + → Get full decision rationale, trade-offs, facts +``` + +### Code Archaeology + +**Scenario:** Find when a specific file was modified + +``` +Step 1: search(query="worker-service.ts", limit=20) + → Get all observations mentioning that file + +Step 2: timeline(query="worker-service.ts refactor", depth_before=2, depth_after=2) + → See what led to and followed from the refactor + +Step 3: get_observations(ids=[]) + → Get implementation details +``` + +### Feature History + +**Scenario:** Track how a feature evolved + +``` +Step 1: search(query="dark mode", type="feature", orderBy="date_asc") + → Chronological view of feature work + +Step 2: timeline(anchor=, depth_after=10) + → See the full development timeline + +Step 3: get_observations(ids=[]) + → Deep dive on critical implementation points +``` + +### Learning from Past Work + +**Scenario:** Review refactoring patterns + +``` +Step 1: search(type="refactor", limit=10, orderBy="date_desc") + → Recent refactoring work + +Step 2: get_observations(ids=[]) + → Study the patterns and approaches used +``` + +### Context Recovery + +**Scenario:** Restore context after time away from project + +``` +Step 1: search(query="project-name", limit=10, orderBy="date_desc") + → See recent work + +Step 2: timeline(anchor=, depth_before=10) + → Understand what led to current state + +Step 3: get_observations(ids=[]) + → Refresh memory on key decisions +``` + +## Search Query Syntax + +The `query` parameter supports SQLite FTS5 full-text search syntax: + +### Boolean Operators + +``` +query="authentication AND JWT" # Both terms must appear +query="OAuth OR JWT" # Either term can appear +query="security NOT deprecated" # Exclude deprecated items +``` + +### Phrase Searches + +``` +query='"database migration"' # Exact phrase match +``` + +### Column-Specific Searches + +``` +query="title:authentication" # Search in title only +query="content:database" # Search in content only +query="concepts:security" # Search in concepts only +``` + +### Combining Operators + +``` +query='"user auth" AND (JWT OR session) NOT deprecated' +``` + +## Token Management + +### Token Efficiency Best Practices + +1. **Always start with search** - Get index first (~50-100 tokens/result) +2. **Use small limits** - Start with 3-5 results, increase if needed +3. **Filter before fetching** - Use type, date, project filters +4. **Batch get_observations** - Always group multiple IDs in one call +5. **Use timeline strategically** - Get context only when narrative matters + +### Token Cost Estimates + +| Operation | Tokens per Result | +|-----------|-------------------| +| search (index) | 50-100 | +| timeline (per observation) | 100-200 | +| get_observations (full details) | 500-1,000 | + +**Example Comparison:** + +**Inefficient:** +``` +# Fetching 20 full observations upfront: 10,000-20,000 tokens +get_observations(ids=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]) +``` + +**Efficient:** +``` +# Search index: ~1,000 tokens +search(query="bug fix", limit=20) + +# Review IDs, identify 3 relevant observations + +# Fetch only relevant: ~1,500-3,000 tokens +get_observations(ids=[5, 12, 18]) + +# Total: 2,500-4,000 tokens (vs 10,000-20,000) +``` ## Advanced Filtering -You can refine searches using natural language filters: - ### Date Ranges ``` -"What bugs did we fix in October?" -"Show me work from last week" -"Find decisions made between October 1-31" +search( + query="performance optimization", + dateStart="2025-10-01", + dateEnd="2025-10-31" +) ``` ### Multiple Types -``` -"Show me all decisions and features" -"Find bugfixes and refactorings" -``` - -### Concepts +For observations of multiple types, make multiple searches or use broader query: ``` -"Find database work related to architecture and performance" -"Show security observations" +search(query="database", type="bugfix", limit=10) +search(query="database", type="feature", limit=10) ``` -### File-Specific +### Project-Specific ``` -"Show refactoring work that touched worker-service.ts" -"Find changes to auth files" +search(query="API", project="my-app", limit=15) ``` -### Project Filtering +### Pagination ``` -"Show authentication work on my-app project" -"What have we done on this codebase?" +# First page +search(query="refactor", limit=10, offset=0) + +# Second page +search(query="refactor", limit=10, offset=10) + +# Third page +search(query="refactor", limit=10, offset=20) ``` -**Note**: Claude translates your natural language into the appropriate API filters automatically. - -## Under the Hood: HTTP API - -The mem-search skill uses HTTP endpoints on the worker service (port 37777): - -- `GET /api/search/observations` - Full-text search observations -- `GET /api/search/sessions` - Full-text search session summaries -- `GET /api/search/prompts` - Full-text search user prompts -- `GET /api/search/by-concept` - Find observations by concept tag -- `GET /api/search/by-file` - Find work related to specific files -- `GET /api/search/by-type` - Find observations by type -- `GET /api/context/recent` - Get recent session context -- `GET /api/context/timeline` - Get timeline around specific point -- `GET /api/timeline/by-query` - Search + timeline in one call -- `GET /api/search/help` - API documentation - -These endpoints use FTS5 full-text search with support for: -- Boolean operators (AND, OR, NOT) -- Phrase searches -- Column-specific searches -- Date range filtering -- Project filtering - ## Result Metadata -All results include rich metadata: +All observations include rich metadata: -``` -## JWT authentication decision - -**Type**: decision -**Date**: 2025-10-21 14:23:45 -**Concepts**: authentication, security, architecture -**Files Read**: src/auth/middleware.ts, src/utils/jwt.ts -**Files Modified**: src/auth/jwt-strategy.ts - -**Narrative**: -Decided to implement JWT-based authentication instead of session-based -authentication for better scalability and stateless design... - -**Facts**: -• JWT tokens expire after 1 hour -• Refresh tokens stored in httpOnly cookies -• Token signing uses RS256 algorithm -• Public keys rotated every 30 days -``` - -## Citations - -All search results include observation IDs that can be accessed via the HTTP API: - -- `http://localhost:37777/api/observation/{id}` - Get specific observation by ID -- View all observations in the web viewer at `http://localhost:37777` - -These citations enable referencing specific historical context in your work. - -## Token Management - -### Token Efficiency Tips - -1. **Start with index format**: ~50-100 tokens per result -2. **Use small limits**: Start with 3-5 results -3. **Apply filters**: Narrow results before searching -4. **Paginate**: Use offset to browse results in batches - -### Token Estimates - -| Format | Tokens per Result | -|--------|-------------------| -| Index | 50-100 | -| Full | 500-1000 | - -**Example**: -- 20 results in index format: ~1,000-2,000 tokens -- 20 results in full format: ~10,000-20,000 tokens - -## Common Use Cases - -### 1. Debugging Issues - -Find what went wrong: -``` -search_observations with query="error database connection" and type="bugfix" -``` - -### 2. Understanding Decisions - -Review architectural choices: -``` -find_by_type with type="decision" and format="index" -``` - -Then deep dive on specific decisions: -``` -search_observations with query="[DECISION TITLE]" and format="full" -``` - -### 3. Code Archaeology - -Find when a file was modified: -``` -find_by_file with filePath="worker-service.ts" -``` - -### 4. Feature History - -Track feature development: -``` -search_sessions with query="authentication feature" -search_user_prompts with query="add authentication" -``` - -### 5. Learning from Past Work - -Review refactoring patterns: -``` -find_by_type with type="refactor" and limit=10 -``` - -### 6. Context Recovery - -Restore context after time away: -``` -get_recent_context with limit=5 -search_sessions with query="[YOUR PROJECT NAME]" and orderBy="date_desc" -``` - -## Best Practices - -1. **Index first, full later**: Always start with index format -2. **Small limits**: Start with 3-5 results to avoid token limits -3. **Use filters**: Narrow results before searching -4. **Specific queries**: More specific = better results -5. **Review citations**: Use citations to reference past decisions -6. **Date filtering**: Use date ranges for time-based searches -7. **Type filtering**: Use types to categorize searches -8. **Concept tags**: Use concepts for thematic searches +- **ID** - Unique observation identifier +- **Type** - bugfix, feature, decision, discovery, refactor, change +- **Date** - When the work occurred +- **Title** - Concise description +- **Concepts** - Tagged themes (e.g., security, performance, architecture) +- **Files Read** - Files examined during work +- **Files Modified** - Files changed during work +- **Narrative** - Story of what happened and why +- **Facts** - Key factual points (decisions made, patterns used, metrics) ## Troubleshooting ### No Results Found -1. Check database has data: +1. **Broaden your search:** + ``` + # Too specific + search(query="JWT authentication implementation with RS256") + + # Better + search(query="authentication") + ``` + +2. **Check database has data:** ```bash - sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;" + curl "http://localhost:37777/api/search?query=test" ``` -2. Try broader natural language query: +3. **Try without filters:** ``` - "Show me anything about authentication" # Broader - vs - "Find exact JWT authentication implementation" # Too specific + # Remove type/date filters to see if data exists + search(query="your-search-term") ``` -3. Ask without filters first: - ``` - "What do we have about auth?" - # Then narrow down - "Show me auth-related decisions" - ``` +### IDs Not Found in get_observations -### Worker Service Not Running +**Error:** "Observation IDs not found: [123, 456]" -If search isn't working, check the worker service: +**Causes:** +- IDs from different project (use `project` parameter) +- IDs were deleted +- Typo in ID numbers -```bash -npm run worker:status # Check worker status -npm run worker:restart # Restart if needed -npm run worker:logs # View logs +**Solution:** +``` +# Verify IDs exist +search(query="") + +# Use correct project filter +get_observations(ids=[123, 456], project="correct-project-name") ``` -Or describe the issue to Claude and the troubleshoot skill will automatically activate to provide diagnosis. +### Token Limit Errors -### Performance Issues +**Error:** Response exceeds token limits + +**Solution:** Use the 3-layer workflow to reduce upfront costs: + +``` +# Instead of fetching 50 full observations: +# get_observations(ids=[1,2,3,...,50]) # 25,000-50,000 tokens! + +# Do this: +search(query="", limit=50) # ~2,500-5,000 tokens +# Review index, identify 5 relevant observations +get_observations(ids=[<5-most-relevant>]) # ~2,500-5,000 tokens +# Total: 5,000-10,000 tokens (50-80% savings) +``` + +### Search Performance If searches seem slow: -1. Be more specific in your queries -2. Ask for recent work (naturally filters by date) -3. Specify the project you're interested in -4. Ask for fewer results initially +1. Be more specific in queries (helps FTS5 index) +2. Use date range filters to narrow scope +3. Specify project filter when possible +4. Use smaller limit values + +## Best Practices + +1. **Index First, Details Later** - Always start with search to survey options +2. **Filter Before Fetching** - Use search parameters to narrow results +3. **Batch ID Fetches** - Group multiple IDs in one get_observations call +4. **Use Timeline for Context** - When narrative matters, timeline shows the story +5. **Specific Queries** - More specific = better relevance +6. **Small Limits Initially** - Start with 3-5 results, expand if needed +7. **Review Before Deep Dive** - Check index before fetching full details ## Technical Details -**Architecture Change (v5.4.0)**: -- **Before**: 9 MCP tools (~2,500 tokens in tool definitions per session start) -- **After**: 1 mem-search skill (~250 tokens in frontmatter, full instructions loaded on-demand) -- **Savings**: ~2,250 tokens per session start -- **Migration**: Transparent - users don't need to change how they ask questions +**Architecture:** MCP tools are a thin wrapper over the Worker HTTP API (localhost:37777). The MCP server translates tool calls into HTTP requests to the worker service, which handles all business logic, database queries, and Chroma vector search. -**v5.5.0 Enhancement**: Renamed from "search" to "mem-search" with improved effectiveness (67% → 100%) and enhanced triggers (44% → 85%). +**MCP Server:** Located at `~/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs` -**How the Skill Works:** -1. User asks a question about past work -2. Claude recognizes the intent matches the mem-search skill description -3. Skill loads full instructions from `plugin/skills/mem-search/SKILL.md` -4. Skill uses `curl` to call HTTP API endpoints -5. Results formatted and returned to Claude -6. Claude presents results to user +**Worker Service:** Express API on port 37777, managed by Bun + +**Database:** SQLite FTS5 full-text search on `~/.claude-mem/claude-mem.db` + +**Vector Search:** Chroma embeddings for semantic search (underlying implementation) ## Next Steps +- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow - [Architecture Overview](/architecture/overview) - System components -- [Database Schema](/architecture/database) - Understanding the data -- [Getting Started](/usage/getting-started) - Automatic operation +- [Database Schema](/architecture/database) - Understanding the data structure +- [Claude Desktop Setup](/usage/claude-desktop) - Installation and configuration