Refactor search documentation to implement a 3-layer workflow for memory retrieval; update tool names and usage examples for clarity and efficiency. Enhance troubleshooting section with new error handling and token management strategies.

This commit is contained in:
Alex Newman
2025-12-29 00:26:06 -05:00
parent f1aa4c3943
commit 00d0bc51e0
6 changed files with 1024 additions and 732 deletions
+357 -306
View File
@@ -1,403 +1,454 @@
---
title: "mem-search Skill"
description: "Query your project history with natural language"
title: "Memory Search"
description: "Search your project history with MCP tools"
---
# mem-search Skill Usage
# Memory Search with MCP Tools
Once claude-mem is installed as a plugin, you can search your project history using natural language. Claude automatically invokes the mem-search skill when you ask about past work.
Claude-mem provides persistent memory across sessions through **4 MCP tools** that follow a token-efficient **3-layer workflow pattern**.
## How It Works
## Overview
**v5.5.0 Enhancement**: The search skill was renamed to "mem-search" for better scope differentiation, with effectiveness increased from 67% to 100% and enhanced concrete triggers (85% vs 44%).
Instead of fetching all historical data upfront (expensive), claude-mem uses a progressive disclosure approach:
**v5.4.0 Architecture**: Claude-Mem uses a skill-based search architecture instead of MCP tools, saving ~2,250 tokens per session start through progressive disclosure.
1. **Search** → Get a compact index with IDs (~50-100 tokens/result)
2. **Timeline** → Get context around interesting results
3. **Get Observations** → Fetch full details ONLY for filtered IDs
**Simple Usage:**
- Just ask naturally: *"What did we do last session?"*
- Claude recognizes the intent and invokes the mem-search skill
- The skill uses HTTP API endpoints to query your memory
- Results are formatted and presented to you
This achieves **~10x token savings** compared to traditional RAG approaches.
**Benefits:**
- **Token Efficient**: ~250 tokens (skill frontmatter) vs ~2,500 tokens (MCP tool definitions)
- **Natural Language**: No need to learn specific tool syntax
- **Progressive Disclosure**: Only loads detailed instructions when needed
- **Auto-Invoked**: Claude knows when to search based on your questions
- **Scope Differentiation**: "mem-search" clearly distinguishes from native conversation memory
## The 3-Layer Workflow
## Quick Reference
### Layer 1: Search (Index)
| Operation | Purpose |
|-------------------------|----------------------------------------------|
| Search Observations | Full-text search across observations |
| Search Sessions | Full-text search across session summaries |
| Search Prompts | Full-text search across raw user prompts |
| By Concept | Find observations tagged with concepts |
| By File | Find observations referencing files |
| By Type | Find observations by type |
| Recent Context | Get recent session context |
| Timeline | Get unified timeline around a specific point |
| Timeline by Query | Search and get timeline context in one step |
| API Help | Get search API documentation |
## Example Queries
### Natural Language Queries
**Search Observations:**
```
"What bugs did we fix related to authentication?"
"Show me all decisions about the build system"
"Find refactoring work on the database"
```
**Search Sessions:**
```
"What did we learn about hooks?"
"What was accomplished in the API implementation?"
"Show me recent work on this project"
```
**Search Prompts:**
```
"When did I ask about authentication features?"
"Find all my requests about dark mode"
```
**Note**: Claude automatically translates your natural language queries into the appropriate search operations.
### Search by File
Start by searching to get a lightweight index of results:
```
"Show me everything related to worker-service.ts"
"What changes were made to migrations.ts?"
"Find all work on the database file"
search(query="authentication bug", type="bugfix", limit=10)
```
### Search by Concept
**Returns:** Compact table with IDs, titles, dates, types
**Cost:** ~50-100 tokens per result
**Purpose:** Survey what exists before fetching details
### Layer 2: Timeline (Context)
Get chronological context around specific observations:
```
"Show observations tagged with architecture"
"Find all security-related observations"
"What patterns have we used?"
timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
```
### Search by Type
Or search and get timeline in one step:
```
"Find all feature implementations"
"Show me all decisions and discoveries"
"What bugs have we fixed?"
timeline(query="authentication", depth_before=2, depth_after=2)
```
### Recent Context
**Returns:** Chronological view showing what was happening before/after
**Cost:** Variable, depends on depth
**Purpose:** Understand narrative arc and context
### Layer 3: Get Observations (Details)
Fetch full details only for relevant observations:
```
"Show me what we've been working on"
"Get context from the last 5 sessions"
"What happened recently on this project?"
get_observations(ids=[123, 456, 789])
```
### Timeline Queries
**Returns:** Complete observation details (narrative, facts, files, concepts)
**Cost:** ~500-1000 tokens per observation
**Purpose:** Deep dive on specific, validated items
**Get timeline around a specific point:**
### Why This Works
**Traditional Approach:**
- Fetch everything upfront: 20,000 tokens
- Relevance: ~10% (2,000 tokens actually useful)
- Waste: 18,000 tokens on irrelevant context
**3-Layer Approach:**
- Search index: 1,000 tokens (10 results)
- Timeline context: 500 tokens (around 2 key results)
- Fetch details: 1,500 tokens (3 observations)
- **Total: 3,000 tokens, 100% relevant**
## Available Tools
### `__IMPORTANT` - Workflow Documentation
Always visible reminder of the 3-layer workflow pattern. Helps Claude understand how to use the search tools efficiently.
**Usage:** Automatically shown, no need to invoke
### `search` - Search Memory Index
Search your memory and get a compact index with IDs.
**Parameters:**
- `query` - Full-text search query (supports AND, OR, NOT, phrase searches)
- `limit` - Maximum results (default: 20)
- `offset` - Skip first N results for pagination
- `type` - Filter by observation type (bugfix, feature, decision, discovery, refactor, change)
- `obs_type` - Filter by record type (observation, session, prompt)
- `project` - Filter by project name
- `dateStart` - Filter by start date (YYYY-MM-DD)
- `dateEnd` - Filter by end date (YYYY-MM-DD)
- `orderBy` - Sort order (date_desc, date_asc, relevance)
**Returns:** Compact index table with IDs, titles, dates, types
**Example:**
```
"What was happening when we implemented authentication?"
"Show me the context around that bug fix"
"What led to the decision to refactor the database?"
search(query="database migration", type="bugfix", limit=5, orderBy="date_desc")
```
**Timeline by query:**
### `timeline` - Get Chronological Context
Get a chronological view of observations around a specific point or query.
**Parameters:**
- `anchor` - Observation ID to center timeline around (optional if query provided)
- `query` - Search query to find anchor automatically (optional if anchor provided)
- `depth_before` - Number of observations before anchor (default: 3)
- `depth_after` - Number of observations after anchor (default: 3)
- `project` - Filter by project name
**Returns:** Chronological list showing what happened before/during/after
**Example:**
```
"Find when we added the viewer UI and show what happened around that time"
"Search for authentication work and show the timeline"
timeline(anchor=12345, depth_before=5, depth_after=5)
```
**Benefits:**
- See the complete narrative arc around key events
- All record types (observations, sessions, prompts) in chronological view
- Understand what was happening before and after important changes
## Search Strategy
The mem-search skill uses a progressive disclosure pattern to efficiently retrieve information:
### 1. Ask Naturally
Start with a natural language question:
Or search-based:
```
"What bugs did we fix related to authentication?"
timeline(query="implemented JWT auth", depth_before=3, depth_after=3)
```
### 2. Claude Invokes mem-search Skill
### `get_observations` - Fetch Full Details
Claude recognizes your intent and loads the mem-search skill (~250 tokens for skill frontmatter).
Fetch complete observation details by IDs. **Always batch multiple IDs in a single call for efficiency.**
### 3. Skill Uses HTTP API
**Parameters:**
- `ids` - Array of observation IDs (required)
- `orderBy` - Sort order (date_desc, date_asc)
- `limit` - Maximum observations to return
- `project` - Filter by project name
The skill calls the appropriate HTTP endpoint (e.g., `/api/search/observations`) with the query.
**Returns:** Complete observation details including narrative, facts, files, concepts
### 4. Results Formatted
Results are formatted and presented to you, usually starting with an index/summary format.
### 5. Deep Dive if Needed
If you need more details, ask follow-up questions:
**Example:**
```
"Tell me more about observation #123"
"Show me the full details of that decision"
get_observations(ids=[123, 456, 789, 1011])
```
**Benefits of This Approach:**
- **Token Efficient**: Only loads what you need, when you need it
- **Natural**: No syntax to learn
- **Progressive**: Start with overview, drill down as needed
- **Automatic**: Claude handles the search invocation
**Important:** Always batch IDs instead of making separate calls per observation.
## Common Use Cases
### Debugging Issues
**Scenario:** Find what went wrong with database connections
```
Step 1: search(query="error database connection", type="bugfix", limit=10)
→ Review index, identify observations #245, #312, #489
Step 2: timeline(anchor=312, depth_before=3, depth_after=3)
→ See what was happening around the fix
Step 3: get_observations(ids=[312, 489])
→ Get full details on relevant fixes
```
### Understanding Decisions
**Scenario:** Review architectural choices about authentication
```
Step 1: search(query="authentication", type="decision", limit=5)
→ Find decision observations
Step 2: get_observations(ids=[<relevant_ids>])
→ Get full decision rationale, trade-offs, facts
```
### Code Archaeology
**Scenario:** Find when a specific file was modified
```
Step 1: search(query="worker-service.ts", limit=20)
→ Get all observations mentioning that file
Step 2: timeline(query="worker-service.ts refactor", depth_before=2, depth_after=2)
→ See what led to and followed from the refactor
Step 3: get_observations(ids=[<specific_observation_ids>])
→ Get implementation details
```
### Feature History
**Scenario:** Track how a feature evolved
```
Step 1: search(query="dark mode", type="feature", orderBy="date_asc")
→ Chronological view of feature work
Step 2: timeline(anchor=<first_observation_id>, depth_after=10)
→ See the full development timeline
Step 3: get_observations(ids=[<key_milestones>])
→ Deep dive on critical implementation points
```
### Learning from Past Work
**Scenario:** Review refactoring patterns
```
Step 1: search(type="refactor", limit=10, orderBy="date_desc")
→ Recent refactoring work
Step 2: get_observations(ids=[<interesting_ids>])
→ Study the patterns and approaches used
```
### Context Recovery
**Scenario:** Restore context after time away from project
```
Step 1: search(query="project-name", limit=10, orderBy="date_desc")
→ See recent work
Step 2: timeline(anchor=<most_recent_id>, depth_before=10)
→ Understand what led to current state
Step 3: get_observations(ids=[<critical_observations>])
→ Refresh memory on key decisions
```
## Search Query Syntax
The `query` parameter supports SQLite FTS5 full-text search syntax:
### Boolean Operators
```
query="authentication AND JWT" # Both terms must appear
query="OAuth OR JWT" # Either term can appear
query="security NOT deprecated" # Exclude deprecated items
```
### Phrase Searches
```
query='"database migration"' # Exact phrase match
```
### Column-Specific Searches
```
query="title:authentication" # Search in title only
query="content:database" # Search in content only
query="concepts:security" # Search in concepts only
```
### Combining Operators
```
query='"user auth" AND (JWT OR session) NOT deprecated'
```
## Token Management
### Token Efficiency Best Practices
1. **Always start with search** - Get index first (~50-100 tokens/result)
2. **Use small limits** - Start with 3-5 results, increase if needed
3. **Filter before fetching** - Use type, date, project filters
4. **Batch get_observations** - Always group multiple IDs in one call
5. **Use timeline strategically** - Get context only when narrative matters
### Token Cost Estimates
| Operation | Tokens per Result |
|-----------|-------------------|
| search (index) | 50-100 |
| timeline (per observation) | 100-200 |
| get_observations (full details) | 500-1,000 |
**Example Comparison:**
**Inefficient:**
```
# Fetching 20 full observations upfront: 10,000-20,000 tokens
get_observations(ids=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])
```
**Efficient:**
```
# Search index: ~1,000 tokens
search(query="bug fix", limit=20)
# Review IDs, identify 3 relevant observations
# Fetch only relevant: ~1,500-3,000 tokens
get_observations(ids=[5, 12, 18])
# Total: 2,500-4,000 tokens (vs 10,000-20,000)
```
## Advanced Filtering
You can refine searches using natural language filters:
### Date Ranges
```
"What bugs did we fix in October?"
"Show me work from last week"
"Find decisions made between October 1-31"
search(
query="performance optimization",
dateStart="2025-10-01",
dateEnd="2025-10-31"
)
```
### Multiple Types
```
"Show me all decisions and features"
"Find bugfixes and refactorings"
```
### Concepts
For observations of multiple types, make multiple searches or use broader query:
```
"Find database work related to architecture and performance"
"Show security observations"
search(query="database", type="bugfix", limit=10)
search(query="database", type="feature", limit=10)
```
### File-Specific
### Project-Specific
```
"Show refactoring work that touched worker-service.ts"
"Find changes to auth files"
search(query="API", project="my-app", limit=15)
```
### Project Filtering
### Pagination
```
"Show authentication work on my-app project"
"What have we done on this codebase?"
# First page
search(query="refactor", limit=10, offset=0)
# Second page
search(query="refactor", limit=10, offset=10)
# Third page
search(query="refactor", limit=10, offset=20)
```
**Note**: Claude translates your natural language into the appropriate API filters automatically.
## Under the Hood: HTTP API
The mem-search skill uses HTTP endpoints on the worker service (port 37777):
- `GET /api/search/observations` - Full-text search observations
- `GET /api/search/sessions` - Full-text search session summaries
- `GET /api/search/prompts` - Full-text search user prompts
- `GET /api/search/by-concept` - Find observations by concept tag
- `GET /api/search/by-file` - Find work related to specific files
- `GET /api/search/by-type` - Find observations by type
- `GET /api/context/recent` - Get recent session context
- `GET /api/context/timeline` - Get timeline around specific point
- `GET /api/timeline/by-query` - Search + timeline in one call
- `GET /api/search/help` - API documentation
These endpoints use FTS5 full-text search with support for:
- Boolean operators (AND, OR, NOT)
- Phrase searches
- Column-specific searches
- Date range filtering
- Project filtering
## Result Metadata
All results include rich metadata:
All observations include rich metadata:
```
## JWT authentication decision
**Type**: decision
**Date**: 2025-10-21 14:23:45
**Concepts**: authentication, security, architecture
**Files Read**: src/auth/middleware.ts, src/utils/jwt.ts
**Files Modified**: src/auth/jwt-strategy.ts
**Narrative**:
Decided to implement JWT-based authentication instead of session-based
authentication for better scalability and stateless design...
**Facts**:
• JWT tokens expire after 1 hour
• Refresh tokens stored in httpOnly cookies
• Token signing uses RS256 algorithm
• Public keys rotated every 30 days
```
## Citations
All search results include observation IDs that can be accessed via the HTTP API:
- `http://localhost:37777/api/observation/{id}` - Get specific observation by ID
- View all observations in the web viewer at `http://localhost:37777`
These citations enable referencing specific historical context in your work.
## Token Management
### Token Efficiency Tips
1. **Start with index format**: ~50-100 tokens per result
2. **Use small limits**: Start with 3-5 results
3. **Apply filters**: Narrow results before searching
4. **Paginate**: Use offset to browse results in batches
### Token Estimates
| Format | Tokens per Result |
|--------|-------------------|
| Index | 50-100 |
| Full | 500-1000 |
**Example**:
- 20 results in index format: ~1,000-2,000 tokens
- 20 results in full format: ~10,000-20,000 tokens
## Common Use Cases
### 1. Debugging Issues
Find what went wrong:
```
search_observations with query="error database connection" and type="bugfix"
```
### 2. Understanding Decisions
Review architectural choices:
```
find_by_type with type="decision" and format="index"
```
Then deep dive on specific decisions:
```
search_observations with query="[DECISION TITLE]" and format="full"
```
### 3. Code Archaeology
Find when a file was modified:
```
find_by_file with filePath="worker-service.ts"
```
### 4. Feature History
Track feature development:
```
search_sessions with query="authentication feature"
search_user_prompts with query="add authentication"
```
### 5. Learning from Past Work
Review refactoring patterns:
```
find_by_type with type="refactor" and limit=10
```
### 6. Context Recovery
Restore context after time away:
```
get_recent_context with limit=5
search_sessions with query="[YOUR PROJECT NAME]" and orderBy="date_desc"
```
## Best Practices
1. **Index first, full later**: Always start with index format
2. **Small limits**: Start with 3-5 results to avoid token limits
3. **Use filters**: Narrow results before searching
4. **Specific queries**: More specific = better results
5. **Review citations**: Use citations to reference past decisions
6. **Date filtering**: Use date ranges for time-based searches
7. **Type filtering**: Use types to categorize searches
8. **Concept tags**: Use concepts for thematic searches
- **ID** - Unique observation identifier
- **Type** - bugfix, feature, decision, discovery, refactor, change
- **Date** - When the work occurred
- **Title** - Concise description
- **Concepts** - Tagged themes (e.g., security, performance, architecture)
- **Files Read** - Files examined during work
- **Files Modified** - Files changed during work
- **Narrative** - Story of what happened and why
- **Facts** - Key factual points (decisions made, patterns used, metrics)
## Troubleshooting
### No Results Found
1. Check database has data:
1. **Broaden your search:**
```
# Too specific
search(query="JWT authentication implementation with RS256")
# Better
search(query="authentication")
```
2. **Check database has data:**
```bash
sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"
curl "http://localhost:37777/api/search?query=test"
```
2. Try broader natural language query:
3. **Try without filters:**
```
"Show me anything about authentication" # Broader
vs
"Find exact JWT authentication implementation" # Too specific
# Remove type/date filters to see if data exists
search(query="your-search-term")
```
3. Ask without filters first:
```
"What do we have about auth?"
# Then narrow down
"Show me auth-related decisions"
```
### IDs Not Found in get_observations
### Worker Service Not Running
**Error:** "Observation IDs not found: [123, 456]"
If search isn't working, check the worker service:
**Causes:**
- IDs from different project (use `project` parameter)
- IDs were deleted
- Typo in ID numbers
```bash
npm run worker:status # Check worker status
npm run worker:restart # Restart if needed
npm run worker:logs # View logs
**Solution:**
```
# Verify IDs exist
search(query="<related-search>")
# Use correct project filter
get_observations(ids=[123, 456], project="correct-project-name")
```
Or describe the issue to Claude and the troubleshoot skill will automatically activate to provide diagnosis.
### Token Limit Errors
### Performance Issues
**Error:** Response exceeds token limits
**Solution:** Use the 3-layer workflow to reduce upfront costs:
```
# Instead of fetching 50 full observations:
# get_observations(ids=[1,2,3,...,50]) # 25,000-50,000 tokens!
# Do this:
search(query="<your-query>", limit=50) # ~2,500-5,000 tokens
# Review index, identify 5 relevant observations
get_observations(ids=[<5-most-relevant>]) # ~2,500-5,000 tokens
# Total: 5,000-10,000 tokens (50-80% savings)
```
### Search Performance
If searches seem slow:
1. Be more specific in your queries
2. Ask for recent work (naturally filters by date)
3. Specify the project you're interested in
4. Ask for fewer results initially
1. Be more specific in queries (helps FTS5 index)
2. Use date range filters to narrow scope
3. Specify project filter when possible
4. Use smaller limit values
## Best Practices
1. **Index First, Details Later** - Always start with search to survey options
2. **Filter Before Fetching** - Use search parameters to narrow results
3. **Batch ID Fetches** - Group multiple IDs in one get_observations call
4. **Use Timeline for Context** - When narrative matters, timeline shows the story
5. **Specific Queries** - More specific = better relevance
6. **Small Limits Initially** - Start with 3-5 results, expand if needed
7. **Review Before Deep Dive** - Check index before fetching full details
## Technical Details
**Architecture Change (v5.4.0)**:
- **Before**: 9 MCP tools (~2,500 tokens in tool definitions per session start)
- **After**: 1 mem-search skill (~250 tokens in frontmatter, full instructions loaded on-demand)
- **Savings**: ~2,250 tokens per session start
- **Migration**: Transparent - users don't need to change how they ask questions
**Architecture:** MCP tools are a thin wrapper over the Worker HTTP API (localhost:37777). The MCP server translates tool calls into HTTP requests to the worker service, which handles all business logic, database queries, and Chroma vector search.
**v5.5.0 Enhancement**: Renamed from "search" to "mem-search" with improved effectiveness (67% → 100%) and enhanced triggers (44% → 85%).
**MCP Server:** Located at `~/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs`
**How the Skill Works:**
1. User asks a question about past work
2. Claude recognizes the intent matches the mem-search skill description
3. Skill loads full instructions from `plugin/skills/mem-search/SKILL.md`
4. Skill uses `curl` to call HTTP API endpoints
5. Results formatted and returned to Claude
6. Claude presents results to user
**Worker Service:** Express API on port 37777, managed by Bun
**Database:** SQLite FTS5 full-text search on `~/.claude-mem/claude-mem.db`
**Vector Search:** Chroma embeddings for semantic search (underlying implementation)
## Next Steps
- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow
- [Architecture Overview](/architecture/overview) - System components
- [Database Schema](/architecture/database) - Understanding the data
- [Getting Started](/usage/getting-started) - Automatic operation
- [Database Schema](/architecture/database) - Understanding the data structure
- [Claude Desktop Setup](/usage/claude-desktop) - Installation and configuration