Refactor search documentation to implement a 3-layer workflow for memory retrieval; update tool names and usage examples for clarity and efficiency. Enhance troubleshooting section with new error handling and token management strategies.

This commit is contained in:
Alex Newman
2025-12-29 00:26:06 -05:00
parent f1aa4c3943
commit 00d0bc51e0
6 changed files with 1024 additions and 732 deletions
+158
View File
@@ -248,6 +248,164 @@ search_observations({
---
## MCP Architecture Simplification (December 2025)
### The Problem: Complex MCP Implementation
**Before:**
```
9+ MCP tools registered at session start:
- search_observations
- find_by_type
- find_by_file
- find_by_concept
- get_recent_context
- get_observation
- get_session
- get_prompt
- help
Problems:
- Overlapping operations (search_observations vs find_by_type)
- Complex parameter schemas (~2,500 tokens in tool definitions)
- No built-in workflow guidance
- High cognitive load for Claude (which tool to use?)
- Code size: ~2,718 lines in mcp-server.ts
```
**The Insight:** Progressive disclosure should be built into tool design itself, not something Claude has to remember.
### The Solution: 3-Layer Workflow
**After:**
```
4 MCP tools following 3-layer workflow:
1. __IMPORTANT - Workflow documentation (always visible)
"3-LAYER WORKFLOW (ALWAYS FOLLOW):
1. search(query) → Get index with IDs
2. timeline(anchor=ID) → Get context
3. get_observations([IDs]) → Fetch details
NEVER fetch full details without filtering first."
2. search - Layer 1: Get index with IDs (~50-100 tokens/result)
3. timeline - Layer 2: Get chronological context
4. get_observations - Layer 3: Fetch full details (~500-1,000 tokens/result)
Benefits:
- Progressive disclosure enforced by tool structure
- No overlapping operations
- Simple schemas (additionalProperties: true)
- Clear workflow pattern
- Code size: ~312 lines in mcp-server.ts (88% reduction)
- ~10x token savings
```
### Migration: Skill-Based Search Removed
**Previously:** Used skill-based search
- mem-search skill invoked via natural language
- HTTP API called directly via curl
- Progressive disclosure through skill loading
- 17 skill documentation files
**Now:** Removed skill-based approach
- MCP-only architecture
- Native MCP protocol (better Claude integration)
- Works with both Claude Desktop and Claude Code
- Simpler to maintain (no skill files)
- All 19 mem-search skill files removed (~2,744 lines)
### Key Architectural Changes
**MCP Server Refactor:**
Before:
```typescript
// Complex parameter schemas
{
name: "search_observations",
inputSchema: {
type: "object",
properties: {
query: { type: "string", description: "..." },
type: { type: "array", items: { enum: [...] } },
format: { enum: ["index", "full"] },
limit: { type: "number", minimum: 1, maximum: 100 },
// ... many more parameters
}
}
}
```
After:
```typescript
// Simple schemas with workflow guidance
{
name: "search",
description: "Step 1: Search memory. Returns index with IDs.",
inputSchema: {
type: "object",
properties: {},
additionalProperties: true // Accept any parameters
}
}
```
**Workflow Enforcement:**
Before: Claude had to remember progressive disclosure pattern
After: Tool structure makes it impossible to skip steps
- Can't get details without IDs from search
- Can't search without seeing __IMPORTANT reminder
- Timeline provides middle ground (context without full details)
### Impact
**Token Efficiency:**
```
Traditional: Fetch 20 observations upfront
→ 10,000-20,000 tokens
→ Only 2 observations relevant (90% waste)
3-Layer Workflow:
→ search (20 results): ~1,000-2,000 tokens
→ Review index, identify 3 relevant IDs
→ get_observations (3 IDs): ~1,500-3,000 tokens
→ Total: 2,500-5,000 tokens (50-75% savings)
```
**Code Simplicity:**
- MCP server: 2,718 lines → 312 lines (88% reduction)
- Removed: 19 skill files (~2,744 lines)
- Net reduction: ~5,150 lines of code removed
**User Experience:**
- Same natural language interaction
- Better token efficiency
- Clearer architecture
- Works identically on Claude Desktop and Claude Code
### Design Philosophy
**Progressive Disclosure Through Structure:**
The 3-layer workflow embodies progressive disclosure at the architectural level:
1. **Layer 1 (Index)** - "What exists?" - Cheap survey of options
2. **Layer 2 (Timeline)** - "What was happening?" - Context around specific points
3. **Layer 3 (Details)** - "Tell me everything" - Full details only when justified
Each layer provides a decision point where Claude can:
- Stop if irrelevant
- Get more context if uncertain
- Dive deep if confident
This makes it structurally difficult to waste tokens.
---
## v1-v2: The Naive Approach
### The First Attempt: Dump Everything