Files
claude-mem/docs/public/architecture/search-architecture.mdx
T
Alex Newman 4321add69c refactor: rename search-server to mcp-server throughout codebase
- Updated all documentation references from search-server to mcp-server
- Removed legacy search-server.cjs file
- Updated debug log messages to use [mcp-server] prefix
- Updated build output references in docs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-09 01:06:43 -05:00

449 lines
12 KiB
Plaintext

---
title: "Search Architecture"
description: "mem-search skill with HTTP API and progressive disclosure"
---
# Search Architecture
Claude-Mem uses a skill-based search architecture that provides intelligent memory retrieval through natural language queries. This replaced the MCP-based approach in v5.4.0, saving ~2,250 tokens per session start. The skill was enhanced and renamed to "mem-search" in v5.5.0 for better scope differentiation.
## Overview
**Architecture**: Skill-Based Search + HTTP API + Progressive Disclosure
**Key Components**:
1. **mem-search Skill** (`plugin/skills/mem-search/SKILL.md`) - Auto-invoked when users ask about past work
2. **HTTP API Endpoints** (10 routes) - Fast, efficient search operations on port 37777
3. **Worker Service** - Express.js server with FTS5 full-text search
4. **SQLite Database** - Persistent storage with FTS5 virtual tables
5. **Chroma Vector DB** - Semantic search with hybrid retrieval
**v5.5.0 Enhancement**: Renamed from "search" to "mem-search" with:
- Effectiveness increased from 67% to 100%
- Concrete triggers increased from 44% to 85%
- 5+ unique identifiers for better scope differentiation
- Comprehensive documentation (17 files, 12 operation guides)
## How It Works
### 1. User Query (Natural Language)
```
User: "What bugs did we fix last session?"
```
### 2. Skill Invocation
Claude recognizes the intent and invokes the mem-search skill:
- Skill frontmatter (~250 tokens) loaded at session start
- Full skill instructions loaded on-demand when skill is invoked
- Progressive disclosure pattern minimizes context overhead
- "mem-search" naming provides clear scope differentiation from native memory
### 3. HTTP API Call
The skill uses `curl` to call the HTTP API:
```bash
curl "http://localhost:37777/api/search/observations?query=bugs&type=bugfix&limit=5"
```
### 4. FTS5 Search
Worker service queries SQLite FTS5 virtual tables:
```sql
SELECT * FROM observations_fts
WHERE observations_fts MATCH ?
AND type = 'bugfix'
ORDER BY rank
LIMIT 5
```
### 5. Results Formatted
Skill formats results and returns to Claude:
```
## Recent Bugfixes
1. [bugfix] Fixed authentication token expiry
Date: 2025-11-08 14:23:45
Files: src/auth/jwt.ts
2. [bugfix] Resolved database connection leak
Date: 2025-11-08 13:15:22
Files: src/services/database.ts
```
### 6. User Sees Answer
Claude presents the formatted results naturally in conversation.
## Architecture Change (v5.4.0)
### Before: MCP-Based Search
**Approach**: 9 MCP tools registered at session start
**Token Cost**: ~2,500 tokens in tool definitions per session
- Each tool's schema, parameters, descriptions loaded
- All 9 tools available whether needed or not
- No progressive disclosure
**Example MCP Tool**:
```json
{
"name": "search_observations",
"description": "Full-text search across observations...",
"inputSchema": {
"type": "object",
"properties": {
"query": { "type": "string", "description": "..." },
"type": { "type": "array", "items": { "enum": [...] } },
"format": { "enum": ["index", "full"] },
// ... many more parameters
}
}
}
```
### After: Skill-Based Search
**Approach**: 1 mem-search skill with progressive disclosure
**Token Cost**: ~250 tokens in skill frontmatter per session
- Only skill description loaded at session start
- Full instructions loaded on-demand when skill is invoked
- HTTP API endpoints instead of MCP protocol
**Example Skill Frontmatter**:
```markdown
# Claude-Mem mem-search Skill
Access claude-mem's persistent memory through a comprehensive HTTP API.
Search for past work, understand context, and learn from previous decisions.
## When to Use This Skill
Invoke this skill when users ask about:
- Past work: "What did we do last session?"
- Bug fixes: "Did we fix this before?"
- Features: "How did we implement authentication?"
...
```
**Token Savings**: ~2,250 tokens per session start (90% reduction)
## HTTP API Endpoints
The worker service exposes 10 search endpoints:
### Full-Text Search
```
GET /api/search/observations
GET /api/search/sessions
GET /api/search/prompts
```
**Parameters**:
- `query` - FTS5 search query (required)
- `type` - Filter by type (bugfix, feature, refactor, etc.)
- `project` - Filter by project name
- `limit` - Maximum results (default: 20)
- `offset` - Pagination offset
- `format` - Response format (index or full)
**Example**:
```bash
curl "http://localhost:37777/api/search/observations?query=authentication&type=decision&limit=5"
```
### Filtered Search
```
GET /api/search/by-type
GET /api/search/by-concept
GET /api/search/by-file
```
**Parameters**:
- `type` / `concept` / `filePath` - Filter criteria (required)
- `project` - Filter by project
- `limit` - Maximum results
- `format` - Response format
**Example**:
```bash
curl "http://localhost:37777/api/search/by-file?filePath=worker-service.ts&limit=10"
```
### Context Retrieval
```
GET /api/context/recent
GET /api/context/timeline
GET /api/timeline/by-query
```
**Parameters**:
- `project` - Filter by project
- `limit` - Number of sessions/records
- `anchor` - Timeline anchor point (ID or timestamp)
- `depth_before` - Records before anchor
- `depth_after` - Records after anchor
**Example**:
```bash
curl "http://localhost:37777/api/context/recent?project=claude-mem&limit=5"
```
### Documentation
```
GET /api/search/help
```
Returns API documentation in JSON format.
## Progressive Disclosure Pattern
The mem-search skill uses progressive disclosure to minimize token usage:
### Layer 1: Skill Frontmatter (Session Start)
**What's Loaded**: Skill description and when to use it (~250 tokens)
**Purpose**: Claude can recognize when to invoke the skill
**Example**:
```markdown
# Claude-Mem mem-search Skill
Access claude-mem's persistent memory through a comprehensive HTTP API.
## When to Use This Skill
Invoke this skill when users ask about:
- Past work: "What did we do last session?"
- Bug fixes: "Did we fix this before?"
...
```
### Layer 2: Full Skill Instructions (On-Demand)
**What's Loaded**: Complete operation documentation (~2,500 tokens)
**Purpose**: Detailed instructions for each search operation
**When Loaded**: Only when Claude invokes the skill
**Example Structure**:
```
/skills/search/
├── SKILL.md (main frontmatter)
├── operations/
│ ├── observations.md (detailed instructions)
│ ├── sessions.md
│ ├── prompts.md
│ ├── by-type.md
│ ├── by-concept.md
│ ├── by-file.md
│ ├── recent-context.md
│ ├── timeline.md
│ ├── timeline-by-query.md
│ ├── help.md
│ ├── formatting.md
│ └── common-workflows.md
```
### Layer 3: API Response
**What's Returned**: Search results in requested format
**Format Options**:
- `index` - Titles, dates, IDs only (~50-100 tokens per result)
- `full` - Complete details (~500-1000 tokens per result)
**Progressive Usage**: Start with `index`, drill down with `full` as needed
## Implementation Details
### mem-search Skill Structure
```
plugin/skills/mem-search/
├── SKILL.md # Main frontmatter (~250 tokens)
├── operations/
│ ├── observations.md # Search observations
│ ├── sessions.md # Search sessions
│ ├── prompts.md # Search prompts
│ ├── by-type.md # Filter by type
│ ├── by-concept.md # Filter by concept
│ ├── by-file.md # Filter by file
│ ├── recent-context.md # Get recent context
│ ├── timeline.md # Timeline around point
│ ├── timeline-by-query.md # Search + timeline
│ ├── help.md # API documentation
│ ├── formatting.md # Result formatting guide
│ └── common-workflows.md # Usage patterns
```
### Worker Service Integration
**File**: `src/services/worker-service.ts`
**Search Routes**:
```typescript
// Full-text search
app.get('/api/search/observations', handleSearchObservations);
app.get('/api/search/sessions', handleSearchSessions);
app.get('/api/search/prompts', handleSearchPrompts);
// Filtered search
app.get('/api/search/by-type', handleSearchByType);
app.get('/api/search/by-concept', handleSearchByConcept);
app.get('/api/search/by-file', handleSearchByFile);
// Context retrieval
app.get('/api/context/recent', handleRecentContext);
app.get('/api/context/timeline', handleTimeline);
app.get('/api/timeline/by-query', handleTimelineByQuery);
// Documentation
app.get('/api/search/help', handleHelp);
```
**Database Access**:
- Uses `SessionSearch` service for FTS5 queries
- Uses `SessionStore` for structured queries
- Hybrid search with ChromaDB for semantic similarity
### Security
**FTS5 Injection Prevention** (v4.2.3):
```typescript
function escapeFTS5Query(query: string): string {
return query.replace(/"/g, '""');
}
```
All user-provided search queries are properly escaped to prevent SQL injection.
**Comprehensive Testing**: 332 injection attack tests covering:
- Special characters
- SQL keywords
- Quote escaping
- Boolean operators
## Benefits
### 1. Token Efficiency
**Before (MCP)**:
- Session start: ~2,500 tokens for tool definitions
- Every session pays this cost
- No progressive disclosure
**After (Skill)**:
- Session start: ~250 tokens for skill frontmatter
- Full instructions: ~2,500 tokens (only when invoked)
- Net savings: ~2,250 tokens per session (~90% reduction)
### 2. Natural Language Interface
**Before**: Users needed to learn MCP tool syntax
```
search_observations with query="authentication" and type="decision"
```
**After**: Users ask naturally
```
"What decisions did we make about authentication?"
```
Claude translates to appropriate API call.
### 3. Flexibility
**HTTP API Benefits**:
- Can be called from skills, MCP tools, or other clients
- Easy to test with curl
- Standard REST conventions
- JSON responses
**Progressive Disclosure**:
- Loads only what's needed
- Can add more operations without increasing base cost
- Documentation co-located with operations
### 4. Performance
**Fast Queries**: FTS5 full-text search under 10ms for typical queries
**Caching**: HTTP layer allows response caching
**Pagination**: Efficient result pagination with offset/limit
## Migration Notes
### For Users
**No Action Required**: The migration from MCP to skill-based search is transparent.
**Same Questions Work**: Natural language queries work exactly the same way.
**Invisible Change**: Users won't notice any difference except better performance.
### For Developers
**Renamed**: MCP server (formerly `search-server.ts`, now `src/servers/mcp-server.ts`)
- Source file kept for reference
- No longer built or registered
- MCP configuration removed from `plugin/.mcp.json`
**New Implementation**: Skill-based search
- Skill files: `plugin/skills/mem-search/`
- HTTP endpoints: `src/services/worker-service.ts` (lines 200-400)
- Build script: `npm run build` includes skill files
- Sync script: `npm run sync-marketplace` copies to plugin directory
## Troubleshooting
### Worker Service Not Running
If searches fail, check worker service:
```bash
pm2 list # Check status
npm run worker:restart # Restart worker
npm run worker:logs # View logs
```
### HTTP Endpoints Not Responding
Test endpoints directly:
```bash
# Health check
curl http://localhost:37777/health
# Search test
curl "http://localhost:37777/api/search/observations?query=test&limit=1"
```
### Skill Not Invoking
If Claude doesn't invoke the mem-search skill automatically:
1. Check skill files exist: `ls ~/.claude/plugins/marketplaces/thedotmack/plugin/skills/mem-search/`
2. Restart Claude Code session to reload skill definitions
3. Try more explicit phrasing: "Search past sessions for bug fixes" or "What did we do in yesterday's session?"
4. Ensure your question is about previous sessions (not current conversation context)
## Next Steps
- [Search Tools Usage](/usage/search-tools) - User guide with examples
- [Worker Service Architecture](/architecture/worker-service) - HTTP API details
- [Database Schema](/architecture/database) - FTS5 tables and indexes