--- title: "Search Architecture" description: "Skill-based search with HTTP API and progressive disclosure" --- # Search Architecture Claude-Mem uses a skill-based search architecture that provides intelligent memory retrieval through natural language queries. This replaced the MCP-based approach in v5.4.0, saving ~2,250 tokens per session start. ## Overview **Architecture**: Skill-Based Search + HTTP API + Progressive Disclosure **Key Components**: 1. **Search Skill** (`plugin/skills/search/SKILL.md`) - Auto-invoked when users ask about past work 2. **HTTP API Endpoints** (10 routes) - Fast, efficient search operations on port 37777 3. **Worker Service** - Express.js server with FTS5 full-text search 4. **SQLite Database** - Persistent storage with FTS5 virtual tables 5. **Chroma Vector DB** - Semantic search with hybrid retrieval ## How It Works ### 1. User Query (Natural Language) ``` User: "What bugs did we fix last session?" ``` ### 2. Skill Invocation Claude recognizes the intent and invokes the search skill: - Skill frontmatter (~250 tokens) loaded at session start - Full skill instructions loaded on-demand when skill is invoked - Progressive disclosure pattern minimizes context overhead ### 3. HTTP API Call The skill uses `curl` to call the HTTP API: ```bash curl "http://localhost:37777/api/search/observations?query=bugs&type=bugfix&limit=5" ``` ### 4. FTS5 Search Worker service queries SQLite FTS5 virtual tables: ```sql SELECT * FROM observations_fts WHERE observations_fts MATCH ? AND type = 'bugfix' ORDER BY rank LIMIT 5 ``` ### 5. Results Formatted Skill formats results and returns to Claude: ``` ## Recent Bugfixes 1. [bugfix] Fixed authentication token expiry Date: 2025-11-08 14:23:45 Files: src/auth/jwt.ts 2. [bugfix] Resolved database connection leak Date: 2025-11-08 13:15:22 Files: src/services/database.ts ``` ### 6. User Sees Answer Claude presents the formatted results naturally in conversation. ## Architecture Change (v5.4.0) ### Before: MCP-Based Search **Approach**: 9 MCP tools registered at session start **Token Cost**: ~2,500 tokens in tool definitions per session - Each tool's schema, parameters, descriptions loaded - All 9 tools available whether needed or not - No progressive disclosure **Example MCP Tool**: ```json { "name": "search_observations", "description": "Full-text search across observations...", "inputSchema": { "type": "object", "properties": { "query": { "type": "string", "description": "..." }, "type": { "type": "array", "items": { "enum": [...] } }, "format": { "enum": ["index", "full"] }, // ... many more parameters } } } ``` ### After: Skill-Based Search **Approach**: 1 search skill with progressive disclosure **Token Cost**: ~250 tokens in skill frontmatter per session - Only skill description loaded at session start - Full instructions loaded on-demand when skill is invoked - HTTP API endpoints instead of MCP protocol **Example Skill Frontmatter**: ```markdown # Claude-Mem Search Skill Access claude-mem's persistent memory through a comprehensive HTTP API. Search for past work, understand context, and learn from previous decisions. ## When to Use This Skill Invoke this skill when users ask about: - Past work: "What did we do last session?" - Bug fixes: "Did we fix this before?" - Features: "How did we implement authentication?" ... ``` **Token Savings**: ~2,250 tokens per session start (90% reduction) ## HTTP API Endpoints The worker service exposes 10 search endpoints: ### Full-Text Search ``` GET /api/search/observations GET /api/search/sessions GET /api/search/prompts ``` **Parameters**: - `query` - FTS5 search query (required) - `type` - Filter by type (bugfix, feature, refactor, etc.) - `project` - Filter by project name - `limit` - Maximum results (default: 20) - `offset` - Pagination offset - `format` - Response format (index or full) **Example**: ```bash curl "http://localhost:37777/api/search/observations?query=authentication&type=decision&limit=5" ``` ### Filtered Search ``` GET /api/search/by-type GET /api/search/by-concept GET /api/search/by-file ``` **Parameters**: - `type` / `concept` / `filePath` - Filter criteria (required) - `project` - Filter by project - `limit` - Maximum results - `format` - Response format **Example**: ```bash curl "http://localhost:37777/api/search/by-file?filePath=worker-service.ts&limit=10" ``` ### Context Retrieval ``` GET /api/context/recent GET /api/context/timeline GET /api/timeline/by-query ``` **Parameters**: - `project` - Filter by project - `limit` - Number of sessions/records - `anchor` - Timeline anchor point (ID or timestamp) - `depth_before` - Records before anchor - `depth_after` - Records after anchor **Example**: ```bash curl "http://localhost:37777/api/context/recent?project=claude-mem&limit=5" ``` ### Documentation ``` GET /api/search/help ``` Returns API documentation in JSON format. ## Progressive Disclosure Pattern The search skill uses progressive disclosure to minimize token usage: ### Layer 1: Skill Frontmatter (Session Start) **What's Loaded**: Skill description and when to use it (~250 tokens) **Purpose**: Claude can recognize when to invoke the skill **Example**: ```markdown # Claude-Mem Search Skill Access claude-mem's persistent memory through a comprehensive HTTP API. ## When to Use This Skill Invoke this skill when users ask about: - Past work: "What did we do last session?" - Bug fixes: "Did we fix this before?" ... ``` ### Layer 2: Full Skill Instructions (On-Demand) **What's Loaded**: Complete operation documentation (~2,500 tokens) **Purpose**: Detailed instructions for each search operation **When Loaded**: Only when Claude invokes the skill **Example Structure**: ``` /skills/search/ ├── SKILL.md (main frontmatter) ├── operations/ │ ├── observations.md (detailed instructions) │ ├── sessions.md │ ├── prompts.md │ ├── by-type.md │ ├── by-concept.md │ ├── by-file.md │ ├── recent-context.md │ ├── timeline.md │ ├── timeline-by-query.md │ ├── help.md │ ├── formatting.md │ └── common-workflows.md ``` ### Layer 3: API Response **What's Returned**: Search results in requested format **Format Options**: - `index` - Titles, dates, IDs only (~50-100 tokens per result) - `full` - Complete details (~500-1000 tokens per result) **Progressive Usage**: Start with `index`, drill down with `full` as needed ## Implementation Details ### Search Skill Structure ``` plugin/skills/search/ ├── SKILL.md # Main frontmatter (~250 tokens) ├── operations/ │ ├── observations.md # Search observations │ ├── sessions.md # Search sessions │ ├── prompts.md # Search prompts │ ├── by-type.md # Filter by type │ ├── by-concept.md # Filter by concept │ ├── by-file.md # Filter by file │ ├── recent-context.md # Get recent context │ ├── timeline.md # Timeline around point │ ├── timeline-by-query.md # Search + timeline │ ├── help.md # API documentation │ ├── formatting.md # Result formatting guide │ └── common-workflows.md # Usage patterns ``` ### Worker Service Integration **File**: `src/services/worker-service.ts` **Search Routes**: ```typescript // Full-text search app.get('/api/search/observations', handleSearchObservations); app.get('/api/search/sessions', handleSearchSessions); app.get('/api/search/prompts', handleSearchPrompts); // Filtered search app.get('/api/search/by-type', handleSearchByType); app.get('/api/search/by-concept', handleSearchByConcept); app.get('/api/search/by-file', handleSearchByFile); // Context retrieval app.get('/api/context/recent', handleRecentContext); app.get('/api/context/timeline', handleTimeline); app.get('/api/timeline/by-query', handleTimelineByQuery); // Documentation app.get('/api/search/help', handleHelp); ``` **Database Access**: - Uses `SessionSearch` service for FTS5 queries - Uses `SessionStore` for structured queries - Hybrid search with ChromaDB for semantic similarity ### Security **FTS5 Injection Prevention** (v4.2.3): ```typescript function escapeFTS5Query(query: string): string { return query.replace(/"/g, '""'); } ``` All user-provided search queries are properly escaped to prevent SQL injection. **Comprehensive Testing**: 332 injection attack tests covering: - Special characters - SQL keywords - Quote escaping - Boolean operators ## Benefits ### 1. Token Efficiency **Before (MCP)**: - Session start: ~2,500 tokens for tool definitions - Every session pays this cost - No progressive disclosure **After (Skill)**: - Session start: ~250 tokens for skill frontmatter - Full instructions: ~2,500 tokens (only when invoked) - Net savings: ~2,250 tokens per session (~90% reduction) ### 2. Natural Language Interface **Before**: Users needed to learn MCP tool syntax ``` search_observations with query="authentication" and type="decision" ``` **After**: Users ask naturally ``` "What decisions did we make about authentication?" ``` Claude translates to appropriate API call. ### 3. Flexibility **HTTP API Benefits**: - Can be called from skills, MCP tools, or other clients - Easy to test with curl - Standard REST conventions - JSON responses **Progressive Disclosure**: - Loads only what's needed - Can add more operations without increasing base cost - Documentation co-located with operations ### 4. Performance **Fast Queries**: FTS5 full-text search <10ms for typical queries **Caching**: HTTP layer allows response caching **Pagination**: Efficient result pagination with offset/limit ## Migration Notes ### For Users **No Action Required**: The migration from MCP to skill-based search is transparent. **Same Questions Work**: Natural language queries work exactly the same way. **Invisible Change**: Users won't notice any difference except better performance. ### For Developers **Deprecated**: MCP search server (`src/servers/search-server.ts`) - Source file kept for reference - No longer built or registered - MCP configuration removed from `plugin/.mcp.json` **New Implementation**: Skill-based search - Skill files: `plugin/skills/search/` - HTTP endpoints: `src/services/worker-service.ts` (lines 200-400) - Build script: `npm run build` includes skill files - Sync script: `npm run sync-marketplace` copies to plugin directory ## Troubleshooting ### Worker Service Not Running If searches fail, check worker service: ```bash pm2 list # Check status npm run worker:restart # Restart worker npm run worker:logs # View logs ``` ### HTTP Endpoints Not Responding Test endpoints directly: ```bash # Health check curl http://localhost:37777/health # Search test curl "http://localhost:37777/api/search/observations?query=test&limit=1" ``` ### Skill Not Invoking If Claude doesn't invoke the skill: 1. Check skill files exist: `ls ~/.claude/plugins/marketplaces/thedotmack/plugin/skills/search/` 2. Restart Claude Code session 3. Try explicit skill invocation: `/skill search` ## Next Steps - [Search Tools Usage](/usage/search-tools) - User guide with examples - [Worker Service Architecture](/architecture/worker-service) - HTTP API details - [Database Schema](/architecture/database) - FTS5 tables and indexes