diff --git a/README.md b/README.md
index 82d5791d..b464fb66 100644
--- a/README.md
+++ b/README.md
@@ -17,7 +17,7 @@
     <img src="https://img.shields.io/badge/License-AGPL%203.0-blue.svg" alt="License">
   </a>
   <a href="package.json">
-    <img src="https://img.shields.io/badge/version-5.1.2-green.svg" alt="Version">
+    <img src="https://img.shields.io/badge/version-5.4.0-green.svg" alt="Version">
   </a>
   <a href="package.json">
     <img src="https://img.shields.io/badge/node-%3E%3D18.0.0-brightgreen.svg" alt="Node">
@@ -69,7 +69,7 @@ Restart Claude Code. Context from previous sessions will automatically appear in
 
 - 🧠 **Persistent Memory** - Context survives across sessions
 - 📊 **Progressive Disclosure** - Layered memory retrieval with token cost visibility
-- 🔍 **9 Search Tools** - Query your project history via MCP
+- 🔍 **Skill-Based Search** - Query your project history with natural language (~2,250 token savings)
 - 🖥️ **Web Viewer UI** - Real-time memory stream at http://localhost:37777
 - 🤖 **Automatic Operation** - No manual intervention required
 - 🔗 **Citations** - Reference past decisions with `claude-mem://` URIs
@@ -91,7 +91,7 @@ npx mintlify dev
 
 - **[Installation Guide](docs/installation.mdx)** - Quick start & advanced installation
 - **[Usage Guide](docs/usage/getting-started.mdx)** - How Claude-Mem works automatically
-- **[MCP Search Tools](docs/usage/search-tools.mdx)** - Query your project history
+- **[Search Tools](docs/usage/search-tools.mdx)** - Query your project history with natural language
 
 ### Best Practices
 
@@ -144,73 +144,83 @@ npx mintlify dev
 **Core Components:**
 
 1. **7 Lifecycle Hook Scripts** - smart-install, context-hook, user-message-hook, new-hook, save-hook, summary-hook, cleanup-hook
-2. **Worker Service** - HTTP API on port 37777 with web viewer UI, managed by PM2
+2. **Worker Service** - HTTP API on port 37777 with web viewer UI and 10 search endpoints, managed by PM2
 3. **SQLite Database** - Stores sessions, observations, summaries with FTS5 full-text search
-4. **9 MCP Search Tools** - Query historical context with citations and timeline analysis
+4. **Search Skill** - Natural language queries with progressive disclosure (~2,250 token savings vs MCP)
 5. **Chroma Vector Database** - Hybrid semantic + keyword search for intelligent context retrieval
 
 See [Architecture Overview](docs/architecture/overview.mdx) for details.
 
 ---
 
-## MCP Search Tools
+## Skill-Based Search
 
-Claude-Mem provides 9 specialized search tools:
+Claude-Mem provides intelligent search through a skill that auto-invokes when you ask about past work:
 
-1. **search_observations** - Full-text search across observations
-2. **search_sessions** - Full-text search across session summaries
-3. **search_user_prompts** - Search raw user requests
-4. **find_by_concept** - Find by concept tags (discovery, problem-solution, pattern, etc.)
-5. **find_by_file** - Find observations referencing specific files
-6. **find_by_type** - Find by type (decision, bugfix, feature, refactor, discovery, change)
-7. **get_recent_context** - Get recent session context for a project
-8. **get_context_timeline** - Get unified timeline of context around a specific point in time
-9. **get_timeline_by_query** - Search for observations and get timeline context around best match
+**How It Works:**
+- Just ask naturally: *"What did we do last session?"* or *"Did we fix this bug before?"*
+- Claude automatically invokes the search skill to find relevant context
+- ~2,250 token savings per session start vs MCP approach
 
-**Example Queries:**
+**Available Search Operations:**
+
+1. **Search Observations** - Full-text search across observations
+2. **Search Sessions** - Full-text search across session summaries
+3. **Search Prompts** - Search raw user requests
+4. **By Concept** - Find by concept tags (discovery, problem-solution, pattern, etc.)
+5. **By File** - Find observations referencing specific files
+6. **By Type** - Find by type (decision, bugfix, feature, refactor, discovery, change)
+7. **Recent Context** - Get recent session context for a project
+8. **Timeline** - Get unified timeline of context around a specific point in time
+9. **Timeline by Query** - Search for observations and get timeline context around best match
+10. **API Help** - Get search API documentation
+
+**Example Natural Language Queries:**
 
 ```
-search_observations with query="authentication" and type="decision"
-find_by_file with filePath="worker-service.ts"
-search_user_prompts with query="add dark mode"
-get_recent_context with limit=5
-get_context_timeline with anchor="S890" depth_before=10 depth_after=10
-get_timeline_by_query with query="viewer UI implementation" mode="auto"
+"What bugs did we fix last session?"
+"How did we implement authentication?"
+"What changes were made to worker-service.ts?"
+"Show me recent work on this project"
+"What was happening when we added the viewer UI?"
 ```
 
-See [MCP Search Tools Guide](docs/usage/search-tools.mdx) for detailed examples.
+See [Search Tools Guide](docs/usage/search-tools.mdx) for detailed examples.
 
 ---
 
-## What's New in v5.1.2
+## What's New in v5.4.0
+
+**🔍 Skill-Based Search Architecture (v5.4.0):**
+
+- **Token Savings**: ~2,250 tokens per session start
+- **Progressive Disclosure**: Skill frontmatter (~250 tokens) vs MCP tool definitions (~2,500 tokens)
+- **Natural Language**: Just ask about past work - Claude auto-invokes the search skill
+- **10 HTTP API Endpoints**: Fast, efficient search operations
+- **No User Action Required**: Migration is transparent
 
 **🎨 Theme Toggle (v5.1.2):**
 
 - Light/dark mode support in viewer UI
 - System preference detection
 - Persistent theme settings across sessions
-- Smooth transitions between themes
 
 **🖥️ Web-Based Viewer UI (v5.1.0):**
 
 - Real-time memory stream visualization at http://localhost:37777
 - Server-Sent Events (SSE) for instant updates
-- Infinite scroll pagination with automatic deduplication
-- Project filtering to focus on specific codebases
-- Settings persistence (sidebar state, selected project)
-- Auto-reconnection with exponential backoff
+- Infinite scroll pagination with project filtering
 
 **⚡ Smart Install Caching (v5.0.3):**
 
-- Eliminated redundant npm installs on every session (2-5s → 10ms)
-- Caches version in `.install-version` file
-- Only runs npm install when needed (first time, version change, missing deps)
+- Eliminated redundant npm installs (2-5s → 10ms)
+- Caches version state, only installs when needed
 
 **🔍 Hybrid Search Architecture (v5.0.0):**
 
 - Chroma vector database for semantic search
 - Combined with FTS5 keyword search
-- Intelligent context retrieval with 90-day recency filtering
+- 90-day recency filtering
 
 See [CHANGELOG.md](CHANGELOG.md) for complete version history.
 
diff --git a/docs/architecture/mcp-search.mdx b/docs/architecture/mcp-search.mdx
deleted file mode 100644
index e6aa9984..00000000
--- a/docs/architecture/mcp-search.mdx
+++ /dev/null
@@ -1,447 +0,0 @@
----
-title: "MCP Search Server"
-description: "9 search tools with examples and usage patterns"
----
-
-# MCP Search Server
-
-Claude-Mem includes a Model Context Protocol (MCP) server that exposes 9 specialized search tools for querying stored observations and sessions.
-
-## Overview
-
-- **Location**: `src/servers/search-server.ts`
-- **Built Output**: `plugin/scripts/search-server.mjs`
-- **Configuration**: `plugin/.mcp.json`
-- **Transport**: stdio
-- **Tools**: 9 specialized search functions
-- **Citations**: All results use `claude-mem://` URI scheme
-
-## Configuration
-
-The MCP server is automatically registered via `plugin/.mcp.json`:
-
-```json
-{
-  "mcpServers": {
-    "claude-mem-search": {
-      "type": "stdio",
-      "command": "${CLAUDE_PLUGIN_ROOT}/scripts/search-server.mjs"
-    }
-  }
-}
-```
-
-This registers the `claude-mem-search` server with Claude Code, making the 9 search tools available in all sessions. The server is automatically started when Claude Code launches and communicates via stdio transport.
-
-## Search Tools
-
-### 1. search_observations
-
-Full-text search across observation titles, narratives, facts, and concepts.
-
-**Parameters**:
-- `query` (required): Search query for FTS5 full-text search
-- `type`: Filter by observation type(s) (decision, bugfix, feature, refactor, discovery, change)
-- `concepts`: Filter by concept tags
-- `files`: Filter by file paths (partial match)
-- `project`: Filter by project name
-- `dateRange`: Filter by date range (`{start, end}`)
-- `orderBy`: Sort order (relevance, date_desc, date_asc)
-- `limit`: Maximum results (default: 20, max: 100)
-- `offset`: Number of results to skip
-- `format`: Output format ("index" for titles/dates only, "full" for complete details)
-
-**Example**:
-```
-search_observations with query="build system" and type="decision"
-```
-
-### 2. search_sessions
-
-Full-text search across session summaries, requests, and learnings.
-
-**Parameters**:
-- `query` (required): Search query for FTS5 full-text search
-- `project`: Filter by project name
-- `dateRange`: Filter by date range
-- `orderBy`: Sort order (relevance, date_desc, date_asc)
-- `limit`: Maximum results (default: 20, max: 100)
-- `offset`: Number of results to skip
-- `format`: Output format ("index" or "full")
-
-**Example**:
-```
-search_sessions with query="hooks implementation"
-```
-
-### 3. search_user_prompts
-
-Search raw user prompts with full-text search. Use this to find what the user actually said/requested across all sessions.
-
-**Parameters**:
-- `query` (required): Search query for FTS5 full-text search
-- `project`: Filter by project name
-- `dateRange`: Filter by date range
-- `orderBy`: Sort order (relevance, date_desc, date_asc)
-- `limit`: Maximum results (default: 20, max: 100)
-- `offset`: Number of results to skip
-- `format`: Output format ("index" for truncated prompts/dates, "full" for complete prompt text)
-
-**Example**:
-```
-search_user_prompts with query="authentication feature"
-```
-
-**Benefits**:
-- Full context reconstruction from user intent → Claude actions → outcomes
-- Pattern detection for repeated requests
-- Improved debugging by tracing from original user words to final implementation
-
-### 4. find_by_concept
-
-Find observations tagged with specific concepts.
-
-**Parameters**:
-- `concept` (required): Concept tag to search for
-- `project`: Filter by project name
-- `dateRange`: Filter by date range
-- `orderBy`: Sort order (relevance, date_desc, date_asc)
-- `limit`: Maximum results (default: 20, max: 100)
-- `offset`: Number of results to skip
-- `format`: Output format ("index" or "full")
-
-**Example**:
-```
-find_by_concept with concept="architecture"
-```
-
-### 5. find_by_file
-
-Find observations and sessions that reference specific file paths.
-
-**Parameters**:
-- `filePath` (required): File path to search for (supports partial matching)
-- `project`: Filter by project name
-- `dateRange`: Filter by date range
-- `orderBy`: Sort order (relevance, date_desc, date_asc)
-- `limit`: Maximum results (default: 20, max: 100)
-- `offset`: Number of results to skip
-- `format`: Output format ("index" or "full")
-
-**Example**:
-```
-find_by_file with filePath="worker-service.ts"
-```
-
-### 6. find_by_type
-
-Find observations by type (decision, bugfix, feature, refactor, discovery, change).
-
-**Parameters**:
-- `type` (required): Observation type(s) to filter by (single type or array)
-- `project`: Filter by project name
-- `dateRange`: Filter by date range
-- `orderBy`: Sort order (relevance, date_desc, date_asc)
-- `limit`: Maximum results (default: 20, max: 100)
-- `offset`: Number of results to skip
-- `format`: Output format ("index" or "full")
-
-**Example**:
-```
-find_by_type with type=["decision", "feature"]
-```
-
-### 7. get_recent_context
-
-Get recent session context including summaries and observations for a project.
-
-**Parameters**:
-- `project`: Project name (defaults to current working directory basename)
-- `limit`: Number of recent sessions to retrieve (default: 3, max: 10)
-
-**Example**:
-```
-get_recent_context with limit=5
-```
-
-### 8. get_context_timeline
-
-Get a unified timeline of context (observations, sessions, and prompts) around a specific point in time. All record types are interleaved chronologically.
-
-**Parameters**:
-- `anchor` (required): Anchor point - observation ID, session ID (e.g., "S123"), or ISO timestamp
-- `depth_before` (default: 10): Number of records to retrieve before anchor (max: 50)
-- `depth_after` (default: 10): Number of records to retrieve after anchor (max: 50)
-- `project`: Filter by project name
-
-**Return Format**:
-Returns `depth_before` records + anchor + `depth_after` records, all interleaved chronologically. Total records: `depth_before + 1 + depth_after`.
-
-**Use Case**: Understanding "what was happening when X occurred"
-
-**Example**:
-```
-# Timeline around observation #123
-get_context_timeline with anchor=123 and depth_before=5 and depth_after=5
-
-# Timeline around a session
-get_context_timeline with anchor="S456" and depth_before=10 and depth_after=10
-
-# Timeline around a timestamp
-get_context_timeline with anchor="2025-11-06T10:30:00Z" and depth_before=15 and depth_after=5
-```
-
-**Response Structure**:
-```json
-{
-  "timeline": [
-    {
-      "type": "observation",
-      "id": 120,
-      "title": "Context before",
-      "created_at": "2025-11-06T10:25:00Z"
-    },
-    {
-      "type": "user-prompt",
-      "id": 45,
-      "prompt": "User request",
-      "created_at": "2025-11-06T10:28:00Z"
-    },
-    {
-      "type": "observation",
-      "id": 123,
-      "title": "Anchor observation",
-      "created_at": "2025-11-06T10:30:00Z",
-      "isAnchor": true
-    },
-    {
-      "type": "session",
-      "id": "S456",
-      "request": "Session summary",
-      "created_at": "2025-11-06T10:32:00Z"
-    }
-  ],
-  "anchor": {
-    "type": "observation",
-    "id": 123
-  }
-}
-```
-
-### 9. get_timeline_by_query
-
-Search for observations using natural language and get timeline context around the best match. Combines search + timeline into a single operation.
-
-**Parameters**:
-- `query` (required): Natural language search query to find relevant observations
-- `mode` (default: "auto"): Operation mode
-  - `"auto"`: Automatically use top search result as timeline anchor
-  - `"interactive"`: Return top N search results for manual anchor selection
-- `depth_before` (default: 10): Number of timeline records before anchor (max: 50)
-- `depth_after` (default: 10): Number of timeline records after anchor (max: 50)
-- `limit` (default: 5): For interactive mode - number of top search results to display (max: 20)
-- `project`: Filter by project name
-
-**Use Case**: Faster context discovery - "show me what happened around when we fixed the authentication bug"
-
-**Example - Auto Mode**:
-```
-# Automatically find and show timeline for "authentication bug"
-get_timeline_by_query with query="authentication bug" and mode="auto" and depth_before=10 and depth_after=10
-```
-
-**Example - Interactive Mode**:
-```
-# Show top 5 matches, let user choose anchor
-get_timeline_by_query with query="authentication bug" and mode="interactive" and limit=5
-```
-
-**Auto Mode Response**:
-```json
-{
-  "search_result": {
-    "id": 123,
-    "title": "Fix authentication bug",
-    "relevance": 0.95
-  },
-  "timeline": [
-    /* timeline records before and after observation 123 */
-  ]
-}
-```
-
-**Interactive Mode Response**:
-```json
-{
-  "top_results": [
-    {
-      "id": 123,
-      "title": "Fix authentication bug",
-      "relevance": 0.95,
-      "created_at": "2025-11-06T10:30:00Z"
-    },
-    {
-      "id": 98,
-      "title": "Authentication refactor",
-      "relevance": 0.82,
-      "created_at": "2025-11-05T14:20:00Z"
-    }
-  ],
-  "message": "Select an observation ID to view its timeline context"
-}
-```
-
-## Output Formats
-
-All search tools support two output formats:
-
-### Index Format (Default)
-
-Returns titles, dates, and source URIs only. Uses ~10x fewer tokens than full format.
-
-**Always use index format first** to get an overview and identify relevant results.
-
-**Example Output**:
-```
-1. [decision] Implement graceful session cleanup
-   Date: 2025-10-21 14:23:45
-   Source: claude-mem://observation/123
-
-2. [feature] Add FTS5 full-text search
-   Date: 2025-10-21 13:15:22
-   Source: claude-mem://observation/124
-```
-
-### Full Format
-
-Returns complete observation/summary details including narrative, facts, concepts, files, etc.
-
-**Only use after reviewing index results** to dive deep into specific items of interest.
-
-## Search Strategy
-
-**Recommended Workflow**:
-
-1. **Initial search**: Use default (index) format to see titles, dates, and sources
-2. **Review results**: Identify which items are most relevant to your needs
-3. **Deep dive**: Only then use `format: "full"` on specific items of interest
-4. **Narrow down**: Use filters (type, dateRange, concepts, files) to refine results
-
-**Token Efficiency**:
-- Index format: ~50-100 tokens per result
-- Full format: ~500-1000 tokens per result
-- Start with 3-5 results to avoid MCP token limits
-
-## Citations
-
-All search results use the `claude-mem://` URI scheme for citations:
-
-- `claude-mem://observation/{id}` - References specific observations
-- `claude-mem://session/{id}` - References specific sessions
-- `claude-mem://user-prompt/{id}` - References specific user prompts
-
-These citations allow Claude to reference specific historical context in responses.
-
-## FTS5 Query Syntax
-
-The `query` parameter supports SQLite FTS5 full-text search syntax:
-
-- **Simple**: `"error handling"`
-- **AND**: `"error" AND "handling"`
-- **OR**: `"bug" OR "fix"`
-- **NOT**: `"bug" NOT "feature"`
-- **Phrase**: `"'exact phrase'"`
-- **Column**: `title:"authentication"`
-
-## Security
-
-As of v4.2.3, all FTS5 queries are properly escaped to prevent SQL injection attacks:
-- Double quotes are escaped: `query.replace(/"/g, '""')`
-- Comprehensive test suite with 332 injection attack tests
-- Affects: `search_observations`, `search_sessions`, `search_user_prompts`
-
-## Example Queries
-
-```
-# Find all decisions about build system
-search_observations with query="build system" and type="decision"
-
-# Show everything related to worker-service.ts
-find_by_file with filePath="worker-service.ts"
-
-# Search what we learned about hooks
-search_sessions with query="hooks"
-
-# Show observations tagged with 'architecture'
-find_by_concept with concept="architecture"
-
-# Find what user asked about authentication
-search_user_prompts with query="authentication"
-
-# Get recent context for debugging
-get_recent_context with limit=5
-
-# Timeline around a specific observation
-get_context_timeline with anchor=123 and depth_before=10 and depth_after=10
-
-# Quick timeline search for authentication work
-get_timeline_by_query with query="authentication bug" and mode="auto"
-```
-
-## Implementation
-
-The MCP search server is implemented using:
-- `@modelcontextprotocol/sdk` (v1.20.1)
-- `SessionSearch` service for FTS5 queries
-- `SessionStore` for database access
-- `zod-to-json-schema` for parameter validation
-
-**Source Code**: `src/servers/search-server.ts`
-
-## Troubleshooting
-
-### Tool Not Available
-
-If search tools are not available in Claude Code sessions:
-
-1. Check MCP configuration:
-   ```bash
-   cat plugin/.mcp.json
-   ```
-
-2. Verify search server is built:
-   ```bash
-   ls -l plugin/scripts/search-server.mjs
-   ```
-
-3. Rebuild if needed:
-   ```bash
-   npm run build
-   ```
-
-### Search Returns No Results
-
-1. Check database has data:
-   ```bash
-   sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"
-   ```
-
-2. Verify FTS5 tables exist:
-   ```bash
-   sqlite3 ~/.claude-mem/claude-mem.db "SELECT name FROM sqlite_master WHERE type='table' AND name LIKE '%_fts';"
-   ```
-
-3. Test query syntax:
-   ```bash
-   # Simple query should work
-   search_observations with query="test"
-   ```
-
-### Token Limit Errors
-
-If you hit MCP token limits:
-
-1. Use `format: "index"` instead of `format: "full"`
-2. Reduce `limit` parameter (try 3-5 instead of 20)
-3. Use more specific filters to narrow results
-4. Use `offset` to paginate through results
diff --git a/docs/architecture/overview.mdx b/docs/architecture/overview.mdx
index 3b53589c..be6329a2 100644
--- a/docs/architecture/overview.mdx
+++ b/docs/architecture/overview.mdx
@@ -10,9 +10,9 @@ description: "System components and data flow in Claude-Mem"
 Claude-Mem operates as a Claude Code plugin with five core components:
 
 1. **Plugin Hooks** - Capture lifecycle events (7 hook files)
-2. **Worker Service** - Process observations via Claude Agent SDK
-3. **Database Layer** - Store sessions and observations (SQLite + FTS5)
-4. **MCP Search Server** - Query historical context (9 search tools)
+2. **Worker Service** - Process observations via Claude Agent SDK + HTTP API (10 search endpoints)
+3. **Database Layer** - Store sessions and observations (SQLite + FTS5 + ChromaDB)
+4. **Search Skill** - Skill-based search with progressive disclosure (v5.4.0+)
 5. **Viewer UI** - Web-based real-time memory stream visualization
 
 ## Technology Stack
@@ -44,16 +44,19 @@ Hook (stdin) → Database → Worker Service → SDK Processor → Database →
 4. **Output**: Processed summaries written back to database
 5. **Retrieval**: Next session's context hook reads summaries from database
 
-### Search Pipeline
+### Search Pipeline (v5.4.0+)
 ```
-Claude Request → MCP Server → SessionSearch Service → FTS5 Database → Search Results → Claude
+User Query → Skill Invoked → HTTP API → SessionSearch Service → FTS5 Database → Search Results → Claude
 ```
 
-1. **Query**: Claude uses MCP search tools (e.g., `search_observations`)
-2. **Search**: MCP server calls SessionSearch service with query parameters
-3. **FTS5**: Full-text search executes against FTS5 virtual tables
-4. **Format**: Results formatted as `search_result` blocks with citations
-5. **Return**: Claude receives citable search results for analysis
+1. **User Query**: User asks naturally: "What bugs did we fix?"
+2. **Skill Invoked**: Claude recognizes intent and invokes search skill
+3. **HTTP API**: Skill uses curl to call HTTP endpoint (e.g., `/api/search/observations`)
+4. **SessionSearch**: Worker service queries FTS5 virtual tables
+5. **Format**: Results formatted and returned to skill
+6. **Return**: Claude presents formatted results to user
+
+**Token Savings**: ~2,250 tokens per session vs MCP approach through progressive disclosure
 
 ## Session Lifecycle
 
@@ -110,9 +113,6 @@ claude-mem/
 │   │   ├── summary-hook.ts     # Stop
 │   │   └── cleanup-hook.ts     # SessionEnd
 │   │
-│   ├── servers/                # MCP servers
-│   │   └── search-server.ts    # MCP search tools server (9 tools)
-│   │
 │   ├── sdk/                    # Claude Agent SDK integration
 │   │   ├── prompts.ts          # XML prompt builders
 │   │   ├── parser.ts           # XML response parser
@@ -146,7 +146,6 @@ claude-mem/
 ├── plugin/                     # Plugin distribution
 │   ├── .claude-plugin/
 │   │   └── plugin.json
-│   ├── .mcp.json               # MCP server configuration
 │   ├── hooks/
 │   │   └── hooks.json
 │   ├── scripts/                # Built executables
@@ -157,8 +156,14 @@ claude-mem/
 │   │   ├── save-hook.js
 │   │   ├── summary-hook.js
 │   │   ├── cleanup-hook.js
-│   │   ├── worker-service.cjs  # Background worker
-│   │   └── search-server.mjs   # MCP search server
+│   │   └── worker-service.cjs  # Background worker + HTTP API
+│   │
+│   ├── skills/                 # Agent skills (v5.4.0+)
+│   │   ├── search/             # Search skill with progressive disclosure
+│   │   │   ├── SKILL.md        # Skill frontmatter (~250 tokens)
+│   │   │   └── operations/     # Detailed operation docs
+│   │   ├── troubleshoot/       # Troubleshooting skill
+│   │   └── version-bump/       # Version management skill
 │   │
 │   └── ui/                     # Built viewer UI
 │       └── viewer.html         # Self-contained bundle
@@ -183,7 +188,8 @@ See [Plugin Hooks](/architecture/hooks) for detailed hook documentation.
 
 ### 2. Worker Service
 Express.js HTTP server on port 37777 (configurable) with:
-- 8 HTTP/SSE endpoints for viewer UI
+- 10 search HTTP API endpoints (v5.4.0+)
+- 8 viewer UI HTTP/SSE endpoints
 - Async observation processing via Claude Agent SDK
 - Real-time updates via Server-Sent Events
 - Auto-managed by PM2 process manager
@@ -199,13 +205,19 @@ SQLite3 with better-sqlite3 driver featuring:
 
 See [Database Architecture](/architecture/database) for schema and FTS5 search.
 
-### 4. MCP Search Server (9 Tools)
-Provides 9 specialized search tools:
-- search_observations, search_sessions, search_user_prompts
-- find_by_concept, find_by_file, find_by_type
-- get_recent_context, get_context_timeline, get_timeline_by_query
+### 4. Search Skill (v5.4.0+)
+Skill-based search with progressive disclosure providing 10 search operations:
+- Search observations, sessions, prompts (full-text FTS5)
+- Filter by type, concept, file
+- Get recent context, timeline, timeline by query
+- API help documentation
 
-See [MCP Search Server](/architecture/mcp-search) for search tools and examples.
+**Token Savings**: ~2,250 tokens per session vs MCP approach
+- Skill frontmatter: ~250 tokens (loaded at session start)
+- Full instructions: ~2,500 tokens (loaded on-demand when invoked)
+- HTTP API endpoints instead of MCP tools
+
+See [Search Architecture](/architecture/search-architecture) for technical details and examples.
 
 ### 5. Viewer UI
 React + TypeScript web interface at http://localhost:37777 featuring:
diff --git a/docs/architecture/search-architecture.mdx b/docs/architecture/search-architecture.mdx
new file mode 100644
index 00000000..9907f424
--- /dev/null
+++ b/docs/architecture/search-architecture.mdx
@@ -0,0 +1,440 @@
+---
+title: "Search Architecture"
+description: "Skill-based search with HTTP API and progressive disclosure"
+---
+
+# Search Architecture
+
+Claude-Mem uses a skill-based search architecture that provides intelligent memory retrieval through natural language queries. This replaced the MCP-based approach in v5.4.0, saving ~2,250 tokens per session start.
+
+## Overview
+
+**Architecture**: Skill-Based Search + HTTP API + Progressive Disclosure
+
+**Key Components**:
+1. **Search Skill** (`plugin/skills/search/SKILL.md`) - Auto-invoked when users ask about past work
+2. **HTTP API Endpoints** (10 routes) - Fast, efficient search operations on port 37777
+3. **Worker Service** - Express.js server with FTS5 full-text search
+4. **SQLite Database** - Persistent storage with FTS5 virtual tables
+5. **Chroma Vector DB** - Semantic search with hybrid retrieval
+
+## How It Works
+
+### 1. User Query (Natural Language)
+
+```
+User: "What bugs did we fix last session?"
+```
+
+### 2. Skill Invocation
+
+Claude recognizes the intent and invokes the search skill:
+- Skill frontmatter (~250 tokens) loaded at session start
+- Full skill instructions loaded on-demand when skill is invoked
+- Progressive disclosure pattern minimizes context overhead
+
+### 3. HTTP API Call
+
+The skill uses `curl` to call the HTTP API:
+
+```bash
+curl "http://localhost:37777/api/search/observations?query=bugs&type=bugfix&limit=5"
+```
+
+### 4. FTS5 Search
+
+Worker service queries SQLite FTS5 virtual tables:
+
+```sql
+SELECT * FROM observations_fts
+WHERE observations_fts MATCH ?
+AND type = 'bugfix'
+ORDER BY rank
+LIMIT 5
+```
+
+### 5. Results Formatted
+
+Skill formats results and returns to Claude:
+
+```
+## Recent Bugfixes
+
+1. [bugfix] Fixed authentication token expiry
+   Date: 2025-11-08 14:23:45
+   Files: src/auth/jwt.ts
+
+2. [bugfix] Resolved database connection leak
+   Date: 2025-11-08 13:15:22
+   Files: src/services/database.ts
+```
+
+### 6. User Sees Answer
+
+Claude presents the formatted results naturally in conversation.
+
+## Architecture Change (v5.4.0)
+
+### Before: MCP-Based Search
+
+**Approach**: 9 MCP tools registered at session start
+
+**Token Cost**: ~2,500 tokens in tool definitions per session
+- Each tool's schema, parameters, descriptions loaded
+- All 9 tools available whether needed or not
+- No progressive disclosure
+
+**Example MCP Tool**:
+```json
+{
+  "name": "search_observations",
+  "description": "Full-text search across observations...",
+  "inputSchema": {
+    "type": "object",
+    "properties": {
+      "query": { "type": "string", "description": "..." },
+      "type": { "type": "array", "items": { "enum": [...] } },
+      "format": { "enum": ["index", "full"] },
+      // ... many more parameters
+    }
+  }
+}
+```
+
+### After: Skill-Based Search
+
+**Approach**: 1 search skill with progressive disclosure
+
+**Token Cost**: ~250 tokens in skill frontmatter per session
+- Only skill description loaded at session start
+- Full instructions loaded on-demand when skill is invoked
+- HTTP API endpoints instead of MCP protocol
+
+**Example Skill Frontmatter**:
+```markdown
+# Claude-Mem Search Skill
+
+Access claude-mem's persistent memory through a comprehensive HTTP API.
+Search for past work, understand context, and learn from previous decisions.
+
+## When to Use This Skill
+
+Invoke this skill when users ask about:
+- Past work: "What did we do last session?"
+- Bug fixes: "Did we fix this before?"
+- Features: "How did we implement authentication?"
+...
+```
+
+**Token Savings**: ~2,250 tokens per session start (90% reduction)
+
+## HTTP API Endpoints
+
+The worker service exposes 10 search endpoints:
+
+### Full-Text Search
+
+```
+GET /api/search/observations
+GET /api/search/sessions
+GET /api/search/prompts
+```
+
+**Parameters**:
+- `query` - FTS5 search query (required)
+- `type` - Filter by type (bugfix, feature, refactor, etc.)
+- `project` - Filter by project name
+- `limit` - Maximum results (default: 20)
+- `offset` - Pagination offset
+- `format` - Response format (index or full)
+
+**Example**:
+```bash
+curl "http://localhost:37777/api/search/observations?query=authentication&type=decision&limit=5"
+```
+
+### Filtered Search
+
+```
+GET /api/search/by-type
+GET /api/search/by-concept
+GET /api/search/by-file
+```
+
+**Parameters**:
+- `type` / `concept` / `filePath` - Filter criteria (required)
+- `project` - Filter by project
+- `limit` - Maximum results
+- `format` - Response format
+
+**Example**:
+```bash
+curl "http://localhost:37777/api/search/by-file?filePath=worker-service.ts&limit=10"
+```
+
+### Context Retrieval
+
+```
+GET /api/context/recent
+GET /api/context/timeline
+GET /api/timeline/by-query
+```
+
+**Parameters**:
+- `project` - Filter by project
+- `limit` - Number of sessions/records
+- `anchor` - Timeline anchor point (ID or timestamp)
+- `depth_before` - Records before anchor
+- `depth_after` - Records after anchor
+
+**Example**:
+```bash
+curl "http://localhost:37777/api/context/recent?project=claude-mem&limit=5"
+```
+
+### Documentation
+
+```
+GET /api/search/help
+```
+
+Returns API documentation in JSON format.
+
+## Progressive Disclosure Pattern
+
+The search skill uses progressive disclosure to minimize token usage:
+
+### Layer 1: Skill Frontmatter (Session Start)
+
+**What's Loaded**: Skill description and when to use it (~250 tokens)
+
+**Purpose**: Claude can recognize when to invoke the skill
+
+**Example**:
+```markdown
+# Claude-Mem Search Skill
+
+Access claude-mem's persistent memory through a comprehensive HTTP API.
+
+## When to Use This Skill
+Invoke this skill when users ask about:
+- Past work: "What did we do last session?"
+- Bug fixes: "Did we fix this before?"
+...
+```
+
+### Layer 2: Full Skill Instructions (On-Demand)
+
+**What's Loaded**: Complete operation documentation (~2,500 tokens)
+
+**Purpose**: Detailed instructions for each search operation
+
+**When Loaded**: Only when Claude invokes the skill
+
+**Example Structure**:
+```
+/skills/search/
+├── SKILL.md (main frontmatter)
+├── operations/
+│   ├── observations.md (detailed instructions)
+│   ├── sessions.md
+│   ├── prompts.md
+│   ├── by-type.md
+│   ├── by-concept.md
+│   ├── by-file.md
+│   ├── recent-context.md
+│   ├── timeline.md
+│   ├── timeline-by-query.md
+│   ├── help.md
+│   ├── formatting.md
+│   └── common-workflows.md
+```
+
+### Layer 3: API Response
+
+**What's Returned**: Search results in requested format
+
+**Format Options**:
+- `index` - Titles, dates, IDs only (~50-100 tokens per result)
+- `full` - Complete details (~500-1000 tokens per result)
+
+**Progressive Usage**: Start with `index`, drill down with `full` as needed
+
+## Implementation Details
+
+### Search Skill Structure
+
+```
+plugin/skills/search/
+├── SKILL.md                           # Main frontmatter (~250 tokens)
+├── operations/
+│   ├── observations.md                # Search observations
+│   ├── sessions.md                    # Search sessions
+│   ├── prompts.md                     # Search prompts
+│   ├── by-type.md                     # Filter by type
+│   ├── by-concept.md                  # Filter by concept
+│   ├── by-file.md                     # Filter by file
+│   ├── recent-context.md              # Get recent context
+│   ├── timeline.md                    # Timeline around point
+│   ├── timeline-by-query.md           # Search + timeline
+│   ├── help.md                        # API documentation
+│   ├── formatting.md                  # Result formatting guide
+│   └── common-workflows.md            # Usage patterns
+```
+
+### Worker Service Integration
+
+**File**: `src/services/worker-service.ts`
+
+**Search Routes**:
+```typescript
+// Full-text search
+app.get('/api/search/observations', handleSearchObservations);
+app.get('/api/search/sessions', handleSearchSessions);
+app.get('/api/search/prompts', handleSearchPrompts);
+
+// Filtered search
+app.get('/api/search/by-type', handleSearchByType);
+app.get('/api/search/by-concept', handleSearchByConcept);
+app.get('/api/search/by-file', handleSearchByFile);
+
+// Context retrieval
+app.get('/api/context/recent', handleRecentContext);
+app.get('/api/context/timeline', handleTimeline);
+app.get('/api/timeline/by-query', handleTimelineByQuery);
+
+// Documentation
+app.get('/api/search/help', handleHelp);
+```
+
+**Database Access**:
+- Uses `SessionSearch` service for FTS5 queries
+- Uses `SessionStore` for structured queries
+- Hybrid search with ChromaDB for semantic similarity
+
+### Security
+
+**FTS5 Injection Prevention** (v4.2.3):
+```typescript
+function escapeFTS5Query(query: string): string {
+  return query.replace(/"/g, '""');
+}
+```
+
+All user-provided search queries are properly escaped to prevent SQL injection.
+
+**Comprehensive Testing**: 332 injection attack tests covering:
+- Special characters
+- SQL keywords
+- Quote escaping
+- Boolean operators
+
+## Benefits
+
+### 1. Token Efficiency
+
+**Before (MCP)**:
+- Session start: ~2,500 tokens for tool definitions
+- Every session pays this cost
+- No progressive disclosure
+
+**After (Skill)**:
+- Session start: ~250 tokens for skill frontmatter
+- Full instructions: ~2,500 tokens (only when invoked)
+- Net savings: ~2,250 tokens per session (~90% reduction)
+
+### 2. Natural Language Interface
+
+**Before**: Users needed to learn MCP tool syntax
+```
+search_observations with query="authentication" and type="decision"
+```
+
+**After**: Users ask naturally
+```
+"What decisions did we make about authentication?"
+```
+
+Claude translates to appropriate API call.
+
+### 3. Flexibility
+
+**HTTP API Benefits**:
+- Can be called from skills, MCP tools, or other clients
+- Easy to test with curl
+- Standard REST conventions
+- JSON responses
+
+**Progressive Disclosure**:
+- Loads only what's needed
+- Can add more operations without increasing base cost
+- Documentation co-located with operations
+
+### 4. Performance
+
+**Fast Queries**: FTS5 full-text search <10ms for typical queries
+
+**Caching**: HTTP layer allows response caching
+
+**Pagination**: Efficient result pagination with offset/limit
+
+## Migration Notes
+
+### For Users
+
+**No Action Required**: The migration from MCP to skill-based search is transparent.
+
+**Same Questions Work**: Natural language queries work exactly the same way.
+
+**Invisible Change**: Users won't notice any difference except better performance.
+
+### For Developers
+
+**Deprecated**: MCP search server (`src/servers/search-server.ts`)
+- Source file kept for reference
+- No longer built or registered
+- MCP configuration removed from `plugin/.mcp.json`
+
+**New Implementation**: Skill-based search
+- Skill files: `plugin/skills/search/`
+- HTTP endpoints: `src/services/worker-service.ts` (lines 200-400)
+- Build script: `npm run build` includes skill files
+- Sync script: `npm run sync-marketplace` copies to plugin directory
+
+## Troubleshooting
+
+### Worker Service Not Running
+
+If searches fail, check worker service:
+
+```bash
+pm2 list                    # Check status
+npm run worker:restart      # Restart worker
+npm run worker:logs         # View logs
+```
+
+### HTTP Endpoints Not Responding
+
+Test endpoints directly:
+
+```bash
+# Health check
+curl http://localhost:37777/health
+
+# Search test
+curl "http://localhost:37777/api/search/observations?query=test&limit=1"
+```
+
+### Skill Not Invoking
+
+If Claude doesn't invoke the skill:
+
+1. Check skill files exist: `ls ~/.claude/plugins/marketplaces/thedotmack/plugin/skills/search/`
+2. Restart Claude Code session
+3. Try explicit skill invocation: `/skill search`
+
+## Next Steps
+
+- [Search Tools Usage](/usage/search-tools) - User guide with examples
+- [Worker Service Architecture](/architecture/worker-service) - HTTP API details
+- [Database Schema](/architecture/database) - FTS5 tables and indexes
diff --git a/docs/configuration.mdx b/docs/configuration.mdx
index eeff6152..dd66fbfd 100644
--- a/docs/configuration.mdx
+++ b/docs/configuration.mdx
@@ -137,22 +137,19 @@ Hooks are configured in `plugin/hooks/hooks.json`:
 }
 ```
 
-### MCP Server Configuration
+### Search Configuration (v5.4.0+)
 
-The MCP search server is configured in `plugin/.mcp.json`:
+**Migration Note**: As of v5.4.0, Claude-Mem uses skill-based search instead of MCP tools.
 
-```json
-{
-  "mcpServers": {
-    "claude-mem-search": {
-      "type": "stdio",
-      "command": "${CLAUDE_PLUGIN_ROOT}/scripts/search-server.mjs"
-    }
-  }
-}
-```
+**Previous (v5.3.x and earlier)**: MCP search server with 9 tools (~2,500 tokens per session)
+**Current (v5.4.0+)**: Search skill with HTTP API (~250 tokens per session)
 
-This registers the `claude-mem-search` server with Claude Code, making the 9 search tools available in all sessions.
+**No configuration required** - the search skill is automatically available in Claude Code sessions.
+
+Search operations are now provided via:
+- **Skill**: `plugin/skills/search/SKILL.md` (auto-invoked when users ask about past work)
+- **HTTP API**: 10 endpoints on worker service port 37777
+- **Progressive Disclosure**: Full instructions loaded on-demand only when needed
 
 ## PM2 Configuration
 
diff --git a/docs/docs.json b/docs/docs.json
index b6af3176..464f5b0d 100644
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -66,7 +66,7 @@
           "architecture/hooks",
           "architecture/worker-service",
           "architecture/database",
-          "architecture/mcp-search"
+          "architecture/search-architecture"
         ]
       }
     ]
diff --git a/docs/introduction.mdx b/docs/introduction.mdx
index c603b3d3..e53fdea1 100644
--- a/docs/introduction.mdx
+++ b/docs/introduction.mdx
@@ -23,7 +23,7 @@ Restart Claude Code. Context from previous sessions will automatically appear in
 ## Key Features
 
 - 🧠 **Persistent Memory** - Context survives across sessions
-- 🔍 **9 Search Tools** - Query your project history via MCP
+- 🔍 **Skill-Based Search** - Query your project history with natural language (~2,250 token savings)
 - 🌐 **Web Viewer UI** - Real-time memory stream visualization at http://localhost:37777
 - 🎨 **Theme Toggle** - Light, dark, and system preference themes
 - 🤖 **Automatic Operation** - No manual intervention required
diff --git a/docs/usage/getting-started.mdx b/docs/usage/getting-started.mdx
index a91ab7d3..9d7845a9 100644
--- a/docs/usage/getting-started.mdx
+++ b/docs/usage/getting-started.mdx
@@ -160,11 +160,13 @@ Context injection uses progressive disclosure for efficient token usage:
 - Shows full summary details **only if** generated after last observation
 - Token cost: ~50-200 tokens for index view
 
-### Layer 2: On-Demand Details (MCP Search)
-- Fetch full observation narratives when needed
+### Layer 2: On-Demand Details (Skill-Based Search)
+- Ask naturally: "What bugs did we fix?" or "How did we implement X?"
+- Claude auto-invokes search skill to fetch full details
 - Search by concept, file, type, or keyword
 - Timeline context around specific observations
 - Token cost: ~100-500 tokens per observation fetched
+- Skill uses HTTP API (v5.4.0+) for efficient retrieval
 
 ### Layer 3: Perfect Recall (Code Access)
 - Read source files directly when needed
@@ -191,8 +193,23 @@ When you use `/clear`, the session doesn't end - it continues with a new prompt
 
 The `/clear` command clears the conversation context visible to Claude AND re-injects fresh context from recent sessions, while the underlying session continues tracking observations.
 
+## Searching Your History (v5.4.0+)
+
+Claude-Mem now uses skill-based search for querying your project history. Simply ask naturally:
+
+```
+"What bugs did we fix last session?"
+"How did we implement authentication?"
+"What changes were made to worker-service.ts?"
+"Show me recent work on this project"
+```
+
+Claude automatically recognizes your intent and invokes the search skill, which uses HTTP API endpoints to query your memory efficiently.
+
+**Token Savings**: ~2,250 tokens per session start vs previous MCP approach
+
 ## Next Steps
 
-- [MCP Search Tools](/usage/search-tools) - Learn how to search your project history
+- [Skill-Based Search](/usage/search-tools) - Learn how to search your project history
 - [Architecture Overview](/architecture/overview) - Understand how it works
 - [Troubleshooting](/troubleshooting) - Common issues and solutions
diff --git a/docs/usage/search-tools.mdx b/docs/usage/search-tools.mdx
index 20e32c9d..1294bc6d 100644
--- a/docs/usage/search-tools.mdx
+++ b/docs/usage/search-tools.mdx
@@ -1,320 +1,221 @@
 ---
-title: "MCP Search Tools"
-description: "Query your project history with 9 specialized search tools"
+title: "Skill-Based Search"
+description: "Query your project history with natural language"
 ---
 
-# MCP Search Tools Usage
+# Skill-Based Search Usage
 
-Once claude-mem is installed as a plugin, 9 search tools become available in your Claude Code sessions for querying project history.
+Once claude-mem is installed as a plugin, you can search your project history using natural language. Claude automatically invokes the search skill when you ask about past work.
+
+## How It Works
+
+**v5.4.0 Migration**: Claude-Mem now uses a skill-based search architecture instead of MCP tools, saving ~2,250 tokens per session start through progressive disclosure.
+
+**Simple Usage:**
+- Just ask naturally: *"What did we do last session?"*
+- Claude recognizes the intent and invokes the search skill
+- The skill uses HTTP API endpoints to query your memory
+- Results are formatted and presented to you
+
+**Benefits:**
+- **Token Efficient**: ~250 tokens (skill frontmatter) vs ~2,500 tokens (MCP tool definitions)
+- **Natural Language**: No need to learn specific tool syntax
+- **Progressive Disclosure**: Only loads detailed instructions when needed
+- **Auto-Invoked**: Claude knows when to search based on your questions
 
 ## Quick Reference
 
-| Tool                    | Purpose                                      |
+| Operation               | Purpose                                      |
 |-------------------------|----------------------------------------------|
-| search_observations     | Full-text search across observations         |
-| search_sessions         | Full-text search across session summaries    |
-| search_user_prompts     | Full-text search across raw user prompts     |
-| find_by_concept         | Find observations tagged with concepts       |
-| find_by_file            | Find observations referencing files          |
-| find_by_type            | Find observations by type                    |
-| get_recent_context      | Get recent session context                   |
-| get_context_timeline    | Get unified timeline around a specific point |
-| get_timeline_by_query   | Search and get timeline context in one step  |
+| Search Observations     | Full-text search across observations         |
+| Search Sessions         | Full-text search across session summaries    |
+| Search Prompts          | Full-text search across raw user prompts     |
+| By Concept              | Find observations tagged with concepts       |
+| By File                 | Find observations referencing files          |
+| By Type                 | Find observations by type                    |
+| Recent Context          | Get recent session context                   |
+| Timeline                | Get unified timeline around a specific point |
+| Timeline by Query       | Search and get timeline context in one step  |
+| API Help                | Get search API documentation                 |
 
 ## Example Queries
 
-### search_observations
+### Natural Language Queries
 
-Find all decisions about the build system:
+**Search Observations:**
 ```
-Use search_observations to find all decisions about the build system
+"What bugs did we fix related to authentication?"
+"Show me all decisions about the build system"
+"Find refactoring work on the database"
 ```
 
-Find bugs related to authentication:
+**Search Sessions:**
 ```
-search_observations with query="authentication" and type="bugfix"
+"What did we learn about hooks?"
+"What was accomplished in the API implementation?"
+"Show me recent work on this project"
 ```
 
-Search for refactoring work:
+**Search Prompts:**
 ```
-search_observations with query="refactor database" and type="refactor"
+"When did I ask about authentication features?"
+"Find all my requests about dark mode"
 ```
 
-### search_sessions
+**Note**: Claude automatically translates your natural language queries into the appropriate search operations.
+
+### Search by File
 
-Find what we learned about hooks:
 ```
-Use search_sessions to find what we learned about hooks
+"Show me everything related to worker-service.ts"
+"What changes were made to migrations.ts?"
+"Find all work on the database file"
 ```
 
-Search for completed work on the API:
+### Search by Concept
+
 ```
-search_sessions with query="API implementation"
+"Show observations tagged with architecture"
+"Find all security-related observations"
+"What patterns have we used?"
 ```
 
-### search_user_prompts
+### Search by Type
 
-Find when user asked about authentication:
 ```
-search_user_prompts with query="authentication feature"
+"Find all feature implementations"
+"Show me all decisions and discoveries"
+"What bugs have we fixed?"
 ```
 
-Trace user requests for a specific feature:
+### Recent Context
+
 ```
-search_user_prompts with query="dark mode"
+"Show me what we've been working on"
+"Get context from the last 5 sessions"
+"What happened recently on this project?"
 ```
 
-**Benefits**:
-- See exactly what the user asked for (vs what was implemented)
-- Detect patterns in repeated requests
-- Debug miscommunications between user intent and implementation
+### Timeline Queries
 
-### find_by_file
-
-Show everything related to worker-service.ts:
+**Get timeline around a specific point:**
 ```
-Use find_by_file to show me everything related to worker-service.ts
+"What was happening when we implemented authentication?"
+"Show me the context around that bug fix"
+"What led to the decision to refactor the database?"
 ```
 
-Find all work on the database migration file:
+**Timeline by query:**
 ```
-find_by_file with filePath="migrations.ts"
+"Find when we added the viewer UI and show what happened around that time"
+"Search for authentication work and show the timeline"
 ```
 
-### find_by_concept
-
-Show observations tagged with 'architecture':
-```
-Use find_by_concept to show observations tagged with 'architecture'
-```
-
-Find all 'security' related observations:
-```
-find_by_concept with concept="security"
-```
-
-### find_by_type
-
-Find all feature implementations:
-```
-find_by_type with type="feature"
-```
-
-Find all decisions and discoveries:
-```
-find_by_type with type=["decision", "discovery"]
-```
-
-### get_recent_context
-
-Get the last 5 sessions for context:
-```
-get_recent_context with limit=5
-```
-
-Get recent context for debugging:
-```
-Use get_recent_context to show me what we've been working on
-```
-
-### get_context_timeline
-
-Get a unified timeline of context around a specific point in time. This tool interleaves observations, sessions, and user prompts chronologically to show what was happening before and after a specific moment.
-
-**Anchor by observation ID:**
-```
-get_context_timeline with anchor=12345 and depth_before=10 and depth_after=10
-```
-
-**Anchor by session ID:**
-```
-get_context_timeline with anchor="S123" and depth_before=5 and depth_after=5
-```
-
-**Anchor by ISO timestamp:**
-```
-get_context_timeline with anchor="2025-10-21T14:30:00Z" and depth_before=15 and depth_after=15
-```
-
-**Use cases:**
-- Understand what was happening when a specific observation occurred
-- See the full context around a bug fix or decision
-- Trace the events leading up to and following a specific change
-- View chronological sequence of related work
-
 **Benefits:**
-- All record types (observations, sessions, prompts) in one chronological view
-- Configurable depth before/after anchor point
-- Flexible anchoring by ID or timestamp
 - See the complete narrative arc around key events
-
-### get_timeline_by_query
-
-Search for observations using natural language and get timeline context around the best match. This combines search + timeline into a single operation for faster context discovery.
-
-**Auto mode (default):**
-```
-get_timeline_by_query with query="authentication implementation"
-```
-Automatically uses the top search result as timeline anchor and returns surrounding context.
-
-**Interactive mode:**
-```
-get_timeline_by_query with query="authentication" and mode="interactive" and limit=5
-```
-Shows top 5 search results for you to manually choose which to use as timeline anchor.
-
-**Customize timeline depth:**
-```
-get_timeline_by_query with query="bug fix" and depth_before=20 and depth_after=10
-```
-
-**Use cases:**
-- Quick context discovery: "What was happening when we implemented X?"
-- Investigate issues: Find a bug fix and see what led to it
-- Decision archaeology: Search for a decision and understand the context
-- Feature timeline: See the complete story of a feature implementation
-
-**Benefits:**
-- Single-step operation (no need to search, then timeline separately)
-- Auto mode provides instant context
-- Interactive mode gives you control over anchor selection
-- Natural language search makes it easy to find relevant moments
+- All record types (observations, sessions, prompts) in chronological view
+- Understand what was happening before and after important changes
 
 ## Search Strategy
 
-### 1. Start with Index Format
+The search skill uses a progressive disclosure pattern to efficiently retrieve information:
 
-**Always use index format first** to get an overview:
+### 1. Ask Naturally
 
+Start with a natural language question:
 ```
-search_observations with query="authentication" and format="index"
+"What bugs did we fix related to authentication?"
 ```
 
-**Why?**
-- Index format uses ~10x fewer tokens than full format
-- See titles, dates, and sources to identify relevant results
-- Avoid hitting MCP token limits
+### 2. Claude Invokes Search Skill
 
-### 2. Review Results
+Claude recognizes your intent and loads the search skill (~250 tokens for skill frontmatter).
 
-Look at the index results to identify items of interest:
+### 3. Skill Uses HTTP API
 
+The skill calls the appropriate HTTP endpoint (e.g., `/api/search/observations`) with the query.
+
+### 4. Results Formatted
+
+Results are formatted and presented to you, usually starting with an index/summary format.
+
+### 5. Deep Dive if Needed
+
+If you need more details, ask follow-up questions:
 ```
-1. [decision] Implement JWT authentication
-   Date: 2025-10-21 14:23:45
-   Source: claude-mem://observation/123
-
-2. [feature] Add user authentication endpoints
-   Date: 2025-10-21 13:15:22
-   Source: claude-mem://observation/124
-
-3. [bugfix] Fix authentication token expiry
-   Date: 2025-10-20 16:45:30
-   Source: claude-mem://observation/125
+"Tell me more about observation #123"
+"Show me the full details of that decision"
 ```
 
-### 3. Deep Dive with Full Format
-
-Only use full format for specific items:
-
-```
-search_observations with query="JWT authentication" and format="full" and limit=3
-```
-
-### 4. Use Filters to Narrow Results
-
-Combine filters for precise searches:
-
-```
-search_observations with query="authentication" and type="decision" and dateRange={start: "2025-10-20", end: "2025-10-21"}
-```
+**Benefits of This Approach:**
+- **Token Efficient**: Only loads what you need, when you need it
+- **Natural**: No syntax to learn
+- **Progressive**: Start with overview, drill down as needed
+- **Automatic**: Claude handles the search invocation
 
 ## Advanced Filtering
 
+You can refine searches using natural language filters:
+
 ### Date Ranges
 
-Search within specific time periods:
-
-```json
-{
-  "dateRange": {
-    "start": "2025-10-01",
-    "end": "2025-10-31"
-  }
-}
 ```
-
-Or use epoch timestamps:
-
-```json
-{
-  "dateRange": {
-    "start": 1729449600,
-    "end": 1732128000
-  }
-}
+"What bugs did we fix in October?"
+"Show me work from last week"
+"Find decisions made between October 1-31"
 ```
 
 ### Multiple Types
 
-Search across multiple observation types:
-
 ```
-find_by_type with type=["decision", "feature", "refactor"]
+"Show me all decisions and features"
+"Find bugfixes and refactorings"
 ```
 
-### Multiple Concepts
-
-Search observations with specific concepts:
+### Concepts
 
 ```
-search_observations with query="database" and concepts=["architecture", "performance"]
+"Find database work related to architecture and performance"
+"Show security observations"
 ```
 
-### File Filtering
-
-Search observations that touched specific files:
+### File-Specific
 
 ```
-search_observations with query="refactor" and files="worker-service.ts"
+"Show refactoring work that touched worker-service.ts"
+"Find changes to auth files"
 ```
 
 ### Project Filtering
 
-Search within specific projects:
-
 ```
-search_observations with query="authentication" and project="my-app"
+"Show authentication work on my-app project"
+"What have we done on this codebase?"
 ```
 
-## FTS5 Query Syntax
+**Note**: Claude translates your natural language into the appropriate API filters automatically.
 
-The `query` parameter supports SQLite FTS5 full-text search syntax:
+## Under the Hood: HTTP API
 
-### Simple Queries
-```
-"authentication"           # Single word
-"error handling"           # Multiple words (OR)
-```
+The search skill uses HTTP endpoints on the worker service (port 37777):
 
-### Boolean Operators
-```
-"error" AND "handling"     # Both terms required
-"bug" OR "fix"             # Either term
-"bug" NOT "feature"        # First term, not second
-```
+- `GET /api/search/observations` - Full-text search observations
+- `GET /api/search/sessions` - Full-text search session summaries
+- `GET /api/search/prompts` - Full-text search user prompts
+- `GET /api/search/by-concept` - Find observations by concept tag
+- `GET /api/search/by-file` - Find work related to specific files
+- `GET /api/search/by-type` - Find observations by type
+- `GET /api/context/recent` - Get recent session context
+- `GET /api/context/timeline` - Get timeline around specific point
+- `GET /api/timeline/by-query` - Search + timeline in one call
+- `GET /api/search/help` - API documentation
 
-### Phrase Searches
-```
-"'exact phrase'"           # Exact phrase match
-```
-
-### Column Searches
-```
-title:"authentication"     # Search specific column
-narrative:"bug fix"        # Search narrative field
-```
+These endpoints use FTS5 full-text search with support for:
+- Boolean operators (AND, OR, NOT)
+- Phrase searches
+- Column-specific searches
+- Date range filtering
+- Project filtering
 
 ## Result Metadata
 
@@ -441,52 +342,61 @@ search_sessions with query="[YOUR PROJECT NAME]" and orderBy="date_desc"
    sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"
    ```
 
-2. Try broader query:
+2. Try broader natural language query:
    ```
-   search_observations with query="authentication"  # Good
+   "Show me anything about authentication"  # Broader
    vs
-   search_observations with query="'exact JWT authentication implementation'"  # Too specific
+   "Find exact JWT authentication implementation"  # Too specific
    ```
 
-3. Remove filters:
+3. Ask without filters first:
    ```
-   # Start broad
-   search_observations with query="auth"
-
-   # Then add filters
-   search_observations with query="auth" and type="decision"
+   "What do we have about auth?"
+   # Then narrow down
+   "Show me auth-related decisions"
    ```
 
-### Token Limit Errors
+### Worker Service Not Running
 
-1. Use index format:
-   ```
-   search_observations with query="..." and format="index"
-   ```
+If search isn't working, check the worker service:
 
-2. Reduce limit:
-   ```
-   search_observations with query="..." and limit=3
-   ```
+```bash
+pm2 list                    # Check worker status
+npm run worker:restart      # Restart if needed
+npm run worker:logs         # View logs
+```
 
-3. Use pagination:
-   ```
-   # First page
-   search_observations with query="..." and limit=5 and offset=0
+Or use the troubleshooting skill:
+```
+/skill troubleshoot
+```
 
-   # Second page
-   search_observations with query="..." and limit=5 and offset=5
-   ```
+### Performance Issues
 
-### Search Too Slow
+If searches seem slow:
+1. Be more specific in your queries
+2. Ask for recent work (naturally filters by date)
+3. Specify the project you're interested in
+4. Ask for fewer results initially
 
-1. Use more specific queries
-2. Add date range filters
-3. Add type/concept filters
-4. Reduce result limit
+## Technical Details
+
+**Architecture Change (v5.4.0)**:
+- **Before**: 9 MCP tools (~2,500 tokens in tool definitions per session start)
+- **After**: 1 search skill (~250 tokens in frontmatter, full instructions loaded on-demand)
+- **Savings**: ~2,250 tokens per session start
+- **Migration**: Transparent - users don't need to change how they ask questions
+
+**How the Skill Works:**
+1. User asks a question about past work
+2. Claude recognizes the intent matches the search skill description
+3. Skill loads full instructions from `plugin/skills/search/SKILL.md`
+4. Skill uses `curl` to call HTTP API endpoints
+5. Results formatted and returned to Claude
+6. Claude presents results to user
 
 ## Next Steps
 
-- [MCP Search Architecture](/architecture/mcp-search) - Technical details
+- [Architecture Overview](/architecture/overview) - System components
 - [Database Schema](/architecture/database) - Understanding the data
 - [Getting Started](/usage/getting-started) - Automatic operation