diff --git a/README.md b/README.md
index b0db0c52..c69256bd 100644
--- a/README.md
+++ b/README.md
@@ -172,35 +172,40 @@ See [Architecture Overview](https://docs.claude-mem.ai/architecture/overview) fo
 
 ---
 
-## mem-search Skill
+## MCP Search Tools
 
-Claude-Mem provides intelligent search through the mem-search skill that auto-invokes when you ask about past work:
+Claude-Mem provides intelligent memory search through **4 MCP tools** following a token-efficient **3-layer workflow pattern**:
+
+**The 3-Layer Workflow:**
+
+1. **`search`** - Get compact index with IDs (~50-100 tokens/result)
+2. **`timeline`** - Get chronological context around interesting results
+3. **`get_observations`** - Fetch full details ONLY for filtered IDs (~500-1,000 tokens/result)
 
 **How It Works:**
-- Just ask naturally: *"What did we do last session?"* or *"Did we fix this bug before?"*
-- Claude automatically invokes the mem-search skill to find relevant context
+- Claude uses MCP tools to search your memory
+- Start with `search` to get an index of results
+- Use `timeline` to see what was happening around specific observations
+- Use `get_observations` to fetch full details for relevant IDs
+- **~10x token savings** by filtering before fetching details
 
-**Available Search Operations:**
+**Available MCP Tools:**
 
-1. **Search Observations** - Full-text search across observations
-2. **Search Sessions** - Full-text search across session summaries
-3. **Search Prompts** - Search raw user requests
-4. **By Concept** - Find by concept tags (discovery, problem-solution, pattern, etc.)
-5. **By File** - Find observations referencing specific files
-6. **By Type** - Find by type (decision, bugfix, feature, refactor, discovery, change)
-7. **Recent Context** - Get recent session context for a project
-8. **Timeline** - Get unified timeline of context around a specific point in time
-9. **Timeline by Query** - Search for observations and get timeline context around best match
-10. **API Help** - Get search API documentation
+1. **`search`** - Search memory index with full-text queries, filters by type/date/project
+2. **`timeline`** - Get chronological context around a specific observation or query
+3. **`get_observations`** - Fetch full observation details by IDs (always batch multiple IDs)
+4. **`__IMPORTANT`** - Workflow documentation (always visible to Claude)
 
-**Example Natural Language Queries:**
+**Example Usage:**
 
-```
-"What bugs did we fix last session?"
-"How did we implement authentication?"
-"What changes were made to worker-service.ts?"
-"Show me recent work on this project"
-"What was happening when we added the viewer UI?"
+```typescript
+// Step 1: Search for index
+search(query="authentication bug", type="bugfix", limit=10)
+
+// Step 2: Review index, identify relevant IDs (e.g., #123, #456)
+
+// Step 3: Fetch full details
+get_observations(ids=[123, 456])
 ```
 
 See [Search Tools Guide](https://docs.claude-mem.ai/usage/search-tools) for detailed examples.
diff --git a/docs/public/architecture-evolution.mdx b/docs/public/architecture-evolution.mdx
index 67685d12..9a53ddd4 100644
--- a/docs/public/architecture-evolution.mdx
+++ b/docs/public/architecture-evolution.mdx
@@ -248,6 +248,164 @@ search_observations({
 
 ---
 
+## MCP Architecture Simplification (December 2025)
+
+### The Problem: Complex MCP Implementation
+
+**Before:**
+```
+9+ MCP tools registered at session start:
+- search_observations
+- find_by_type
+- find_by_file
+- find_by_concept
+- get_recent_context
+- get_observation
+- get_session
+- get_prompt
+- help
+
+Problems:
+- Overlapping operations (search_observations vs find_by_type)
+- Complex parameter schemas (~2,500 tokens in tool definitions)
+- No built-in workflow guidance
+- High cognitive load for Claude (which tool to use?)
+- Code size: ~2,718 lines in mcp-server.ts
+```
+
+**The Insight:** Progressive disclosure should be built into tool design itself, not something Claude has to remember.
+
+### The Solution: 3-Layer Workflow
+
+**After:**
+```
+4 MCP tools following 3-layer workflow:
+
+1. __IMPORTANT - Workflow documentation (always visible)
+   "3-LAYER WORKFLOW (ALWAYS FOLLOW):
+    1. search(query) → Get index with IDs
+    2. timeline(anchor=ID) → Get context
+    3. get_observations([IDs]) → Fetch details
+    NEVER fetch full details without filtering first."
+
+2. search - Layer 1: Get index with IDs (~50-100 tokens/result)
+3. timeline - Layer 2: Get chronological context
+4. get_observations - Layer 3: Fetch full details (~500-1,000 tokens/result)
+
+Benefits:
+- Progressive disclosure enforced by tool structure
+- No overlapping operations
+- Simple schemas (additionalProperties: true)
+- Clear workflow pattern
+- Code size: ~312 lines in mcp-server.ts (88% reduction)
+- ~10x token savings
+```
+
+### Migration: Skill-Based Search Removed
+
+**Previously:** Used skill-based search
+- mem-search skill invoked via natural language
+- HTTP API called directly via curl
+- Progressive disclosure through skill loading
+- 17 skill documentation files
+
+**Now:** Removed skill-based approach
+- MCP-only architecture
+- Native MCP protocol (better Claude integration)
+- Works with both Claude Desktop and Claude Code
+- Simpler to maintain (no skill files)
+- All 19 mem-search skill files removed (~2,744 lines)
+
+### Key Architectural Changes
+
+**MCP Server Refactor:**
+
+Before:
+```typescript
+// Complex parameter schemas
+{
+  name: "search_observations",
+  inputSchema: {
+    type: "object",
+    properties: {
+      query: { type: "string", description: "..." },
+      type: { type: "array", items: { enum: [...] } },
+      format: { enum: ["index", "full"] },
+      limit: { type: "number", minimum: 1, maximum: 100 },
+      // ... many more parameters
+    }
+  }
+}
+```
+
+After:
+```typescript
+// Simple schemas with workflow guidance
+{
+  name: "search",
+  description: "Step 1: Search memory. Returns index with IDs.",
+  inputSchema: {
+    type: "object",
+    properties: {},
+    additionalProperties: true  // Accept any parameters
+  }
+}
+```
+
+**Workflow Enforcement:**
+
+Before: Claude had to remember progressive disclosure pattern
+
+After: Tool structure makes it impossible to skip steps
+- Can't get details without IDs from search
+- Can't search without seeing __IMPORTANT reminder
+- Timeline provides middle ground (context without full details)
+
+### Impact
+
+**Token Efficiency:**
+```
+Traditional: Fetch 20 observations upfront
+→ 10,000-20,000 tokens
+→ Only 2 observations relevant (90% waste)
+
+3-Layer Workflow:
+→ search (20 results): ~1,000-2,000 tokens
+→ Review index, identify 3 relevant IDs
+→ get_observations (3 IDs): ~1,500-3,000 tokens
+→ Total: 2,500-5,000 tokens (50-75% savings)
+```
+
+**Code Simplicity:**
+- MCP server: 2,718 lines → 312 lines (88% reduction)
+- Removed: 19 skill files (~2,744 lines)
+- Net reduction: ~5,150 lines of code removed
+
+**User Experience:**
+- Same natural language interaction
+- Better token efficiency
+- Clearer architecture
+- Works identically on Claude Desktop and Claude Code
+
+### Design Philosophy
+
+**Progressive Disclosure Through Structure:**
+
+The 3-layer workflow embodies progressive disclosure at the architectural level:
+
+1. **Layer 1 (Index)** - "What exists?" - Cheap survey of options
+2. **Layer 2 (Timeline)** - "What was happening?" - Context around specific points
+3. **Layer 3 (Details)** - "Tell me everything" - Full details only when justified
+
+Each layer provides a decision point where Claude can:
+- Stop if irrelevant
+- Get more context if uncertain
+- Dive deep if confident
+
+This makes it structurally difficult to waste tokens.
+
+---
+
 ## v1-v2: The Naive Approach
 
 ### The First Attempt: Dump Everything
diff --git a/docs/public/architecture/search-architecture.mdx b/docs/public/architecture/search-architecture.mdx
index b3314e99..76a14692 100644
--- a/docs/public/architecture/search-architecture.mdx
+++ b/docs/public/architecture/search-architecture.mdx
@@ -1,448 +1,497 @@
 ---
 title: "Search Architecture"
-description: "mem-search skill with HTTP API and progressive disclosure"
+description: "MCP tools with 3-layer workflow for token-efficient memory retrieval"
 ---
 
 # Search Architecture
 
-Claude-Mem uses a skill-based search architecture that provides intelligent memory retrieval through natural language queries. This replaced the MCP-based approach in v5.4.0 with a more efficient implementation. The skill was enhanced and renamed to "mem-search" in v5.5.0 for better scope differentiation.
+Claude-mem uses an **MCP-based search architecture** that provides intelligent memory retrieval through 4 streamlined tools following a 3-layer workflow pattern.
 
 ## Overview
 
-**Architecture**: Skill-Based Search + HTTP API + Progressive Disclosure
+**Architecture**: MCP Tools → MCP Protocol → HTTP API → Worker Service
 
 **Key Components**:
-1. **mem-search Skill** (`plugin/skills/mem-search/SKILL.md`) - Auto-invoked when users ask about past work
-2. **HTTP API Endpoints** (10 routes) - Fast, efficient search operations on port 37777
-3. **Worker Service** - Express.js server with FTS5 full-text search
-4. **SQLite Database** - Persistent storage with FTS5 virtual tables
-5. **Chroma Vector DB** - Semantic search with hybrid retrieval
+1. **MCP Tools** (4 tools) - `search`, `timeline`, `get_observations`, `__IMPORTANT`
+2. **MCP Server** (`plugin/scripts/mcp-server.cjs`) - Thin wrapper over HTTP API
+3. **HTTP API Endpoints** - Fast search operations on Worker Service (port 37777)
+4. **Worker Service** - Express.js server with FTS5 full-text search
+5. **SQLite Database** - Persistent storage with FTS5 virtual tables
+6. **Chroma Vector DB** - Semantic search with hybrid retrieval
 
-**v5.5.0 Enhancement**: Renamed from "search" to "mem-search" with:
-- Effectiveness increased from 67% to 100%
-- Concrete triggers increased from 44% to 85%
-- 5+ unique identifiers for better scope differentiation
-- Comprehensive documentation (17 files, 12 operation guides)
+**Token Efficiency**: ~10x savings through 3-layer workflow pattern
 
 ## How It Works
 
-### 1. User Query (Natural Language)
+### 1. User Query
+
+Claude has access to 4 MCP tools. When searching memory, Claude follows the 3-layer workflow:
 
 ```
-User: "What bugs did we fix last session?"
+Step 1: search(query="authentication bug", type="bugfix", limit=10)
+Step 2: timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
+Step 3: get_observations(ids=[123, 456, 789])
 ```
 
-### 2. Skill Invocation
+### 2. MCP Protocol
 
-Claude recognizes the intent and invokes the mem-search skill:
-- Skill frontmatter (~250 tokens) loaded at session start
-- Full skill instructions loaded on-demand when skill is invoked
-- Progressive disclosure pattern minimizes context overhead
-- "mem-search" naming provides clear scope differentiation from native memory
+MCP server receives tool call via JSON-RPC over stdio:
+
+```json
+{
+  "method": "tools/call",
+  "params": {
+    "name": "search",
+    "arguments": {
+      "query": "authentication bug",
+      "type": "bugfix",
+      "limit": 10
+    }
+  }
+}
+```
 
 ### 3. HTTP API Call
 
-The skill uses `curl` to call the HTTP API:
+MCP server translates to HTTP request:
 
-```bash
-curl "http://localhost:37777/api/search/observations?query=bugs&type=bugfix&limit=5"
+```typescript
+const url = `http://localhost:37777/api/search?query=authentication%20bug&type=bugfix&limit=10`;
+const response = await fetch(url);
 ```
 
-### 4. FTS5 Search
+### 4. Worker Processing
 
-Worker service queries SQLite FTS5 virtual tables:
+Worker service executes FTS5 query:
 
 ```sql
 SELECT * FROM observations_fts
 WHERE observations_fts MATCH ?
 AND type = 'bugfix'
 ORDER BY rank
-LIMIT 5
+LIMIT 10
 ```
 
-### 5. Results Formatted
+### 5. Results Returned
 
-Skill formats results and returns to Claude:
+Worker returns structured data → MCP server → Claude:
 
-```
-## Recent Bugfixes
-
-1. [bugfix] Fixed authentication token expiry
-   Date: 2025-11-08 14:23:45
-   Files: src/auth/jwt.ts
-
-2. [bugfix] Resolved database connection leak
-   Date: 2025-11-08 13:15:22
-   Files: src/services/database.ts
-```
-
-### 6. User Sees Answer
-
-Claude presents the formatted results naturally in conversation.
-
-## Architecture Change (v5.4.0)
-
-### Before: MCP-Based Search
-
-**Approach**: 9 MCP tools registered at session start
-
-**Token Cost**: ~2,500 tokens in tool definitions per session
-- Each tool's schema, parameters, descriptions loaded
-- All 9 tools available whether needed or not
-- No progressive disclosure
-
-**Example MCP Tool**:
 ```json
 {
-  "name": "search_observations",
-  "description": "Full-text search across observations...",
-  "inputSchema": {
-    "type": "object",
-    "properties": {
-      "query": { "type": "string", "description": "..." },
-      "type": { "type": "array", "items": { "enum": [...] } },
-      "format": { "enum": ["index", "full"] },
-      // ... many more parameters
+  "content": [{
+    "type": "text",
+    "text": "| ID | Time | Title | Type |\n|---|---|---|---|\n| #123 | 2:15 PM | Fixed auth token expiry | bugfix |"
+  }]
+}
+```
+
+### 6. Claude Processes Results
+
+Claude reviews the index, decides which observations are relevant, and can:
+- Use `timeline` to get context
+- Use `get_observations` to fetch full details for selected IDs
+
+## The 4 MCP Tools
+
+### `__IMPORTANT` - Workflow Documentation
+
+Always visible to Claude. Explains the 3-layer workflow pattern.
+
+**Description:**
+```
+3-LAYER WORKFLOW (ALWAYS FOLLOW):
+1. search(query) → Get index with IDs (~50-100 tokens/result)
+2. timeline(anchor=ID) → Get context around interesting results
+3. get_observations([IDs]) → Fetch full details ONLY for filtered IDs
+NEVER fetch full details without filtering first. 10x token savings.
+```
+
+**Purpose:** Ensures Claude follows token-efficient pattern
+
+### `search` - Search Memory Index
+
+**Tool Definition:**
+```typescript
+{
+  name: 'search',
+  description: 'Step 1: Search memory. Returns index with IDs. Params: query, limit, project, type, obs_type, dateStart, dateEnd, offset, orderBy',
+  inputSchema: {
+    type: 'object',
+    properties: {},
+    additionalProperties: true  // Accepts any parameters
+  }
+}
+```
+
+**HTTP Endpoint:** `GET /api/search`
+
+**Parameters:**
+- `query` - Full-text search query
+- `limit` - Maximum results (default: 20)
+- `type` - Filter by observation type
+- `project` - Filter by project name
+- `dateStart`, `dateEnd` - Date range filters
+- `offset` - Pagination offset
+- `orderBy` - Sort order
+
+**Returns:** Compact index with IDs, titles, dates, types (~50-100 tokens per result)
+
+### `timeline` - Get Chronological Context
+
+**Tool Definition:**
+```typescript
+{
+  name: 'timeline',
+  description: 'Step 2: Get context around results. Params: anchor (observation ID) OR query (finds anchor automatically), depth_before, depth_after, project',
+  inputSchema: {
+    type: 'object',
+    properties: {},
+    additionalProperties: true
+  }
+}
+```
+
+**HTTP Endpoint:** `GET /api/timeline`
+
+**Parameters:**
+- `anchor` - Observation ID to center timeline around (optional if query provided)
+- `query` - Search query to find anchor automatically (optional if anchor provided)
+- `depth_before` - Number of observations before anchor (default: 3)
+- `depth_after` - Number of observations after anchor (default: 3)
+- `project` - Filter by project name
+
+**Returns:** Chronological view showing what happened before/during/after
+
+### `get_observations` - Fetch Full Details
+
+**Tool Definition:**
+```typescript
+{
+  name: 'get_observations',
+  description: 'Step 3: Fetch full details for filtered IDs. Params: ids (array of observation IDs, required), orderBy, limit, project',
+  inputSchema: {
+    type: 'object',
+    properties: {
+      ids: {
+        type: 'array',
+        items: { type: 'number' },
+        description: 'Array of observation IDs to fetch (required)'
+      }
+    },
+    required: ['ids'],
+    additionalProperties: true
+  }
+}
+```
+
+**HTTP Endpoint:** `POST /api/observations/batch`
+
+**Body:**
+```json
+{
+  "ids": [123, 456, 789],
+  "orderBy": "date_desc",
+  "project": "my-app"
+}
+```
+
+**Returns:** Complete observation details (~500-1,000 tokens per observation)
+
+## MCP Server Implementation
+
+**Location:** `/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs`
+
+**Role:** Thin wrapper that translates MCP protocol to HTTP API calls
+
+**Key Characteristics:**
+- ~312 lines of code (reduced from ~2,718 lines in old implementation)
+- No business logic - just protocol translation
+- Single source of truth: Worker HTTP API
+- Simple schemas with `additionalProperties: true`
+
+**Handler Example:**
+```typescript
+{
+  name: 'search',
+  handler: async (args: any) => {
+    const endpoint = '/api/search';
+    const searchParams = new URLSearchParams();
+
+    for (const [key, value] of Object.entries(args)) {
+      searchParams.append(key, String(value));
+    }
+
+    const url = `http://localhost:37777${endpoint}?${searchParams}`;
+    const response = await fetch(url);
+    return await response.json();
+  }
+}
+```
+
+## Worker HTTP API
+
+**Location:** `src/services/worker-service.ts`
+
+**Port:** 37777
+
+**Search Endpoints:**
+```typescript
+GET  /api/search           # Main search (used by MCP search tool)
+GET  /api/timeline         # Timeline context (used by MCP timeline tool)
+POST /api/observations/batch  # Fetch by IDs (used by MCP get_observations tool)
+GET  /api/health           # Health check
+```
+
+**Database Access:**
+- Uses `SessionSearch` service for FTS5 queries
+- Uses `SessionStore` for structured queries
+- Hybrid search with ChromaDB for semantic similarity
+
+**FTS5 Full-Text Search:**
+```typescript
+// search tool → HTTP GET → FTS5 query
+SELECT * FROM observations_fts
+WHERE observations_fts MATCH ?
+AND type = ?
+AND date >= ? AND date <= ?
+ORDER BY rank
+LIMIT ? OFFSET ?
+```
+
+## The 3-Layer Workflow Pattern
+
+### Design Philosophy
+
+The 3-layer workflow embodies **progressive disclosure** - a core principle of claude-mem's architecture.
+
+**Layer 1: Index (Search)**
+- **What:** Compact table with IDs, titles, dates, types
+- **Cost:** ~50-100 tokens per result
+- **Purpose:** Survey what exists before committing tokens
+- **Decision Point:** "Which observations are relevant?"
+
+**Layer 2: Context (Timeline)**
+- **What:** Chronological view of observations around a point
+- **Cost:** Variable based on depth
+- **Purpose:** Understand narrative arc, see what led to/from a point
+- **Decision Point:** "Do I need full details?"
+
+**Layer 3: Details (Get Observations)**
+- **What:** Complete observation data (narrative, facts, files, concepts)
+- **Cost:** ~500-1,000 tokens per observation
+- **Purpose:** Deep dive on validated, relevant observations
+- **Decision Point:** "Apply knowledge to current task"
+
+### Token Efficiency
+
+**Traditional RAG Approach:**
+```
+Fetch 20 observations upfront: 10,000-20,000 tokens
+Relevance: ~10% (only 2 observations actually useful)
+Waste: 18,000 tokens on irrelevant context
+```
+
+**3-Layer Workflow:**
+```
+Step 1: search (20 results)        ~1,000-2,000 tokens
+Step 2: Review index, filter to 3 relevant IDs
+Step 3: get_observations (3 IDs)   ~1,500-3,000 tokens
+Total: 2,500-5,000 tokens (50-75% savings)
+```
+
+**10x Savings:** By filtering at index level before fetching full details
+
+## Architecture Evolution
+
+### Before: Complex MCP Implementation
+
+**Approach:** 9 MCP tools with detailed parameter schemas
+
+**Token Cost:** ~2,500 tokens in tool definitions per session
+- `search_observations` - Full-text search
+- `find_by_type` - Filter by type
+- `find_by_file` - Filter by file
+- `find_by_concept` - Filter by concept
+- `get_recent_context` - Recent sessions
+- `get_observation` - Fetch single observation
+- `get_session` - Fetch session
+- `get_prompt` - Fetch prompt
+- `help` - API documentation
+
+**Problems:**
+- Overlapping operations (search_observations vs find_by_type)
+- Complex parameter schemas
+- No built-in workflow guidance
+- High token cost at session start
+
+**Code Size:** ~2,718 lines in mcp-server.ts
+
+### After: Streamlined MCP Implementation
+
+**Approach:** 4 MCP tools following 3-layer workflow
+
+**Token Cost:** ~312 lines of code, simplified tool definitions
+
+**Tools:**
+1. `__IMPORTANT` - Workflow guidance (always visible)
+2. `search` - Step 1 (index)
+3. `timeline` - Step 2 (context)
+4. `get_observations` - Step 3 (details)
+
+**Benefits:**
+- Progressive disclosure built into tool design
+- No overlapping operations
+- Simple schemas (`additionalProperties: true`)
+- Clear workflow pattern
+- ~10x token savings
+
+**Code Size:** ~312 lines in mcp-server.ts (88% reduction)
+
+### Key Insight
+
+**Before:** Progressive disclosure was something Claude had to remember
+
+**After:** Progressive disclosure is enforced by tool design itself
+
+The 3-layer workflow pattern makes it structurally difficult to waste tokens:
+- Can't fetch details without first getting IDs from search
+- Can't search without seeing workflow reminder (`__IMPORTANT`)
+- Timeline provides middle ground between index and full details
+
+## Configuration
+
+### Claude Desktop
+
+Add to `claude_desktop_config.json`:
+
+```json
+{
+  "mcpServers": {
+    "mcp-search": {
+      "command": "node",
+      "args": [
+        "/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs"
+      ]
     }
   }
 }
 ```
 
-### After: Skill-Based Search
+### Claude Code
 
-**Approach**: 1 mem-search skill with progressive disclosure
+MCP server is automatically configured via plugin installation. No manual setup required.
 
-**Token Cost**: ~250 tokens in skill frontmatter per session
-- Only skill description loaded at session start
-- Full instructions loaded on-demand when skill is invoked
-- HTTP API endpoints instead of MCP protocol
+**Both clients use the same MCP tools** - the architecture works identically for Claude Desktop and Claude Code.
 
-**Example Skill Frontmatter**:
-```markdown
-# Claude-Mem mem-search Skill
+## Security
 
-Access claude-mem's persistent memory through a comprehensive HTTP API.
-Search for past work, understand context, and learn from previous decisions.
+### FTS5 Injection Prevention
 
-## When to Use This Skill
+All search queries are escaped before FTS5 processing:
 
-Invoke this skill when users ask about:
-- Past work: "What did we do last session?"
-- Bug fixes: "Did we fix this before?"
-- Features: "How did we implement authentication?"
-...
-```
-
-**Token Efficiency**: Minimal frontmatter at session start with progressive disclosure
-
-## HTTP API Endpoints
-
-The worker service exposes 10 search endpoints:
-
-### Full-Text Search
-
-```
-GET /api/search/observations
-GET /api/search/sessions
-GET /api/search/prompts
-```
-
-**Parameters**:
-- `query` - FTS5 search query (required)
-- `type` - Filter by type (bugfix, feature, refactor, etc.)
-- `project` - Filter by project name
-- `limit` - Maximum results (default: 20)
-- `offset` - Pagination offset
-- `format` - Response format (index or full)
-
-**Example**:
-```bash
-curl "http://localhost:37777/api/search/observations?query=authentication&type=decision&limit=5"
-```
-
-### Filtered Search
-
-```
-GET /api/search/by-type
-GET /api/search/by-concept
-GET /api/search/by-file
-```
-
-**Parameters**:
-- `type` / `concept` / `filePath` - Filter criteria (required)
-- `project` - Filter by project
-- `limit` - Maximum results
-- `format` - Response format
-
-**Example**:
-```bash
-curl "http://localhost:37777/api/search/by-file?filePath=worker-service.ts&limit=10"
-```
-
-### Context Retrieval
-
-```
-GET /api/context/recent
-GET /api/context/timeline
-GET /api/timeline/by-query
-```
-
-**Parameters**:
-- `project` - Filter by project
-- `limit` - Number of sessions/records
-- `anchor` - Timeline anchor point (ID or timestamp)
-- `depth_before` - Records before anchor
-- `depth_after` - Records after anchor
-
-**Example**:
-```bash
-curl "http://localhost:37777/api/context/recent?project=claude-mem&limit=5"
-```
-
-### Documentation
-
-```
-GET /api/search/help
-```
-
-Returns API documentation in JSON format.
-
-## Progressive Disclosure Pattern
-
-The mem-search skill uses progressive disclosure to minimize token usage:
-
-### Layer 1: Skill Frontmatter (Session Start)
-
-**What's Loaded**: Skill description and when to use it (~250 tokens)
-
-**Purpose**: Claude can recognize when to invoke the skill
-
-**Example**:
-```markdown
-# Claude-Mem mem-search Skill
-
-Access claude-mem's persistent memory through a comprehensive HTTP API.
-
-## When to Use This Skill
-Invoke this skill when users ask about:
-- Past work: "What did we do last session?"
-- Bug fixes: "Did we fix this before?"
-...
-```
-
-### Layer 2: Full Skill Instructions (On-Demand)
-
-**What's Loaded**: Complete operation documentation (~2,500 tokens)
-
-**Purpose**: Detailed instructions for each search operation
-
-**When Loaded**: Only when Claude invokes the skill
-
-**Example Structure**:
-```
-/skills/search/
-├── SKILL.md (main frontmatter)
-├── operations/
-│   ├── observations.md (detailed instructions)
-│   ├── sessions.md
-│   ├── prompts.md
-│   ├── by-type.md
-│   ├── by-concept.md
-│   ├── by-file.md
-│   ├── recent-context.md
-│   ├── timeline.md
-│   ├── timeline-by-query.md
-│   ├── help.md
-│   ├── formatting.md
-│   └── common-workflows.md
-```
-
-### Layer 3: API Response
-
-**What's Returned**: Search results in requested format
-
-**Format Options**:
-- `index` - Titles, dates, IDs only (~50-100 tokens per result)
-- `full` - Complete details (~500-1000 tokens per result)
-
-**Progressive Usage**: Start with `index`, drill down with `full` as needed
-
-## Implementation Details
-
-### mem-search Skill Structure
-
-```
-plugin/skills/mem-search/
-├── SKILL.md                           # Main frontmatter (~250 tokens)
-├── operations/
-│   ├── observations.md                # Search observations
-│   ├── sessions.md                    # Search sessions
-│   ├── prompts.md                     # Search prompts
-│   ├── by-type.md                     # Filter by type
-│   ├── by-concept.md                  # Filter by concept
-│   ├── by-file.md                     # Filter by file
-│   ├── recent-context.md              # Get recent context
-│   ├── timeline.md                    # Timeline around point
-│   ├── timeline-by-query.md           # Search + timeline
-│   ├── help.md                        # API documentation
-│   ├── formatting.md                  # Result formatting guide
-│   └── common-workflows.md            # Usage patterns
-```
-
-### Worker Service Integration
-
-**File**: `src/services/worker-service.ts`
-
-**Search Routes**:
-```typescript
-// Full-text search
-app.get('/api/search/observations', handleSearchObservations);
-app.get('/api/search/sessions', handleSearchSessions);
-app.get('/api/search/prompts', handleSearchPrompts);
-
-// Filtered search
-app.get('/api/search/by-type', handleSearchByType);
-app.get('/api/search/by-concept', handleSearchByConcept);
-app.get('/api/search/by-file', handleSearchByFile);
-
-// Context retrieval
-app.get('/api/context/recent', handleRecentContext);
-app.get('/api/context/timeline', handleTimeline);
-app.get('/api/timeline/by-query', handleTimelineByQuery);
-
-// Documentation
-app.get('/api/search/help', handleHelp);
-```
-
-**Database Access**:
-- Uses `SessionSearch` service for FTS5 queries
-- Uses `SessionStore` for structured queries
-- Hybrid search with ChromaDB for semantic similarity
-
-### Security
-
-**FTS5 Injection Prevention** (v4.2.3):
 ```typescript
 function escapeFTS5Query(query: string): string {
   return query.replace(/"/g, '""');
 }
 ```
 
-All user-provided search queries are properly escaped to prevent SQL injection.
+**Testing:** 332 injection attack tests covering special characters, SQL keywords, quote escaping, and boolean operators.
 
-**Comprehensive Testing**: 332 injection attack tests covering:
-- Special characters
-- SQL keywords
-- Quote escaping
-- Boolean operators
+### MCP Protocol Security
 
-## Benefits
+- Stdio transport (no network exposure)
+- Local-only HTTP API (localhost:37777)
+- No authentication needed (local development only)
 
-### 1. Token Efficiency
+## Performance
 
-**Before (MCP)**:
-- Session start: All tool definitions loaded upfront
-- Every session pays this cost
-- No progressive disclosure
+**FTS5 Full-Text Search:** <10ms for typical queries
 
-**After (Skill)**:
-- Session start: Minimal token cost for skill frontmatter
-- Full instructions loaded only when invoked (progressive disclosure)
-- More efficient than loading all tool definitions upfront
+**MCP Overhead:** Minimal - simple protocol translation
 
-### 2. Natural Language Interface
+**Caching:** HTTP layer allows response caching (future enhancement)
 
-**Before**: Users needed to learn MCP tool syntax
-```
-search_observations with query="authentication" and type="decision"
-```
+**Pagination:** Efficient with offset/limit
 
-**After**: Users ask naturally
-```
-"What decisions did we make about authentication?"
-```
+**Batching:** `get_observations` accepts multiple IDs in single call
 
-Claude translates to appropriate API call.
+## Benefits Over Alternative Approaches
 
-### 3. Flexibility
+### vs. Traditional RAG
 
-**HTTP API Benefits**:
-- Can be called from skills, MCP tools, or other clients
-- Easy to test with curl
-- Standard REST conventions
-- JSON responses
+**Traditional RAG:**
+- Fetches everything upfront
+- High token cost
+- Low relevance ratio
 
-**Progressive Disclosure**:
-- Loads only what's needed
-- Can add more operations without increasing base cost
-- Documentation co-located with operations
+**3-Layer MCP:**
+- Fetches only what's needed
+- ~10x token savings
+- 100% relevance (Claude chooses what to fetch)
 
-### 4. Performance
+### vs. Previous MCP Implementation (v5.x)
 
-**Fast Queries**: FTS5 full-text search under 10ms for typical queries
+**Previous (9 tools):**
+- Complex schemas
+- Overlapping operations
+- No workflow guidance
+- ~2,500 tokens in definitions
 
-**Caching**: HTTP layer allows response caching
+**Current (4 tools):**
+- Simple schemas
+- Clear workflow
+- Built-in guidance
+- ~312 lines of code
 
-**Pagination**: Efficient result pagination with offset/limit
+### vs. Skill-Based Approach (Previously)
 
-## Migration Notes
+**Skill approach:**
+- Required separate skill files
+- HTTP API called directly via curl
+- Progressive disclosure through skill loading
 
-### For Users
+**MCP approach:**
+- Native MCP protocol (better Claude integration)
+- Cleaner architecture (protocol translation layer)
+- Works with both Claude Desktop and Claude Code
+- Simpler to maintain (no skill files)
 
-**No Action Required**: The migration from MCP to skill-based search is transparent.
-
-**Same Questions Work**: Natural language queries work exactly the same way.
-
-**Invisible Change**: Users won't notice any difference except better performance.
-
-### For Developers
-
-**Renamed**: MCP server (formerly `search-server.ts`, now `src/servers/mcp-server.ts`)
-- Source file kept for reference
-- No longer built or registered
-- MCP configuration removed from `plugin/.mcp.json`
-
-**New Implementation**: Skill-based search
-- Skill files: `plugin/skills/mem-search/`
-- HTTP endpoints: `src/services/worker-service.ts` (lines 200-400)
-- Build script: `npm run build` includes skill files
-- Sync script: `npm run sync-marketplace` copies to plugin directory
+**Migration:** Skill-based search was removed in favor of streamlined MCP architecture.
 
 ## Troubleshooting
 
+### MCP Server Not Connected
+
+**Symptoms:** Tools not appearing in Claude
+
+**Solution:**
+1. Check MCP server path in configuration
+2. Verify worker service is running: `curl http://localhost:37777/api/health`
+3. Restart Claude Desktop/Code
+
 ### Worker Service Not Running
 
-If searches fail, check worker service:
+**Symptoms:** MCP tools fail with connection errors
 
+**Solution:**
 ```bash
 npm run worker:status       # Check status
 npm run worker:restart      # Restart worker
 npm run worker:logs         # View logs
 ```
 
-### HTTP Endpoints Not Responding
+### Empty Search Results
 
-Test endpoints directly:
+**Symptoms:** search() returns no results
 
-```bash
-# Health check
-curl http://localhost:37777/health
-
-# Search test
-curl "http://localhost:37777/api/search/observations?query=test&limit=1"
-```
-
-### Skill Not Invoking
-
-If Claude doesn't invoke the mem-search skill automatically:
-
-1. Check skill files exist: `ls ~/.claude/plugins/marketplaces/thedotmack/plugin/skills/mem-search/`
-2. Restart Claude Code session to reload skill definitions
-3. Try more explicit phrasing: "Search past sessions for bug fixes" or "What did we do in yesterday's session?"
-4. Ensure your question is about previous sessions (not current conversation context)
+**Troubleshooting:**
+1. Test API directly: `curl "http://localhost:37777/api/search?query=test"`
+2. Check database: `ls ~/.claude-mem/claude-mem.db`
+3. Verify observations exist: `curl "http://localhost:37777/api/health"`
 
 ## Next Steps
 
-- [Search Tools Usage](/usage/search-tools) - User guide with examples
+- [Memory Search Usage](/usage/search-tools) - User guide with examples
+- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow
 - [Worker Service Architecture](/architecture/worker-service) - HTTP API details
 - [Database Schema](/architecture/database) - FTS5 tables and indexes
diff --git a/docs/public/progressive-disclosure.mdx b/docs/public/progressive-disclosure.mdx
index 47188a25..f3f4c703 100644
--- a/docs/public/progressive-disclosure.mdx
+++ b/docs/public/progressive-disclosure.mdx
@@ -260,14 +260,12 @@ The index is useless without retrieval mechanisms:
 *Use claude-mem MCP search to access records with the given ID*
 ```
 
-**Available tools:**
-- `search_observations` - Full-text search
-- `find_by_concept` - Concept-based retrieval
-- `find_by_file` - File-based retrieval
-- `find_by_type` - Type-based retrieval
-- `get_recent_context` - Recent session summaries
+**Available MCP tools:**
+- `search` - Search memory index (Layer 1: Get IDs)
+- `timeline` - Get chronological context (Layer 2: See narrative arc)
+- `get_observations` - Fetch full details (Layer 3: Deep dive)
 
-Each tool supports `format: "index"` (default) and `format: "full"`.
+The 3-layer workflow ensures progressive disclosure: index → context → details.
 
 ---
 
@@ -318,16 +316,18 @@ Is my task related to npm? → YES
 
 ---
 
-## The Two-Tier Search Strategy
+## The Three-Layer Workflow
 
-Claude-Mem implements progressive disclosure in search results too:
+Claude-Mem implements progressive disclosure through a 3-layer workflow pattern:
 
-### Tier 1: Index Format (Default)
+### Layer 1: Search (Index)
+
+Start by searching to get a compact index with IDs:
 
 ```typescript
-search_observations({
+search({
   query: "hook timeout",
-  format: "index"  // Default
+  limit: 10
 })
 ```
 
@@ -335,23 +335,40 @@ search_observations({
 ```
 Found 3 observations matching "hook timeout":
 
-| ID | Date | Type | Title | Tokens |
-|----|------|------|-------|--------|
-| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | ~155 |
-| #2891 | Oct 25 | how-it-works | Hook timeout configuration | ~203 |
-| #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ~89 |
+| ID | Date | Type | Title |
+|----|------|------|-------|
+| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short |
+| #2891 | Oct 25 | how-it-works | Hook timeout configuration |
+| #2102 | Oct 20 | problem-solution | Fixed timeout in CI |
 ```
 
-**Cost:** ~100 tokens for 3 results
-**Value:** Agent can scan and decide which to fetch
+**Cost:** ~50-100 tokens per result
+**Value:** Agent can scan and decide which observations are relevant
 
-### Tier 2: Full Format (On-Demand)
+### Layer 2: Timeline (Context)
+
+Get chronological context around interesting observations:
 
 ```typescript
-search_observations({
-  query: "hook timeout",
-  format: "full",
-  limit: 1  // Fetch just the most relevant
+timeline({
+  anchor: 2543,  // Observation ID from search
+  depth_before: 3,
+  depth_after: 3
+})
+```
+
+**Returns:** Chronological view showing what happened before/during/after observation #2543
+
+**Cost:** Variable based on depth
+**Value:** Understand narrative arc and context
+
+### Layer 3: Get Observations (Details)
+
+Fetch full details only for relevant observations:
+
+```typescript
+get_observations({
+  ids: [2543, 2102]  // Selected from search results
 })
 ```
 
@@ -463,29 +480,30 @@ Here are 10 observations.
 *Use MCP search tools to fetch full observation details on-demand*
 ```
 
-### ❌ Defaulting to Full Format
+### ❌ Skipping the Index Layer
 
 **Bad:**
 ```typescript
-search_observations({
-  query: "hooks",
-  format: "full"  // Fetches everything
+// Fetching full details immediately
+get_observations({
+  ids: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]  // Guessing which are relevant
 })
 ```
 
 **Good:**
 ```typescript
-search_observations({
+// Follow the 3-layer workflow
+// Layer 1: Search for index
+search({
   query: "hooks",
-  format: "index",  // Scan first
   limit: 20
 })
 
-// Then, if needed:
-search_observations({
-  query: "hooks",
-  format: "full",
-  limit: 1  // Just the most relevant
+// Layer 2: Review index, identify 2-3 relevant IDs
+
+// Layer 3: Fetch only relevant observations
+get_observations({
+  ids: [2543, 2891]  // Just the most relevant
 })
 ```
 
@@ -595,10 +613,9 @@ SessionStart({ source: "compact" }):
 
 ```typescript
 // Use embeddings to pre-sort index by relevance
-search_observations({
+search({
   query: "authentication bug",
-  format: "index",
-  sort: "relevance"  // Based on semantic similarity
+  orderBy: "relevance"  // Based on semantic similarity (future enhancement)
 })
 ```
 
diff --git a/docs/public/troubleshooting.mdx b/docs/public/troubleshooting.mdx
index 51e7c7d7..826a7793 100644
--- a/docs/public/troubleshooting.mdx
+++ b/docs/public/troubleshooting.mdx
@@ -742,17 +742,17 @@ sqlite3 ~/.claude-mem/claude-mem.db "
 
 3. Test simple query:
    ```bash
-   # In Claude Code
-   search_observations with query="test"
+   # Test MCP search tool
+   search(query="test", limit=5)
    ```
 
 4. Check query syntax:
    ```bash
-   # Bad: Special characters
-   search_observations with query="[test]"
+   # Bad: Special characters may cause issues
+   search(query="[test]")
 
    # Good: Simple words
-   search_observations with query="test"
+   search(query="test")
    ```
 
 ### Token Limit Errors
@@ -761,28 +761,40 @@ sqlite3 ~/.claude-mem/claude-mem.db "
 
 **Solutions**:
 
-1. Use index format:
+1. Follow 3-layer workflow (don't skip to get_observations):
    ```bash
-   search_observations with query="..." and format="index"
+   # Start with search to get index
+   search(query="...", limit=10)
+
+   # Review IDs, then fetch only relevant ones
+   get_observations(ids=[<2-3 relevant IDs>])
    ```
 
-2. Reduce limit:
+2. Reduce limit in search:
    ```bash
-   search_observations with query="..." and limit=3
+   search(query="...", limit=3)
    ```
 
 3. Use filters to narrow results:
    ```bash
-   search_observations with query="..." and type="decision" and limit=5
+   search(query="...", type="decision", limit=5)
    ```
 
 4. Paginate results:
    ```bash
    # First page
-   search_observations with query="..." and limit=5 and offset=0
+   search(query="...", limit=5, offset=0)
 
    # Second page
-   search_observations with query="..." and limit=5 and offset=5
+   search(query="...", limit=5, offset=5)
+   ```
+
+5. Batch IDs in get_observations:
+   ```bash
+   # Always batch multiple IDs in one call
+   get_observations(ids=[123, 456, 789])
+
+   # Don't make separate calls per ID
    ```
 
 ## Performance Issues
diff --git a/docs/public/usage/search-tools.mdx b/docs/public/usage/search-tools.mdx
index 32343138..6ddf7b6d 100644
--- a/docs/public/usage/search-tools.mdx
+++ b/docs/public/usage/search-tools.mdx
@@ -1,403 +1,454 @@
 ---
-title: "mem-search Skill"
-description: "Query your project history with natural language"
+title: "Memory Search"
+description: "Search your project history with MCP tools"
 ---
 
-# mem-search Skill Usage
+# Memory Search with MCP Tools
 
-Once claude-mem is installed as a plugin, you can search your project history using natural language. Claude automatically invokes the mem-search skill when you ask about past work.
+Claude-mem provides persistent memory across sessions through **4 MCP tools** that follow a token-efficient **3-layer workflow pattern**.
 
-## How It Works
+## Overview
 
-**v5.5.0 Enhancement**: The search skill was renamed to "mem-search" for better scope differentiation, with effectiveness increased from 67% to 100% and enhanced concrete triggers (85% vs 44%).
+Instead of fetching all historical data upfront (expensive), claude-mem uses a progressive disclosure approach:
 
-**v5.4.0 Architecture**: Claude-Mem uses a skill-based search architecture instead of MCP tools, saving ~2,250 tokens per session start through progressive disclosure.
+1. **Search** → Get a compact index with IDs (~50-100 tokens/result)
+2. **Timeline** → Get context around interesting results
+3. **Get Observations** → Fetch full details ONLY for filtered IDs
 
-**Simple Usage:**
-- Just ask naturally: *"What did we do last session?"*
-- Claude recognizes the intent and invokes the mem-search skill
-- The skill uses HTTP API endpoints to query your memory
-- Results are formatted and presented to you
+This achieves **~10x token savings** compared to traditional RAG approaches.
 
-**Benefits:**
-- **Token Efficient**: ~250 tokens (skill frontmatter) vs ~2,500 tokens (MCP tool definitions)
-- **Natural Language**: No need to learn specific tool syntax
-- **Progressive Disclosure**: Only loads detailed instructions when needed
-- **Auto-Invoked**: Claude knows when to search based on your questions
-- **Scope Differentiation**: "mem-search" clearly distinguishes from native conversation memory
+## The 3-Layer Workflow
 
-## Quick Reference
+### Layer 1: Search (Index)
 
-| Operation               | Purpose                                      |
-|-------------------------|----------------------------------------------|
-| Search Observations     | Full-text search across observations         |
-| Search Sessions         | Full-text search across session summaries    |
-| Search Prompts          | Full-text search across raw user prompts     |
-| By Concept              | Find observations tagged with concepts       |
-| By File                 | Find observations referencing files          |
-| By Type                 | Find observations by type                    |
-| Recent Context          | Get recent session context                   |
-| Timeline                | Get unified timeline around a specific point |
-| Timeline by Query       | Search and get timeline context in one step  |
-| API Help                | Get search API documentation                 |
-
-## Example Queries
-
-### Natural Language Queries
-
-**Search Observations:**
-```
-"What bugs did we fix related to authentication?"
-"Show me all decisions about the build system"
-"Find refactoring work on the database"
-```
-
-**Search Sessions:**
-```
-"What did we learn about hooks?"
-"What was accomplished in the API implementation?"
-"Show me recent work on this project"
-```
-
-**Search Prompts:**
-```
-"When did I ask about authentication features?"
-"Find all my requests about dark mode"
-```
-
-**Note**: Claude automatically translates your natural language queries into the appropriate search operations.
-
-### Search by File
+Start by searching to get a lightweight index of results:
 
 ```
-"Show me everything related to worker-service.ts"
-"What changes were made to migrations.ts?"
-"Find all work on the database file"
+search(query="authentication bug", type="bugfix", limit=10)
 ```
 
-### Search by Concept
+**Returns:** Compact table with IDs, titles, dates, types
+**Cost:** ~50-100 tokens per result
+**Purpose:** Survey what exists before fetching details
+
+### Layer 2: Timeline (Context)
+
+Get chronological context around specific observations:
 
 ```
-"Show observations tagged with architecture"
-"Find all security-related observations"
-"What patterns have we used?"
+timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
 ```
 
-### Search by Type
+Or search and get timeline in one step:
 
 ```
-"Find all feature implementations"
-"Show me all decisions and discoveries"
-"What bugs have we fixed?"
+timeline(query="authentication", depth_before=2, depth_after=2)
 ```
 
-### Recent Context
+**Returns:** Chronological view showing what was happening before/after
+**Cost:** Variable, depends on depth
+**Purpose:** Understand narrative arc and context
+
+### Layer 3: Get Observations (Details)
+
+Fetch full details only for relevant observations:
 
 ```
-"Show me what we've been working on"
-"Get context from the last 5 sessions"
-"What happened recently on this project?"
+get_observations(ids=[123, 456, 789])
 ```
 
-### Timeline Queries
+**Returns:** Complete observation details (narrative, facts, files, concepts)
+**Cost:** ~500-1000 tokens per observation
+**Purpose:** Deep dive on specific, validated items
 
-**Get timeline around a specific point:**
+### Why This Works
+
+**Traditional Approach:**
+- Fetch everything upfront: 20,000 tokens
+- Relevance: ~10% (2,000 tokens actually useful)
+- Waste: 18,000 tokens on irrelevant context
+
+**3-Layer Approach:**
+- Search index: 1,000 tokens (10 results)
+- Timeline context: 500 tokens (around 2 key results)
+- Fetch details: 1,500 tokens (3 observations)
+- **Total: 3,000 tokens, 100% relevant**
+
+## Available Tools
+
+### `__IMPORTANT` - Workflow Documentation
+
+Always visible reminder of the 3-layer workflow pattern. Helps Claude understand how to use the search tools efficiently.
+
+**Usage:** Automatically shown, no need to invoke
+
+### `search` - Search Memory Index
+
+Search your memory and get a compact index with IDs.
+
+**Parameters:**
+- `query` - Full-text search query (supports AND, OR, NOT, phrase searches)
+- `limit` - Maximum results (default: 20)
+- `offset` - Skip first N results for pagination
+- `type` - Filter by observation type (bugfix, feature, decision, discovery, refactor, change)
+- `obs_type` - Filter by record type (observation, session, prompt)
+- `project` - Filter by project name
+- `dateStart` - Filter by start date (YYYY-MM-DD)
+- `dateEnd` - Filter by end date (YYYY-MM-DD)
+- `orderBy` - Sort order (date_desc, date_asc, relevance)
+
+**Returns:** Compact index table with IDs, titles, dates, types
+
+**Example:**
 ```
-"What was happening when we implemented authentication?"
-"Show me the context around that bug fix"
-"What led to the decision to refactor the database?"
+search(query="database migration", type="bugfix", limit=5, orderBy="date_desc")
 ```
 
-**Timeline by query:**
+### `timeline` - Get Chronological Context
+
+Get a chronological view of observations around a specific point or query.
+
+**Parameters:**
+- `anchor` - Observation ID to center timeline around (optional if query provided)
+- `query` - Search query to find anchor automatically (optional if anchor provided)
+- `depth_before` - Number of observations before anchor (default: 3)
+- `depth_after` - Number of observations after anchor (default: 3)
+- `project` - Filter by project name
+
+**Returns:** Chronological list showing what happened before/during/after
+
+**Example:**
 ```
-"Find when we added the viewer UI and show what happened around that time"
-"Search for authentication work and show the timeline"
+timeline(anchor=12345, depth_before=5, depth_after=5)
 ```
 
-**Benefits:**
-- See the complete narrative arc around key events
-- All record types (observations, sessions, prompts) in chronological view
-- Understand what was happening before and after important changes
-
-## Search Strategy
-
-The mem-search skill uses a progressive disclosure pattern to efficiently retrieve information:
-
-### 1. Ask Naturally
-
-Start with a natural language question:
+Or search-based:
 ```
-"What bugs did we fix related to authentication?"
+timeline(query="implemented JWT auth", depth_before=3, depth_after=3)
 ```
 
-### 2. Claude Invokes mem-search Skill
+### `get_observations` - Fetch Full Details
 
-Claude recognizes your intent and loads the mem-search skill (~250 tokens for skill frontmatter).
+Fetch complete observation details by IDs. **Always batch multiple IDs in a single call for efficiency.**
 
-### 3. Skill Uses HTTP API
+**Parameters:**
+- `ids` - Array of observation IDs (required)
+- `orderBy` - Sort order (date_desc, date_asc)
+- `limit` - Maximum observations to return
+- `project` - Filter by project name
 
-The skill calls the appropriate HTTP endpoint (e.g., `/api/search/observations`) with the query.
+**Returns:** Complete observation details including narrative, facts, files, concepts
 
-### 4. Results Formatted
-
-Results are formatted and presented to you, usually starting with an index/summary format.
-
-### 5. Deep Dive if Needed
-
-If you need more details, ask follow-up questions:
+**Example:**
 ```
-"Tell me more about observation #123"
-"Show me the full details of that decision"
+get_observations(ids=[123, 456, 789, 1011])
 ```
 
-**Benefits of This Approach:**
-- **Token Efficient**: Only loads what you need, when you need it
-- **Natural**: No syntax to learn
-- **Progressive**: Start with overview, drill down as needed
-- **Automatic**: Claude handles the search invocation
+**Important:** Always batch IDs instead of making separate calls per observation.
+
+## Common Use Cases
+
+### Debugging Issues
+
+**Scenario:** Find what went wrong with database connections
+
+```
+Step 1: search(query="error database connection", type="bugfix", limit=10)
+  → Review index, identify observations #245, #312, #489
+
+Step 2: timeline(anchor=312, depth_before=3, depth_after=3)
+  → See what was happening around the fix
+
+Step 3: get_observations(ids=[312, 489])
+  → Get full details on relevant fixes
+```
+
+### Understanding Decisions
+
+**Scenario:** Review architectural choices about authentication
+
+```
+Step 1: search(query="authentication", type="decision", limit=5)
+  → Find decision observations
+
+Step 2: get_observations(ids=[<relevant_ids>])
+  → Get full decision rationale, trade-offs, facts
+```
+
+### Code Archaeology
+
+**Scenario:** Find when a specific file was modified
+
+```
+Step 1: search(query="worker-service.ts", limit=20)
+  → Get all observations mentioning that file
+
+Step 2: timeline(query="worker-service.ts refactor", depth_before=2, depth_after=2)
+  → See what led to and followed from the refactor
+
+Step 3: get_observations(ids=[<specific_observation_ids>])
+  → Get implementation details
+```
+
+### Feature History
+
+**Scenario:** Track how a feature evolved
+
+```
+Step 1: search(query="dark mode", type="feature", orderBy="date_asc")
+  → Chronological view of feature work
+
+Step 2: timeline(anchor=<first_observation_id>, depth_after=10)
+  → See the full development timeline
+
+Step 3: get_observations(ids=[<key_milestones>])
+  → Deep dive on critical implementation points
+```
+
+### Learning from Past Work
+
+**Scenario:** Review refactoring patterns
+
+```
+Step 1: search(type="refactor", limit=10, orderBy="date_desc")
+  → Recent refactoring work
+
+Step 2: get_observations(ids=[<interesting_ids>])
+  → Study the patterns and approaches used
+```
+
+### Context Recovery
+
+**Scenario:** Restore context after time away from project
+
+```
+Step 1: search(query="project-name", limit=10, orderBy="date_desc")
+  → See recent work
+
+Step 2: timeline(anchor=<most_recent_id>, depth_before=10)
+  → Understand what led to current state
+
+Step 3: get_observations(ids=[<critical_observations>])
+  → Refresh memory on key decisions
+```
+
+## Search Query Syntax
+
+The `query` parameter supports SQLite FTS5 full-text search syntax:
+
+### Boolean Operators
+
+```
+query="authentication AND JWT"           # Both terms must appear
+query="OAuth OR JWT"                      # Either term can appear
+query="security NOT deprecated"           # Exclude deprecated items
+```
+
+### Phrase Searches
+
+```
+query='"database migration"'             # Exact phrase match
+```
+
+### Column-Specific Searches
+
+```
+query="title:authentication"             # Search in title only
+query="content:database"                  # Search in content only
+query="concepts:security"                 # Search in concepts only
+```
+
+### Combining Operators
+
+```
+query='"user auth" AND (JWT OR session) NOT deprecated'
+```
+
+## Token Management
+
+### Token Efficiency Best Practices
+
+1. **Always start with search** - Get index first (~50-100 tokens/result)
+2. **Use small limits** - Start with 3-5 results, increase if needed
+3. **Filter before fetching** - Use type, date, project filters
+4. **Batch get_observations** - Always group multiple IDs in one call
+5. **Use timeline strategically** - Get context only when narrative matters
+
+### Token Cost Estimates
+
+| Operation | Tokens per Result |
+|-----------|-------------------|
+| search (index) | 50-100 |
+| timeline (per observation) | 100-200 |
+| get_observations (full details) | 500-1,000 |
+
+**Example Comparison:**
+
+**Inefficient:**
+```
+# Fetching 20 full observations upfront: 10,000-20,000 tokens
+get_observations(ids=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])
+```
+
+**Efficient:**
+```
+# Search index: ~1,000 tokens
+search(query="bug fix", limit=20)
+
+# Review IDs, identify 3 relevant observations
+
+# Fetch only relevant: ~1,500-3,000 tokens
+get_observations(ids=[5, 12, 18])
+
+# Total: 2,500-4,000 tokens (vs 10,000-20,000)
+```
 
 ## Advanced Filtering
 
-You can refine searches using natural language filters:
-
 ### Date Ranges
 
 ```
-"What bugs did we fix in October?"
-"Show me work from last week"
-"Find decisions made between October 1-31"
+search(
+  query="performance optimization",
+  dateStart="2025-10-01",
+  dateEnd="2025-10-31"
+)
 ```
 
 ### Multiple Types
 
-```
-"Show me all decisions and features"
-"Find bugfixes and refactorings"
-```
-
-### Concepts
+For observations of multiple types, make multiple searches or use broader query:
 
 ```
-"Find database work related to architecture and performance"
-"Show security observations"
+search(query="database", type="bugfix", limit=10)
+search(query="database", type="feature", limit=10)
 ```
 
-### File-Specific
+### Project-Specific
 
 ```
-"Show refactoring work that touched worker-service.ts"
-"Find changes to auth files"
+search(query="API", project="my-app", limit=15)
 ```
 
-### Project Filtering
+### Pagination
 
 ```
-"Show authentication work on my-app project"
-"What have we done on this codebase?"
+# First page
+search(query="refactor", limit=10, offset=0)
+
+# Second page
+search(query="refactor", limit=10, offset=10)
+
+# Third page
+search(query="refactor", limit=10, offset=20)
 ```
 
-**Note**: Claude translates your natural language into the appropriate API filters automatically.
-
-## Under the Hood: HTTP API
-
-The mem-search skill uses HTTP endpoints on the worker service (port 37777):
-
-- `GET /api/search/observations` - Full-text search observations
-- `GET /api/search/sessions` - Full-text search session summaries
-- `GET /api/search/prompts` - Full-text search user prompts
-- `GET /api/search/by-concept` - Find observations by concept tag
-- `GET /api/search/by-file` - Find work related to specific files
-- `GET /api/search/by-type` - Find observations by type
-- `GET /api/context/recent` - Get recent session context
-- `GET /api/context/timeline` - Get timeline around specific point
-- `GET /api/timeline/by-query` - Search + timeline in one call
-- `GET /api/search/help` - API documentation
-
-These endpoints use FTS5 full-text search with support for:
-- Boolean operators (AND, OR, NOT)
-- Phrase searches
-- Column-specific searches
-- Date range filtering
-- Project filtering
-
 ## Result Metadata
 
-All results include rich metadata:
+All observations include rich metadata:
 
-```
-## JWT authentication decision
-
-**Type**: decision
-**Date**: 2025-10-21 14:23:45
-**Concepts**: authentication, security, architecture
-**Files Read**: src/auth/middleware.ts, src/utils/jwt.ts
-**Files Modified**: src/auth/jwt-strategy.ts
-
-**Narrative**:
-Decided to implement JWT-based authentication instead of session-based
-authentication for better scalability and stateless design...
-
-**Facts**:
-• JWT tokens expire after 1 hour
-• Refresh tokens stored in httpOnly cookies
-• Token signing uses RS256 algorithm
-• Public keys rotated every 30 days
-```
-
-## Citations
-
-All search results include observation IDs that can be accessed via the HTTP API:
-
-- `http://localhost:37777/api/observation/{id}` - Get specific observation by ID
-- View all observations in the web viewer at `http://localhost:37777`
-
-These citations enable referencing specific historical context in your work.
-
-## Token Management
-
-### Token Efficiency Tips
-
-1. **Start with index format**: ~50-100 tokens per result
-2. **Use small limits**: Start with 3-5 results
-3. **Apply filters**: Narrow results before searching
-4. **Paginate**: Use offset to browse results in batches
-
-### Token Estimates
-
-| Format | Tokens per Result |
-|--------|-------------------|
-| Index  | 50-100            |
-| Full   | 500-1000          |
-
-**Example**:
-- 20 results in index format: ~1,000-2,000 tokens
-- 20 results in full format: ~10,000-20,000 tokens
-
-## Common Use Cases
-
-### 1. Debugging Issues
-
-Find what went wrong:
-```
-search_observations with query="error database connection" and type="bugfix"
-```
-
-### 2. Understanding Decisions
-
-Review architectural choices:
-```
-find_by_type with type="decision" and format="index"
-```
-
-Then deep dive on specific decisions:
-```
-search_observations with query="[DECISION TITLE]" and format="full"
-```
-
-### 3. Code Archaeology
-
-Find when a file was modified:
-```
-find_by_file with filePath="worker-service.ts"
-```
-
-### 4. Feature History
-
-Track feature development:
-```
-search_sessions with query="authentication feature"
-search_user_prompts with query="add authentication"
-```
-
-### 5. Learning from Past Work
-
-Review refactoring patterns:
-```
-find_by_type with type="refactor" and limit=10
-```
-
-### 6. Context Recovery
-
-Restore context after time away:
-```
-get_recent_context with limit=5
-search_sessions with query="[YOUR PROJECT NAME]" and orderBy="date_desc"
-```
-
-## Best Practices
-
-1. **Index first, full later**: Always start with index format
-2. **Small limits**: Start with 3-5 results to avoid token limits
-3. **Use filters**: Narrow results before searching
-4. **Specific queries**: More specific = better results
-5. **Review citations**: Use citations to reference past decisions
-6. **Date filtering**: Use date ranges for time-based searches
-7. **Type filtering**: Use types to categorize searches
-8. **Concept tags**: Use concepts for thematic searches
+- **ID** - Unique observation identifier
+- **Type** - bugfix, feature, decision, discovery, refactor, change
+- **Date** - When the work occurred
+- **Title** - Concise description
+- **Concepts** - Tagged themes (e.g., security, performance, architecture)
+- **Files Read** - Files examined during work
+- **Files Modified** - Files changed during work
+- **Narrative** - Story of what happened and why
+- **Facts** - Key factual points (decisions made, patterns used, metrics)
 
 ## Troubleshooting
 
 ### No Results Found
 
-1. Check database has data:
+1. **Broaden your search:**
+   ```
+   # Too specific
+   search(query="JWT authentication implementation with RS256")
+
+   # Better
+   search(query="authentication")
+   ```
+
+2. **Check database has data:**
    ```bash
-   sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"
+   curl "http://localhost:37777/api/search?query=test"
    ```
 
-2. Try broader natural language query:
+3. **Try without filters:**
    ```
-   "Show me anything about authentication"  # Broader
-   vs
-   "Find exact JWT authentication implementation"  # Too specific
+   # Remove type/date filters to see if data exists
+   search(query="your-search-term")
    ```
 
-3. Ask without filters first:
-   ```
-   "What do we have about auth?"
-   # Then narrow down
-   "Show me auth-related decisions"
-   ```
+### IDs Not Found in get_observations
 
-### Worker Service Not Running
+**Error:** "Observation IDs not found: [123, 456]"
 
-If search isn't working, check the worker service:
+**Causes:**
+- IDs from different project (use `project` parameter)
+- IDs were deleted
+- Typo in ID numbers
 
-```bash
-npm run worker:status       # Check worker status
-npm run worker:restart      # Restart if needed
-npm run worker:logs         # View logs
+**Solution:**
+```
+# Verify IDs exist
+search(query="<related-search>")
+
+# Use correct project filter
+get_observations(ids=[123, 456], project="correct-project-name")
 ```
 
-Or describe the issue to Claude and the troubleshoot skill will automatically activate to provide diagnosis.
+### Token Limit Errors
 
-### Performance Issues
+**Error:** Response exceeds token limits
+
+**Solution:** Use the 3-layer workflow to reduce upfront costs:
+
+```
+# Instead of fetching 50 full observations:
+# get_observations(ids=[1,2,3,...,50])  # 25,000-50,000 tokens!
+
+# Do this:
+search(query="<your-query>", limit=50)  # ~2,500-5,000 tokens
+# Review index, identify 5 relevant observations
+get_observations(ids=[<5-most-relevant>])  # ~2,500-5,000 tokens
+# Total: 5,000-10,000 tokens (50-80% savings)
+```
+
+### Search Performance
 
 If searches seem slow:
-1. Be more specific in your queries
-2. Ask for recent work (naturally filters by date)
-3. Specify the project you're interested in
-4. Ask for fewer results initially
+1. Be more specific in queries (helps FTS5 index)
+2. Use date range filters to narrow scope
+3. Specify project filter when possible
+4. Use smaller limit values
+
+## Best Practices
+
+1. **Index First, Details Later** - Always start with search to survey options
+2. **Filter Before Fetching** - Use search parameters to narrow results
+3. **Batch ID Fetches** - Group multiple IDs in one get_observations call
+4. **Use Timeline for Context** - When narrative matters, timeline shows the story
+5. **Specific Queries** - More specific = better relevance
+6. **Small Limits Initially** - Start with 3-5 results, expand if needed
+7. **Review Before Deep Dive** - Check index before fetching full details
 
 ## Technical Details
 
-**Architecture Change (v5.4.0)**:
-- **Before**: 9 MCP tools (~2,500 tokens in tool definitions per session start)
-- **After**: 1 mem-search skill (~250 tokens in frontmatter, full instructions loaded on-demand)
-- **Savings**: ~2,250 tokens per session start
-- **Migration**: Transparent - users don't need to change how they ask questions
+**Architecture:** MCP tools are a thin wrapper over the Worker HTTP API (localhost:37777). The MCP server translates tool calls into HTTP requests to the worker service, which handles all business logic, database queries, and Chroma vector search.
 
-**v5.5.0 Enhancement**: Renamed from "search" to "mem-search" with improved effectiveness (67% → 100%) and enhanced triggers (44% → 85%).
+**MCP Server:** Located at `~/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs`
 
-**How the Skill Works:**
-1. User asks a question about past work
-2. Claude recognizes the intent matches the mem-search skill description
-3. Skill loads full instructions from `plugin/skills/mem-search/SKILL.md`
-4. Skill uses `curl` to call HTTP API endpoints
-5. Results formatted and returned to Claude
-6. Claude presents results to user
+**Worker Service:** Express API on port 37777, managed by Bun
+
+**Database:** SQLite FTS5 full-text search on `~/.claude-mem/claude-mem.db`
+
+**Vector Search:** Chroma embeddings for semantic search (underlying implementation)
 
 ## Next Steps
 
+- [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow
 - [Architecture Overview](/architecture/overview) - System components
-- [Database Schema](/architecture/database) - Understanding the data
-- [Getting Started](/usage/getting-started) - Automatic operation
+- [Database Schema](/architecture/database) - Understanding the data structure
+- [Claude Desktop Setup](/usage/claude-desktop) - Installation and configuration