Replace search skill with mem-search (#91)

* feat: add mem-search skill with progressive disclosure architecture Add comprehensive mem-search skill for accessing claude-mem's persistent cross-session memory database. Implements progressive disclosure workflow and token-efficient search patterns. Features: - 12 search operations (observations, sessions, prompts, by-type, by-concept, by-file, timelines, etc.) - Progressive disclosure principles to minimize token usage - Anti-patterns documentation to guide LLM behavior - HTTP API integration for all search functionality - Common workflows with composition examples Structure: - SKILL.md: Entry point with temporal trigger patterns - principles/: Progressive disclosure + anti-patterns - operations/: 12 search operation files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: add CHANGELOG entry for mem-search skill Document mem-search skill addition in Unreleased section with: - 100% effectiveness compliance metrics - Comparison to previous search skill implementation - Progressive disclosure architecture details - Reference to audit report documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: add mem-search skill audit report Add comprehensive audit report validating mem-search skill against Anthropic's official skill-creator documentation. Report includes: - Effectiveness metrics comparison (search vs mem-search) - Critical issues analysis for production readiness - Compliance validation across 6 key dimensions - Reference implementation guidance Result: mem-search achieves 100% compliance vs search's 67% 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: Add comprehensive search architecture analysis document - Document current state of dual search architectures (HTTP API and MCP) - Analyze HTTP endpoints and MCP search server architectures - Identify DRY violations across search implementations - Evaluate the use of curl as the optimal approach for search - Provide architectural recommendations for immediate and long-term improvements - Outline action plan for cleanup, feature parity, DRY refactoring * refactor: Remove deprecated search skill documentation and operations * refactor: Reorganize documentation into public and context directories Changes: - Created docs/public/ for Mintlify documentation (.mdx files) - Created docs/context/ for internal planning and implementation docs - Moved all .mdx files and assets to docs/public/ - Moved all internal .md files to docs/context/ - Added CLAUDE.md to both directories explaining their purpose - Updated docs.json paths to work with new structure Benefits: - Clear separation between user-facing and internal documentation - Easier to maintain Mintlify docs in dedicated directory - Internal context files organized separately 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Enhance session management and continuity in hooks - Updated new-hook.ts to clarify session_id threading and idempotent session creation. - Modified prompts.ts to require claudeSessionId for continuation prompts, ensuring session context is maintained. - Improved SessionStore.ts documentation on createSDKSession to emphasize idempotent behavior and session connection. - Refined SDKAgent.ts to detail continuation prompt logic and its reliance on session.claudeSessionId for unified session handling. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Alex Newman <thedotmack@gmail.com>
2025-11-11 16:15:07 -05:00
parent eafdd6a7be
commit 97d565e3cd
92 changed files with 5038 additions and 1812 deletions
@@ -0,0 +1,120 @@
+# Progressive Disclosure Pattern (MANDATORY)
+
+**Core Principle**: Find the smallest set of high-signal tokens first (index format), then drill down to full details only for relevant items.
+
+## The 4-Step Workflow
+
+### Step 1: Start with Index Format
+
+**Action:**
+- Use `format=index` (default in most operations)
+- Set `limit=3-5` (not 20)
+- Review titles and dates ONLY
+
+**Token Cost:** ~50-100 tokens per result
+
+**Why:** Minimal token investment for maximum signal. Get overview before committing to full details.
+
+**Example:**
+```bash
+curl -s "http://localhost:37777/api/search/observations?query=authentication&format=index&limit=5"
+```
+
+**Response:**
+```json
+{
+  "query": "authentication",
+  "count": 5,
+  "format": "index",
+  "results": [
+    {
+      "id": 1234,
+      "type": "feature",
+      "title": "Implemented JWT authentication",
+      "subtitle": "Added token-based auth with refresh tokens",
+      "created_at_epoch": 1699564800000,
+      "project": "api-server"
+    }
+  ]
+}
+```
+
+### Step 2: Identify Relevant Items
+
+**Cognitive Task:**
+- Scan index results for relevance
+- Note which items need full details
+- Discard irrelevant items
+
+**Why:** Human-in-the-loop filtering before expensive operations. Don't load full details for items you'll ignore.
+
+### Step 3: Request Full Details (Selectively)
+
+**Action:**
+- Use `format=full` ONLY for specific items of interest
+- Target by ID or use refined search query
+
+**Token Cost:** ~500-1000 tokens per result
+
+**Principle:** Load only what you need
+
+**Example:**
+```bash
+# After reviewing index, get full details for observation #1234
+curl -s "http://localhost:37777/api/search/observations?query=authentication&format=full&limit=1&offset=2"
+```
+
+**Why:** Targeted token expenditure with high ROI. 10x cost difference means selectivity matters.
+
+### Step 4: Refine with Filters (If Needed)
+
+**Techniques:**
+- Use `type`, `dateRange`, `concepts`, `files` filters
+- Narrow scope BEFORE requesting more results
+- Use `offset` for pagination instead of large limits
+
+**Why:** Reduce result set first, then expand selectively. Don't load 20 results when filters could narrow to 3.
+
+## Token Budget Awareness
+
+**Costs:**
+- Index result: ~50-100 tokens
+- Full result: ~500-1000 tokens
+- 10x cost difference
+
+**Starting Points:**
+- Start with `limit=3-5` (not 20)
+- Reduce limit if hitting token errors
+
+**Savings Example:**
+- Naive: 10 items × 750 tokens (avg full) = 7,500 tokens
+- Progressive: (5 items × 75 tokens index) + (2 items × 750 tokens full) = 1,875 tokens
+- **Savings: 5,625 tokens (75% reduction)**
+
+## What Problems This Solves
+
+1. **Token exhaustion**: Without this, LLMs load everything in full format (9,000+ tokens for 10 items)
+2. **Poor signal-to-noise**: Loading full details for irrelevant items wastes tokens
+3. **MCP limits**: Large payloads hit protocol limits (system failures)
+4. **Inefficiency**: Loading 20 full results when only 2 are relevant
+
+## How It Scales
+
+**With 10 records:**
+- Index (500 tokens) → Full (2,000 tokens for 2 relevant) = 2,500 tokens
+- Without pattern: Full (10,000 tokens for all 10) = 4x more expensive
+
+**With 1,000 records:**
+- Index (500 tokens for top 5) → Full (1,000 tokens for 1 relevant) = 1,500 tokens
+- Without pattern: Would hit MCP limits before seeing relevant data
+
+## Context Engineering Alignment
+
+This pattern implements core context engineering principles:
+
+- **Just-in-time context**: Load data dynamically at runtime
+- **Progressive disclosure**: Lightweight identifiers (index) → full details as needed
+- **Token efficiency**: Minimal high-signal tokens first, expand selectively
+- **Attention budget**: Treat context as finite resource with diminishing returns
+
+Always start with the smallest set of high-signal tokens that maximize likelihood of desired outcome.