Replace search skill with mem-search (#91)

* feat: add mem-search skill with progressive disclosure architecture

Add comprehensive mem-search skill for accessing claude-mem's persistent
cross-session memory database. Implements progressive disclosure workflow
and token-efficient search patterns.

Features:
- 12 search operations (observations, sessions, prompts, by-type, by-concept, by-file, timelines, etc.)
- Progressive disclosure principles to minimize token usage
- Anti-patterns documentation to guide LLM behavior
- HTTP API integration for all search functionality
- Common workflows with composition examples

Structure:
- SKILL.md: Entry point with temporal trigger patterns
- principles/: Progressive disclosure + anti-patterns
- operations/: 12 search operation files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: add CHANGELOG entry for mem-search skill

Document mem-search skill addition in Unreleased section with:
- 100% effectiveness compliance metrics
- Comparison to previous search skill implementation
- Progressive disclosure architecture details
- Reference to audit report documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: add mem-search skill audit report

Add comprehensive audit report validating mem-search skill against
Anthropic's official skill-creator documentation.

Report includes:
- Effectiveness metrics comparison (search vs mem-search)
- Critical issues analysis for production readiness
- Compliance validation across 6 key dimensions
- Reference implementation guidance

Result: mem-search achieves 100% compliance vs search's 67%

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: Add comprehensive search architecture analysis document

- Document current state of dual search architectures (HTTP API and MCP)
- Analyze HTTP endpoints and MCP search server architectures
- Identify DRY violations across search implementations
- Evaluate the use of curl as the optimal approach for search
- Provide architectural recommendations for immediate and long-term improvements
- Outline action plan for cleanup, feature parity, DRY refactoring

* refactor: Remove deprecated search skill documentation and operations

* refactor: Reorganize documentation into public and context directories

Changes:
- Created docs/public/ for Mintlify documentation (.mdx files)
- Created docs/context/ for internal planning and implementation docs
- Moved all .mdx files and assets to docs/public/
- Moved all internal .md files to docs/context/
- Added CLAUDE.md to both directories explaining their purpose
- Updated docs.json paths to work with new structure

Benefits:
- Clear separation between user-facing and internal documentation
- Easier to maintain Mintlify docs in dedicated directory
- Internal context files organized separately

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Enhance session management and continuity in hooks

- Updated new-hook.ts to clarify session_id threading and idempotent session creation.
- Modified prompts.ts to require claudeSessionId for continuation prompts, ensuring session context is maintained.
- Improved SessionStore.ts documentation on createSDKSession to emphasize idempotent behavior and session connection.
- Refined SDKAgent.ts to detail continuation prompt logic and its reliance on session.claudeSessionId for unified session handling.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Alex Newman <thedotmack@gmail.com>
This commit is contained in:
basher83
2025-11-11 16:15:07 -05:00
committed by GitHub
parent eafdd6a7be
commit 97d565e3cd
92 changed files with 5038 additions and 1812 deletions
@@ -0,0 +1,120 @@
# Progressive Disclosure Pattern (MANDATORY)
**Core Principle**: Find the smallest set of high-signal tokens first (index format), then drill down to full details only for relevant items.
## The 4-Step Workflow
### Step 1: Start with Index Format
**Action:**
- Use `format=index` (default in most operations)
- Set `limit=3-5` (not 20)
- Review titles and dates ONLY
**Token Cost:** ~50-100 tokens per result
**Why:** Minimal token investment for maximum signal. Get overview before committing to full details.
**Example:**
```bash
curl -s "http://localhost:37777/api/search/observations?query=authentication&format=index&limit=5"
```
**Response:**
```json
{
"query": "authentication",
"count": 5,
"format": "index",
"results": [
{
"id": 1234,
"type": "feature",
"title": "Implemented JWT authentication",
"subtitle": "Added token-based auth with refresh tokens",
"created_at_epoch": 1699564800000,
"project": "api-server"
}
]
}
```
### Step 2: Identify Relevant Items
**Cognitive Task:**
- Scan index results for relevance
- Note which items need full details
- Discard irrelevant items
**Why:** Human-in-the-loop filtering before expensive operations. Don't load full details for items you'll ignore.
### Step 3: Request Full Details (Selectively)
**Action:**
- Use `format=full` ONLY for specific items of interest
- Target by ID or use refined search query
**Token Cost:** ~500-1000 tokens per result
**Principle:** Load only what you need
**Example:**
```bash
# After reviewing index, get full details for observation #1234
curl -s "http://localhost:37777/api/search/observations?query=authentication&format=full&limit=1&offset=2"
```
**Why:** Targeted token expenditure with high ROI. 10x cost difference means selectivity matters.
### Step 4: Refine with Filters (If Needed)
**Techniques:**
- Use `type`, `dateRange`, `concepts`, `files` filters
- Narrow scope BEFORE requesting more results
- Use `offset` for pagination instead of large limits
**Why:** Reduce result set first, then expand selectively. Don't load 20 results when filters could narrow to 3.
## Token Budget Awareness
**Costs:**
- Index result: ~50-100 tokens
- Full result: ~500-1000 tokens
- 10x cost difference
**Starting Points:**
- Start with `limit=3-5` (not 20)
- Reduce limit if hitting token errors
**Savings Example:**
- Naive: 10 items × 750 tokens (avg full) = 7,500 tokens
- Progressive: (5 items × 75 tokens index) + (2 items × 750 tokens full) = 1,875 tokens
- **Savings: 5,625 tokens (75% reduction)**
## What Problems This Solves
1. **Token exhaustion**: Without this, LLMs load everything in full format (9,000+ tokens for 10 items)
2. **Poor signal-to-noise**: Loading full details for irrelevant items wastes tokens
3. **MCP limits**: Large payloads hit protocol limits (system failures)
4. **Inefficiency**: Loading 20 full results when only 2 are relevant
## How It Scales
**With 10 records:**
- Index (500 tokens) → Full (2,000 tokens for 2 relevant) = 2,500 tokens
- Without pattern: Full (10,000 tokens for all 10) = 4x more expensive
**With 1,000 records:**
- Index (500 tokens for top 5) → Full (1,000 tokens for 1 relevant) = 1,500 tokens
- Without pattern: Would hit MCP limits before seeing relevant data
## Context Engineering Alignment
This pattern implements core context engineering principles:
- **Just-in-time context**: Load data dynamically at runtime
- **Progressive disclosure**: Lightweight identifiers (index) → full details as needed
- **Token efficiency**: Minimal high-signal tokens first, expand selectively
- **Attention budget**: Treat context as finite resource with diminishing returns
Always start with the smallest set of high-signal tokens that maximize likelihood of desired outcome.