Files

T

basher83 97d565e3cd Replace search skill with mem-search (#91 )

* feat: add mem-search skill with progressive disclosure architecture

Add comprehensive mem-search skill for accessing claude-mem's persistent
cross-session memory database. Implements progressive disclosure workflow
and token-efficient search patterns.

Features:
- 12 search operations (observations, sessions, prompts, by-type, by-concept, by-file, timelines, etc.)
- Progressive disclosure principles to minimize token usage
- Anti-patterns documentation to guide LLM behavior
- HTTP API integration for all search functionality
- Common workflows with composition examples

Structure:
- SKILL.md: Entry point with temporal trigger patterns
- principles/: Progressive disclosure + anti-patterns
- operations/: 12 search operation files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: add CHANGELOG entry for mem-search skill

Document mem-search skill addition in Unreleased section with:
- 100% effectiveness compliance metrics
- Comparison to previous search skill implementation
- Progressive disclosure architecture details
- Reference to audit report documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: add mem-search skill audit report

Add comprehensive audit report validating mem-search skill against
Anthropic's official skill-creator documentation.

Report includes:
- Effectiveness metrics comparison (search vs mem-search)
- Critical issues analysis for production readiness
- Compliance validation across 6 key dimensions
- Reference implementation guidance

Result: mem-search achieves 100% compliance vs search's 67%

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: Add comprehensive search architecture analysis document

- Document current state of dual search architectures (HTTP API and MCP)
- Analyze HTTP endpoints and MCP search server architectures
- Identify DRY violations across search implementations
- Evaluate the use of curl as the optimal approach for search
- Provide architectural recommendations for immediate and long-term improvements
- Outline action plan for cleanup, feature parity, DRY refactoring

* refactor: Remove deprecated search skill documentation and operations

* refactor: Reorganize documentation into public and context directories

Changes:
- Created docs/public/ for Mintlify documentation (.mdx files)
- Created docs/context/ for internal planning and implementation docs
- Moved all .mdx files and assets to docs/public/
- Moved all internal .md files to docs/context/
- Added CLAUDE.md to both directories explaining their purpose
- Updated docs.json paths to work with new structure

Benefits:
- Clear separation between user-facing and internal documentation
- Easier to maintain Mintlify docs in dedicated directory
- Internal context files organized separately

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Enhance session management and continuity in hooks

- Updated new-hook.ts to clarify session_id threading and idempotent session creation.
- Modified prompts.ts to require claudeSessionId for continuation prompts, ensuring session context is maintained.
- Improved SessionStore.ts documentation on createSDKSession to emphasize idempotent behavior and session connection.
- Refined SDKAgent.ts to detail continuation prompt logic and its reliance on session.claudeSessionId for unified session handling.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Alex Newman <thedotmack@gmail.com>

2025-11-11 16:15:07 -05:00

3.7 KiB

Raw Blame History

Progressive Disclosure Pattern (MANDATORY)

Core Principle: Find the smallest set of high-signal tokens first (index format), then drill down to full details only for relevant items.

The 4-Step Workflow

Step 1: Start with Index Format

Action:

Use format=index (default in most operations)
Set limit=3-5 (not 20)
Review titles and dates ONLY

Token Cost: ~50-100 tokens per result

Why: Minimal token investment for maximum signal. Get overview before committing to full details.

Example:

curl -s "http://localhost:37777/api/search/observations?query=authentication&format=index&limit=5"

Response:

{
  "query": "authentication",
  "count": 5,
  "format": "index",
  "results": [
    {
      "id": 1234,
      "type": "feature",
      "title": "Implemented JWT authentication",
      "subtitle": "Added token-based auth with refresh tokens",
      "created_at_epoch": 1699564800000,
      "project": "api-server"
    }
  ]
}

Step 2: Identify Relevant Items

Cognitive Task:

Scan index results for relevance
Note which items need full details
Discard irrelevant items

Why: Human-in-the-loop filtering before expensive operations. Don't load full details for items you'll ignore.

Step 3: Request Full Details (Selectively)

Action:

Use format=full ONLY for specific items of interest
Target by ID or use refined search query

Token Cost: ~500-1000 tokens per result

Principle: Load only what you need

Example:

# After reviewing index, get full details for observation #1234
curl -s "http://localhost:37777/api/search/observations?query=authentication&format=full&limit=1&offset=2"

Why: Targeted token expenditure with high ROI. 10x cost difference means selectivity matters.

Step 4: Refine with Filters (If Needed)

Techniques:

Use type, dateRange, concepts, files filters
Narrow scope BEFORE requesting more results
Use offset for pagination instead of large limits

Why: Reduce result set first, then expand selectively. Don't load 20 results when filters could narrow to 3.

Token Budget Awareness

Costs:

Index result: ~50-100 tokens
Full result: ~500-1000 tokens
10x cost difference

Starting Points:

Start with limit=3-5 (not 20)
Reduce limit if hitting token errors

Savings Example:

Naive: 10 items × 750 tokens (avg full) = 7,500 tokens
Progressive: (5 items × 75 tokens index) + (2 items × 750 tokens full) = 1,875 tokens
Savings: 5,625 tokens (75% reduction)

What Problems This Solves

Token exhaustion: Without this, LLMs load everything in full format (9,000+ tokens for 10 items)
Poor signal-to-noise: Loading full details for irrelevant items wastes tokens
MCP limits: Large payloads hit protocol limits (system failures)
Inefficiency: Loading 20 full results when only 2 are relevant

How It Scales

With 10 records:

Index (500 tokens) → Full (2,000 tokens for 2 relevant) = 2,500 tokens
Without pattern: Full (10,000 tokens for all 10) = 4x more expensive

With 1,000 records:

Index (500 tokens for top 5) → Full (1,000 tokens for 1 relevant) = 1,500 tokens
Without pattern: Would hit MCP limits before seeing relevant data

Context Engineering Alignment

This pattern implements core context engineering principles:

Just-in-time context: Load data dynamically at runtime
Progressive disclosure: Lightweight identifiers (index) → full details as needed
Token efficiency: Minimal high-signal tokens first, expand selectively
Attention budget: Treat context as finite resource with diminishing returns

Always start with the smallest set of high-signal tokens that maximize likelihood of desired outcome.

3.7 KiB Raw Blame History Unescape Escape