97d565e3cd
* feat: add mem-search skill with progressive disclosure architecture Add comprehensive mem-search skill for accessing claude-mem's persistent cross-session memory database. Implements progressive disclosure workflow and token-efficient search patterns. Features: - 12 search operations (observations, sessions, prompts, by-type, by-concept, by-file, timelines, etc.) - Progressive disclosure principles to minimize token usage - Anti-patterns documentation to guide LLM behavior - HTTP API integration for all search functionality - Common workflows with composition examples Structure: - SKILL.md: Entry point with temporal trigger patterns - principles/: Progressive disclosure + anti-patterns - operations/: 12 search operation files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: add CHANGELOG entry for mem-search skill Document mem-search skill addition in Unreleased section with: - 100% effectiveness compliance metrics - Comparison to previous search skill implementation - Progressive disclosure architecture details - Reference to audit report documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: add mem-search skill audit report Add comprehensive audit report validating mem-search skill against Anthropic's official skill-creator documentation. Report includes: - Effectiveness metrics comparison (search vs mem-search) - Critical issues analysis for production readiness - Compliance validation across 6 key dimensions - Reference implementation guidance Result: mem-search achieves 100% compliance vs search's 67% 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: Add comprehensive search architecture analysis document - Document current state of dual search architectures (HTTP API and MCP) - Analyze HTTP endpoints and MCP search server architectures - Identify DRY violations across search implementations - Evaluate the use of curl as the optimal approach for search - Provide architectural recommendations for immediate and long-term improvements - Outline action plan for cleanup, feature parity, DRY refactoring * refactor: Remove deprecated search skill documentation and operations * refactor: Reorganize documentation into public and context directories Changes: - Created docs/public/ for Mintlify documentation (.mdx files) - Created docs/context/ for internal planning and implementation docs - Moved all .mdx files and assets to docs/public/ - Moved all internal .md files to docs/context/ - Added CLAUDE.md to both directories explaining their purpose - Updated docs.json paths to work with new structure Benefits: - Clear separation between user-facing and internal documentation - Easier to maintain Mintlify docs in dedicated directory - Internal context files organized separately 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Enhance session management and continuity in hooks - Updated new-hook.ts to clarify session_id threading and idempotent session creation. - Modified prompts.ts to require claudeSessionId for continuation prompts, ensuring session context is maintained. - Improved SessionStore.ts documentation on createSDKSession to emphasize idempotent behavior and session connection. - Refined SDKAgent.ts to detail continuation prompt logic and its reliance on session.claudeSessionId for unified session handling. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Alex Newman <thedotmack@gmail.com>
161 lines
6.7 KiB
Markdown
161 lines
6.7 KiB
Markdown
# Skill Audit Report
|
|
|
|
**Date:** 2025-11-10
|
|
**Validation:** Anthropic's official skill-creator documentation
|
|
**Skills Audited:** mem-search, search
|
|
|
|
## Executive Summary
|
|
|
|
The mem-search skill achieves 100% compliance across all dimensions. The search skill meets technical requirements but fails effectiveness metrics critical for auto-invocation.
|
|
|
|
**mem-search:** Production-ready. No changes required.
|
|
|
|
**search:** Requires three critical fixes before Claude reliably discovers and invokes this skill.
|
|
|
|
## mem-search Skill Results
|
|
|
|
**Status:** ✅ PASS
|
|
**Compliance:** 100% technical, 100% effectiveness
|
|
**Files:** 17 (202-line SKILL.md + 13 operations + 2 principles)
|
|
|
|
### Strengths
|
|
|
|
The skill demonstrates exemplary effectiveness engineering:
|
|
|
|
1. **Trigger Design (85% concrete)**
|
|
- Five unique identifiers: claude-mem, PM2-managed database, cross-session memory, session summaries, observations
|
|
- Nine scope differentiation keywords
|
|
- Explicit boundary: "NOT in the current conversation context"
|
|
- Minimal overlap with Claude's native capabilities
|
|
|
|
2. **Capability Visibility (100%)**
|
|
- All nine operations include inline "Use when" examples
|
|
- Decision guide reduces complexity from nine operations to five common cases
|
|
- No navigation friction
|
|
|
|
3. **Structure**
|
|
- 202 lines (60% under limit)
|
|
- Perfect progressive disclosure with token cost documentation
|
|
- Clean file organization: operations/ and principles/ directories
|
|
- No content duplication
|
|
|
|
### Issues
|
|
|
|
**One false positive:** Line 152 contains backslashes in regex notation `(bugfix\|feature\|decision)`. This documents parameter syntax, not Windows paths. No action required.
|
|
|
|
## search Skill Results
|
|
|
|
**Status:** ⚠️ NEEDS IMPROVEMENT
|
|
**Compliance:** 100% technical, 67% effectiveness
|
|
**Files:** 13 (96-line SKILL.md + 12 operations)
|
|
|
|
### Critical Effectiveness Issues
|
|
|
|
Three failures prevent reliable auto-invocation:
|
|
|
|
#### Issue 1: Insufficient Scope Differentiation
|
|
|
|
**Problem:** Description contains only two differentiation keywords (threshold: ≥3). Claude cannot distinguish this skill from native conversation memory.
|
|
|
|
**Current description:**
|
|
```text
|
|
Search claude-mem persistent memory for past sessions, observations, bugs
|
|
fixed, features implemented, decisions made, code changes, and previous work.
|
|
Use when answering questions about history, finding past decisions, or
|
|
researching previous implementations.
|
|
```
|
|
|
|
**Domain overlap analysis:**
|
|
- Claude answers natively: "What bugs did we fix?" (current conversation)
|
|
- Claude needs skill: "What bugs did we fix last week?" (external database)
|
|
|
|
**Fix required:**
|
|
|
|
```text
|
|
Search claude-mem's external database of past sessions, observations, and
|
|
work from previous conversations. Accesses persistent memory stored outside
|
|
current session context - NOT information from today's conversation. Use when
|
|
users ask about: (1) previous sessions ("what did we do last week?"),
|
|
(2) historical work ("bugs we fixed months ago"), (3) cross-session patterns
|
|
("how have we approached this before?"), (4) work already stored in claude-mem
|
|
("what's in the database about X?"). Searches FTS5 full-text index across
|
|
typed observations (bugfix/feature/refactor/decision/discovery). For current
|
|
session memory, use native conversation context instead.
|
|
```
|
|
|
|
This adds eight differentiation keywords: "external database", "past sessions", "previous conversations", "outside current session", "NOT information from today's", "last week", "months ago", "already stored in claude-mem".
|
|
|
|
#### Issue 2: Weak Trigger Specificity
|
|
|
|
**Problem:** Only 44% concrete triggers (threshold: >50%). Only one unique identifier (threshold: ≥2).
|
|
|
|
**Abstract triggers (low specificity):**
|
|
- "history" (could mean git history, browser history)
|
|
- "past work" (could mean files, commits, documents)
|
|
- "decisions" (could mean any decision tracking)
|
|
- "previous work" (could mean current session earlier)
|
|
- "implementations" (could mean code in current conversation)
|
|
|
|
**Concrete triggers (high specificity):**
|
|
- "claude-mem" (unique system name)
|
|
- "persistent memory" (system-specific)
|
|
- "sessions" (cross-session concept)
|
|
- "observations" (system-specific)
|
|
|
|
**Concrete ratio:** 4/9 = 44% (fails 50% threshold)
|
|
|
|
**Fix required:** Add system-specific terminology: "HTTP API", "port 37777", "FTS5 full-text index", "typed observations". See combined description in Issue 1 fix.
|
|
|
|
#### Issue 3: Wasted Content in Body
|
|
|
|
**Problem:** Lines 10-22 contain "When to Use This Skill" section in SKILL.md body. This loads AFTER triggering, wastes ~200 tokens, provides no value.
|
|
|
|
**Reference:** [Anthropic's skill-creator documentation](https://github.com/anthropics/anthropic-quickstarts/tree/main/skill-creator) states: "The body is only loaded after triggering, so 'When to Use This Skill' sections in the body are not helpful to Claude."
|
|
|
|
**Fix required:** Delete lines 10-22 entirely. Move triggering examples to description field (already included in Issue 1 fix).
|
|
|
|
### Strengths
|
|
|
|
The skill demonstrates strong structure:
|
|
|
|
- Excellent progressive disclosure (96-line navigation hub)
|
|
- Strong decision guide (reduces 10 operations to common cases)
|
|
- 100% capability visibility (all operations show purpose inline)
|
|
- No forbidden files or content duplication
|
|
- Clean operations/ directory structure
|
|
|
|
### Warning
|
|
|
|
**Minor:** Description uses imperative "Use when" instead of third person. Change to "Useful for" or "Invoked when" for consistency with skill-creator best practices.
|
|
|
|
## Comparison
|
|
|
|
| Metric | mem-search | search | Impact |
|
|
|--------|-----------|---------|--------|
|
|
| Concrete triggers | 85% | 44% | search harder to discover |
|
|
| Unique identifiers | 5+ | 1 | search less distinct |
|
|
| Scope differentiation | 9 keywords | 2 keywords | **search conflicts with native memory** |
|
|
| Body optimization | Clean | Wasted section | search wastes tokens |
|
|
| Overall effectiveness | 100% | 67% | search needs fixes |
|
|
|
|
## Critical Recommendations
|
|
|
|
The search skill requires three changes before production use:
|
|
|
|
1. **Rewrite description** to add scope differentiation and concrete triggers (see Issue 1 fix)
|
|
2. **Delete lines 10-22** from SKILL.md body
|
|
3. **Convert to third person** - change "Use when" to "Useful for"
|
|
|
|
**Why this matters:** Without scope differentiation, Claude assumes "What bugs did we fix?" refers to current conversation, not the external claude-mem database. This causes systematic under-invocation.
|
|
|
|
## Reference Implementation
|
|
|
|
The mem-search skill serves as a reference implementation for:
|
|
|
|
- Trigger design with explicit scope boundaries
|
|
- Progressive disclosure with token efficiency documentation
|
|
- Inline capability visibility eliminating navigation friction
|
|
- Decision guides reducing cognitive load
|
|
|
|
Study mem-search when creating skills that overlap with Claude's native capabilities.
|