Files

T

basher83 97d565e3cd Replace search skill with mem-search (#91 )

* feat: add mem-search skill with progressive disclosure architecture

Add comprehensive mem-search skill for accessing claude-mem's persistent
cross-session memory database. Implements progressive disclosure workflow
and token-efficient search patterns.

Features:
- 12 search operations (observations, sessions, prompts, by-type, by-concept, by-file, timelines, etc.)
- Progressive disclosure principles to minimize token usage
- Anti-patterns documentation to guide LLM behavior
- HTTP API integration for all search functionality
- Common workflows with composition examples

Structure:
- SKILL.md: Entry point with temporal trigger patterns
- principles/: Progressive disclosure + anti-patterns
- operations/: 12 search operation files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: add CHANGELOG entry for mem-search skill

Document mem-search skill addition in Unreleased section with:
- 100% effectiveness compliance metrics
- Comparison to previous search skill implementation
- Progressive disclosure architecture details
- Reference to audit report documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: add mem-search skill audit report

Add comprehensive audit report validating mem-search skill against
Anthropic's official skill-creator documentation.

Report includes:
- Effectiveness metrics comparison (search vs mem-search)
- Critical issues analysis for production readiness
- Compliance validation across 6 key dimensions
- Reference implementation guidance

Result: mem-search achieves 100% compliance vs search's 67%

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: Add comprehensive search architecture analysis document

- Document current state of dual search architectures (HTTP API and MCP)
- Analyze HTTP endpoints and MCP search server architectures
- Identify DRY violations across search implementations
- Evaluate the use of curl as the optimal approach for search
- Provide architectural recommendations for immediate and long-term improvements
- Outline action plan for cleanup, feature parity, DRY refactoring

* refactor: Remove deprecated search skill documentation and operations

* refactor: Reorganize documentation into public and context directories

Changes:
- Created docs/public/ for Mintlify documentation (.mdx files)
- Created docs/context/ for internal planning and implementation docs
- Moved all .mdx files and assets to docs/public/
- Moved all internal .md files to docs/context/
- Added CLAUDE.md to both directories explaining their purpose
- Updated docs.json paths to work with new structure

Benefits:
- Clear separation between user-facing and internal documentation
- Easier to maintain Mintlify docs in dedicated directory
- Internal context files organized separately

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Enhance session management and continuity in hooks

- Updated new-hook.ts to clarify session_id threading and idempotent session creation.
- Modified prompts.ts to require claudeSessionId for continuation prompts, ensuring session context is maintained.
- Improved SessionStore.ts documentation on createSDKSession to emphasize idempotent behavior and session connection.
- Refined SDKAgent.ts to detail continuation prompt logic and its reliance on session.claudeSessionId for unified session handling.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Alex Newman <thedotmack@gmail.com>

2025-11-11 16:15:07 -05:00

6.9 KiB

Raw Permalink Blame History

Context Engineering for AI Agents: Best Practices Cheat Sheet

Core Principle

Find the smallest possible set of high-signal tokens that maximize the likelihood of your desired outcome.

Context Engineering vs Prompt Engineering

Prompt Engineering: Writing and organizing LLM instructions for optimal outcomes (one-time task)

Context Engineering: Curating and maintaining the optimal set of tokens during inference across multiple turns (iterative process)

Context engineering manages:

System instructions
Tools
Model Context Protocol (MCP)
External data
Message history
Runtime data retrieval

The Problem: Context Rot

Key Insight: LLMs have an "attention budget" that gets depleted as context grows

Every token attends to every other token (n² relationships)
As context length increases, model accuracy decreases
Models have less training experience with longer sequences
Context must be treated as a finite resource with diminishing marginal returns

System Prompts: Find the "Right Altitude"

The Goldilocks Zone

Too Prescriptive ❌

Hardcoded if-else logic
Brittle and fragile
High maintenance complexity

Too Vague ❌

High-level guidance without concrete signals
Falsely assumes shared context
Lacks actionable direction

Just Right ✅

Specific enough to guide behavior effectively
Flexible enough to provide strong heuristics
Minimal set of information that fully outlines expected behavior

Best Practices

Use simple, direct language
Organize into distinct sections (<background_information>, <instructions>, ## Tool guidance, etc.)
Use XML tags or Markdown headers for structure
Start with minimal prompt, add based on failure modes
Note: Minimal ≠ short (provide sufficient information upfront)

Tools: Minimal and Clear

Design Principles

Self-contained: Each tool has a single, clear purpose
Robust to error: Handle edge cases gracefully
Extremely clear: Intended use is unambiguous
Token-efficient: Returns relevant information without bloat
Descriptive parameters: Unambiguous input names (e.g., user_id not user)

Critical Rule

If a human engineer can't definitively say which tool to use in a given situation, an AI agent can't be expected to do better.

Common Failure Modes to Avoid

Bloated tool sets covering too much functionality
Tools with overlapping purposes
Ambiguous decision points about which tool to use

Examples: Diverse, Not Exhaustive

Do ✅

Curate a set of diverse, canonical examples
Show expected behavior effectively
Think "pictures worth a thousand words"

Don't ❌

Stuff in a laundry list of edge cases
Try to articulate every possible rule
Overwhelm with exhaustive scenarios

Context Retrieval Strategies

Just-In-Time Context (Recommended for Agents)

Approach: Maintain lightweight identifiers (file paths, queries, links) and dynamically load data at runtime

Benefits:

Avoids context pollution
Enables progressive disclosure
Mirrors human cognition (we don't memorize everything)
Leverages metadata (file names, folder structure, timestamps)
Agents discover context incrementally

Trade-offs:

Slower than pre-computed retrieval
Requires proper tool guidance to avoid dead-ends

Pre-Inference Retrieval (Traditional RAG)

Approach: Use embedding-based retrieval to surface context before inference

When to Use: Static content that won't change during interaction

Hybrid Strategy (Best of Both)

Approach: Retrieve some data upfront, enable autonomous exploration as needed

Example: Claude Code loads CLAUDE.md files upfront, uses glob/grep for just-in-time retrieval

Rule of Thumb: "Do the simplest thing that works"

Long-Horizon Tasks: Three Techniques

1. Compaction

What: Summarize conversation nearing context limit, reinitiate with summary

Implementation:

Pass message history to model for compression
Preserve critical details (architectural decisions, bugs, implementation)
Discard redundant outputs
Continue with compressed context + recently accessed files

Tuning Process:

First: Maximize recall (capture all relevant information)
Then: Improve precision (eliminate superfluous content)

Low-Hanging Fruit: Clear old tool calls and results

Best For: Tasks requiring extensive back-and-forth

2. Structured Note-Taking (Agentic Memory)

What: Agent writes notes persisted outside context window, retrieved later

Examples:

To-do lists
NOTES.md files
Game state tracking (Pokémon example: tracking 1,234 steps of training)
Project progress logs

Benefits:

Persistent memory with minimal overhead
Maintains critical context across tool calls
Enables multi-hour coherent strategies

Best For: Iterative development with clear milestones

3. Sub-Agent Architectures

What: Specialized sub-agents handle focused tasks with clean context windows

How It Works:

Main agent coordinates high-level plan
Sub-agents perform deep technical work
Sub-agents explore extensively (tens of thousands of tokens)
Return condensed summaries (1,000-2,000 tokens)

Benefits:

Clear separation of concerns
Parallel exploration
Detailed context remains isolated

Best For: Complex research and analysis tasks

Quick Decision Framework

Scenario	Recommended Approach
Static content	Pre-inference retrieval or hybrid
Dynamic exploration needed	Just-in-time context
Extended back-and-forth	Compaction
Iterative development	Structured note-taking
Complex research	Sub-agent architectures
Rapid model improvement	"Do the simplest thing that works"

Key Takeaways

Context is finite: Treat it as a precious resource with an attention budget
Think holistically: Consider the entire state available to the LLM
Stay minimal: More context isn't always better
Be iterative: Context curation happens each time you pass to the model
Design for autonomy: As models improve, let them act intelligently
Start simple: Test with minimal setup, add based on failure modes

Anti-Patterns to Avoid

❌ Cramming everything into prompts
❌ Creating brittle if-else logic
❌ Building bloated tool sets
❌ Stuffing exhaustive edge cases as examples
❌ Assuming larger context windows solve everything
❌ Ignoring context pollution over long interactions

Remember

"Even as models continue to improve, the challenge of maintaining coherence across extended interactions will remain central to building more effective agents."

Context engineering will evolve, but the core principle stays the same: optimize signal-to-noise ratio in your token budget.

Based on Anthropic's "Effective context engineering for AI agents" (September 2025)

6.9 KiB Raw Permalink Blame History

Context Engineering for AI Agents: Best Practices Cheat Sheet

Core Principle

Context Engineering vs Prompt Engineering

The Problem: Context Rot

System Prompts: Find the "Right Altitude"

The Goldilocks Zone

Best Practices

Tools: Minimal and Clear

Design Principles

Critical Rule

Common Failure Modes to Avoid

Examples: Diverse, Not Exhaustive

Context Retrieval Strategies

Just-In-Time Context (Recommended for Agents)

Pre-Inference Retrieval (Traditional RAG)

Hybrid Strategy (Best of Both)

Long-Horizon Tasks: Three Techniques

1. Compaction

2. Structured Note-Taking (Agentic Memory)

3. Sub-Agent Architectures

Quick Decision Framework

Key Takeaways

Anti-Patterns to Avoid

Remember

6.9 KiB

Raw Permalink Blame History