diff --git a/docs/public/architecture/search-architecture.mdx b/docs/public/architecture/search-architecture.mdx index a61f97de..b3314e99 100644 --- a/docs/public/architecture/search-architecture.mdx +++ b/docs/public/architecture/search-architecture.mdx @@ -133,7 +133,7 @@ Invoke this skill when users ask about: ... ``` -**Token Savings**: ~2,250 tokens per session start (90% reduction) +**Token Efficiency**: Minimal frontmatter at session start with progressive disclosure ## HTTP API Endpoints @@ -341,14 +341,14 @@ All user-provided search queries are properly escaped to prevent SQL injection. ### 1. Token Efficiency **Before (MCP)**: -- Session start: ~2,500 tokens for tool definitions +- Session start: All tool definitions loaded upfront - Every session pays this cost - No progressive disclosure **After (Skill)**: -- Session start: ~250 tokens for skill frontmatter -- Full instructions: ~2,500 tokens (only when invoked) -- Net savings: ~2,250 tokens per session (~90% reduction) +- Session start: Minimal token cost for skill frontmatter +- Full instructions loaded only when invoked (progressive disclosure) +- More efficient than loading all tool definitions upfront ### 2. Natural Language Interface diff --git a/docs/public/docs.json b/docs/public/docs.json index d5df1646..aaff8c88 100644 --- a/docs/public/docs.json +++ b/docs/public/docs.json @@ -40,7 +40,8 @@ "usage/claude-desktop", "usage/private-tags", "usage/export-import", - "beta-features" + "beta-features", + "endless-mode" ] }, { diff --git a/docs/public/endless-mode.mdx b/docs/public/endless-mode.mdx new file mode 100644 index 00000000..e0ccf6ed --- /dev/null +++ b/docs/public/endless-mode.mdx @@ -0,0 +1,111 @@ +--- +title: "Endless Mode (Beta)" +description: "Experimental biomimetic memory architecture for extended sessions" +--- + +# Current State of Endless Mode + +## Core Concept + +Endless Mode is a **biomimetic memory architecture** that solves Claude's context window exhaustion problem. Instead of keeping full tool outputs in the context window (O(N²) complexity), it: + +- Captures compressed observations after each tool use +- Replaces transcripts with low token summaries +- Achieves O(N) linear complexity +- Maintains two-tier memory: working memory (compressed) + archive memory (full transcript on disk, maintained by default claude code functionality) + +## Implementation Status + +**Status**: FUNCTIONAL BUT EXPERIMENTAL + +**Current Branch**: `beta/endless-mode` (ahead of main) + +**Recent Activity**: +- Merged main branch changes +- Resolved merge conflicts in save-hook, SessionStore, SessionRoutes +- Updated documentation to remove misleading token reduction claims +- Added important caveats about beta status + +## Key Architecture Components + +1. **Pre-Tool-Use Hook** - Tracks tool execution start, sends tool_use_id to worker +2. **Save Hook (PostToolUse)** - **CRITICAL**: Blocks until observation is generated (110s timeout), injects compressed observation back into context +3. **SessionManager.waitForNextObservation()** - Event-driven wait mechanism (no polling) +4. **SDKAgent** - Generates observations via Agent SDK, emits completion events +5. **Database** - Added `tool_use_id` column for observation correlation + +## Configuration + +```json +{ + "CLAUDE_MEM_ENDLESS_MODE": "false", // Default: disabled + "CLAUDE_MEM_ENDLESS_WAIT_TIMEOUT_MS": "90000" // 90 second timeout +} +``` + +**Enable via**: Manual checkout of beta branch (see instructions below) + +## Flow + +``` +Tool Executes → Pre-Hook (track ID) → Tool Completes → +Save-Hook (BLOCKS) → Worker processes → SDK generates observation → +Event fired → Hook receives observation → Injects markdown → +Clears input → Context reduced +``` + +## Known Limitations + +From the documentation: +- ⚠️ **Slower than standard mode** - Blocking adds latency +- ⚠️ **Still in development** - May have bugs +- ⚠️ **Not battle-tested** - New architecture +- ⚠️ **Theoretical projections** - Efficiency gains not yet validated in production + +## What's Working + +- ✅ Synchronous observation injection +- ✅ Event-driven wait mechanism +- ✅ Token reduction via input clearing +- ✅ Database schema with tool_use_id +- ✅ Web UI for version switching +- ✅ Graceful timeout fallbacks + +## What's Not Ready + +- ❌ Production validation of token savings +- ❌ Comprehensive test coverage +- ❌ Stable channel release +- ❌ Performance benchmarks +- ❌ Long-running session data + +## How to Try Endless Mode + +Endless Mode is currently only available on the beta branch. To try it: + +```bash +# Navigate to your claude-mem installation +cd ~/.claude/plugins/marketplaces/thedotmack/ + +# Checkout the beta branch +git checkout beta/endless-mode + +# Install dependencies +npm install + +# Restart the worker +npm run worker:restart +``` + +**To return to stable:** + +```bash +cd ~/.claude/plugins/marketplaces/thedotmack/ +git checkout main +npm install +npm run worker:restart +``` + +## Summary + +The implementation is architecturally complete and functional, but remains experimental pending production validation of the theoretical efficiency gains. diff --git a/plugin/skills/mem-search.zip b/plugin/skills/mem-search.zip index 999ff782..bf681164 100644 Binary files a/plugin/skills/mem-search.zip and b/plugin/skills/mem-search.zip differ