diff --git a/docs/public/architecture/search-architecture.mdx b/docs/public/architecture/search-architecture.mdx
index a61f97de..b3314e99 100644
--- a/docs/public/architecture/search-architecture.mdx
+++ b/docs/public/architecture/search-architecture.mdx
@@ -133,7 +133,7 @@ Invoke this skill when users ask about:
 ...
 ```
 
-**Token Savings**: ~2,250 tokens per session start (90% reduction)
+**Token Efficiency**: Minimal frontmatter at session start with progressive disclosure
 
 ## HTTP API Endpoints
 
@@ -341,14 +341,14 @@ All user-provided search queries are properly escaped to prevent SQL injection.
 ### 1. Token Efficiency
 
 **Before (MCP)**:
-- Session start: ~2,500 tokens for tool definitions
+- Session start: All tool definitions loaded upfront
 - Every session pays this cost
 - No progressive disclosure
 
 **After (Skill)**:
-- Session start: ~250 tokens for skill frontmatter
-- Full instructions: ~2,500 tokens (only when invoked)
-- Net savings: ~2,250 tokens per session (~90% reduction)
+- Session start: Minimal token cost for skill frontmatter
+- Full instructions loaded only when invoked (progressive disclosure)
+- More efficient than loading all tool definitions upfront
 
 ### 2. Natural Language Interface
 
diff --git a/docs/public/docs.json b/docs/public/docs.json
index d5df1646..aaff8c88 100644
--- a/docs/public/docs.json
+++ b/docs/public/docs.json
@@ -40,7 +40,8 @@
           "usage/claude-desktop",
           "usage/private-tags",
           "usage/export-import",
-          "beta-features"
+          "beta-features",
+          "endless-mode"
         ]
       },
       {
diff --git a/docs/public/endless-mode.mdx b/docs/public/endless-mode.mdx
new file mode 100644
index 00000000..e0ccf6ed
--- /dev/null
+++ b/docs/public/endless-mode.mdx
@@ -0,0 +1,111 @@
+---
+title: "Endless Mode (Beta)"
+description: "Experimental biomimetic memory architecture for extended sessions"
+---
+
+# Current State of Endless Mode
+
+## Core Concept
+
+Endless Mode is a **biomimetic memory architecture** that solves Claude's context window exhaustion problem. Instead of keeping full tool outputs in the context window (O(N²) complexity), it:
+
+- Captures compressed observations after each tool use
+- Replaces transcripts with low token summaries
+- Achieves O(N) linear complexity
+- Maintains two-tier memory: working memory (compressed) + archive memory (full transcript on disk, maintained by default claude code functionality)
+
+## Implementation Status
+
+**Status**: FUNCTIONAL BUT EXPERIMENTAL
+
+**Current Branch**: `beta/endless-mode` (ahead of main)
+
+**Recent Activity**:
+- Merged main branch changes
+- Resolved merge conflicts in save-hook, SessionStore, SessionRoutes
+- Updated documentation to remove misleading token reduction claims
+- Added important caveats about beta status
+
+## Key Architecture Components
+
+1. **Pre-Tool-Use Hook** - Tracks tool execution start, sends tool_use_id to worker
+2. **Save Hook (PostToolUse)** - **CRITICAL**: Blocks until observation is generated (110s timeout), injects compressed observation back into context
+3. **SessionManager.waitForNextObservation()** - Event-driven wait mechanism (no polling)
+4. **SDKAgent** - Generates observations via Agent SDK, emits completion events
+5. **Database** - Added `tool_use_id` column for observation correlation
+
+## Configuration
+
+```json
+{
+  "CLAUDE_MEM_ENDLESS_MODE": "false",  // Default: disabled
+  "CLAUDE_MEM_ENDLESS_WAIT_TIMEOUT_MS": "90000"  // 90 second timeout
+}
+```
+
+**Enable via**: Manual checkout of beta branch (see instructions below)
+
+## Flow
+
+```
+Tool Executes → Pre-Hook (track ID) → Tool Completes →
+Save-Hook (BLOCKS) → Worker processes → SDK generates observation →
+Event fired → Hook receives observation → Injects markdown →
+Clears input → Context reduced
+```
+
+## Known Limitations
+
+From the documentation:
+- ⚠️ **Slower than standard mode** - Blocking adds latency
+- ⚠️ **Still in development** - May have bugs
+- ⚠️ **Not battle-tested** - New architecture
+- ⚠️ **Theoretical projections** - Efficiency gains not yet validated in production
+
+## What's Working
+
+- ✅ Synchronous observation injection
+- ✅ Event-driven wait mechanism
+- ✅ Token reduction via input clearing
+- ✅ Database schema with tool_use_id
+- ✅ Web UI for version switching
+- ✅ Graceful timeout fallbacks
+
+## What's Not Ready
+
+- ❌ Production validation of token savings
+- ❌ Comprehensive test coverage
+- ❌ Stable channel release
+- ❌ Performance benchmarks
+- ❌ Long-running session data
+
+## How to Try Endless Mode
+
+Endless Mode is currently only available on the beta branch. To try it:
+
+```bash
+# Navigate to your claude-mem installation
+cd ~/.claude/plugins/marketplaces/thedotmack/
+
+# Checkout the beta branch
+git checkout beta/endless-mode
+
+# Install dependencies
+npm install
+
+# Restart the worker
+npm run worker:restart
+```
+
+**To return to stable:**
+
+```bash
+cd ~/.claude/plugins/marketplaces/thedotmack/
+git checkout main
+npm install
+npm run worker:restart
+```
+
+## Summary
+
+The implementation is architecturally complete and functional, but remains experimental pending production validation of the theoretical efficiency gains.
diff --git a/plugin/skills/mem-search.zip b/plugin/skills/mem-search.zip
index 999ff782..bf681164 100644
Binary files a/plugin/skills/mem-search.zip and b/plugin/skills/mem-search.zip differ