Compare commits

..

8 Commits

Author SHA1 Message Date
Alex Newman 0524fa83cd chore: bump version to 10.6.2
Publish to npm / publish (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 14:14:09 -07:00
Alex Newman 4d7bec4d05 fix: stop spinner from spinning forever (#1440)
* fix: stop spinner from spinning forever due to orphaned DB messages

The activity spinner never stopped because isAnySessionProcessing() queried
ALL pending/processing messages in the database, including orphaned messages
from dead sessions that no generator would ever process.

Root cause: isAnySessionProcessing() used hasAnyPendingWork() which is a
global DB scan. Changed it to use getTotalQueueDepth() which only checks
sessions in the active in-memory Map.

Additional fixes:
- Add terminateSession() to enforce restart-or-terminate invariant
- Fix 3 zombie paths in .finally() handler that left sessions alive
- Clean up idle sessions from memory on successful completion
- Remove redundant bare isProcessing:true broadcast
- Replace inline require() with proper accessor
- Add 8 regression tests for session termination invariant

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address review findings — idle-timeout race, double broadcast, query amplification

- Move pendingCount check before idle-timeout termination to prevent
  abandoning fresh messages that arrive between idle abort and .finally()
- Move broadcastProcessingStatus() inside restart branch only — the else
  branch already broadcasts via removeSessionImmediate callback
- Compute queueDepth once in broadcastProcessingStatus() and derive
  isProcessing from it, eliminating redundant double iteration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 14:13:10 -07:00
Alex Newman 9f529a30f5 feat: strip <system_instruction> tags before DB storage (#1398)
* feat: strip <system_instruction> tags before database storage

Extends the existing tag-stripping mechanism (used for <private> and
<claude-mem-context>) to also filter Conductor-injected system instructions,
preventing them from being persisted in the observation database.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: also strip <system-instruction> (hyphen variant) before DB storage

Conductor uses both <system_instruction> and <system-instruction> tag
formats. This adds the hyphen variant to the same stripping mechanism.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 12:08:25 -07:00
Alex Newman b34aff1aa2 docs: update CHANGELOG.md for v10.6.1
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:37:01 -07:00
Alex Newman d54e574251 chore: bump version to 10.6.1
Publish to npm / publish (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:36:23 -07:00
Alex Newman c7abb01dfc feat(timeline-report): detect git worktree and use parent project as data source
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:31:49 -07:00
Alex Newman 7e07210635 feat: add timeline-report skill with token economics, compress context output 53%
## Summary
- New timeline-report skill for generating narrative project history reports
- Compressed markdown context output ~53% (tables → flat compact lines, verbose labels → terse format)
- Added `full=true` param to /api/context/inject for fetching all observations
- Split TimelineRenderer into separate markdown/color rendering paths
- Removed arbitrary file write vulnerability (dump_to_file param)
- Fixed timestamp ditto marker leaking across session summary boundaries

## Review
- Rebased on main (v10.6.0) to preserve OpenClaw system prompt injection
- Reviewed by /review (gstack) + /octo:review (Codex, Gemini, Claude fleet)
- Security fix (dump_to_file removal) confirmed by all 3 reviewers
- Timestamp bug caught by Codex, fixed

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-03-18 13:57:20 -07:00
Alex Newman 648c84804c docs: update CHANGELOG.md for v10.6.0
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 17:17:08 -07:00
20 changed files with 813 additions and 375 deletions
+1 -1
View File
@@ -10,7 +10,7 @@
"plugins": [
{
"name": "claude-mem",
"version": "10.6.0",
"version": "10.6.2",
"source": "./plugin",
"description": "Persistent memory system for Claude Code - context compression across sessions"
}
+34 -50
View File
@@ -2,6 +2,40 @@
All notable changes to claude-mem.
## [v10.6.1] - 2026-03-18
### New Features
- **Timeline Report Skill** — New `/timeline-report` skill generates narrative "Journey Into [Project]" reports from claude-mem's development history with token-aware economics
- **Git Worktree Detection** — Timeline report automatically detects git worktrees and uses parent project as data source
- **Compressed Context Output** — Markdown context injection compressed ~53% (tables → compact flat lines), reducing token overhead in session starts
- **Full Observation Fetch** — Added `full=true` parameter to `/api/context/inject` for fetching all observations
### Improvements
- Split `TimelineRenderer` into separate markdown/color rendering paths
- Fixed timestamp ditto marker leaking across session summary boundaries
### Security
- Removed arbitrary file write vulnerability (`dump_to_file` parameter)
## [v10.6.0] - 2026-03-18
## OpenClaw: System prompt context injection
The OpenClaw plugin no longer writes to `MEMORY.md`. Instead, it injects the observation timeline into each agent's system prompt via the `before_prompt_build` hook using `appendSystemContext`. This keeps `MEMORY.md` under the agent's control for curated long-term memory. Context is cached for 60 seconds per project.
## New `syncMemoryFileExclude` config
Exclude specific agent IDs from automatic context injection (e.g., `["snarf", "debugger"]`). Observations are still recorded for excluded agents — only the context injection is skipped.
## Fix: UI settings now preserve falsy values
The viewer settings hook used `||` instead of `??`, which silently replaced backend values like `'0'`, `'false'`, and `''` with UI defaults. Fixed with nullish coalescing. Frontend defaults now aligned with backend `SettingsDefaultsManager`.
## Documentation
- Updated `openclaw-integration.mdx` and `openclaw/SKILL.md` to reflect system prompt injection behavior
- Fixed "prompt injection" → "context injection" terminology to avoid confusion with the OWASP security term
## [v10.5.6] - 2026-03-16
## Patch: Process Supervisor Hardening & Logging Cleanup
@@ -1084,53 +1118,3 @@ Fixed an issue where the worker service startup wasn't producing proper JSON sta
- Removed obsolete error handling baseline file
## [v9.0.2] - 2026-01-10
## Bug Fixes
- **Windows Terminal Tab Accumulation (#625, #628)**: Fixed terminal tab accumulation on Windows by implementing graceful exit strategy. All expected failure scenarios (port conflicts, version mismatches, health check timeouts) now exit with code 0 instead of code 1.
- **Windows 11 Compatibility (#625)**: Replaced deprecated WMIC commands with PowerShell `Get-Process` and `Get-CimInstance` for process enumeration. WMIC is being removed from Windows 11.
## Maintenance
- **Removed Obsolete CLAUDE.md Files**: Cleaned up auto-generated CLAUDE.md files from `~/.claude/plans/` and `~/.claude/plugins/marketplaces/` directories.
---
**Full Changelog**: https://github.com/thedotmack/claude-mem/compare/v9.0.1...v9.0.2
## [v9.0.1] - 2026-01-08
## Bug Fixes
### Claude Code 2.1.1 Compatibility
- Fixed hook architecture for compatibility with Claude Code 2.1.0/2.1.1
- Context is now injected silently via SessionStart hook
- Removed deprecated `user-message-hook` (no longer used in CC 2.1.0+)
### Path Validation for CLAUDE.md Distribution
- Added `isValidPathForClaudeMd()` to reject malformed paths:
- Tilde paths (`~`) that Node.js doesn't expand
- URLs (`http://`, `https://`)
- Paths with spaces (likely command text or PR references)
- Paths with `#` (GitHub issue/PR references)
- Relative paths that escape project boundary
- Cleaned up 12 invalid CLAUDE.md files created by bug artifacts
- Updated `.gitignore` to prevent future accidents
### Log-Level Audit
- Promoted 38+ WARN messages to ERROR level for improved debugging:
- Parser: observation type errors, data contamination
- SDK/Agents: empty init responses (Gemini, OpenRouter)
- Worker/Queue: session recovery, auto-recovery failures
- Chroma: sync failures, search failures
- SQLite: search failures
- Session/Generator: failures, missing context
- Infrastructure: shutdown, process management failures
## Internal Changes
- Removed hardcoded fake token counts from context injection
- Standardized Claude Code 2.1.0 note wording across documentation
**Full Changelog**: https://github.com/thedotmack/claude-mem/compare/v9.0.0...v9.0.1
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "10.6.0",
"version": "10.6.2",
"description": "Memory compression system for Claude Code - persist context across sessions",
"keywords": [
"claude",
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "10.6.0",
"version": "10.6.2",
"description": "Persistent memory system for Claude Code - seamlessly preserve context across sessions",
"author": {
"name": "Alex Newman"
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem-plugin",
"version": "10.6.0",
"version": "10.6.2",
"private": true,
"description": "Runtime dependencies for claude-mem bundled hooks",
"type": "module",
File diff suppressed because one or more lines are too long
+1 -1
View File
@@ -114,7 +114,7 @@ Set the \`cycles\` parameter to \`"ref"\` to resolve cyclical schemas with defs.
${c}`}var xP=new Set([".js",".jsx",".ts",".tsx",".mjs",".cjs",".py",".pyw",".go",".rs",".rb",".java",".cs",".cpp",".c",".h",".hpp",".swift",".kt",".php",".vue",".svelte"]),kP=new Set(["node_modules",".git","dist","build",".next","__pycache__",".venv","venv","env",".env","target","vendor",".cache",".turbo","coverage",".nyc_output",".claude",".smart-file-read"]),SP=512*1024;async function*u$(t,e,r=20){if(r<=0)return;let n;try{n=await(0,zn.readdir)(t,{withFileTypes:!0})}catch{return}for(let o of n){if(o.name.startsWith(".")&&o.name!=="."||kP.has(o.name))continue;let i=(0,hi.join)(t,o.name);if(o.isDirectory())yield*u$(i,e,r-1);else if(o.isFile()){let a=o.name.slice(o.name.lastIndexOf("."));xP.has(a)&&(yield i)}}}async function wP(t){try{let e=await(0,zn.stat)(t);if(e.size>SP||e.size===0)return null;let r=await(0,zn.readFile)(t,"utf-8");return r.slice(0,1e3).includes("\0")?null:r}catch{return null}}async function l$(t,e,r={}){let n=r.maxResults||20,o=e.toLowerCase(),i=o.split(/[\s_\-./]+/).filter(h=>h.length>0),a=[];for await(let h of u$(t,t)){if(r.filePattern&&!(0,hi.relative)(t,h).toLowerCase().includes(r.filePattern.toLowerCase()))continue;let _=await wP(h);_&&a.push({absolutePath:h,relativePath:(0,hi.relative)(t,h),content:_})}let s=a$(a),c=[],u=[],l=0;for(let[h,_]of s){l+=zP(_);let E=js(h.toLowerCase(),i)>0,I=[],A=(j,Le)=>{for(let de of j){let Wt=0,Qe="",Kt=js(de.name.toLowerCase(),i);Kt>0&&(Wt+=Kt*3,Qe="name match"),de.signature.toLowerCase().includes(o)&&(Wt+=2,Qe=Qe?`${Qe} + signature`:"signature match"),de.jsdoc&&de.jsdoc.toLowerCase().includes(o)&&(Wt+=1,Qe=Qe?`${Qe} + jsdoc`:"jsdoc match"),Wt>0&&(E=!0,I.push({filePath:h,symbolName:Le?`${Le}.${de.name}`:de.name,kind:de.kind,signature:de.signature,jsdoc:de.jsdoc,lineStart:de.lineStart,lineEnd:de.lineEnd,matchReason:Qe})),de.children&&A(de.children,de.name)}};A(_.symbols),E&&(c.push(_),u.push(...I))}u.sort((h,_)=>{let b=js(h.symbolName.toLowerCase(),i);return js(_.symbolName.toLowerCase(),i)-b});let d=u.slice(0,n),p=new Set(d.map(h=>h.filePath)),f=c.filter(h=>p.has(h.filePath)).slice(0,n),g=f.reduce((h,_)=>h+_.foldedTokenEstimate,0);return{foldedFiles:f,matchingSymbols:d,totalFilesScanned:a.length,totalSymbolsFound:l,tokenEstimate:g}}function js(t,e){let r=0;for(let n of e)if(t===n)r+=10;else if(t.includes(n))r+=5;else{let o=0,i=0;for(let a of n){let s=t.indexOf(a,o);s!==-1&&(i++,o=s+1)}i===n.length&&(r+=1)}return r}function zP(t){let e=t.symbols.length;for(let r of t.symbols)r.children&&(e+=r.children.length);return e}function d$(t,e){let r=[];if(r.push(`\u{1F50D} Smart Search: "${e}"`),r.push(` Scanned ${t.totalFilesScanned} files, found ${t.totalSymbolsFound} symbols`),r.push(` ${t.matchingSymbols.length} matches across ${t.foldedFiles.length} files (~${t.tokenEstimate} tokens for folded view)`),r.push(""),t.matchingSymbols.length===0)return r.push(" No matching symbols found."),r.join(`
`);r.push("\u2500\u2500 Matching Symbols \u2500\u2500"),r.push("");for(let n of t.matchingSymbols){if(r.push(` ${n.kind} ${n.symbolName} (${n.filePath}:${n.lineStart+1})`),r.push(` ${n.signature}`),n.jsdoc){let o=n.jsdoc.split(`
`).find(i=>i.replace(/^[\s*/]+/,"").trim().length>0);o&&r.push(` \u{1F4AC} ${o.replace(/^[\s*/]+/,"").trim()}`)}r.push("")}r.push("\u2500\u2500 Folded File Views \u2500\u2500"),r.push("");for(let n of t.foldedFiles)r.push(wn(n)),r.push("");return r.push("\u2500\u2500 Actions \u2500\u2500"),r.push(" To see full implementation: use smart_unfold with file path and symbol name"),r.join(`
`)}var jf=require("node:fs/promises"),Ds=require("node:path"),IP="10.6.0";console.log=(...t)=>{ve.error("CONSOLE","Intercepted console output (MCP protocol protection)",void 0,{args:t})};var p$={search:"/api/search",timeline:"/api/timeline"};async function f$(t,e){ve.debug("SYSTEM","\u2192 Worker API",void 0,{endpoint:t,params:e});try{let r=new URLSearchParams;for(let[a,s]of Object.entries(e))s!=null&&r.append(a,String(s));let n=`${t}?${r}`,o=await Ps(n);if(!o.ok){let a=await o.text();throw new Error(`Worker API error (${o.status}): ${a}`)}let i=await o.json();return ve.debug("SYSTEM","\u2190 Worker API success",void 0,{endpoint:t}),i}catch(r){return ve.error("SYSTEM","\u2190 Worker API error",{endpoint:t},r),{content:[{type:"text",text:`Error calling Worker API: ${r instanceof Error?r.message:String(r)}`}],isError:!0}}}async function EP(t,e){ve.debug("HTTP","Worker API request (POST)",void 0,{endpoint:t});try{let r=await Ps(t,{method:"POST",headers:{"Content-Type":"application/json"},body:JSON.stringify(e)});if(!r.ok){let o=await r.text();throw new Error(`Worker API error (${r.status}): ${o}`)}let n=await r.json();return ve.debug("HTTP","Worker API success (POST)",void 0,{endpoint:t}),{content:[{type:"text",text:JSON.stringify(n,null,2)}]}}catch(r){return ve.error("HTTP","Worker API error (POST)",{endpoint:t},r),{content:[{type:"text",text:`Error calling Worker API: ${r instanceof Error?r.message:String(r)}`}],isError:!0}}}async function TP(){try{return(await Ps("/api/health")).ok}catch(t){return ve.debug("SYSTEM","Worker health check failed",{},t),!1}}var m$=[{name:"__IMPORTANT",description:`3-LAYER WORKFLOW (ALWAYS FOLLOW):
`)}var jf=require("node:fs/promises"),Ds=require("node:path"),IP="10.6.2";console.log=(...t)=>{ve.error("CONSOLE","Intercepted console output (MCP protocol protection)",void 0,{args:t})};var p$={search:"/api/search",timeline:"/api/timeline"};async function f$(t,e){ve.debug("SYSTEM","\u2192 Worker API",void 0,{endpoint:t,params:e});try{let r=new URLSearchParams;for(let[a,s]of Object.entries(e))s!=null&&r.append(a,String(s));let n=`${t}?${r}`,o=await Ps(n);if(!o.ok){let a=await o.text();throw new Error(`Worker API error (${o.status}): ${a}`)}let i=await o.json();return ve.debug("SYSTEM","\u2190 Worker API success",void 0,{endpoint:t}),i}catch(r){return ve.error("SYSTEM","\u2190 Worker API error",{endpoint:t},r),{content:[{type:"text",text:`Error calling Worker API: ${r instanceof Error?r.message:String(r)}`}],isError:!0}}}async function EP(t,e){ve.debug("HTTP","Worker API request (POST)",void 0,{endpoint:t});try{let r=await Ps(t,{method:"POST",headers:{"Content-Type":"application/json"},body:JSON.stringify(e)});if(!r.ok){let o=await r.text();throw new Error(`Worker API error (${r.status}): ${o}`)}let n=await r.json();return ve.debug("HTTP","Worker API success (POST)",void 0,{endpoint:t}),{content:[{type:"text",text:JSON.stringify(n,null,2)}]}}catch(r){return ve.error("HTTP","Worker API error (POST)",{endpoint:t},r),{content:[{type:"text",text:`Error calling Worker API: ${r instanceof Error?r.message:String(r)}`}],isError:!0}}}async function TP(){try{return(await Ps("/api/health")).ok}catch(t){return ve.debug("SYSTEM","Worker health check failed",{},t),!1}}var m$=[{name:"__IMPORTANT",description:`3-LAYER WORKFLOW (ALWAYS FOLLOW):
1. search(query) \u2192 Get index with IDs (~50-100 tokens/result)
2. timeline(anchor=ID) \u2192 Get context around interesting results
3. get_observations([IDs]) \u2192 Fetch full details ONLY for filtered IDs
File diff suppressed because one or more lines are too long
+203
View File
@@ -0,0 +1,203 @@
---
name: timeline-report
description: Generate a "Journey Into [Project]" narrative report analyzing a project's entire development history from claude-mem's timeline. Use when asked for a timeline report, project history analysis, development journey, or full project report.
---
# Timeline Report
Generate a comprehensive narrative analysis of a project's entire development history using claude-mem's persistent memory timeline.
## When to Use
Use when users ask for:
- "Write a timeline report"
- "Journey into [project]"
- "Analyze my project history"
- "Full project report"
- "Summarize the entire development history"
- "What's the story of this project?"
## Prerequisites
The claude-mem worker must be running on localhost:37777. The project must have claude-mem observations recorded.
## Workflow
### Step 1: Determine the Project Name
Ask the user which project to analyze if not obvious from context. The project name is typically the directory name of the project (e.g., "tokyo", "my-app"). If the user says "this project", use the current working directory's basename.
**Worktree Detection:** Before using the directory basename, check if the current directory is a git worktree. In a worktree, the data source is the **parent project**, not the worktree directory itself. Run:
```bash
git_dir=$(git rev-parse --git-dir 2>/dev/null)
git_common_dir=$(git rev-parse --git-common-dir 2>/dev/null)
if [ "$git_dir" != "$git_common_dir" ]; then
# We're in a worktree — resolve the parent project name
parent_project=$(basename "$(dirname "$git_common_dir")")
echo "Worktree detected. Parent project: $parent_project"
else
parent_project=$(basename "$PWD")
fi
echo "$parent_project"
```
If a worktree is detected, use `$parent_project` (the basename of the parent repo) as the project name for all API calls. Inform the user: "Detected git worktree. Using parent project '[name]' as the data source."
### Step 2: Fetch the Full Timeline
Use Bash to fetch the complete timeline from the claude-mem worker API:
```bash
curl -s "http://localhost:37777/api/context/inject?project=PROJECT_NAME&full=true"
```
This returns the entire compressed timeline -- every observation, session boundary, and summary across the project's full history. The response is pre-formatted markdown optimized for LLM consumption.
**Token estimates:** The full timeline size depends on the project's history:
- Small project (< 1,000 observations): ~20-50K tokens
- Medium project (1,000-10,000 observations): ~50-300K tokens
- Large project (10,000-35,000 observations): ~300-750K tokens
If the response is empty or returns an error, the worker may not be running or the project name may be wrong. Try `curl -s "http://localhost:37777/api/search?query=*&limit=1"` to verify the worker is healthy.
### Step 3: Estimate Token Count
Before proceeding, estimate the token count of the fetched timeline (roughly 1 token per 4 characters). Report this to the user:
```
Timeline fetched: ~X observations, estimated ~Yk tokens.
This analysis will consume approximately Yk input tokens + ~5-10k output tokens.
Proceed? (y/n)
```
Wait for user confirmation before continuing if the timeline exceeds 100K tokens.
### Step 4: Analyze with a Subagent
Deploy an Agent (using the Task tool) with the full timeline and the following analysis prompt. Pass the ENTIRE timeline as context to the agent. The agent should also be instructed to query the SQLite database at `~/.claude-mem/claude-mem.db` for the Token Economics section.
**Agent prompt:**
```
You are a technical historian analyzing a software project's complete development timeline from claude-mem's persistent memory system. The timeline below contains every observation, session boundary, and summary recorded across the project's entire history.
You also have access to the claude-mem SQLite database at ~/.claude-mem/claude-mem.db. Use it to run queries for the Token Economics & Memory ROI section. The database has an "observations" table with columns: id, memory_session_id, project, text, type, title, subtitle, facts, narrative, concepts, files_read, files_modified, prompt_number, discovery_tokens, created_at, created_at_epoch, source_tool, source_input_summary.
Write a comprehensive narrative report titled "Journey Into [PROJECT_NAME]" that covers:
## Required Sections
1. **Project Genesis** -- When and how the project started. What were the first commits, the initial vision, the founding technical decisions? What problem was being solved?
2. **Architectural Evolution** -- How did the architecture change over time? What were the major pivots? Why did they happen? Trace the evolution from initial design through each significant restructuring.
3. **Key Breakthroughs** -- Identify the "aha" moments: when a difficult problem was finally solved, when a new approach unlocked progress, when a prototype first worked. These are the observations where the tone shifts from investigation to resolution.
4. **Work Patterns** -- Analyze the rhythm of development. Identify debugging cycles (clusters of bug fixes), feature sprints (rapid observation sequences), refactoring phases (architectural changes without new features), and exploration phases (many discoveries without changes).
5. **Technical Debt** -- Track where shortcuts were taken and when they were paid back. Identify patterns of accumulation (rapid feature work) and resolution (dedicated refactoring sessions).
6. **Challenges and Debugging Sagas** -- The hardest problems encountered. Multi-session debugging efforts, architectural dead-ends that required backtracking, platform-specific issues that took days to resolve.
7. **Memory and Continuity** -- How did persistent memory (claude-mem itself, if applicable) affect the development process? Were there moments where recalled context from prior sessions saved significant time or prevented repeated mistakes?
8. **Token Economics & Memory ROI** -- Quantitative analysis of how memory recall saved work:
- Query the database directly for these metrics using `sqlite3 ~/.claude-mem/claude-mem.db`
- Count total discovery_tokens across all observations (the original cost of all work)
- Count sessions that had context injection available (sessions after the first)
- Calculate the compression ratio: average discovery_tokens vs average read_tokens per observation
- Identify the highest-value observations (highest discovery_tokens -- these are the most expensive decisions, bugs, and discoveries that memory prevents re-doing)
- Identify explicit recall events (observations where source_tool contains "search", "smart_search", "get_observations", "timeline", or where narrative mentions "recalled", "from memory", "previous session")
- Estimate passive recall savings: each session with context injection receives ~50 observations. Use a 30% relevance factor (conservative estimate that 30% of injected context prevents re-work). Savings = sessions_with_context × avg_discovery_value_of_50_obs_window × 0.30
- Estimate explicit recall savings: ~10K tokens per explicit recall query
- Calculate net ROI: total_savings / total_read_tokens_invested
- Present as a table with monthly breakdown
- Highlight the top 5 most expensive observations by discovery_tokens -- these represent the highest-value memories in the system (architecture decisions, hard bugs, implementation plans that cost 100K+ tokens to produce originally)
Use these SQL queries as a starting point:
```sql
-- Total discovery tokens
SELECT SUM(discovery_tokens) FROM observations WHERE project = 'PROJECT_NAME';
-- Sessions with context available (not the first session)
SELECT COUNT(DISTINCT memory_session_id) FROM observations WHERE project = 'PROJECT_NAME';
-- Average tokens per observation
SELECT AVG(discovery_tokens) as avg_discovery, AVG(LENGTH(title || COALESCE(subtitle,'') || COALESCE(narrative,'') || COALESCE(facts,'')) / 4) as avg_read FROM observations WHERE project = 'PROJECT_NAME' AND discovery_tokens > 0;
-- Top 5 most expensive observations (highest-value memories)
SELECT id, title, discovery_tokens FROM observations WHERE project = 'PROJECT_NAME' ORDER BY discovery_tokens DESC LIMIT 5;
-- Monthly breakdown
SELECT strftime('%Y-%m', created_at) as month, COUNT(*) as obs, SUM(discovery_tokens) as total_discovery, COUNT(DISTINCT memory_session_id) as sessions FROM observations WHERE project = 'PROJECT_NAME' GROUP BY month ORDER BY month;
-- Explicit recall events
SELECT COUNT(*) FROM observations WHERE project = 'PROJECT_NAME' AND (source_tool LIKE '%search%' OR source_tool LIKE '%timeline%' OR source_tool LIKE '%get_observations%' OR narrative LIKE '%recalled%' OR narrative LIKE '%from memory%' OR narrative LIKE '%previous session%');
```
9. **Timeline Statistics** -- Quantitative summary:
- Date range (first observation to last)
- Total observations and sessions
- Breakdown by observation type (features, bug fixes, discoveries, decisions, changes)
- Most active days/weeks
- Longest debugging sessions
10. **Lessons and Meta-Observations** -- What patterns emerge from the full history? What would a new developer learn about this codebase from reading the timeline? What recurring themes or principles guided development?
## Writing Style
- Write as a technical narrative, not a list of bullet points
- Use specific observation IDs and timestamps when referencing events (e.g., "On Dec 14 (#26766), the root cause was finally identified...")
- Connect events across time -- show how early decisions created later consequences
- Be honest about struggles and dead ends, not just successes
- Target 3,000-6,000 words depending on project size
- Use markdown formatting with headers, emphasis, and code references where appropriate
## Important
- Analyze the ENTIRE timeline chronologically -- do not skip early history
- Look for narrative arcs: problem -> investigation -> solution
- Identify turning points where the project's direction fundamentally changed
- Note any observations about the development process itself (tooling, workflow, collaboration patterns)
Here is the complete project timeline:
[TIMELINE CONTENT GOES HERE]
```
### Step 5: Save the Report
Save the agent's output as a markdown file. Default location:
```
./journey-into-PROJECT_NAME.md
```
Or if the user specified a different output path, use that instead.
### Step 6: Report Completion
Tell the user:
- Where the report was saved
- The approximate token cost (input timeline + output report)
- The date range covered
- Number of observations analyzed
## Error Handling
- **Empty timeline:** "No observations found for project 'X'. Check the project name with: `curl -s 'http://localhost:37777/api/search?query=*&limit=1'`"
- **Worker not running:** "The claude-mem worker is not responding on port 37777. Start it with your usual method or check `ps aux | grep worker-service`."
- **Timeline too large:** For projects with 50,000+ observations, the timeline may exceed context limits. Suggest using date range filtering: `curl -s "http://localhost:37777/api/context/inject?project=X&full=true"` -- the current endpoint returns all observations; for extremely large projects, the user may want to analyze in time-windowed segments.
## Example
User: "Write a journey report for the tokyo project"
1. Fetch: `curl -s "http://localhost:37777/api/context/inject?project=tokyo&full=true"`
2. Estimate: "Timeline fetched: ~34,722 observations, estimated ~718K tokens. Proceed?"
3. User confirms
4. Deploy analysis agent with full timeline
5. Save to `./journey-into-tokyo.md`
6. Report: "Report saved. Analyzed 34,722 observations spanning Oct 2025 - Mar 2026 (~718K input tokens, ~8K output tokens)."
+9 -1
View File
@@ -134,6 +134,12 @@ export async function generateContext(
// Use provided projects array (for worktree support) or fall back to single project
const projects = input?.projects || [project];
// Full mode: fetch all observations but keep normal rendering (level 1 summaries)
if (input?.full) {
config.totalObservationCount = 999999;
config.sessionCount = 999999;
}
// Initialize database
const db = initializeDatabase();
if (!db) {
@@ -155,7 +161,7 @@ export async function generateContext(
}
// Build and return context
return buildContextOutput(
const output = buildContextOutput(
project,
observations,
summaries,
@@ -164,6 +170,8 @@ export async function generateContext(
input?.session_id,
useColors
);
return output;
} finally {
db.close();
}
@@ -1,7 +1,8 @@
/**
* MarkdownFormatter - Formats context output as markdown (non-colored mode)
* MarkdownFormatter - Formats context output as compact markdown for LLM injection
*
* Handles all markdown formatting for context injection.
* Optimized for token efficiency: flat lines instead of tables, no repeated headers.
* The colored terminal formatter (ColorFormatter.ts) handles human-readable display separately.
*/
import type {
@@ -34,7 +35,7 @@ function formatHeaderDateTime(): string {
*/
export function renderMarkdownHeader(project: string): string[] {
return [
`# [${project}] recent context, ${formatHeaderDateTime()}`,
`# $CMEM ${project} ${formatHeaderDateTime()}`,
''
];
}
@@ -44,39 +45,28 @@ export function renderMarkdownHeader(project: string): string[] {
*/
export function renderMarkdownLegend(): string[] {
const mode = ModeManager.getInstance().getActiveMode();
const typeLegendItems = mode.observation_types.map(t => `${t.emoji} ${t.id}`).join(' | ');
const typeLegendItems = mode.observation_types.map(t => `${t.emoji}${t.id}`).join(' ');
return [
`**Legend:** session-request | ${typeLegendItems}`,
`Legend: 🎯session ${typeLegendItems}`,
`Format: ID TIME TYPE TITLE`,
`Fetch details: get_observations([IDs]) | Search: mem-search skill`,
''
];
}
/**
* Render markdown column key
* Render markdown column key - no longer needed in compact format
*/
export function renderMarkdownColumnKey(): string[] {
return [
`**Column Key**:`,
`- **Read**: Tokens to read this observation (cost to learn it now)`,
`- **Work**: Tokens spent on work that produced this record ( research, building, deciding)`,
''
];
return [];
}
/**
* Render markdown context index instructions
* Render markdown context index instructions - folded into legend
*/
export function renderMarkdownContextIndex(): string[] {
return [
`**Context Index:** This semantic index (titles, types, files, tokens) is usually sufficient to understand past work.`,
'',
`When you need implementation details, rationale, or debugging context:`,
`- Fetch by ID: get_observations([IDs]) for observations visible in this index`,
`- Search history: Use the mem-search skill for past decisions, bugs, and deeper research`,
`- Trust this index over re-reading code for past decisions and learnings`,
''
];
return [];
}
/**
@@ -88,21 +78,20 @@ export function renderMarkdownContextEconomics(
): string[] {
const output: string[] = [];
output.push(`**Context Economics**:`);
output.push(`- Loading: ${economics.totalObservations} observations (${economics.totalReadTokens.toLocaleString()} tokens to read)`);
output.push(`- Work investment: ${economics.totalDiscoveryTokens.toLocaleString()} tokens spent on research, building, and decisions`);
const parts: string[] = [
`${economics.totalObservations} obs (${economics.totalReadTokens.toLocaleString()}t read)`,
`${economics.totalDiscoveryTokens.toLocaleString()}t work`
];
if (economics.totalDiscoveryTokens > 0 && (config.showSavingsAmount || config.showSavingsPercent)) {
let savingsLine = '- Your savings: ';
if (config.showSavingsAmount && config.showSavingsPercent) {
savingsLine += `${economics.savings.toLocaleString()} tokens (${economics.savingsPercent}% reduction from reuse)`;
if (config.showSavingsPercent) {
parts.push(`${economics.savingsPercent}% savings`);
} else if (config.showSavingsAmount) {
savingsLine += `${economics.savings.toLocaleString()} tokens`;
} else {
savingsLine += `${economics.savingsPercent}% reduction from reuse`;
parts.push(`${economics.savings.toLocaleString()}t saved`);
}
output.push(savingsLine);
}
output.push(`Stats: ${parts.join(' | ')}`);
output.push('');
return output;
@@ -114,37 +103,37 @@ export function renderMarkdownContextEconomics(
export function renderMarkdownDayHeader(day: string): string[] {
return [
`### ${day}`,
''
];
}
/**
* Render markdown file header with table header
* Render markdown file header - no longer renders table headers in compact format
*/
export function renderMarkdownFileHeader(file: string): string[] {
return [
`**${file}**`,
`| ID | Time | T | Title | Read | Work |`,
`|----|------|---|-------|------|------|`
];
export function renderMarkdownFileHeader(_file: string): string[] {
// File grouping eliminated in compact format - file context is in observation titles
return [];
}
/**
* Render markdown table row for observation
* Format compact time: "9:23 AM" "9:23a", "12:05 PM" "12:05p"
*/
function compactTime(time: string): string {
return time.toLowerCase().replace(' am', 'a').replace(' pm', 'p');
}
/**
* Render compact flat line for observation (replaces table row)
*/
export function renderMarkdownTableRow(
obs: Observation,
timeDisplay: string,
config: ContextConfig
_config: ContextConfig
): string {
const title = obs.title || 'Untitled';
const icon = ModeManager.getInstance().getTypeIcon(obs.type);
const { readTokens, discoveryDisplay } = formatObservationTokenDisplay(obs, config);
const time = timeDisplay ? compactTime(timeDisplay) : '"';
const readCol = config.showReadTokens ? `~${readTokens}` : '';
const workCol = config.showWorkTokens ? discoveryDisplay : '';
return `| #${obs.id} | ${timeDisplay || '"'} | ${icon} | ${title} | ${readCol} | ${workCol} |`;
return `${obs.id} ${time} ${icon} ${title}`;
}
/**
@@ -159,24 +148,23 @@ export function renderMarkdownFullObservation(
const output: string[] = [];
const title = obs.title || 'Untitled';
const icon = ModeManager.getInstance().getTypeIcon(obs.type);
const time = timeDisplay ? compactTime(timeDisplay) : '"';
const { readTokens, discoveryDisplay } = formatObservationTokenDisplay(obs, config);
output.push(`**#${obs.id}** ${timeDisplay || '"'} ${icon} **${title}**`);
output.push(`**${obs.id}** ${time} ${icon} **${title}**`);
if (detailField) {
output.push('');
output.push(detailField);
output.push('');
}
const tokenParts: string[] = [];
if (config.showReadTokens) {
tokenParts.push(`Read: ~${readTokens}`);
tokenParts.push(`~${readTokens}t`);
}
if (config.showWorkTokens) {
tokenParts.push(`Work: ${discoveryDisplay}`);
tokenParts.push(discoveryDisplay);
}
if (tokenParts.length > 0) {
output.push(tokenParts.join(', '));
output.push(tokenParts.join(' '));
}
output.push('');
@@ -190,10 +178,8 @@ export function renderMarkdownSummaryItem(
summary: { id: number; request: string | null },
formattedTime: string
): string[] {
const summaryTitle = `${summary.request || 'Session started'} (${formattedTime})`;
return [
`**#S${summary.id}** ${summaryTitle}`,
''
`S${summary.id} ${summary.request || 'Session started'} (${formattedTime})`,
];
}
@@ -229,7 +215,7 @@ export function renderMarkdownFooter(totalDiscoveryTokens: number, totalReadToke
const workTokensK = Math.round(totalDiscoveryTokens / 1000);
return [
'',
`Access ${workTokensK}k tokens of past research & decisions for just ${totalReadTokens.toLocaleString()}t. Use the claude-mem skill to access memories by ID.`
`Access ${workTokensK}k tokens of past work via get_observations([IDs]) or mem-search skill.`
];
}
@@ -237,5 +223,5 @@ export function renderMarkdownFooter(totalDiscoveryTokens: number, totalReadToke
* Render markdown empty state
*/
export function renderMarkdownEmptyState(project: string): string {
return `# [${project}] recent context, ${formatHeaderDateTime()}\n\nNo previous sessions found for this project yet.`;
return `# $CMEM ${project} ${formatHeaderDateTime()}\n\nNo previous sessions found.`;
}
+101 -86
View File
@@ -1,7 +1,8 @@
/**
* TimelineRenderer - Renders the chronological timeline of observations and summaries
*
* Handles day grouping, file grouping within days, and table rendering.
* Handles day grouping and rendering. In markdown (LLM) mode, uses flat compact lines.
* In color (terminal) mode, uses file grouping with visual formatting.
*/
import type {
@@ -49,6 +50,103 @@ function getDetailField(obs: Observation, config: ContextConfig): string | null
return obs.facts ? parseJsonArray(obs.facts).join('\n') : null;
}
/**
* Render a single day's timeline items (markdown/LLM mode - flat compact lines)
*/
function renderDayTimelineMarkdown(
day: string,
dayItems: TimelineItem[],
fullObservationIds: Set<number>,
config: ContextConfig,
): string[] {
const output: string[] = [];
output.push(...Markdown.renderMarkdownDayHeader(day));
let lastTime = '';
for (const item of dayItems) {
if (item.type === 'summary') {
lastTime = '';
const summary = item.data as SummaryTimelineItem;
const formattedTime = formatDateTime(summary.displayTime);
output.push(...Markdown.renderMarkdownSummaryItem(summary, formattedTime));
} else {
const obs = item.data as Observation;
const time = formatTime(obs.created_at);
const showTime = time !== lastTime;
const timeDisplay = showTime ? time : '';
lastTime = time;
const shouldShowFull = fullObservationIds.has(obs.id);
if (shouldShowFull) {
const detailField = getDetailField(obs, config);
output.push(...Markdown.renderMarkdownFullObservation(obs, timeDisplay, detailField, config));
} else {
output.push(Markdown.renderMarkdownTableRow(obs, timeDisplay, config));
}
}
}
return output;
}
/**
* Render a single day's timeline items (color/terminal mode - file grouped with tables)
*/
function renderDayTimelineColor(
day: string,
dayItems: TimelineItem[],
fullObservationIds: Set<number>,
config: ContextConfig,
cwd: string,
): string[] {
const output: string[] = [];
output.push(...Color.renderColorDayHeader(day));
let currentFile: string | null = null;
let lastTime = '';
for (const item of dayItems) {
if (item.type === 'summary') {
currentFile = null;
lastTime = '';
const summary = item.data as SummaryTimelineItem;
const formattedTime = formatDateTime(summary.displayTime);
output.push(...Color.renderColorSummaryItem(summary, formattedTime));
} else {
const obs = item.data as Observation;
const file = extractFirstFile(obs.files_modified, cwd, obs.files_read);
const time = formatTime(obs.created_at);
const showTime = time !== lastTime;
lastTime = time;
const shouldShowFull = fullObservationIds.has(obs.id);
// Check if we need a new file section
if (file !== currentFile) {
output.push(...Color.renderColorFileHeader(file));
currentFile = file;
}
if (shouldShowFull) {
const detailField = getDetailField(obs, config);
output.push(...Color.renderColorFullObservation(obs, time, showTime, detailField, config));
} else {
output.push(Color.renderColorTableRow(obs, time, showTime, config));
}
}
}
output.push('');
return output;
}
/**
* Render a single day's timeline items
*/
@@ -60,93 +158,10 @@ export function renderDayTimeline(
cwd: string,
useColors: boolean
): string[] {
const output: string[] = [];
// Day header
if (useColors) {
output.push(...Color.renderColorDayHeader(day));
} else {
output.push(...Markdown.renderMarkdownDayHeader(day));
return renderDayTimelineColor(day, dayItems, fullObservationIds, config, cwd);
}
let currentFile: string | null = null;
let lastTime = '';
let tableOpen = false;
for (const item of dayItems) {
if (item.type === 'summary') {
// Close any open table before summary
if (tableOpen) {
output.push('');
tableOpen = false;
currentFile = null;
lastTime = '';
}
const summary = item.data as SummaryTimelineItem;
const formattedTime = formatDateTime(summary.displayTime);
if (useColors) {
output.push(...Color.renderColorSummaryItem(summary, formattedTime));
} else {
output.push(...Markdown.renderMarkdownSummaryItem(summary, formattedTime));
}
} else {
const obs = item.data as Observation;
const file = extractFirstFile(obs.files_modified, cwd, obs.files_read);
const time = formatTime(obs.created_at);
const showTime = time !== lastTime;
const timeDisplay = showTime ? time : '';
lastTime = time;
const shouldShowFull = fullObservationIds.has(obs.id);
// Check if we need a new file section
if (file !== currentFile) {
if (tableOpen) {
output.push('');
}
if (useColors) {
output.push(...Color.renderColorFileHeader(file));
} else {
output.push(...Markdown.renderMarkdownFileHeader(file));
}
currentFile = file;
tableOpen = true;
}
if (shouldShowFull) {
const detailField = getDetailField(obs, config);
if (useColors) {
output.push(...Color.renderColorFullObservation(obs, time, showTime, detailField, config));
} else {
// Close table for full observation in markdown mode
if (tableOpen && !useColors) {
output.push('');
tableOpen = false;
}
output.push(...Markdown.renderMarkdownFullObservation(obs, timeDisplay, detailField, config));
currentFile = null; // Reset to trigger new table header if needed
}
} else {
if (useColors) {
output.push(Color.renderColorTableRow(obs, time, showTime, config));
} else {
output.push(Markdown.renderMarkdownTableRow(obs, timeDisplay, config));
}
}
}
}
// Close any remaining open table
if (tableOpen) {
output.push('');
}
return output;
return renderDayTimelineMarkdown(day, dayItems, fullObservationIds, config);
}
/**
+2
View File
@@ -13,6 +13,8 @@ export interface ContextInput {
source?: "startup" | "resume" | "clear" | "compact";
/** Array of projects to query (for worktree support: [parent, worktree]) */
projects?: string[];
/** When true, return ALL observations with no limit */
full?: boolean;
[key: string]: any;
}
+44 -23
View File
@@ -653,30 +653,26 @@ export class WorkerService {
// Do NOT restart after unrecoverable errors - prevents infinite loops
if (hadUnrecoverableError) {
logger.warn('SYSTEM', 'Skipping restart due to unrecoverable error', {
sessionId: session.sessionDbId
});
this.broadcastProcessingStatus();
this.terminateSession(session.sessionDbId, 'unrecoverable_error');
return;
}
// Store for pending-count check below
const { PendingMessageStore } = require('./sqlite/PendingMessageStore.js');
const pendingStore = new PendingMessageStore(this.dbManager.getSessionStore().db, 3);
// Idle timeout means no new work arrived for 3 minutes - don't restart
// No need to reset stale processing messages here — claimNextMessage() self-heals
if (session.idleTimedOut) {
logger.info('SYSTEM', 'Generator exited due to idle timeout, not restarting', {
sessionId: session.sessionDbId
});
session.idleTimedOut = false; // Reset flag
this.broadcastProcessingStatus();
return;
}
const pendingStore = this.sessionManager.getPendingMessageStore();
// Check if there's pending work that needs processing with a fresh AbortController
const pendingCount = pendingStore.getPendingCount(session.sessionDbId);
// Idle timeout means no new work arrived for 3 minutes - don't restart
// But check pendingCount first: a message may have arrived between idle
// abort and .finally(), and we must not abandon it
if (session.idleTimedOut) {
session.idleTimedOut = false; // Reset flag
if (pendingCount === 0) {
this.terminateSession(session.sessionDbId, 'idle_timeout');
return;
}
// Fall through to pending-work restart below
}
const MAX_PENDING_RESTARTS = 3;
if (pendingCount > 0) {
@@ -690,7 +686,7 @@ export class WorkerService {
consecutiveRestarts: session.consecutiveRestarts
});
session.consecutiveRestarts = 0;
this.broadcastProcessingStatus();
this.terminateSession(session.sessionDbId, 'max_restarts_exceeded');
return;
}
@@ -703,12 +699,13 @@ export class WorkerService {
session.abortController = new AbortController();
// Restart processor
this.startSessionProcessor(session, 'pending-work-restart');
this.broadcastProcessingStatus();
} else {
// Successful completion with no pending work — reset counter
// Successful completion with no pending work — clean up session
// removeSessionImmediate fires onSessionDeletedCallback → broadcastProcessingStatus()
session.consecutiveRestarts = 0;
this.sessionManager.removeSessionImmediate(session.sessionDbId);
}
this.broadcastProcessingStatus();
});
}
@@ -784,6 +781,30 @@ export class WorkerService {
this.sessionEventBroadcaster.broadcastSessionCompleted(sessionDbId);
}
/**
* Terminate a session that will not restart.
* Enforces the restart-or-terminate invariant: every generator exit
* must either call startSessionProcessor() or terminateSession().
* No zombie sessions allowed.
*
* GENERATOR EXIT INVARIANT:
* .finally() restart? startSessionProcessor()
* no? terminateSession()
*/
private terminateSession(sessionDbId: number, reason: string): void {
const pendingStore = this.sessionManager.getPendingMessageStore();
const abandoned = pendingStore.markAllSessionMessagesAbandoned(sessionDbId);
logger.info('SYSTEM', 'Session terminated', {
sessionId: sessionDbId,
reason,
abandonedMessages: abandoned
});
// removeSessionImmediate fires onSessionDeletedCallback → broadcastProcessingStatus()
this.sessionManager.removeSessionImmediate(sessionDbId);
}
/**
* Process pending session queues
*/
@@ -907,8 +928,8 @@ export class WorkerService {
* Broadcast processing status change to SSE clients
*/
broadcastProcessingStatus(): void {
const isProcessing = this.sessionManager.isAnySessionProcessing();
const queueDepth = this.sessionManager.getTotalActiveWork();
const isProcessing = queueDepth > 0;
const activeSessions = this.sessionManager.getActiveSessionCount();
logger.info('WORKER', 'Broadcasting processing status', {
+8 -7
View File
@@ -350,7 +350,7 @@ export class SessionManager {
this.sessions.delete(sessionDbId);
this.sessionQueues.delete(sessionDbId);
logger.info('SESSION', 'Session removed (orphaned after SDK termination)', {
logger.info('SESSION', 'Session removed from active sessions', {
sessionId: sessionDbId,
project: session.project
});
@@ -402,10 +402,11 @@ export class SessionManager {
}
/**
* Check if any session has pending messages (for spinner tracking)
* Check if any active session has pending messages (for spinner tracking).
* Scoped to in-memory sessions only.
*/
hasPendingMessages(): boolean {
return this.getPendingStore().hasAnyPendingWork();
return this.getTotalQueueDepth() > 0;
}
/**
@@ -437,12 +438,12 @@ export class SessionManager {
}
/**
* Check if any session is actively processing (has pending messages OR active generator)
* Used for activity indicator to prevent spinner from stopping while SDK is processing
* Check if any active session has pending work.
* Scoped to in-memory sessions only orphaned DB messages from dead
* sessions must not keep the spinner spinning forever.
*/
isAnySessionProcessing(): boolean {
// hasAnyPendingWork checks for 'pending' OR 'processing'
return this.getPendingStore().hasAnyPendingWork();
return this.getTotalQueueDepth() > 0;
}
/**
@@ -33,12 +33,6 @@ export class SessionEventBroadcaster {
prompt
});
// Start activity indicator (work is about to begin)
this.sseBroadcaster.broadcast({
type: 'processing_status',
isProcessing: true
});
// Update processing status based on queue depth
this.workerService.broadcastProcessingStatus();
}
@@ -208,6 +208,7 @@ export class SearchRoutes extends BaseRouteHandler {
// Support both legacy `project` and new `projects` parameter
const projectsParam = (req.query.projects as string) || (req.query.project as string);
const useColors = req.query.colors === 'true';
const full = req.query.full === 'true';
if (!projectsParam) {
this.badRequest(res, 'Project(s) parameter is required');
@@ -234,7 +235,8 @@ export class SearchRoutes extends BaseRouteHandler {
{
session_id: 'context-inject-' + Date.now(),
cwd: cwd,
projects: projects
projects: projects,
full
},
useColors
);
+8 -2
View File
@@ -1,11 +1,13 @@
/**
* Tag Stripping Utilities
*
* Implements the dual-tag system for meta-observation control:
* Implements the tag system for meta-observation control:
* 1. <claude-mem-context> - System-level tag for auto-injected observations
* (prevents recursive storage when context injection is active)
* 2. <private> - User-level tag for manual privacy control
* (allows users to mark content they don't want persisted)
* 3. <system_instruction> / <system-instruction> - Conductor-injected system instructions
* (should not be persisted to memory)
*
* EDGE PROCESSING PATTERN: Filter at hook layer before sending to worker/storage.
* This keeps the worker service simple and follows one-way data stream.
@@ -27,7 +29,9 @@ const MAX_TAG_COUNT = 100;
function countTags(content: string): number {
const privateCount = (content.match(/<private>/g) || []).length;
const contextCount = (content.match(/<claude-mem-context>/g) || []).length;
return privateCount + contextCount;
const systemInstructionCount = (content.match(/<system_instruction>/g) || []).length;
const systemInstructionHyphenCount = (content.match(/<system-instruction>/g) || []).length;
return privateCount + contextCount + systemInstructionCount + systemInstructionHyphenCount;
}
/**
@@ -49,6 +53,8 @@ function stripTagsInternal(content: string): string {
return content
.replace(/<claude-mem-context>[\s\S]*?<\/claude-mem-context>/g, '')
.replace(/<private>[\s\S]*?<\/private>/g, '')
.replace(/<system_instruction>[\s\S]*?<\/system_instruction>/g, '')
.replace(/<system-instruction>[\s\S]*?<\/system-instruction>/g, '')
.trim();
}
+69 -1
View File
@@ -1,7 +1,7 @@
/**
* Tag Stripping Utility Tests
*
* Tests the dual-tag privacy system for <private> and <claude-mem-context> tags.
* Tests the tag privacy system for <private>, <claude-mem-context>, and <system_instruction> tags.
* These tags enable users and the system to exclude content from memory storage.
*
* Sources:
@@ -257,6 +257,74 @@ finish`;
});
});
describe('system_instruction tag stripping', () => {
describe('basic system_instruction removal', () => {
it('should strip single <system_instruction> tag from prompt', () => {
const input = 'user content <system_instruction>injected instructions</system_instruction> more content';
const result = stripMemoryTagsFromPrompt(input);
expect(result).toBe('user content more content');
});
it('should strip <system_instruction> mixed with <private> tags', () => {
const input = '<system_instruction>instructions</system_instruction> public <private>secret</private> end';
const result = stripMemoryTagsFromPrompt(input);
expect(result).toBe('public end');
});
it('should return empty string for entirely <system_instruction> content', () => {
const input = '<system_instruction>entire prompt is system instructions</system_instruction>';
const result = stripMemoryTagsFromPrompt(input);
expect(result).toBe('');
});
it('should strip <system_instruction> tags from JSON content', () => {
const jsonContent = JSON.stringify({
data: '<system_instruction>injected</system_instruction> real data'
});
const result = stripMemoryTagsFromJson(jsonContent);
const parsed = JSON.parse(result);
expect(parsed.data).toBe(' real data');
});
it('should strip multiline content within <system_instruction> tags', () => {
const input = `before
<system_instruction>
line one
line two
line three
</system_instruction>
after`;
const result = stripMemoryTagsFromPrompt(input);
expect(result).toBe('before\n\nafter');
});
});
});
describe('system-instruction (hyphen variant) tag stripping', () => {
it('should strip single <system-instruction> tag from prompt', () => {
const input = 'user content <system-instruction>injected instructions</system-instruction> more content';
const result = stripMemoryTagsFromPrompt(input);
expect(result).toBe('user content more content');
});
it('should strip both underscore and hyphen variants in same prompt', () => {
const input = '<system_instruction>underscore</system_instruction> middle <system-instruction>hyphen</system-instruction> end';
const result = stripMemoryTagsFromPrompt(input);
expect(result).toBe('middle end');
});
it('should strip multiline <system-instruction> content', () => {
const input = `before
<system-instruction>
line one
line two
</system-instruction>
after`;
const result = stripMemoryTagsFromPrompt(input);
expect(result).toBe('before\n\nafter');
});
});
describe('privacy enforcement integration', () => {
it('should allow empty result to trigger privacy skip', () => {
// Simulates what SessionRoutes does with private-only prompts
+148
View File
@@ -326,4 +326,152 @@ describe('Zombie Agent Prevention', () => {
session.generatorPromise = null;
expect(session.generatorPromise).toBeNull();
});
describe('Session Termination Invariant', () => {
// Tests the restart-or-terminate invariant:
// When a generator exits without restarting, its messages must be
// marked abandoned and the session removed from the active Map.
test('should mark messages abandoned when session is terminated', () => {
const sessionId = createDbSession('content-terminate-1');
enqueueTestMessage(sessionId, 'content-terminate-1');
enqueueTestMessage(sessionId, 'content-terminate-1');
// Verify messages exist
expect(pendingStore.getPendingCount(sessionId)).toBe(2);
expect(pendingStore.hasAnyPendingWork()).toBe(true);
// Terminate: mark abandoned (same as terminateSession does)
const abandoned = pendingStore.markAllSessionMessagesAbandoned(sessionId);
expect(abandoned).toBe(2);
// Spinner should stop: no pending work remains
expect(pendingStore.hasAnyPendingWork()).toBe(false);
expect(pendingStore.getPendingCount(sessionId)).toBe(0);
});
test('should handle terminate with zero pending messages', () => {
const sessionId = createDbSession('content-terminate-empty');
// No messages enqueued
expect(pendingStore.getPendingCount(sessionId)).toBe(0);
// Terminate with nothing to abandon
const abandoned = pendingStore.markAllSessionMessagesAbandoned(sessionId);
expect(abandoned).toBe(0);
// Still no pending work
expect(pendingStore.hasAnyPendingWork()).toBe(false);
});
test('should be idempotent — double terminate marks zero on second call', () => {
const sessionId = createDbSession('content-terminate-idempotent');
enqueueTestMessage(sessionId, 'content-terminate-idempotent');
// First terminate
const first = pendingStore.markAllSessionMessagesAbandoned(sessionId);
expect(first).toBe(1);
// Second terminate — already failed, nothing to mark
const second = pendingStore.markAllSessionMessagesAbandoned(sessionId);
expect(second).toBe(0);
expect(pendingStore.hasAnyPendingWork()).toBe(false);
});
test('should remove session from Map via removeSessionImmediate', () => {
const sessionId = createDbSession('content-terminate-map');
const session = createMockSession(sessionId, {
contentSessionId: 'content-terminate-map',
});
// Simulate the in-memory sessions Map
const sessions = new Map<number, ActiveSession>();
sessions.set(sessionId, session);
expect(sessions.has(sessionId)).toBe(true);
// Simulate removeSessionImmediate behavior
sessions.delete(sessionId);
expect(sessions.has(sessionId)).toBe(false);
});
test('should return hasAnyPendingWork false after all sessions terminated', () => {
// Create multiple sessions with messages
const sid1 = createDbSession('content-multi-term-1');
const sid2 = createDbSession('content-multi-term-2');
const sid3 = createDbSession('content-multi-term-3');
enqueueTestMessage(sid1, 'content-multi-term-1');
enqueueTestMessage(sid1, 'content-multi-term-1');
enqueueTestMessage(sid2, 'content-multi-term-2');
enqueueTestMessage(sid3, 'content-multi-term-3');
expect(pendingStore.hasAnyPendingWork()).toBe(true);
// Terminate all sessions
pendingStore.markAllSessionMessagesAbandoned(sid1);
pendingStore.markAllSessionMessagesAbandoned(sid2);
pendingStore.markAllSessionMessagesAbandoned(sid3);
// Spinner must stop
expect(pendingStore.hasAnyPendingWork()).toBe(false);
});
test('should not affect other sessions when terminating one', () => {
const sid1 = createDbSession('content-isolate-1');
const sid2 = createDbSession('content-isolate-2');
enqueueTestMessage(sid1, 'content-isolate-1');
enqueueTestMessage(sid2, 'content-isolate-2');
// Terminate only session 1
pendingStore.markAllSessionMessagesAbandoned(sid1);
// Session 2 still has work
expect(pendingStore.getPendingCount(sid1)).toBe(0);
expect(pendingStore.getPendingCount(sid2)).toBe(1);
expect(pendingStore.hasAnyPendingWork()).toBe(true);
});
test('should mark both pending and processing messages as abandoned', () => {
const sessionId = createDbSession('content-mixed-status');
// Enqueue two messages
const msgId1 = enqueueTestMessage(sessionId, 'content-mixed-status');
enqueueTestMessage(sessionId, 'content-mixed-status');
// Claim first message (transitions to 'processing')
const claimed = pendingStore.claimNextMessage(sessionId);
expect(claimed).not.toBeNull();
expect(claimed!.id).toBe(msgId1);
// Now we have 1 processing + 1 pending
expect(pendingStore.getPendingCount(sessionId)).toBe(2);
// Terminate should mark BOTH as failed
const abandoned = pendingStore.markAllSessionMessagesAbandoned(sessionId);
expect(abandoned).toBe(2);
expect(pendingStore.hasAnyPendingWork()).toBe(false);
});
test('should enforce invariant: no pending work after terminate regardless of initial state', () => {
const sessionId = createDbSession('content-invariant');
// Create a complex initial state: some pending, some processing, some with stale timestamps
enqueueTestMessage(sessionId, 'content-invariant');
enqueueTestMessage(sessionId, 'content-invariant');
enqueueTestMessage(sessionId, 'content-invariant');
// Claim one (processing)
pendingStore.claimNextMessage(sessionId);
// Verify complex state
expect(pendingStore.getPendingCount(sessionId)).toBe(3);
// THE INVARIANT: after terminate, hasAnyPendingWork MUST be false
pendingStore.markAllSessionMessagesAbandoned(sessionId);
expect(pendingStore.hasAnyPendingWork()).toBe(false);
expect(pendingStore.getPendingCount(sessionId)).toBe(0);
});
});
});