Compare commits

...

39 Commits

Author SHA1 Message Date
Alex Newman 0768fafd83 chore: bump version to 7.4.2
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 17:19:17 -05:00
Alex Newman 5ce656037e Refactor worker commands from npm scripts to claude-mem CLI
- Updated all instances of `npm run worker:restart` to `claude-mem restart` in documentation and code comments for consistency.
- Modified error messages and logging to reflect the new command structure.
- Adjusted worker management commands in various troubleshooting documents.
- Changed the worker status check message to guide users towards the new command.
2025-12-20 17:16:20 -05:00
Alex Newman e27f8e4963 added path alias script 2025-12-20 17:03:52 -05:00
ToxMox af145cfaef fix(windows): improve worker stop/restart reliability (#395)
* fix(windows): enable worker logging on Windows

Previously, Windows worker startup via PowerShell Start-Process did not
redirect stdout/stderr to log files, making debugging startup failures
impossible. This adds -RedirectStandardOutput and -RedirectStandardError
to capture worker logs to ~/.claude-mem/logs/worker-YYYY-MM-DD.log.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(windows): improve worker stop/restart reliability

- Use HTTP shutdown endpoint as primary stop method (worker kills itself)
- Only remove PID file after confirming worker is actually dead
- Remove auto-respawn from wrapper to prevent PID file mismatches
- Wrapper now exits when inner worker crashes (hooks will restart)

This hopefully fixes issues where npm run worker:stop would fail silently when
the worker was started from hooks, leaving zombie processes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 16:50:32 -05:00
Alex Newman 15fe0cfe3c docs: simplify build commands section in CLAUDE.md 2025-12-19 18:32:24 -05:00
Alex Newman c0ed9bbcfd chore: update CHANGELOG.md 2025-12-19 15:12:42 -05:00
Alex Newman 8040c6d559 fix: redirect MCP server logs to stderr to preserve JSON-RPC protocol
MCP uses stdio transport where stdout is reserved for JSON-RPC messages.
Console.log was writing startup logs to stdout, causing Claude Desktop
to parse "[2025-12-19..." as a JSON array and fail.

Fixes #396

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 15:11:30 -05:00
Alex Newman 7187220b24 Redirect console.log to stderr in mcp-server.ts to prevent JSON-RPC protocol interference; update mem-search.zip 2025-12-19 15:10:32 -05:00
Alex Newman ca52950b2a chore: update CHANGELOG.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-17 22:51:42 -05:00
Alex Newman ee1441f462 chore: bump version to 7.4.0
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-17 22:50:46 -05:00
Alex Newman c4af31f48d Optimize MCP tool token usage with schema reference pattern
Reduces MCP tool token consumption by ~90% through progressive disclosure. Tools now show minimal schemas with get_schema() for details on demand.
2025-12-17 22:47:30 -05:00
Alex Newman c2742d5664 chore: bump plugin version to 7.3.9 2025-12-17 19:52:03 -05:00
Alex Newman 0c45919261 chore: update CHANGELOG.md 2025-12-17 19:41:30 -05:00
Alex Newman a3ab898e04 chore: bump version to 7.3.9 2025-12-17 19:40:46 -05:00
Alex Newman dea67c0d86 fix: MCP server compatibility and web UI path resolution
Fixes #371, #369

**Issue #371: MCP server fails when Bun not in PATH**
- Changed MCP server shebang from `#!/usr/bin/env bun` to `#!/usr/bin/env node`
- MCP server now works regardless of whether Bun is in PATH
- Worker service correctly uses getBunPath() to find Bun in common install locations

**Issue #369: Web UI returns ENOENT error**
- Fixed hardcoded 'plugin/' path in ViewerRoutes
- Now checks both cache structure (ui/viewer.html) and marketplace structure (plugin/ui/viewer.html)
- Web UI now works from both ~/.claude/plugins/cache and ~/.claude/plugins/marketplaces

**Technical Details:**
- Updated build-hooks.js to use Node shebang for MCP server (line 169)
- Enhanced ViewerRoutes.handleViewerUI() to try multiple path patterns
- Added existsSync check to find viewer.html in either location

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-17 19:36:03 -05:00
Alex Newman d13a2c237c Update mem-search plugin with new features and improvements 2025-12-17 19:26:44 -05:00
Alex Newman c592f0aa69 chore: update CHANGELOG.md 2025-12-17 19:26:07 -05:00
Alex Newman 85a2472e4e chore: bump version to 7.3.8 2025-12-17 19:25:11 -05:00
Alex Newman 0cb3256b2d fix(security): add localhost-only protection for admin endpoints
Adds middleware to restrict /api/admin/restart and /api/admin/shutdown
to localhost-only access. This prevents DoS attacks when the worker
service is bound to 0.0.0.0 for remote UI access.

Implementation:
- Created requireLocalhost middleware in middleware.ts
- Applied to both admin endpoints
- Checks client IP against localhost addresses (127.0.0.1, ::1, etc.)
- Returns 403 Forbidden for non-localhost requests

Addresses security concern raised in PR #368 with cleaner DRY approach.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-17 19:06:33 -05:00
Alex Newman 44029862b1 chore: update CHANGELOG.md 2025-12-17 18:48:20 -05:00
Alex Newman 130abe04a9 chore: bump version to 7.3.7 2025-12-17 18:47:04 -05:00
Alex Newman bff10d49c9 fix(windows): Windows platform stabilization improvements (#378)
* chore: bump version to 7.3.6 in package.json

* Enhance worker readiness checks and MCP connection handling

- Updated health check endpoint to /api/readiness for better initialization tracking.
- Increased timeout for health checks and worker startup retries, especially for Windows.
- Added initialization flags to track MCP readiness and overall worker initialization status.
- Implemented a timeout guard for MCP connection to prevent hanging.
- Adjusted logging to reflect readiness state and errors more accurately.

* fix(windows): use Bun PATH detection in worker wrapper

Phase 2/8: Fix Bun PATH Detection in Worker Wrapper

- Import getBunPath() in worker-wrapper.ts for Bun detection
- Add Bun path resolution before spawning inner worker process
- Update spawn call to use detected Bun path instead of process.execPath
- Add logging to bun-path.ts when PATH detection succeeds
- Add logging when fallback paths are used
- Add Windows-specific validation for .exe extension
- Log warning with searched paths when Bun not found
- Fail fast with clear error message if Bun cannot be detected

This ensures worker-wrapper uses the correct Bun executable on Windows
even when Bun is not in PATH, fixing issue #371 where users reported
"Bun not in PATH" errors despite Bun being installed.

Addresses: #371

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(windows): standardize child process spawning with windowsHide

Phase 3/8: Standardize Child Process Spawning (Windows)

Changes:
- Added windowsHide flag to ChromaSync MCP subprocess spawn
- Added Windows-specific process tracking (childPid) in ChromaSync
- Force-kill subprocess on Windows before closing transport to prevent zombie processes
- Updated cleanupOrphanedProcesses() to support Windows using PowerShell Get-CimInstance
- Use taskkill /T /F for proper process tree cleanup on Windows
- Audited BranchManager - confirmed windowsHide already present on all spawn calls

This prevents PowerShell windows from appearing during ChromaSync operations
and ensures proper cleanup of subprocess trees on Windows.

Addresses: #363, #361, #367, #371, #373, #374

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(windows): enhance socket cleanup with recursive process tree management

Phase 4/8: Enhanced Socket Cleanup & Process Tree Management

Changes:
- Added recursive process tree enumeration in worker-wrapper.ts for Windows
- Enhanced killInner() to enumerate all descendants before killing
- Added fallback individual process kill if taskkill /T fails
- Added 10s timeout to ChromaSync.close() in DatabaseManager to prevent hangs
- Force nullify ChromaSync even on close failure to prevent resource leaks
- Improved logging to show full process tree during cleanup

This ensures complete cleanup of all child processes (ChromaSync MCP subprocess,
Python processes, etc.) preventing socket leaks and CLOSE_WAIT states.

Addresses: #363, #361

* fix(windows): consolidate project name extraction with drive root handling

Phase 5/8: Project Name Extraction Consolidation

- Created shared getProjectName() utility in src/utils/project-name.ts
- Handles edge case: drive roots (C:\, J:\) now return "drive-X" format
- Handles edge case: null/undefined/empty cwd now returns "unknown-project"
- Fixed missing null check bug in new-hook.ts
- Replaced duplicated path.basename(cwd) logic in:
  - src/hooks/context-hook.ts
  - src/hooks/new-hook.ts
  - src/services/context-generator.ts

Addresses: #374

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(windows): increase timeouts and improve error messages

Phase 6/8: Increase Timeouts & Improve Error Messages

- Enhanced logger.ts with platform prefix (WIN32/DARWIN) and PID in all logs
- Added comprehensive Windows troubleshooting to ProcessManager error messages
- Enhanced Bun detection error message with Windows-specific troubleshooting
- All error messages now include GitHub issue numbers and docs links
- Windows timeout already increased to 2.0x multiplier in previous phases

Changes:
- src/utils/logger.ts: Added platform prefix and PID to all log output
- src/services/process/ProcessManager.ts: Enhanced error messages with troubleshooting steps
- src/utils/bun-path.ts: Added Windows-specific Bun detection error guidance

Addresses: #363, #361, #367, #371, #373, #374

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(windows): add comprehensive Windows CI testing

Phase 7/8: Add Windows CI Testing

- Create automated Windows testing workflow
- Test worker startup/shutdown cycles
- Verify Bun PATH detection on Windows
- Test rapid restart scenarios
- Validate port cleanup after shutdown
- Check for zombie processes
- Run on all pushes and PRs to main/fix/feature branches

Addresses: #363, #361, #367, #371, #373, #374

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* ci(windows): remove build steps from Windows CI workflow

Build files are already included in the plugin folder, so npm install
and npm run build are unnecessary steps in the CI workflow.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* revert: remove Windows CI workflow

The CI workflow cannot be properly implemented in the current architecture
due to limitations in testing the worker service in CI environments.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* security: add PID validation and improve ChromaSync timeout handling

Address critical security and reliability issues identified in PR review:

**Security Fixes:**
- Add PID validation before all PowerShell/taskkill command execution
- Validate PIDs are positive integers to prevent command injection
- Apply validation in worker-wrapper.ts, worker-service.ts, and ChromaSync.ts

**Reliability Improvements:**
- Add timeout handling to ChromaSync client.close() (10s timeout)
- Add timeout handling to ChromaSync transport.close() (5s timeout)
- Implement force-kill fallback when ChromaSync close operations timeout
- Prevents hanging on shutdown and ensures subprocess cleanup

**Implementation Details:**
- PID validation checks: Number.isInteger(pid) && pid > 0
- Applied before all execSync taskkill calls on Windows
- Applied in process enumeration (Get-CimInstance) PowerShell commands
- ChromaSync.close() uses Promise.race for timeout enforcement
- Graceful degradation with force-kill fallback on timeout

Addresses PR #378 review feedback

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Refactor ChromaSync client and transport closure logic

- Removed timeout handling for closing the Chroma client and transport.
- Simplified error logging for client and transport closure.
- Ensured subprocess cleanup logic is more straightforward.

* fix(worker): streamline Windows process management and cleanup

* revert: remove speculative LLM-generated complexity

Reverts defensive code that was added speculatively without user-reported issues:

- ChromaSync: Remove PID extraction and explicit taskkill (wrapper handles this)
- worker-wrapper: Restore simple taskkill /T /F (validated in v7.3.5)
- DatabaseManager: Remove Promise.race timeout wrapper
- hook-constants: Restore original timeout values
- logger: Remove platform/PID additions to every log line
- bun-path: Remove speculative logging

Keeps only changes that map to actual GitHub issues:
- #374: Drive root project name fix (getProjectName utility)
- #363: Readiness endpoint and Windows orphan cleanup
- #367: windowsHide on ChromaSync transport

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-17 18:44:04 -05:00
Alex Newman 40a71d3250 chore: update CHANGELOG.md 2025-12-17 16:02:04 -05:00
Alex Newman ae3d20c71a chore: bump version to 7.3.6 2025-12-17 16:01:07 -05:00
Alex Newman 54ef9662c1 Enhance SDKAgent response handling and message processing
- Updated response logging to process both empty and non-empty responses.
- Added functionality to mark messages as processed even when the response is empty.
- Refactored message processing logic to ensure all pending messages are marked as processed after successful observation/summary storage.
- Introduced a new private method `markMessagesProcessed` to encapsulate the logic for marking messages as processed, preventing message loss and duplicate processing.
2025-12-17 16:00:12 -05:00
Alex Newman 9aec461e14 Update mem-search plugin with new features and improvements 2025-12-17 15:39:03 -05:00
Alex Newman 0fe0705133 chore: bump version to 7.3.5 (#375)
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-17 14:54:13 -05:00
ToxMox a5bf653a47 fix(windows): solve zombie port problem with wrapper architecture (#372)
On Windows, Bun doesn't properly release socket handles when the worker
process exits, causing "zombie ports" that remain bound even after all
processes are dead. This required a system reboot to clear.

Solution: Introduce a wrapper process (worker-wrapper.cjs) that:
- Spawns the actual worker as a child with IPC channel
- On restart/shutdown, uses `taskkill /T /F` to kill the entire process tree
- Exits itself, allowing hooks to start fresh

The wrapper has no sockets, so Bun's socket cleanup bug doesn't affect it.
When the wrapper kills the inner worker tree and exits, the port is properly
released and can be immediately reused.

Key changes:
- New worker-wrapper.ts for Windows process lifecycle management
- ProcessManager starts wrapper on Windows, worker directly on Unix
- Worker sends IPC messages to wrapper for restart/shutdown
- Health endpoint now includes debug info (build ID, managed status, hasIpc)

Tested: Restart API now properly releases port and new worker binds to same port.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 14:45:41 -05:00
Alex Newman 1fec1e8339 chore: update mem-search.zip with latest changes 2025-12-16 22:17:58 -05:00
Alex Newman 1afb14d0d6 docs: update CHANGELOG.md for v7.3.4
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-16 22:09:13 -05:00
Alex Newman e961cd5a4a chore: bump version to 7.3.4
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-16 22:08:20 -05:00
Alex Newman 660c523ba4 fix: shorten MCP server name to prevent tool name length errors (#360)
* fix: shorten MCP server name to prevent tool name length errors (#358)

Root cause: Claude Code prefixes MCP tool names with
`mcp__plugin_{plugin-name}_{server-name}__` which was 43 chars
for `mcp__plugin_claude-mem_claude-mem-search__`. Combined with
`progressive_description` (22 chars) this exceeded the 64 char limit.

Changes:
- Shortened MCP server name from 'claude-mem-search' to 'mem-search'
  (saves 8 chars, new prefix is 35 chars)
- Renamed `progressive_description` tool to `help` (saves 18 chars)
- Updated SKILL.md to reference new `help` tool name
- Updated internal Server constructor name for consistency

All tool names now safely under 64 char limit:
- Longest is now `get_batch_observations` at 56 chars total
- `help` is only 39 chars total

Fixes #358

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: rename get_batch_observations to get_observations

The plural form naturally implies multiple items can be fetched,
following WordPress conventions. Simpler and clearer naming.

Also saves 6 additional characters for MCP tool name length.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* docs: update all references to renamed MCP tools

Updated documentation and code comments to reflect:
- progressive_description → help
- get_batch_observations → get_observations

Files updated:
- docs/public/usage/claude-desktop.mdx
- docs/public/architecture/worker-service.mdx
- src/services/worker/FormattingService.ts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-16 22:06:24 -05:00
Alex Newman da30aedb28 docs: add Pro Features Architecture section to CLAUDE.md 2025-12-16 21:03:42 -05:00
Alex Newman 10e58ef221 chore: update plugin version to 7.3.3 and modify mem-search.zip 2025-12-16 18:02:10 -05:00
Alex Newman 5e97d539a5 docs: update CHANGELOG.md for v7.3.3
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 18:01:56 -05:00
Alex Newman fdb4fafd3a chore: bump version to 7.3.3
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 18:01:03 -05:00
Alex Newman db3794762f chore: remove all better-sqlite3 references from codebase (#357)
* fix: export/import scripts now use API instead of direct DB access

Export script fix:
- Add format=json parameter to SearchManager for raw data output
- Add getSdkSessionsBySessionIds method to SessionStore
- Add POST /api/sdk-sessions/batch endpoint to DataRoutes
- Refactor export-memories.ts to use HTTP API

Import script fix:
- Add import methods to SessionStore with duplicate detection
- Add POST /api/import endpoint to DataRoutes
- Refactor import-memories.ts to use HTTP API

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: update analyze-transformations-smart.js to use bun:sqlite

Replace better-sqlite3 import with bun:sqlite to align with v7.1.0 migration.

* chore: remove all better-sqlite3 references from codebase

- Updated scripts/analyze-transformations-smart.js to use bun:sqlite
- Merged PR #332: Refactored import/export scripts to use worker API instead of direct DB access
- Updated PM2-to-Bun migration documentation

All better-sqlite3 references have been removed from source code.
Documentation references remain as appropriate historical context.

* build: update plugin artifacts with merged changes

Include built artifacts from PR #332 merge and analyze-transformations-smart.js update.

---------

Co-authored-by: lee <loyalpartner@163.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 17:57:40 -05:00
Alex Newman 78cb5c38dc Remove outdated documentation files related to Claude-Mem hooks cleanup, PR #335 review summary, and Windows worker struggles. These files contained obsolete information and action items that are no longer relevant to the current project state. 2025-12-16 17:19:23 -05:00
Alex Newman 5e6feb0cb4 docs: update CHANGELOG.md for v7.3.2
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-16 17:07:19 -05:00
99 changed files with 2712 additions and 7111 deletions
+1 -1
View File
@@ -10,7 +10,7 @@
"plugins": [
{
"name": "claude-mem",
"version": "7.3.2",
"version": "7.4.2",
"source": "./plugin",
"description": "Persistent memory system for Claude Code - context compression across sessions"
}
+1
View File
@@ -1,3 +1,4 @@
datasets/
node_modules/
dist/
*.log
-167
View File
@@ -1,167 +0,0 @@
# Action Plan: Issues & PRs Cleanup
Generated: 2025-12-12
## Phase 1: Immediate Cleanup (Today)
### Close Obsolete PRs
- [ ] **#255** - Close PR "Fix PM2 worker MODULE_NOT_FOUND"
- Reason: v7.1.0 removed PM2 entirely, this fix is no longer relevant
- Comment: Explain that v7.1.0 migration to Bun eliminated PM2 dependency
- [ ] **#206** - Close or request update on "Harden worker startup"
- Reason: Contains PM2-specific code that no longer exists
- Comment: Ask author if they want to update for Bun architecture, otherwise close as obsolete
### Close/Update Fixed Issues
- [ ] **#213** - Comment and close "Windows endless process spawning"
- Reason: v7.1.0 Bun migration eliminated PM2 process management
- Comment: Ask user to verify fix on v7.1.0, explain PM2 removal resolved issue
- [ ] **#229** - Close as duplicate
- Reason: Duplicate of #227 (upstream Claude Code bug)
- Comment: Direct to #227 for full details and workaround
- [ ] **#211** - Answer and close "Cursor IDE support question"
- Reason: Product question, not a bug report
- Comment: Explain focus is Claude Code, but plugin architecture may allow future expansion
### Critical Bug Follow-Up
- [ ] **#254** - Follow up on "Worker API fetch failed"
- Current status: Asked about PM2 logs (pre-v7.1.0 comment)
- Action: Update comment asking:
- What version of claude-mem are you running?
- If pre-v7.1.0: Please upgrade to v7.1.0 which fixes PM2 issues
- If v7.1.0+: Run troubleshoot skill and share logs
## Phase 2: High-Priority Merges (This Week)
### Security & Critical Fixes
- [ ] **#236** - Review and merge "Localhost-only binding" 🔒 PRIORITY
- Impact: Security improvement (fixes network exposure)
- Status: 156 additions, all tests pass (42/42)
- Action: Final review, merge, update CHANGELOG
- [ ] **#212** - Review and merge "Windows path quoting fix"
- Impact: Fixes Windows usernames with spaces
- Status: 6 lines changed, minimal risk
- Action: Quick cross-platform test, merge
### Major Features (Maintainer-Authored)
- [ ] **#225** - Review and merge "Export/Import scripts"
- Impact: Enables backup/restore, partially addresses #233
- Status: 927 additions, extensively tested by maintainer
- Action: Final review, merge, update docs
- [ ] **#250** - Review and merge "README translations"
- Impact: International user onboarding (22 languages)
- Status: 10,209 additions (massive but low-risk)
- Action: Spot-check a few translations, merge
### User-Requested Features
- [ ] **#252** - Test and merge "Execution traces" (addresses #194)
- Impact: Shows tools/skills/MCPs in UI bubbles
- Status: 383 additions, comprehensive implementation
- Action: Test database migration, API endpoints, UI display
- [ ] **#251** - Test and merge "Plan file context" (addresses #180)
- Impact: Injects last plan file into context
- Status: 85 additions, follows existing patterns
- Action: Test with real plan files, verify toggle works
## Phase 3: Review & Consider (Next Week)
### Quality Enhancements
- [ ] **#230** - Review "Multi-language support" (addresses #228)
- Impact: Observations/summaries in user's language
- Status: 157 additions, Korean screenshot provided
- Action: Review prompt changes carefully, test with multiple languages
- [ ] **#226** - Review "CLAUDE_CONFIG_DIR support"
- Impact: Supports non-standard Claude installations
- Status: 10 additions, minimal change
- Action: Test with custom config directory, merge if working
### Developer Experience
- [ ] **#216** - Review "Makefile shortcuts"
- Impact: DX improvement for contributors
- Status: 1,085 additions
- Priority: Low (not urgent)
- Action: Review when time permits
## Phase 4: Issue Follow-Ups (Ongoing)
### Awaiting User Verification
- [ ] **#209** - Follow up if no response on Windows worker startup
- Status: Already commented asking for v7.1.0 verification
- Action: Close if verified fixed, or investigate if still broken
- [ ] **#231** - Follow up if no response on module resolution
- Status: Already commented asking for v7.1.0 verification
- Action: Close if verified fixed, or investigate if still broken
### Upstream Bugs (Keep Open)
- [ ] **#227** - Keep open as documented upstream bug
- Reason: Claude Code CLI uses invalid Windows paths
- Action: No action needed, workaround documented
### Active Bugs (Investigate)
- [ ] **#208** - Investigate "Windows console windows appearing"
- Priority: Medium (cosmetic but annoying)
- Action: Reproduce on Windows, identify root cause
## Phase 5: Future Feature Planning
### Feature Requests Without PRs
- [ ] **#240** - Plan "Move MCP scaffolding to separate file"
- Type: Internal refactoring
- Priority: Low
- Action: Design approach when time permits
- [ ] **#239** - Plan "Track git branch as metadata"
- Type: Context enhancement
- Priority: Medium
- Action: Design schema changes, discuss approach
- [ ] **#215** - Plan "PreCompact event hook"
- Type: Power user feature
- Priority: Low
- Action: Evaluate use cases, design API
- [ ] **#233** - Plan "Multi-device sync" (partial solution exists)
- Type: Major feature
- Note: PR #225 provides export/import, full sync is more complex
- Action: Determine if export/import is sufficient, or plan cloud sync
## Summary
### Quick Wins (Do Today)
- Close 2 obsolete PRs (#255, #206)
- Close 3 resolved/duplicate issues (#213, #229, #211)
- Follow up on critical bug (#254)
### High-Impact Merges (This Week)
- Merge security fix (#236)
- Merge 2 simple fixes (#212, #225)
- Merge 2 major features (#250, #252, #251)
### Expected Impact
- **Security**: Localhost-only by default
- **Functionality**: Export/import, execution traces, plan context
- **UX**: Multi-language support, Windows fixes
- **Clarity**: Clean backlog, remove PM2 confusion
---
**Next Review**: After Phase 2 completion, reassess remaining items
+139
View File
@@ -4,6 +4,145 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
## [7.4.1] - 2025-12-19
## Bug Fixes
- **MCP Server**: Redirect logs to stderr to preserve JSON-RPC protocol (#396)
- MCP uses stdio transport where stdout is reserved for JSON-RPC messages
- Console.log was writing startup logs to stdout, causing Claude Desktop to parse log lines as JSON and fail
## [7.4.0] - 2025-12-18
## What's New
### MCP Tool Token Reduction
Optimized MCP tool definitions for reduced token consumption in Claude Code sessions through progressive parameter disclosure.
**Changes:**
- Streamlined MCP tool schemas with minimal inline definitions
- Added `get_schema()` tool for on-demand parameter documentation
- Enhanced worker API with operation-based instruction loading
This release improves session efficiency by reducing the token overhead of MCP tool definitions while maintaining full functionality through progressive disclosure.
## [7.3.9] - 2025-12-18
## Fixes
- Fix MCP server compatibility and web UI path resolution
This patch release addresses compatibility issues with the MCP server and resolves path resolution problems in the web UI.
## [7.3.8] - 2025-12-18
## Security Fix
Added localhost-only protection for admin endpoints to prevent DoS attacks when worker service is bound to 0.0.0.0 for remote UI access.
### Changes
- Created `requireLocalhost` middleware to restrict admin endpoints
- Applied to `/api/admin/restart` and `/api/admin/shutdown`
- Returns 403 Forbidden for non-localhost requests
### Security Impact
Prevents unauthorized shutdown/restart of worker service when exposed on network.
Fixes security concern raised in #368.
## [7.3.7] - 2025-12-17
## Windows Platform Stabilization
This patch release includes comprehensive improvements for Windows platform stability and reliability.
### Key Improvements
- **Worker Readiness Tracking**: Added `/api/readiness` endpoint with MCP/SDK initialization flags to prevent premature connection attempts
- **Process Tree Cleanup**: Implemented recursive process enumeration on Windows to prevent zombie socket processes
- **Bun Runtime Migration**: Migrated worker wrapper from Node.js to Bun for consistency and reliability
- **Centralized Project Name Utility**: Consolidated duplicate project name extraction logic with Windows drive root handling
- **Enhanced Error Messages**: Added platform-aware logging and detailed Windows troubleshooting guidance
- **Subprocess Console Hiding**: Standardized `windowsHide: true` across all child process spawns to prevent console window flashing
### Technical Details
- Worker service tracks MCP and SDK readiness states separately
- ChromaSync service properly tracks subprocess PIDs for Windows cleanup
- Worker wrapper uses Bun runtime with enhanced socket cleanup via process tree enumeration
- Increased timeouts on Windows platform (30s worker startup, 10s hook timeouts)
- Logger utility includes platform and PID information for better debugging
This represents a major reliability improvement for Windows users, eliminating common issues with worker startup failures, orphaned processes, and zombie sockets.
**Full Changelog**: https://github.com/thedotmack/claude-mem/compare/v7.3.6...v7.3.7
## [7.3.6] - 2025-12-17
## Bug Fixes
- Enhanced SDKAgent response handling and message processing
## [7.3.5] - 2025-12-17
## What's Changed
* fix(windows): solve zombie port problem with wrapper architecture by @ToxMox in https://github.com/thedotmack/claude-mem/pull/372
* chore: bump version to 7.3.5 by @thedotmack in https://github.com/thedotmack/claude-mem/pull/375
## New Contributors
* @ToxMox made their first contribution in https://github.com/thedotmack/claude-mem/pull/372
**Full Changelog**: https://github.com/thedotmack/claude-mem/compare/v7.3.4...v7.3.5
## [7.3.4] - 2025-12-17
Patch release for bug fixes and minor improvements
## [7.3.3] - 2025-12-16
## What's Changed
- Remove all better-sqlite3 references from codebase (#357)
**Full Changelog**: https://github.com/thedotmack/claude-mem/compare/v7.3.2...v7.3.3
## [7.3.2] - 2025-12-16
## 🪟 Windows Console Fix
Fixes blank console windows appearing for Windows 11 users during claude-mem operations.
### What Changed
- **Windows**: Uses PowerShell `Start-Process -WindowStyle Hidden` to properly hide worker process
- **Security**: Added PowerShell string escaping to follow security best practices
- **Unix/Mac**: No changes (continues to work as before)
### Root Cause
The issue was caused by a Node.js limitation where `windowsHide: true` doesn't work with `detached: true` in `child_process.spawn()`. This affects both Bun and Node.js since Bun inherits Node.js process spawning semantics.
See: https://github.com/nodejs/node/issues/21825
### Security Note
While all paths in the PowerShell command are application-controlled (not user input), we've added proper escaping to follow security best practices. If an attacker could modify bun installation paths or plugin directories, they would already have full filesystem access including the database.
### Related
- Fixes #304 (Multiple visible console windows)
- Merged PR #339
- Testing documented in PR #315
### Breaking Changes
None - fully backward compatible.
---
**Full Changelog**: https://github.com/thedotmack/claude-mem/compare/v7.3.1...v7.3.2
## [7.3.1] - 2025-12-16
## 🐛 Bug Fixes
+22 -7
View File
@@ -33,12 +33,7 @@ Claude-mem is a Claude Code plugin providing persistent memory across sessions.
## Build Commands
```bash
npm run build-and-sync # Build, sync to marketplace, restart worker (most common)
npm run build # Compile TypeScript only
npm run sync-marketplace # Copy to ~/.claude/plugins only
npm run worker:restart # Restart worker service only
npm run worker:status # Check worker status
npm run worker:logs # View worker logs
npm run build-and-sync # Build, sync to marketplace, restart worker
```
**Viewer UI**: http://localhost:37777
@@ -77,6 +72,26 @@ Settings are managed in `~/.claude-mem/settings.json`. The file is auto-created
**Source**: `docs/public/` - MDX files, edit `docs.json` for navigation
**Deploy**: Auto-deploys from GitHub on push to main
## Pro Features Architecture
Claude-mem is designed with a clean separation between open-source core functionality and optional Pro features.
**Open-Source Core** (this repository):
- All worker API endpoints on localhost:37777 remain fully open and accessible
- Pro features are headless - no proprietary UI elements in this codebase
- Pro integration points are minimal: settings for license keys, tunnel provisioning logic
- The architecture ensures Pro features extend rather than replace core functionality
**Pro Features** (coming soon, external):
- Enhanced UI (Memory Stream) connects to the same localhost:37777 endpoints as the open viewer
- Additional features like advanced filtering, timeline scrubbing, and search tools
- Access gated by license validation, not by modifying core endpoints
- Users without Pro licenses continue using the full open-source viewer UI without limitation
This architecture preserves the open-source nature of the project while enabling sustainable development through optional paid features.
# Important
No need to edit the changelog ever, it's generated automatically.
No need to edit the changelog ever, it's generated automatically.
+1 -1
View File
@@ -396,7 +396,7 @@ If you're experiencing issues, describe the problem to Claude and the troublesho
**Common Issues:**
- Worker not starting → `npm run worker:restart`
- Worker not starting → `claude-mem restart`
- No context appearing → `npm run test:context`
- Database issues → `sqlite3 ~/.claude-mem/claude-mem.db "PRAGMA integrity_check;"`
- Search not working → Check FTS5 tables exist
-186
View File
@@ -1,186 +0,0 @@
# Security Policy
## Reporting a Vulnerability
If you discover a security vulnerability in claude-mem, please report it by:
1. **DO NOT** create a public GitHub issue
2. Email the maintainer directly with details
3. Include steps to reproduce, impact assessment, and suggested fixes if possible
We take security seriously and will respond to valid reports within 48 hours.
## Security Measures
### Command Injection Prevention
Claude-mem executes system commands for git operations and process management. We have implemented comprehensive protections against command injection:
#### Safe Command Execution
- **Array-based Arguments:** All commands use array-based arguments to prevent shell interpretation
- **No Shell Execution:** `shell: false` is explicitly set for all spawn operations involving user input
- **Input Validation:** All user-controlled parameters are validated before use
#### Example Safe Pattern
```typescript
// ✅ SAFE: Array-based arguments with validation
if (!isValidBranchName(userInput)) {
throw new Error('Invalid input');
}
spawnSync('git', ['checkout', userInput], { shell: false });
// ❌ UNSAFE: Never do this
execSync(`git checkout ${userInput}`);
```
### Input Validation
All user-controlled inputs are validated using whitelists and strict patterns:
- **Branch Names:** Must match `/^[a-zA-Z0-9][a-zA-Z0-9._/-]*$/` and not contain `..`
- **Port Numbers:** Must be numeric and within range 1024-65535
- **File Paths:** All paths are joined using `path.join()` to prevent traversal
### Process Management
- **PID File Protection:** Process IDs are stored in user's data directory (`~/.claude-mem/`)
- **Port Validation:** Worker port is validated before binding
- **Health Checks:** Worker health is verified before processing requests
### Privacy Controls
Claude-mem includes dual-tag system for content privacy:
- `<private>content</private>` - User-level privacy (prevents storage)
- `<claude-mem-context>content</claude-mem-context>` - System-level tag (prevents recursive storage)
Tags are stripped at the hook layer before data reaches worker/database.
## Security Audit History
### 2025-12-16: Command Injection Vulnerability (Issue #354)
- **Severity:** CRITICAL
- **Status:** RESOLVED
- **Details:** See [SECURITY_AUDIT_REPORT.md](./SECURITY_AUDIT_REPORT.md)
- **Affected Versions:** All versions prior to fix
- **Fixed In:** Current version
- **Vulnerabilities Found:** 3
- **Vulnerabilities Fixed:** 3
**Summary of Fixes:**
1. Replaced string interpolation with array-based arguments in `BranchManager.ts`
2. Added `isValidBranchName()` validation function
3. Removed unnecessary shell usage in `bun-path.ts`
4. Created comprehensive security test suite
## Security Best Practices for Contributors
### When Adding Command Execution
1. **NEVER use shell with user input:**
```typescript
// ❌ NEVER
execSync(`command ${userInput}`);
spawn('command', [...], { shell: true });
// ✅ ALWAYS
spawnSync('command', [userInput], { shell: false });
```
2. **ALWAYS validate user input:**
```typescript
if (!isValidInput(userInput)) {
throw new Error('Invalid input');
}
```
3. **Use array-based arguments:**
```typescript
// ❌ NEVER
execSync(`git ${command} ${arg}`);
// ✅ ALWAYS
spawnSync('git', [command, arg], { shell: false });
```
4. **Explicitly set shell: false:**
```typescript
spawnSync('command', args, { shell: false });
```
### When Adding User Input
1. **Whitelist validation** over blacklist
2. **Strict regex patterns** for format validation
3. **Type checking** for expected data types
4. **Range validation** for numeric inputs
5. **Length limits** for string inputs
### Code Review Checklist
Before submitting a PR with command execution or user input handling:
- [ ] No `execSync` with string interpolation or template literals
- [ ] No `shell: true` when user input is involved
- [ ] All spawn/spawnSync calls use array arguments
- [ ] Input validation is present for all user-controlled parameters
- [ ] Security tests are added for new attack vectors
- [ ] Code follows patterns in [SECURITY_AUDIT_REPORT.md](./SECURITY_AUDIT_REPORT.md)
## Dependencies
We regularly audit dependencies for vulnerabilities:
- **npm audit:** Run before each release
- **Dependabot:** Enabled for automatic security updates
- **Manual Review:** Critical dependencies reviewed quarterly
## Data Storage
Claude-mem stores data locally in `~/.claude-mem/`:
- **Database:** SQLite3 at `~/.claude-mem/claude-mem.db`
- **Vector Store:** Chroma at `~/.claude-mem/chroma/`
- **Logs:** `~/.claude-mem/logs/`
- **Settings:** `~/.claude-mem/settings.json`
All data remains on the user's machine. No telemetry or external data transmission.
## Permissions
Claude-mem requires:
- **File System:** Read/write to `~/.claude-mem/` and `~/.claude/plugins/`
- **Network:** HTTP server on localhost (default port 37777)
- **Process Management:** Spawn worker processes, manage PIDs
No elevated privileges (root/administrator) are required.
## Secure Defaults
- **Worker Host:** Binds to `127.0.0.1` by default (localhost only)
- **Worker Port:** User-configurable, validates range 1024-65535
- **Log Level:** INFO by default (no sensitive data in logs)
- **Privacy Tags:** Auto-strips private content before storage
## Updates
Security patches are released as soon as possible after discovery. Users should:
1. Keep claude-mem updated to the latest version
2. Monitor GitHub releases for security announcements
3. Review [CHANGELOG.md](./CHANGELOG.md) for security-related changes
## Questions?
For security-related questions (non-vulnerabilities), please:
1. Check [SECURITY_AUDIT_REPORT.md](./SECURITY_AUDIT_REPORT.md) for technical details
2. Review code comments in security-critical files
3. Open a GitHub Discussion (not an Issue) for general security questions
---
**Last Updated:** 2025-12-16
**Last Audit:** 2025-12-16 (Issue #354)
**Next Scheduled Audit:** 2025-03-16
-403
View File
@@ -1,403 +0,0 @@
# Security Audit Report - Command Injection Prevention
**Date:** 2025-12-16
**Issue:** #354 - Command Injection Vulnerability
**Severity:** CRITICAL
**Status:** RESOLVED
## Executive Summary
A comprehensive security audit was conducted to identify and fix command injection vulnerabilities in the claude-mem codebase. The primary vulnerability was found in `BranchManager.ts` where user-supplied branch names were directly interpolated into shell commands without validation or sanitization.
### Vulnerabilities Found: 3
### Vulnerabilities Fixed: 3
### Files Modified: 2
### Tests Added: 1 comprehensive test suite
---
## Critical Vulnerabilities (Fixed)
### 1. BranchManager.ts - Command Injection via Branch Name
**File:** `src/services/worker/BranchManager.ts`
**Lines:** 156, 159, 164, 224 (original line numbers)
**Severity:** CRITICAL
**Attack Vector:** User-controlled branch name parameter
#### Original Vulnerable Code:
```typescript
// VULNERABLE: Direct string interpolation
function execGit(command: string): string {
return execSync(`git ${command}`, { ... });
}
// Called with user input:
execGit(`checkout ${targetBranch}`); // Line 156
execGit(`checkout -b ${targetBranch} origin/${targetBranch}`); // Line 159
execGit(`pull origin ${targetBranch}`); // Line 164
execGit(`pull origin ${info.branch}`); // Line 224
```
#### Exploitation Example:
```bash
targetBranch = "main; rm -rf /"
# Results in: git checkout main; rm -rf /
```
#### Fix Applied:
1. **Input Validation:** Added `isValidBranchName()` function to validate branch names using regex
2. **Array-based Arguments:** Replaced `execSync` string interpolation with `spawnSync` array arguments
3. **Shell Disabled:** Explicitly set `shell: false` to prevent shell interpretation
```typescript
// SECURE: Array-based arguments with validation
function isValidBranchName(branchName: string): boolean {
if (!branchName || typeof branchName !== 'string') {
return false;
}
const validBranchRegex = /^[a-zA-Z0-9][a-zA-Z0-9._/-]*$/;
return validBranchRegex.test(branchName) && !branchName.includes('..');
}
function execGit(args: string[]): string {
const result = spawnSync('git', args, {
cwd: INSTALLED_PLUGIN_PATH,
encoding: 'utf-8',
timeout: GIT_COMMAND_TIMEOUT_MS,
windowsHide: true,
shell: false // CRITICAL: Never use shell with user input
});
// ... error handling
return result.stdout.trim();
}
// Called with validated input:
if (!isValidBranchName(targetBranch)) {
return { success: false, error: 'Invalid branch name' };
}
execGit(['checkout', targetBranch]);
```
---
### 2. BranchManager.ts - NPM Command Injection
**File:** `src/services/worker/BranchManager.ts`
**Lines:** 173, 231 (original line numbers)
**Severity:** MEDIUM
**Attack Vector:** Indirect (through branch switching workflow)
#### Original Vulnerable Code:
```typescript
// VULNERABLE: Shell execution
function execShell(command: string): string {
return execSync(command, { ... });
}
execShell('npm install', NPM_INSTALL_TIMEOUT_MS);
```
#### Fix Applied:
Created dedicated `execNpm()` function using array-based arguments:
```typescript
function execNpm(args: string[], timeoutMs: number = NPM_INSTALL_TIMEOUT_MS): string {
const isWindows = process.platform === 'win32';
const npmCmd = isWindows ? 'npm.cmd' : 'npm';
const result = spawnSync(npmCmd, args, {
cwd: INSTALLED_PLUGIN_PATH,
encoding: 'utf-8',
timeout: timeoutMs,
windowsHide: true,
shell: false // CRITICAL: Never use shell
});
// ... error handling
return result.stdout.trim();
}
execNpm(['install'], NPM_INSTALL_TIMEOUT_MS);
```
---
### 3. bun-path.ts - Unnecessary Shell Usage on Windows
**File:** `src/utils/bun-path.ts`
**Line:** 26 (original)
**Severity:** LOW
**Attack Vector:** None (command is hardcoded), but violates security best practices
#### Original Code:
```typescript
const result = spawnSync('bun', ['--version'], {
encoding: 'utf-8',
stdio: ['pipe', 'pipe', 'pipe'],
shell: isWindows // Unnecessary shell usage
});
```
#### Fix Applied:
```typescript
const result = spawnSync('bun', ['--version'], {
encoding: 'utf-8',
stdio: ['pipe', 'pipe', 'pipe'],
shell: false // SECURITY: No need for shell
});
```
---
## Safe Code Patterns Verified
The following files were audited and confirmed to be safe from command injection:
### 1. ProcessManager.ts
```typescript
// SAFE: Array-based arguments, no user input
const child = spawn(bunPath, [script], {
detached: true,
stdio: ['ignore', 'pipe', 'pipe'],
env: { ...process.env, CLAUDE_MEM_WORKER_PORT: String(port) },
cwd: MARKETPLACE_ROOT,
...(isWindows && { windowsHide: true })
});
```
**Why Safe:**
- Uses array-based arguments
- No shell execution
- Port parameter is validated (lines 29-35) before use
- `bunPath` comes from trusted utility function
### 2. SDKAgent.ts
```typescript
// SAFE: Hardcoded command, no user input
execSync(process.platform === 'win32' ? 'where claude' : 'which claude', {
encoding: 'utf8',
windowsHide: true
})
```
**Why Safe:**
- Command is completely hardcoded (no user input)
- Used only for finding Claude executable in PATH
### 3. paths.ts
```typescript
// SAFE: Hardcoded command, no user input
const gitRoot = execSync('git rev-parse --show-toplevel', {
cwd: process.cwd(),
encoding: 'utf8',
stdio: ['pipe', 'pipe', 'ignore'],
windowsHide: true
});
```
**Why Safe:**
- Command is completely hardcoded
- No user input in command or arguments
- `cwd` is from `process.cwd()` (trusted source)
### 4. worker-utils.ts
```typescript
// SAFE: Hardcoded arguments
spawnSync('pm2', ['delete', 'claude-mem-worker'], { stdio: 'ignore' });
```
**Why Safe:**
- Array-based arguments
- All arguments are hardcoded strings
- No user input
---
## Security Test Suite
Created comprehensive test suite at `tests/security/command-injection.test.ts` with:
- **50+ test cases** covering various injection attempts
- **Platform-specific tests** for Windows and Unix command separators
- **Edge case testing** (Unicode control chars, URL encoding, long inputs)
- **Regression tests** for Issue #354
- **Code verification tests** to ensure no shell usage remains
### Key Test Categories:
1. **Branch Name Validation**
- Shell metacharacters (`; && || | & $ \` \n \r`)
- Directory traversal (`..`)
- Invalid starting characters (`. - /`)
- Valid branch names (main, beta, feature/*, etc.)
2. **Command Array Safety**
- Verifies no string interpolation in git commands
- Verifies `shell: false` is set
- Verifies array-based arguments are used
3. **Cross-platform Attacks**
- Windows-specific injections (`& type C:\...`)
- Unix-specific injections (`; cat /etc/shadow`)
4. **Edge Cases**
- Null/undefined/empty inputs
- URL encoding attempts
- Unicode control characters
- Very long inputs (1000+ chars)
---
## Security Best Practices Applied
### 1. Never Use Shell with User Input
```typescript
// ❌ NEVER DO THIS
execSync(`git ${userInput}`);
spawn('git', [...], { shell: true });
// ✅ ALWAYS DO THIS
spawnSync('git', [userInput], { shell: false });
```
### 2. Always Validate User Input
```typescript
// ❌ NEVER DO THIS
execGit(['checkout', targetBranch]);
// ✅ ALWAYS DO THIS
if (!isValidBranchName(targetBranch)) {
return { success: false, error: 'Invalid input' };
}
execGit(['checkout', targetBranch]);
```
### 3. Use Array-based Arguments
```typescript
// ❌ NEVER DO THIS
execSync(`git checkout ${branch}`);
// ✅ ALWAYS DO THIS
spawnSync('git', ['checkout', branch], { shell: false });
```
### 4. Explicit shell: false
```typescript
// ❌ BAD (shell might be enabled by default in some cases)
spawnSync('git', ['checkout', branch]);
// ✅ GOOD (explicit is better)
spawnSync('git', ['checkout', branch], { shell: false });
```
---
## Verification Steps
### Manual Testing
```bash
# Run security test suite
bun test tests/security/command-injection.test.ts
# Expected result: All tests pass
```
### Code Review Checklist
- [x] No `execSync` with string interpolation
- [x] No `shell: true` with user input
- [x] All spawn/spawnSync calls use array arguments
- [x] Input validation on all user-controlled parameters
- [x] Security test coverage for all attack vectors
### Automated Scanning
```bash
# Check for potential vulnerabilities
grep -rn "execSync.*\${" src/
grep -rn "shell:\s*true" src/
grep -rn "exec(\`" src/
# Expected result: No matches (or only false positives in comments)
```
---
## Impact Assessment
### Before Fix:
- **Risk:** Remote code execution via branch name parameter
- **Attack Surface:** Any UI or API endpoint accepting branch names
- **Affected Functions:** `switchBranch()`, `pullUpdates()`
- **Exploitability:** High (trivial to exploit)
### After Fix:
- **Risk:** None
- **Attack Surface:** Zero (input validation + safe execution)
- **Affected Functions:** All secured
- **Exploitability:** None
---
## Recommendations
### Immediate Actions
1. ✅ Apply all fixes from this audit
2. ✅ Run security test suite
3. ✅ Deploy to production immediately (critical security fix)
### Long-term Actions
1. **Code Review Process:**
- Add security checklist to PR template
- Require review of all `exec*` and `spawn*` calls
- Flag any `shell: true` usage for security review
2. **Automated Scanning:**
- Add pre-commit hooks to detect unsafe patterns
- Integrate SAST (Static Application Security Testing) tools
- Run security tests in CI/CD pipeline
3. **Developer Training:**
- Document secure coding practices for command execution
- Share this audit report with the team
- Add security section to CONTRIBUTING.md
4. **Regular Audits:**
- Quarterly security audits of all exec/spawn usage
- Review any new dependencies for vulnerabilities
- Keep security test suite updated with new attack vectors
---
## Files Modified
### /src/services/worker/BranchManager.ts
- Added `isValidBranchName()` validation function
- Replaced `execGit()` with safe implementation using `spawnSync`
- Replaced `execShell()` with `execNpm()` using safe implementation
- Added validation to `switchBranch()` function
- Added validation to `pullUpdates()` function
- Updated all git command calls to use array arguments
### /src/utils/bun-path.ts
- Changed `shell: isWindows` to `shell: false`
### /tests/security/command-injection.test.ts (NEW)
- Comprehensive security test suite with 50+ test cases
---
## Conclusion
All command injection vulnerabilities have been identified and fixed. The codebase now follows security best practices for command execution:
1. **No shell execution** with user input
2. **Array-based arguments** for all external commands
3. **Input validation** on all user-controlled parameters
4. **Comprehensive test coverage** for security scenarios
The risk of command injection is now **ELIMINATED** in the claude-mem codebase.
---
**Audited by:** Agent A (AI Security Audit)
**Date:** 2025-12-16
**Next Audit:** Recommended within 3 months
File diff suppressed because it is too large Load Diff
-113
View File
@@ -1,113 +0,0 @@
# Branch Switching Test Plan: feature/bun-executable
## Overview
This document validates that switching to the `feature/bun-executable` branch will be seamless for users.
## Branch Switching Mechanism
When a user switches branches via the Settings UI:
1. **Branch Switch Request**: User selects `feature/bun-executable` from Settings UI
2. **Validation**: SettingsRoutes validates branch name against allowed list
3. **Git Operations**: BranchManager performs:
- Discard local changes (`git checkout -- .` and `git clean -fd`)
- Fetch from origin (`git fetch origin`)
- Checkout target branch (`git checkout feature/bun-executable`)
- Pull latest (`git pull origin feature/bun-executable`)
4. **Install Dependencies**:
- Clear install marker (`.install-version`)
- Run `npm install` (2 minute timeout)
5. **Worker Restart**: Worker process exits and PM2/supervisor restarts it
## Feature Branch Changes
The `feature/bun-executable` branch makes these key changes:
### Dependencies Removed
- `better-sqlite3` → Uses Bun's built-in SQLite
- `pm2` → Custom worker CLI with process management
- `@types/better-sqlite3`
### New Features
- Auto-installation of Bun runtime in smart-install.js
- Simplified worker management via worker-cli.js
- No native module compilation required (better-sqlite3 removed)
## Installation Validation
### Current Branch → feature/bun-executable
**Step 1: Branch Switch (BranchManager)**
```bash
git checkout feature/bun-executable
git pull origin feature/bun-executable
rm .install-version
npm install # ✅ Works - package.json is npm-compatible
```
**Step 2: First Hook Execution**
```bash
node plugin/scripts/context-hook.js
Calls smart-install.js
Checks if Bun installed → Auto-installs if missing
Runs: bun install (if needed)
```
**Step 3: Worker Management**
- Old: PM2 manages worker-service.cjs
- New: worker-cli.js manages worker as background process
- Transition: Automatic on first worker start command
## Seamless Installation Checklist
- [x] **Branch Validation**: `feature/bun-executable` added to allowedBranches list
- [x] **npm install Compatible**: Feature branch package.json works with npm
- [x] **No Breaking Changes**: No hooks that would fail on first run
- [x] **Auto-Install**: smart-install.js automatically installs Bun if missing
- [x] **Graceful Degradation**: Scripts fall back to node if Bun unavailable
- [x] **No Manual Steps**: User just clicks "Switch Branch" in UI
## Potential Issues & Mitigations
### Issue 1: Bun Not in PATH After Install
**Mitigation**: smart-install.js checks common Bun installation paths and provides clear instructions to user
### Issue 2: PM2 vs Worker CLI Transition
**Mitigation**: Old PM2 worker continues running, new worker CLI starts separately. User can manually stop old PM2 worker if needed.
### Issue 3: Windows Compatibility
**Mitigation**: Feature branch uses PowerShell installer for Windows, curl for Unix/macOS
## Test Results
### Unit Tests
```bash
✓ tests/branch-selector.test.ts (5 tests)
✓ should allow main branch
✓ should allow beta/7.0 branch
✓ should allow feature/bun-executable branch
✓ should reject invalid branch names
✓ should have exactly 3 allowed branches
```
### Integration Tests
```bash
✓ All existing tests pass (42 tests)
✓ No regressions introduced
✓ TypeScript compilation successful
```
## Conclusion
✅ **SEAMLESS INSTALLATION VALIDATED**
The installation process is seamless because:
1. Branch switching uses standard git operations
2. `npm install` works on feature branch
3. Bun auto-installs on first hook execution
4. No manual intervention required
5. Clear error messages if issues occur
6. Backward compatible with existing installations
-561
View File
@@ -1,561 +0,0 @@
# Claude-Mem Smart Install & Plugin Hooks - Comprehensive Analysis
**Generated:** 2025-12-09
**Scope:** Smart install system, all plugin hooks, cross-platform compatibility, error handling, edge cases
---
## Executive Summary
This report provides a comprehensive analysis of claude-mem's smart install system and plugin hook infrastructure. The analysis focuses on cross-platform compatibility, error handling patterns, artificial blockers, and edge case handling.
**Key Findings:**
- ✅ Overall architecture is well-designed with clear separation of concerns
- ⚠️ Multiple cross-platform compatibility issues identified
- ⚠️ Several silent failure patterns that hinder debugging
- ⚠️ Artificial blockers that could prevent legitimate use cases
- ⚠️ Inconsistent timeout values across different components
- ✅ No nested try-catch anti-patterns found
---
## Architecture Overview
### Smart Install System Flow
```
User Invokes Hook
ensureWorkerRunning() [worker-utils.ts]
isWorkerHealthy() → fetch /health endpoint
├─ [HEALTHY] → Continue
└─ [UNHEALTHY] → startWorker()
├─ [Windows] → PowerShell Start-Process (hidden window)
└─ [Unix] → Bun start ecosystem.config.cjs
Wait for health check (15 retries × 1000ms)
├─ [SUCCESS] → Continue
└─ [FAILURE] → Throw error with manual recovery instructions
```
### Plugin Hook Lifecycle
1. **SessionStart** (context-hook.ts + user-message-hook.ts)
- context-hook: Fetches context via HTTP/curl
- user-message-hook: Displays context to user via stderr
2. **UserPromptSubmit** (new-hook.ts)
- Creates/retrieves SDK session
- Strips privacy tags from prompt
- Initializes session via HTTP
3. **PostToolUse** (save-hook.ts)
- Filters skipped tools
- Sends observation to worker via HTTP
4. **Stop** (summary-hook.ts)
- Parses transcript JSONL
- Extracts last user/assistant messages
- Requests summary generation via HTTP
5. **SessionEnd** (cleanup-hook.ts)
- Marks session complete
- Fire-and-forget HTTP request
---
## Cross-Platform Compatibility Issues
### 🔴 CRITICAL: curl Dependency (context-hook.ts)
**Location:** `src/hooks/context-hook.ts:32`
```typescript
const result = execSync(`curl -s "${url}"`, { encoding: "utf-8", timeout: 5000 });
```
**Issues:**
1. **Windows Compatibility:** curl is not guaranteed to be available on Windows systems (though included in Windows 10 1803+, it may be missing on older systems or custom installations)
2. **Error Handling:** No try-catch around execSync - will throw unhandled exception if curl fails
3. **Redundancy:** Uses curl when JavaScript's native `fetch` is already used everywhere else in the codebase
**Impact:** High - SessionStart hook will crash if curl is unavailable or returns non-zero exit code
**Edge Cases:**
- Corporate proxies blocking curl
- Systems without curl in PATH
- curl returning non-zero exit with valid output (warnings, etc.)
**Recommendation:**
```typescript
// Replace curl with fetch (already used in user-message-hook.ts)
const response = await fetch(url, { signal: AbortSignal.timeout(5000) });
const result = await response.text();
```
---
### 🟡 MEDIUM: Platform-Specific Process Spawning (worker-utils.ts)
**Location:** `src/shared/worker-utils.ts:55-93`
**Windows Implementation:**
```typescript
spawnSync('powershell.exe', [
'-NoProfile',
'-NonInteractive',
'-Command',
`Start-Process -FilePath 'node' -ArgumentList '${workerScript}' -WorkingDirectory '${MARKETPLACE_ROOT}' -WindowStyle Hidden`
])
```
**Issues:**
1. **PowerShell Dependency:** Assumes PowerShell is available and in PATH
2. **Command Injection Risk:** Worker script path inserted directly into command string without escaping
3. **Process Monitoring:** Windows approach launches detached process with no Bun monitoring - harder to debug/restart
4. **Health Check Timeout:** Comment says "Windows needs longer timeouts" but timeout is same for all platforms (500ms)
**Edge Cases:**
- Windows systems with PowerShell execution policy restrictions
- Paths containing single quotes or special characters
- Windows subsystem for Linux (WSL) environments
- Wine/Proton compatibility layers
**Unix Implementation:**
```typescript
const localBunBase = path.join(MARKETPLACE_ROOT, 'node_modules', '.bin', 'bun');
const bunCommand = existsSync(localBunBase) ? localBunBase : 'bun';
```
**Issues:**
1. **Bun Dependency:** Falls back to global bun if local not found, but doesn't verify it exists
2. **Silent Failure:** If Bun not installed globally, spawnSync will fail with cryptic ENOENT error
**Recommendation:**
- Add bun existence check before spawn
- Implement consistent process monitoring across platforms
- Add path escaping for Windows command construction
- Actually implement longer timeout for Windows if needed
---
### 🟡 MEDIUM: Git Dependency (paths.ts)
**Location:** `src/shared/paths.ts:89-97`
```typescript
export function getCurrentProjectName(): string {
try {
const gitRoot = execSync('git rev-parse --show-toplevel', {
cwd: process.cwd(),
encoding: 'utf8',
stdio: ['pipe', 'pipe', 'ignore']
}).trim();
return basename(gitRoot);
} catch {
return basename(process.cwd());
}
}
```
**Issues:**
1. **Git Assumption:** Assumes git is installed and available in PATH
2. **Non-Git Projects:** Silently falls back to cwd basename, but this behavior is undocumented
**Edge Cases:**
- Projects not using git
- Monorepos where cwd !== git root is desired
- Systems without git installed
**Status:** ✅ Already handled with fallback, but could benefit from debug logging
---
## Error Handling Analysis
### 🔴 CRITICAL: Silent Failures Without Logging
#### 1. Settings File Loading (early-settings.ts:20-28)
```typescript
try {
if (existsSync(SETTINGS_PATH)) {
const data = JSON.parse(readFileSync(SETTINGS_PATH, 'utf-8'));
const fileValue = data.env?.[key];
if (fileValue !== undefined) return fileValue;
}
} catch {
// Fail silently - fall through to env var
}
```
**Problem:**
- Invalid JSON in settings file fails silently
- File read permission errors fail silently
- Users have no way to know their settings file is being ignored
**Impact:** High - Users may think settings are applied when they're actually using defaults
**Recommendation:**
```typescript
} catch (error) {
logger.warn('SETTINGS', 'Failed to load settings file', { path: SETTINGS_PATH }, error);
}
```
---
#### 2. Worker Startup Failure (worker-utils.ts:104-107)
```typescript
try {
// ... worker startup logic ...
} catch (error) {
// Failed to start worker
return false;
}
```
**Problem:**
- Catches ALL errors during worker startup
- Returns boolean with no information about what failed
- User only gets generic error after all retries exhausted
**Impact:** High - Makes debugging worker startup issues extremely difficult
**Recommendation:**
```typescript
} catch (error) {
logger.error('WORKER', 'Failed to start worker', {}, error as Error);
return false;
}
```
---
#### 3. Worker Health Check (worker-utils.ts:30-40)
```typescript
async function isWorkerHealthy(): Promise<boolean> {
try {
const port = getWorkerPort();
const response = await fetch(`http://127.0.0.1:${port}/health`, {
signal: AbortSignal.timeout(HEALTH_CHECK_TIMEOUT_MS)
});
return response.ok;
} catch {
return false;
}
}
```
**Problem:**
- Network errors, timeouts, and non-200 responses all indistinguishable
- No logging at all - completely silent
**Impact:** Medium - Hard to debug why health checks fail
**Recommendation:**
```typescript
} catch (error) {
logger.debug('WORKER', 'Health check failed', { port }, error);
return false;
}
```
---
#### 4. Tool Formatting (logger.ts:122-124)
```typescript
try {
const input = typeof toolInput === 'string' ? JSON.parse(toolInput) : toolInput;
// ...
} catch {
return toolName;
}
```
**Problem:**
- Invalid JSON in tool input fails silently
- Could mask data corruption issues
**Impact:** Low - Only affects log formatting
**Status:** ✅ Acceptable for log formatting, but could log at DEBUG level
---
### 🟢 GOOD: No Nested Try-Catch Anti-Patterns
Analysis confirmed zero instances of nested try-catch blocks. Error handling is consistently at single level per function.
---
## Artificial Blockers & Unnecessary Checks
### 🔴 CRITICAL: First-Run Detection (user-message-hook.ts:14-40)
```typescript
const nodeModulesPath = join(pluginDir, 'node_modules');
if (!existsSync(nodeModulesPath)) {
// Show first-time setup message
console.error(`...`);
process.exit(3);
}
```
**Problems:**
1. **False Positive:** Will trigger if user manually deletes node_modules (e.g., for troubleshooting)
2. **Installation Race:** Could fail if installation is still in progress
3. **Hook-Level Check:** Runs on EVERY SessionStart, not just actual first run
**Impact:** High - Prevents usage until node_modules exists, even if dependencies are installed elsewhere
**Edge Cases:**
- User runs `rm -rf node_modules` for troubleshooting
- Package manager installation interrupted
- Symlinked node_modules (some package managers)
**Recommendation:**
- Use a `.first-run-complete` marker file instead
- Move check to npm postinstall script
- Make check more robust (check for specific required modules)
---
### 🟡 MEDIUM: Overly Specific Validation (paths.ts:117-119)
```typescript
if (!existsSync(join(commandsDir, 'save.md'))) {
throw new Error('Package commands directory missing required files');
}
```
**Problem:**
- Checks for ONE specific file to validate entire directory
- Hardcoded filename could break if files reorganized
- Error message doesn't specify what's missing
**Impact:** Medium - Could prevent package from working after internal refactoring
**Recommendation:**
- Remove check entirely (let actual command invocation fail with better error)
- Or check all required files if validation is critical
---
### 🟡 MEDIUM: Duplicate Health Endpoints
**Locations:**
- `src/services/worker-service.ts:107` - `/api/health`
- `src/services/worker/http/routes/ViewerRoutes.ts:27` - `/health`
**Usage:**
- `worker-utils.ts` uses `/health`
- `mcp-server.ts` uses `/api/health`
**Problem:**
- Redundant endpoints doing the same thing
- Inconsistent usage across codebase
- Maintenance burden
**Impact:** Low - Both work, but creates confusion
**Recommendation:**
- Standardize on `/api/health` (follows REST convention)
- Remove `/health` endpoint
- Update worker-utils.ts to use `/api/health`
---
## Timeout Configuration Issues
### Inconsistent Timeouts Across Components
| Component | Timeout | Location | Purpose |
|-----------|---------|----------|---------|
| Health check | 500ms | worker-utils.ts:13 | Check if worker alive |
| Worker startup wait | 1000ms | worker-utils.ts:14 | Wait between health checks |
| Worker startup retries | 15x | worker-utils.ts:15 | Max retries (15s total) |
| Hook HTTP requests | 2000ms | cleanup-hook.ts:61, save-hook.ts:70, summary-hook.ts:164 | Send data to worker |
| New hook session init | 5000ms | new-hook.ts:129 | Initialize session |
| Context hook fetch | 5000ms | context-hook.ts:32 | Fetch context via curl |
| User message hook | 5000ms | user-message-hook.ts:52 | Fetch context display |
**Problems:**
1. **Health Check Too Aggressive:** 500ms may be too short for loaded systems or slow network
2. **No Platform Adjustment:** Comment says "Windows needs longer timeouts" but values are same
3. **Hook Timeout Variation:** Some hooks use 2s, others use 5s with no clear reasoning
**Recommendations:**
- Increase health check timeout to 1000ms minimum
- Actually implement longer timeouts for Windows
- Standardize hook timeouts to 5000ms across the board
- Make timeouts configurable via settings
---
## Edge Case Analysis
### Handled Well ✅
1. **JSONL Parsing:** summary-hook.ts continues on malformed lines (60-64, 117-121)
2. **Git Not Available:** paths.ts falls back to cwd basename (89-97)
3. **Settings File Missing:** early-settings.ts falls back to env vars and defaults (20-28)
4. **Privacy Tags:** new-hook.ts handles fully-private prompts (99-109)
5. **Tool Skipping:** save-hook.ts filters low-value tools (24-30)
### Missing Edge Case Handling ⚠️
1. **curl Failure:** context-hook.ts has no error handling for curl failures
2. **Bun Not Installed:** worker-utils.ts assumes bun exists globally
3. **PowerShell Restrictions:** worker-utils.ts doesn't check execution policy
4. **Concurrent Worker Starts:** No locking to prevent multiple hooks from starting worker simultaneously
5. **Port Already In Use:** No detection or recovery if worker port is taken
6. **Zombie Processes:** Windows approach doesn't track PIDs, can't detect/kill zombies
---
## Recommendations Summary
### High Priority 🔴
1. **Replace curl with fetch** in context-hook.ts
- Eliminates external dependency
- Consistent with rest of codebase
- Better error handling
2. **Add logging to silent failures**
- early-settings.ts: Log when settings file fails to load
- worker-utils.ts: Log startup failures with details
- worker-utils.ts: Log health check failures at debug level
3. **Fix first-run detection**
- Use marker file instead of node_modules check
- More reliable and intentional
### Medium Priority 🟡
4. **Verify Bun availability** before attempting to use it
- Check existence before spawn
- Provide clear error message if missing
5. **Implement platform-specific timeouts**
- Actually use longer timeouts on Windows as comment suggests
- Make timeouts configurable
6. **Standardize health endpoints**
- Remove duplicate `/health` endpoint
- Use `/api/health` everywhere
7. **Add path escaping** for Windows PowerShell commands
- Prevent injection issues
- Handle paths with special characters
### Low Priority 🟢
8. **Standardize HTTP timeouts** across all hooks
9. **Add concurrent startup protection** (locking mechanism)
10. **Improve error messages** with actionable recovery steps
---
## Testing Recommendations
### Cross-Platform Testing Needed
1. **Windows Environments:**
- Windows 10 (various versions)
- Windows 11
- Windows Server
- WSL/WSL2
- PowerShell execution policies (Restricted, RemoteSigned, Unrestricted)
2. **Unix Environments:**
- macOS (Intel + Apple Silicon)
- Linux (Ubuntu, Fedora, Arch)
- FreeBSD
3. **Edge Environments:**
- Docker containers
- CI/CD environments
- Systems without git installed
- Systems without curl (or with restricted curl)
- Corporate networks with proxies
- Low-spec systems (slow startup)
### Test Scenarios
1. **Cold Start:** First run with no existing data
2. **Corrupt Settings:** Invalid JSON in settings.json
3. **Missing Dependencies:** No Bun, no git, no curl
4. **Port Conflicts:** Worker port already in use
5. **Rapid Hook Invocations:** Multiple hooks trying to start worker simultaneously
6. **Permission Issues:** Read-only filesystem, restricted execution
7. **Network Issues:** Localhost blocked, slow network
---
## Code Quality Assessment
### Strengths ✅
- Clean separation of concerns (hooks → worker → database)
- No nested try-catch anti-patterns
- Consistent use of modern async/await
- Good use of TypeScript for type safety
- Idempotent database operations
- Clear documentation in critical sections
### Weaknesses ⚠️
- Silent failures hinder debugging
- Inconsistent error handling patterns
- Platform-specific code not fully tested/documented
- Timeout configuration hardcoded and inconsistent
- Some artificial blockers prevent legitimate use cases
### Technical Debt
- Duplicate health endpoints
- curl dependency when fetch available
- Bun dependency on Unix but not Windows (inconsistent monitoring)
- First-run detection using node_modules existence
- Hardcoded timeout values
---
## Conclusion
The claude-mem smart install and plugin hook system is architecturally sound with a well-designed separation of concerns. However, several cross-platform compatibility issues and silent failure patterns could cause problems in production, particularly on Windows systems or in edge case scenarios.
The highest priority improvements are:
1. Removing the curl dependency
2. Adding proper logging to silent failures
3. Fixing the fragile first-run detection
4. Verifying external dependencies before use
These changes would significantly improve debuggability and cross-platform reliability without requiring major architectural changes.
---
**Analysis Methodology:**
- Systematic review of all TypeScript source files
- Static analysis of error handling patterns
- Cross-platform compatibility assessment
- Edge case identification through code path analysis
- Comparison against best practices and KISS principles
**Files Analyzed:**
- src/hooks/*.ts (6 files)
- src/services/worker-service.ts
- src/services/worker/*.ts (10+ files)
- src/servers/mcp-server.ts
- src/shared/*.ts (worker-utils, early-settings, paths)
- src/utils/*.ts (logger, silent-debug, tag-stripping)
-386
View File
@@ -1,386 +0,0 @@
# Test Suite Audit Report
**Date:** 2025-12-13
**Auditor:** Code Quality Assurance Manager
**Focus:** Recent bugfixes and regression prevention
---
## Executive Summary
The test suite has **critical gaps** in error handling coverage. While happy path tests exist, **zero tests verify that recent bugfixes actually prevent regressions**. The fish shell PATH bug (Issue #264), silent hook failures (observation 25389), and ChromaSync error standardization (observation 25458) are all unprotected by tests.
**Risk Level:** HIGH - Recent bugfixes can silently regress without detection.
---
## Coverage Analysis
### What We Have ✅
1. **Happy Path Tests** (`tests/happy-paths/`) - 6 files
- Basic success scenarios work
- Tool capture, search, session init/cleanup
- Good foundation but insufficient
2. **Unit Tests**
- `bun-path.test.ts` - Tests PATH resolution logic
- `parser.test.ts` - SDK parser validation
- `strip-memory-tags.test.ts` - Privacy tag handling
3. **Integration Test** (`full-lifecycle.test.ts`)
- ONE error recovery test (too shallow)
- Mostly happy paths
- All tests mock `fetch()` - never test real failures
### What's Missing ❌
## 1. Silent Hook Failures (CRITICAL GAP)
**Issue:** Multiple hooks had no error logging until recently fixed
**Fixed In:**
- `save-hook.ts` (observation 25389) - Added `handleFetchError`/`handleWorkerError`
- `new-hook.ts` - Added error handlers
- `context-hook.ts` - Added error handlers
**Test Gap:** ZERO tests verify hooks actually log errors when they fail
**Created:** `/Users/alexnewman/Scripts/claude-mem/tests/error-handling/hook-error-logging.test.ts`
**Tests:**
- `handleFetchError()` logs with full context (status, hook, operation, tool, port)
- `handleFetchError()` throws user-facing error with restart instructions
- `handleWorkerError()` handles timeout/connection errors
- Real hook scenarios (save-hook, new-hook, context-hook failures)
- Error message quality (actionable, includes next steps)
**Why This Matters:**
If someone refactors hooks and removes error handlers, the system will silently fail again. These tests catch that regression immediately.
---
## 2. ChromaSync Client Initialization (MEDIUM GAP)
**Issue:** Standardized error messages across all client checks (observation 25458)
**Code Locations:** ChromaSync.ts lines 140-145, 324-329, 504-509, 761-766
**Test Gap:** NO tests verify error messages are consistent or fire correctly
**Created:** `/Users/alexnewman/Scripts/claude-mem/tests/services/chroma-sync-errors.test.ts`
**Tests:**
- Calling methods before `ensureConnection()` throws correct message
- All error messages include project name
- Error messages are consistent across all 4 locations
- Fail-fast behavior (no silent retries)
- Error context preservation
**Why This Matters:**
Prevents "works on my machine" bugs where Chroma isn't properly initialized. Ensures all 4 error checks stay in sync during refactoring.
---
## 3. Fish Shell PATH Issues (PARTIAL COVERAGE)
**Issue:** Issue #264 - Hooks fail with fish shell because bun not in /bin/sh PATH
**Current Test:** `bun-path.test.ts` tests the utility function
**Gap:** Doesn't test the ACTUAL bug - hooks failing when bun not in PATH
**Created:** `/Users/alexnewman/Scripts/claude-mem/tests/integration/hook-execution-environments.test.ts`
**Tests:**
- Running hook when `bun` only in `~/.bun/bin/bun` (not in PATH)
- Hook finds bun from common install locations
- Cross-platform bun resolution (macOS, Linux, Windows)
- Fish shell with custom PATH
- Zsh with homebrew in non-standard location
- Error messages include PATH diagnostic info
**Why This Matters:**
Fish shell users (and anyone with non-standard PATH) will get "command not found" errors if this regresses. Test ensures hooks work regardless of shell.
---
## 4. General Error Handling Patterns (CRITICAL GAP)
**Issue:** "264 silent failure locations" - widespread lack of error handling
**Current State:** Recent fixes added standardized error handlers
**Test Gap:** No systematic tests for error handling patterns
**Covered By:** `/Users/alexnewman/Scripts/claude-mem/tests/error-handling/hook-error-logging.test.ts`
**Why This Matters:**
If new hooks are added without using `handleFetchError`/`handleWorkerError`, they'll fail silently. Tests enforce the pattern.
---
## 5. Integration Test Weaknesses
**Current Test:** `full-lifecycle.test.ts` has ONE error recovery test (lines 292-352)
**Issues:**
- Too shallow - just checks second request succeeds after first fails
- Doesn't verify error logging
- Never tests real worker failures (all mocked)
**Needs:**
```
/tests/integration/hook-failures.test.ts
```
Should test:
- Worker crashes mid-session - hooks fail gracefully
- Worker returns 500 error - hook logs and throws
- Worker times out - hook aborts with timeout message
- Worker returns malformed JSON - hook handles parse error
---
## YAGNI Violations (Unnecessary Test Complexity)
### Problem: `/Users/alexnewman/Scripts/claude-mem/tests/happy-paths/search.test.ts`
**Lines 80-196:** Tests for features that DON'T EXIST:
1. **Line 80-107:** "supports filtering by observation type"
- Endpoint: `/api/search/by-type` - DOES NOT EXIST
2. **Line 109-136:** "supports filtering by concept tags"
- Endpoint: `/api/search/by-concept` - DOES NOT EXIST
3. **Line 138-168:** "supports pagination for large result sets"
- Includes `page`, `limit`, `offset` params - NOT IMPLEMENTED
4. **Line 170-196:** "supports date range filtering"
- `dateStart`, `dateEnd` params - NOT IMPLEMENTED
5. **Line 227-271:** "supports semantic search ranking"
- `orderBy=relevance` with relevance scores - NOT IMPLEMENTED
**Impact:** These tests are ALL PASSING because they mock `fetch()`. They create false confidence - making it look like features exist when they don't.
**Fix:** DELETE these tests until features actually exist. Write tests AFTER implementing features, not before.
**Philosophy Violation:** "Write the dumb, obvious thing first" - these tests violate YAGNI by testing features we don't need yet.
---
## KISS Violations (Overcomplicated Tests)
### Problem: Excessive Mocking
**Pattern Found:** 49 instances of `global.fetch = vi.fn()` across 8 test files
**Issue:** Every test mocks the worker, so tests never verify real integration
**Example:** `/Users/alexnewman/Scripts/claude-mem/tests/integration/full-lifecycle.test.ts`
- Called "integration test" but mocks everything
- Never actually tests hooks talking to worker
- Can't catch real integration bugs
**Fix:** Add TRUE integration tests that:
1. Start real worker process
2. Run real hooks
3. Verify real database writes
4. Tear down cleanly
**Philosophy Violation:** "Simple First" - mocking everything is more complex than just testing the real thing.
---
## DRY Violations (Test Code Duplication)
### Problem: Repeated Mock Setup
**Pattern:** Every test file has identical beforeEach blocks:
```typescript
beforeEach(() => {
vi.clearAllMocks();
});
```
**Pattern:** Every test manually mocks fetch with same structure:
```typescript
global.fetch = vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ... })
});
```
**Solution:** Extract to test helpers:
```typescript
// tests/helpers/mock-worker.ts
export function mockWorkerSuccess(responseData: any) {
global.fetch = vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => responseData
});
}
export function mockWorkerError(status: number, message: string) {
global.fetch = vi.fn().mockResolvedValue({
ok: false,
status,
text: async () => message
});
}
```
**Impact:** Reduces 49 instances to ~10 helper calls. Makes test intent clearer.
---
## Actionable Recommendations
### Priority 1: Critical Regressions (Implement Now) ✅ DONE
1. **Hook Error Logging Tests** ✅ Created
- File: `/Users/alexnewman/Scripts/claude-mem/tests/error-handling/hook-error-logging.test.ts`
- Prevents silent failure regressions
- Verifies error messages are actionable
2. **ChromaSync Error Tests** ✅ Created
- File: `/Users/alexnewman/Scripts/claude-mem/tests/services/chroma-sync-errors.test.ts`
- Ensures consistent error messages
- Catches initialization bugs
3. **Hook Environment Tests** ✅ Created
- File: `/Users/alexnewman/Scripts/claude-mem/tests/integration/hook-execution-environments.test.ts`
- Prevents fish shell PATH regression
- Cross-platform coverage
### Priority 2: Remove False Positives (Do Next)
1. **DELETE Unimplemented Feature Tests**
- `/Users/alexnewman/Scripts/claude-mem/tests/happy-paths/search.test.ts` lines 80-271
- These create false confidence
- Re-add when features actually exist
### Priority 3: Reduce Test Complexity
1. **Extract Mock Helpers**
- Create `/Users/alexnewman/Scripts/claude-mem/tests/helpers/mock-worker.ts`
- Replace 49 instances of manual mocking
- See DRY section above for example
2. **Add TRUE Integration Tests**
- Create `/Users/alexnewman/Scripts/claude-mem/tests/integration/real-worker.test.ts`
- Start real worker, run real hooks
- Currently ALL integration tests are mocked
### Priority 4: Systematic Error Testing
1. **Worker Failure Scenarios**
- Create `/Users/alexnewman/Scripts/claude-mem/tests/integration/hook-failures.test.ts`
- Test crash, timeout, malformed response scenarios
2. **Spinner Timeout Tests**
- Create `/Users/alexnewman/Scripts/claude-mem/tests/utils/spinner-timeout.test.ts`
- Verify hardened spinner cleanup works
---
## Test Quality Checklist
For EVERY new test, verify:
- [ ] Tests actual bug, not mocked behavior
- [ ] Will FAIL if bug reappears
- [ ] Error messages are checked (not just success paths)
- [ ] No YAGNI - tests code that exists NOW
- [ ] DRY - uses test helpers, not duplicated setup
- [ ] KISS - simple, obvious test structure
- [ ] Fail fast - no silent fallbacks tested
---
## Coverage Metrics
**Before Audit:**
- Error handling: 0% (no tests for error paths)
- Silent failures: Undetected
- Recent bugfixes: Unprotected
**After Audit:**
- Error handling: ~40% (3 new test files)
- Silent failures: Detected by hook-error-logging.test.ts
- Recent bugfixes: Protected
**Remaining Gaps:**
- True integration tests (worker + hooks + database)
- Spinner error handling
- Worker crash scenarios
- Malformed response handling
---
## Files Created
1. `/Users/alexnewman/Scripts/claude-mem/tests/error-handling/hook-error-logging.test.ts`
- 200+ lines
- Tests handleFetchError, handleWorkerError
- Real hook error scenarios
- Error message quality checks
2. `/Users/alexnewman/Scripts/claude-mem/tests/services/chroma-sync-errors.test.ts`
- 300+ lines
- Client initialization errors
- Error message consistency
- Fail-fast behavior
3. `/Users/alexnewman/Scripts/claude-mem/tests/integration/hook-execution-environments.test.ts`
- 250+ lines
- Fish shell PATH resolution
- Cross-platform bun finding
- Real-world shell scenarios
**Total:** ~750 lines of new regression-preventing tests
---
## Philosophy Alignment
These tests follow the project's coding standards:
**YAGNI** - Only test code that exists (removed future-feature tests)
**DRY** - Identified duplication, recommended helpers
**Fail Fast** - All tests verify explicit errors, not silent failures
**Simple First** - Recommended real integration over complex mocks
**Delete Aggressively** - Flagged unimplemented feature tests for deletion
---
## Next Steps
1. **Run new tests:** `npm test tests/error-handling/ tests/services/ tests/integration/hook-execution-environments.test.ts`
2. **Delete false positives:** Remove search.test.ts lines 80-271 (unimplemented features)
3. **Extract helpers:** Create `tests/helpers/mock-worker.ts` to reduce duplication
4. **Add true integration:** Create real worker + hook integration test
5. **Continuous:** Apply "Test Quality Checklist" to all future tests
---
## Conclusion
The test suite now has **regression protection for recent bugfixes**. The three new test files will catch if:
- Hooks start failing silently again
- ChromaSync error messages become inconsistent
- Fish shell PATH issues return
However, we still need **true integration tests** that don't mock everything. The current integration tests are really "mocked end-to-end tests" - they test the shape of the API, not the actual behavior.
**Risk reduced from HIGH → MEDIUM**. Remaining risk: real integration failures not caught by mocked tests.
@@ -1,152 +0,0 @@
# Research Report: The Genesis of Biomimetic Architecture in Claude-Mem
## Executive Summary
The concept of **"biomimetic architecture"** in claude-mem emerged organically during a concentrated development period in mid-November 2025, crystallizing around three foundational observations created on November 17, 2025. What began as a practical solution to AI context window exhaustion evolved into a comprehensive philosophy of mirroring human memory systems while augmenting them with computational advantages. This report traces the intellectual journey from problem identification through architectural breakthrough to public messaging.
---
## The Foundational Philosophy (November 17, 2025, Early Morning)
The biomimetic architecture concept was formally articulated in three seminal observations created within a four-minute window between **1:31 AM and 1:35 AM** on November 17, 2025:
### Observation #10140 (Nov 17, 2025 at 1:31 AM)
**"Memory System Design Philosophy: Selective Retention with Total Recall Capability"**
This observation established the core philosophical foundation: humans observe selectively and retain only portions that seem relevant, never creating complete transcripts of all experiences. The innovation was recognizing this selective retention as fundamental to human cognition, then creating a hybrid approach—normal operation uses human-like selective observation-based memory, but leverages computational advantages by maintaining capability for complete recall through optional transcript archival when needed.
> **Key insight:** "Selective retention is fundamental to human cognition. The designed system replicates this behavior by observing and recording key observations, decisions, and discoveries rather than archiving everything."
### Observation #10142 (Nov 17, 2025 at 1:35 AM)
**"Biological Memory Principles in Endless Mode Architecture"**
Created just four minutes later, this observation made the problem-solution connection explicit: Claude's context window was exploding from endless raw data accumulation—exactly the same problem biological brains evolved to solve through compression. The architecture directly implements the brain's solution: compressing experiences into abstract observations rather than retaining verbose raw transcripts.
> **Critical innovation articulated:** "Unlike human memory which permanently loses raw data once compressed, Endless Mode maintains an archive of the original data. This creates a hybrid approach: the working memory operates on compressed abstractions for efficiency, while the full data remains available for later retrieval."
The observation concluded: *"This design naturally feels correct because it implements proven biological principles at the AI level—the brain's solution to memory management, now augmented with perfect archival recall."*
---
## The Breakthrough: 95.1% Token Reduction (November 21, 2025)
### Observation #13556 (Nov 21, 2025 at 10:25 PM)
**"Endless Mode breakthrough: 95.1% token reduction through biomimetic memory compression"**
Four days after the philosophical foundation was laid, the team validated the approach with empirical data. Real dataset analysis of 48 observations showed **95.1% token reduction** (16.5M → 801K tokens) with **20.6x efficiency gains**. The breakthrough document revealed the critical insight: observations are not lossy data compression but rather **memoized synthesis results**—caching the computational output Claude would generate from reading raw data.
This transformed the recursive synthesis problem from **O(N²) quadratic complexity to O(N) linear complexity**. Each tool use previously forced Claude to re-read and re-synthesize ALL previous tool outputs. With Endless Mode, Claude reads pre-computed observations instead, turning each synthesis into a one-time cost with cached results.
The observation explicitly framed this as: *"Two-tier memory system mimicking human working memory (compressed observations) but with digital advantages (perfect archival recall)."*
---
## Hybrid Architecture Recognition (November 21, 2025)
### Observation #13169 (Nov 21, 2025 at 1:32 AM)
**"Claude-mem Identified as Hybrid Architecture Mirroring Human Memory Systems"**
This observation synthesized the complete architectural understanding, identifying claude-mem as combining three components that directly parallel human memory systems:
1. **Episodic Memory** - Temporal timelines storing autobiographical, action-based experiences
*("On Nov 20, I fixed auth bug in session X")*
2. **Semantic Memory** - RAG-like vector similarity search for retrieving relevant past episodes
*("Find all times I worked on authentication")*
3. **Working Memory Compression** - Endless Mode preventing exponential context growth during active sessions
*(forget details, keep insights)*
**The full lifecycle:** During sessions, Endless Mode compresses in real-time; between sessions, observations are stored in episodic memory; new sessions start with RAG-like retrieval plus temporal timeline injection.
### Observation #13177 (Nov 21, 2025 at 1:35 AM)
**"Final Synthesis: General-Purpose AI Context Management Solution for Entire Industry"**
This observation expanded the vision beyond coding assistants, identifying seven application domains (healthcare, therapy, education, research, personal assistants, gaming, journalism) with the universal pattern: anywhere AI accumulates context over time benefits from ~80% compression.
> **Critical distinction clarified:** "RAG accesses external static knowledge while claude-mem accesses the AI's own episodic memories. The system combines episodic memory, RAG-like retrieval, and real-time compression, making it more sophisticated than pure RAG with temporal, autobiographical, and compression features."
---
## Translation to Public Messaging (November 26, 2025)
### Observation #15781 (Nov 26, 2025 at 5:15 PM)
**"Memory search reveals 19 results on biomimetic design philosophy origins"**
Five days later, during landing page development, the team executed a memory search for "biomimetic human memory design philosophy" which returned 19 matches. This search surfaced the November 17th foundational observations, providing the backstory needed for public-facing content development.
### Observation #15757 (Nov 26, 2025 at 4:30 PM)
**"BiomimeticDesign Component Created with Human Memory Philosophy Narrative"**
The team created a landing page component explaining the philosophy to users. The narrative established that LLMs "simply DO" with no retention between sessions, then explained human memory as reconstructive—built from scattered fragments rather than photographic playback—framed as *"genius compression, not a bug."*
**The three-pillar architecture** directly mapped human cognitive systems to technical implementation:
- **Episodic Memory** → Timeline Observations
- **Semantic Memory** → RAG Vector Search
- **Working Memory** → Endless Mode (95% compression)
### Observation #15818 (Nov 26, 2025 at 5:27 PM)
**"Timeline Search as Causal Navigation Pattern Over Efficiency Metrics"**
This observation refined the public messaging, identifying that the actual innovation wasn't compression percentages but **timeline-based search** returning contextual windows (7 before, 7 after) to expose causal relationships, combined with semantically rich titles functioning as retrieval cues.
> **Key insight:** "The proof of effectiveness is behavioral: Claude knows exactly where to go without searching, using only index tables. The upfront cost of creating detailed observations eliminates ongoing re-synthesis cost—the understanding was already built, and the index preserves access to that synthesis."
### Observation #15805 (Nov 26, 2025 at 5:24 PM)
**"Reframed landing page copy from abstract to concrete Claude experience"**
User feedback about "low context malarkey" prompted a pivot from theoretical human memory metaphors to concrete Claude behavior descriptions. The messaging shifted to specific examples:
- **Pain point:** Claude re-reading, re-discovering, re-researching
- **Solution:** Timeline feature showing 7 observations before/after
- **Proof:** "It barely ever searches. It just knows where to go."
---
## The Terminology Debate (December 2, 2025)
### Observation #19374 (Dec 2, 2025 at 7:37 PM)
**"User Questioning Biomimetic Design Terminology"**
The user raised questions about whether "biomimetic design" terminology should be changed to alternative phrasing, indicating potential reconsideration of naming conventions.
### Observation #19377 (Dec 2, 2025 at 7:38 PM)
**"Renamed BiomimeticDesign component to HowYouRemember"**
The component was renamed from "BiomimeticDesign" to "HowYouRemember" for user-friendliness, though the underlying architecture and philosophy remained unchanged. The renaming improved semantic clarity by aligning the component name with its actual content—explaining how users can remember and query information.
---
## Key Timeline
| Date | Time | Event |
|------|------|-------|
| **Nov 17, 2025** | 1:31-1:35 AM | Core biomimetic philosophy articulated in observations #10140 and #10142 |
| **Nov 17, 2025** | 3:28 PM | Observation #10364 documents comprehensive development narrative |
| **Nov 21, 2025** | 1:32 AM | Hybrid architecture recognition in observation #13169 |
| **Nov 21, 2025** | 10:25 PM | Breakthrough validation with 95.1% token reduction in observation #13556 |
| **Nov 26, 2025** | 4:30-5:27 PM | Public-facing BiomimeticDesign component created and messaging refined |
| **Dec 2, 2025** | 7:37 PM | Terminology questioned and component renamed to HowYouRemember |
---
## Conclusion
The biomimetic architecture concept emerged from a deep first-principles analysis of the AI context management problem. Rather than treating memory as a pure engineering challenge, the team recognized the parallel to biological systems that evolved to solve identical problems.
The innovation wasn't merely copying human memory limitations, but rather **understanding the why behind selective retention and compression**, then augmenting those principles with computational advantages (perfect archival recall).
The concept evolved through distinct phases:
1. **Internal architectural philosophy** (Nov 17)
2. **Empirical validation** (Nov 21)
3. **Public messaging** (Nov 26)
4. **User-friendly terminology** (Dec 2)
...while preserving the core biomimetic principles that make the system work.
---
## References
**Observations:** #10140, #10142, #10363, #10364, #13169, #13177, #13556, #15757, #15781, #15784, #15785, #15805, #15818, #15824, #19374, #19377
@@ -1,23 +0,0 @@
# Observation #10140
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #10142
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #10363
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #10364
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #13169
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #13177
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #13556
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #15757
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #15781
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #15784
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #15785
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #15805
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #15818
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #15824
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #19374
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,23 +0,0 @@
# Observation #19377
**Created**:
**Type**:
**Session**:
**Project**:
## Title
## Subtitle
## Narrative
## Facts
## Concepts
## Discovery Tokens
@@ -1,360 +0,0 @@
# Dual-Tag System Architecture
**Date**: 2025-11-30
**Branch**: `feature/meta-observation-control`
**Status**: Implemented
**Based on**: PR #105 dual-tag system
## Overview
The dual-tag system provides fine-grained control over what content gets persisted in claude-mem's observation database. It uses an edge processing pattern to filter tagged content at the hook layer before it reaches the worker service.
## The Two Tags
### Tag 1: `<private>`
**Purpose**: User-controlled privacy
**Status**: User-facing feature (documented)
**Use case**: Users wrap content they don't want persisted
```xml
<private>
This content won't be stored in observations
</private>
```
**Examples**:
- Sensitive information (API keys, credentials, internal URLs)
- Temporary context (deadlines, personal notes)
- Debug output (logs, stack traces)
- Exploratory prompts (brainstorming, hypotheticals)
### Tag 2: `<claude-mem-context>`
**Purpose**: System-level meta-observation control
**Status**: Infrastructure-ready (not user-facing yet)
**Use case**: Prevents recursive storage when real-time context injection is active
```xml
<claude-mem-context>
# Relevant Context from Past Sessions
[Auto-injected past observations...]
</claude-mem-context>
```
**Context**: This tag is used by the real-time context injection feature (not yet shipped). When past observations are injected into new prompts, they're wrapped in this tag to prevent them from being re-stored as new observations (recursive storage problem).
## Architecture Pattern: Edge Processing
**Principle**: "Process at edge, send clean data to server"
The dual-tag system follows the edge processing pattern from hooks-in-composition:
```text
UserPrompt → [Hook Layer] → Worker → Database
Filter here
(strip tags at edge)
```
### Data Flow
**Without Filtering** (broken):
```
UserPrompt with <private> → PostToolUse hook → Worker → Memory Agent → Database
Private content stored
```
**With Edge Processing** (correct):
```
UserPrompt with <private> → PostToolUse hook → stripMemoryTags() → Worker → Memory Agent → Database
↑ ↓
Filter at edge Only clean data stored
```
## Implementation
### File: `src/hooks/save-hook.ts`
**Function Added** (lines 31-53):
```typescript
/**
* Strip memory tags to prevent recursive storage and enable privacy control
*/
function stripMemoryTags(content: string): string {
if (typeof content !== 'string') {
silentDebug('[save-hook] stripMemoryTags received non-string:', { type: typeof content });
return '{}'; // Safe default for JSON context
}
return content
.replace(/<claude-mem-context>[\s\S]*?<\/claude-mem-context>/g, '')
.replace(/<private>[\s\S]*?<\/private>/g, '')
.trim();
}
```
**Application** (lines 95-100):
```typescript
tool_input: tool_input !== undefined
? stripMemoryTags(JSON.stringify(tool_input))
: '{}',
tool_response: tool_response !== undefined
? stripMemoryTags(JSON.stringify(tool_response))
: '{}',
```
### File: `tests/strip-memory-tags.test.ts`
**Test Coverage**: 19 tests across 4 categories:
1. **Basic Functionality** (7 tests)
- Strip `<claude-mem-context>` tags
- Strip `<private>` tags
- Strip both tag types
- Handle nested tags
- Multiline content
- Multiple tags
- Empty results
2. **Edge Cases** (5 tests)
- Malformed tags (unclosed)
- Tag-like strings (not actual tags)
- Very large content (10k+ chars)
- Whitespace trimming
- Strings without tags
3. **Type Safety** (5 tests)
- Non-string inputs (number, null, undefined, object, array)
- All return safe default '{}'
4. **Real-World Scenarios** (2 tests)
- JSON.stringify output
- Efficient large content handling
**All tests passing** ✅ (19/19)
## Design Decisions
### 1. Always Active (No Configuration)
**Decision**: Tag stripping is always on, no environment variable needed
**Rationale**: Privacy and anti-recursion protection should be default, not opt-in
### 2. Edge Processing (Not Worker-Level)
**Decision**: Filter at hook layer before sending to worker
**Rationale**:
- Keeps worker service simple
- Follows one-way data stream
- No worker changes needed
- Hook becomes a filter/gateway
### 3. Defensive Coding with Silent Debug
**Decision**: Handle non-string inputs with silentDebug, return safe default
**Rationale**:
- Never block the agent (hooks-in-composition principle)
- Log issues for observability
- Safe fallback maintains system stability
### 4. Both Tags Now (Progressive Enhancement)
**Decision**: Implement both tags even though only `<private>` is user-facing
**Rationale**:
- Infrastructure ready for real-time context feature
- No rework needed when context injection ships
- Same code path for both tags (simple)
- Progressive enhancement approach
### 5. Regex-Based Stripping
**Decision**: Use regex `/<tag>[\s\S]*?<\/tag>/g` instead of XML parser
**Rationale**:
- No dependencies needed
- Handles multiline content (`[\s\S]*?`)
- Non-greedy (`*?`) prevents over-matching
- Global flag (`g`) handles multiple tags
- Good enough for this use case
## Edge Cases Handled
| Case | Input | Output | Why |
|------|-------|--------|-----|
| Nested tags | `<private>a <private>b</private> a</private>` | `` | Outer tag matches all |
| Malformed | `<private>unclosed` | `<private>unclosed` | Regex requires closing tag |
| Multiple | `<private>a</private> b <private>c</private>` | `b` | Global flag removes all |
| Empty | `<private></private>` | `` | Matches and removes |
| Tag-like | `<tag>not private</tag>` | `<tag>not private</tag>` | Different tag name |
| Large content | 10MB+ string | (stripped) | O(n) regex handles it |
| Non-string | `123`, `null`, `{}` | `'{}'` | Defensive default |
## Future Enhancements
### 1. Real-Time Context Injection
**Status**: Deferred (not in this PR)
**When ready**: The `<claude-mem-context>` tag infrastructure is already in place
The missing piece is in `src/hooks/new-hook.ts`:
- Select relevant observations from timeline
- Wrap in `<claude-mem-context>` tags
- Return via `hookSpecificOutput`
- Tag stripping already handles the rest
### 2. System-Level Meta-Observation Tagging
**Concept**: Auto-tag observations about observations
**Examples**:
- Search skill results: `<claude-mem-context>[search results]</claude-mem-context>`
- Memory lookups: Fetched observations wrapped in tag
- Observation summaries: Meta-level analysis wrapped
**Implementation**: Tools/skills that produce meta-observations can wrap output in `<claude-mem-context>` tags to prevent recursive storage.
### 3. Additional Tag Types
**Potential tags**:
- `<ephemeral>`: Content that should be seen but not stored (alias for `<private>`)
- `<debug>`: Debug output that should be logged but not persisted
- `<scratch>`: Thinking/planning content not meant for observations
**Note**: Current implementation handles any tag you add to the regex. Adding new tags requires one line change in `stripMemoryTags()`.
## Testing Strategy
### Unit Tests
```bash
node --test tests/strip-memory-tags.test.ts
```
**Expected**: 19/19 passing ✅
### Integration Tests
**Test 1: Basic Privacy**
```bash
# Submit prompt with <private> tag
# Query database: should not contain private content
sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations WHERE narrative LIKE '%<private>%';"
# Expected: 0
```
**Test 2: Dual Tags**
```bash
# Submit prompt with both tags
# Verify neither tag appears in database
sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations WHERE narrative LIKE '%<private>%' OR narrative LIKE '%<claude-mem-context>%';"
# Expected: 0
```
**Test 3: Function Exists**
```bash
# Verify stripMemoryTags in built file
grep -c "claude-mem-context.*private.*trim" ~/.claude/plugins/marketplaces/thedotmack/plugin/scripts/save-hook.js
# Expected: 1
```
### Regression Tests
**Ensure**:
- Normal observations still work (no tags broken)
- Worker service receives clean data
- No errors in `~/.claude-mem/silent.log`
- Tool executions still captured correctly
## Known Limitations
### 1. Tag Format is Fixed
Tags must use exact XML-style format: `<tag>content</tag>`
**Won't work**:
- `[private]content[/private]` (wrong syntax)
- `<!-- private -->content<!-- /private -->` (comment syntax)
- `{{private}}content{{/private}}` (curly braces)
**Future**: Could add support for alternative formats if needed.
### 2. Partial Tag Matching
If user writes about tags without intending to use them:
```
I want to add a <private> tag feature to my app
```
This won't be stripped (no closing tag). But if they accidentally write:
```
I want to add a <private>tag</private> feature
```
"tag" gets stripped.
**Mitigation**: Documentation educates users on proper usage.
### 3. Performance with Very Large Content
Regex performance is O(n) where n = content length.
**Tested**: Works fine with 10,000 character strings
**Unknown**: Performance with multi-megabyte tool responses
**Mitigation**: Most tool I/O is small. If issues arise, could optimize with:
- Early exit if no '<' character found
- Streaming regex for very large content
- Size limits on stripMemoryTags input
## Documentation
### User-Facing
**Location**: `docs/public/usage/private-tags.mdx`
**Content**:
- How to use `<private>` tags
- Use cases and examples
- Best practices
- Troubleshooting
**Available in**: Mintlify docs site, navigation under "Get Started"
### Technical/Internal
**Location**: `docs/context/dual-tag-system-architecture.md` (this file)
**Content**:
- Complete dual-tag system architecture
- Implementation details
- Design decisions
- Future enhancements
**Audience**: Contributors, maintainers, future developers
## References
### Original Work
- **PR #105**: Real-time context injection with dual-tag system
- **Branch**: `feature/real-time-context` (merged to main)
- **Investigator**: @basher83
### Documentation
- **Investigation**: `docs/context/real-time-context-recursive-memory-investigation.md`
- **User Guide**: `docs/public/usage/private-tags.mdx`
- **This Document**: `docs/context/dual-tag-system-architecture.md`
### Patterns Applied
- **Edge Processing**: From hooks-in-composition pattern
- **Never Block the Agent**: Defensive coding, safe defaults
- **One-Way Data Stream**: Hook → Worker → Database
## Summary
The dual-tag system is a complete, production-ready implementation that:
- ✅ Gives users privacy control via `<private>` tags
- ✅ Prepares infrastructure for real-time context injection
- ✅ Uses edge processing pattern for clean architecture
- ✅ Has comprehensive test coverage (19 tests, all passing)
- ✅ Includes user documentation and technical reference
- ✅ Requires no configuration (always active)
- ✅ Handles edge cases defensively
**Status**: Ready to ship 🚀
File diff suppressed because it is too large Load Diff
-83
View File
@@ -1,83 +0,0 @@
# Claude-Mem Hooks Cleanup Todo
## ✅ Phase 1: Delete Dead Code (Modified)
**hook-response.ts**
- [ ] Remove `| string` from HookType union to restore type safety
- [ ] Delete PreCompact branch (lines 23-36, 14 lines)
- [x] ~~Delete pointless branches~~ — SKIP (intentional)
- [x] ~~Simplify wrapper function~~ — SKIP (intentional)
**new-hook.ts**
- [ ] Delete 34-line architecture comment block (lines 1-34)
- [ ] Replace 18 lines of debug logging with single 4-line log call (lines 64-81)
**cleanup-hook.ts**
- [ ] Remove `cwd`, `transcript_path`, `hook_event_name` from SessionEndInput interface
- [ ] Replace 12-line manual mode help with simple error throw
**user-message-hook.ts**
- [ ] Delete all 40 lines of expired announcement code (lines 31-70)
- [ ] Add comment explaining exit code 3: `// exit code 3 = show user message that Claude does NOT receive as context`
---
## ✅ Phase 2: Extract Shared Utilities
- [ ] Create `src/shared/hook-error-handler.ts` with `handleWorkerError()`
- [ ] Update all 4 hooks to use shared error handler (context-hook, new-hook, save-hook, summary-hook)
- [ ] Create `src/shared/transcript-parser.ts` — merge `extractLastUserMessage` + `extractLastAssistantMessage` into single parameterized function
- [ ] Create `src/shared/hook-constants.ts` for exit codes, timeouts
---
## ❌ Phase 3: SKIPPED
_(Entry points stay as-is, hook-response.ts wrapper stays as-is)_
---
## ✅ Phase 4: Restore Type Safety
**context-hook.ts**
- [ ] Make `session_id`, `cwd`, `transcript_path` required in SessionStartInput
- [ ] Remove `[key: string]: any`
- [ ] Remove unused `source` field
- [ ] Keep using `happy_path_error__with_fallback` for defaults (hooks use exit codes, logging tool is appropriate)
**All 4 hook interfaces**
- [ ] Remove `[key: string]: any` from all interfaces
**save-hook.ts**
- [ ] Keep `happy_path_error__with_fallback` usage (it's appropriate for hook context)
**summary-hook.ts**
- [ ] Add timeout (2s) and error logging to spinner stop request
---
## ✅ Phase 5: Relocate Business Logic (Modified)
- [ ] Move `SKIP_TOOLS` from save-hook.ts to worker service
- [ ] Make `SKIP_TOOLS` configurable via settings.json
- [x] ~~Move announcements to database~~ — SKIP
- [x] ~~Merge context-hook + user-message-hook~~ — SKIP (intentionally separate)
---
## Summary
| Action | Count |
| ----------------- | ----- |
| Lines to delete | ~150 |
| New shared files | 3 |
| Interfaces to fix | 4 |
| Items skipped | 5 |
-125
View File
@@ -1,125 +0,0 @@
⏺ PR #335 Review Summary: Queue Monitoring UI + Windows Fixes
What PR #335 Does
Additions: 3,724 lines
Author: ToxMox
Main Changes:
1. Switch worker runtime from Bun → Node.js (all platforms)
2. Add persistent message queue in SQLite (pending_messages table)
3. Add queue monitoring UI with drawer interface
4. Add watchdog service for auto-recovery (30s polling)
5. Fix Windows console popups using PowerShell Start-Process -WindowStyle Hidden
Initial Concerns (My Review)
Raised red flags about:
- Lack of evidence for "zombie socket" issue (no GitHub issue, only ToxMox reported)
- Over-engineering: Full persistent queue + watchdog + retry logic + UI for unproven problems
- Mixing multiple concerns in one PR (should be 3-4 separate PRs)
- No automated tests for complex state machine logic
- Global runtime change (Bun→Node) affects all platforms for Windows-specific issue
- Command injection vulnerability in PowerShell string (ProcessManager.ts:67)
What We Discovered
1. Problems ARE Real & Documented
- Found detailed analysis in PR #315 comments by ToxMox
- Zombie socket issue has upstream Bun GitHub issues linked:
- oven-sh/bun#12127, #5774, #8786
- windowsHide: true doesn't work with detached: true (Node.js bug #21825)
- SDK subprocess hangs documented with testing details
2. Queue UI Has Real Value
- You saw video demo and said it's "gorgeous and helpful"
- Provides visibility into worker activity
- Recovery controls prevent user frustration
- Real-time updates via existing SSE infrastructure
3. Architecture Makes Sense
Why persistent worker is needed:
- Real-time UI at http://localhost:37777 requires persistent process
- SSE (Server-Sent Events) for live updates
- Can't do on-demand worker if UI needs to be always available
Why queue in database is justified:
- Transactional consistency (save observation + enqueue atomically)
- Relational queries (JOIN with sessions/projects)
- Foreign key cascades (session deleted → queue entries auto-cleaned)
- Already have SQLite infrastructure
4. Storage Optimization Strategy
Smart cleanup approach (your insight):
- Keep full data while pending/processing (needed for retry)
- Clear payloads immediately on completion: Set tool_input, tool_response, last_user_message, last_assistant_message to NULL
- Keep lightweight metadata for "Recently Processed" UI
- Eventually delete old completed records (after 1hr or >100 count)
- Result: 100x storage reduction while keeping UI history
Rejected approach: Linking to transcript files (overly complex, YAGNI)
Critical Insight: Bun → Node Switch Likely Unnecessary
Your final assessment:
"honestly thats more an llm hallucinating an overengineered solution based on incorrect data that probably could be solved by just killing the process correctly"
The real issue is probably:
- Missing cleanup handlers (server.close() before exit)
- Process killed too fast (SIGKILL before cleanup finishes)
- Not receiving SIGTERM properly
- No registered signal handlers for graceful shutdown
Simple fix to try FIRST:
const server = app.listen(port);
async function cleanup() {
server.close(); // Close server
sessionManager.abortAll(); // Stop active work
db.close(); // Close DB
process.exit(0);
}
process.on('SIGTERM', cleanup);
process.on('SIGINT', cleanup);
Final Assessment
PR #335 is mostly solid with real benefits, BUT:
✅ What's Good:
- Queue UI provides real value
- Persistent queue in DB is architecturally justified
- Auto-recovery prevents stuck sessions
- Problems are real and documented
- Comprehensive solution to multiple pain points
⚠️ What Needs Validation:
1. Bun zombie socket issue - Only ToxMox reported, not validated by you
2. Proper cleanup handlers - Try fixing process termination before switching runtimes
3. Platform-specific runtime - If Bun issue is real, use Node only on Windows, keep Bun on Mac/Linux
🔧 What Needs Fixing:
1. Add tests - Zero automated tests for complex state machine
2. Fix command injection - ProcessManager.ts:67 PowerShell string interpolation
3. Implement payload cleanup - Clear heavy data immediately on completion
4. Try simple fix first - Proper signal handlers before runtime switch
Action Items for Next Session
1. Ask ToxMox: Did you try proper cleanup handlers before switching runtimes?
2. Suggest: Platform-specific runtime (Bun on Unix, Node on Windows only if needed)
3. Request: Reproduction steps for zombie socket issue
4. Require: Basic tests before merge
5. Fix: Command injection vulnerability
6. Consider: Splitting into separate PRs (optional, not required)
Key Takeaway
The PR solves real problems with solid architecture, but the Bun→Node switch is likely over-engineered. Try proper process cleanup first.
@@ -1,332 +0,0 @@
# Windows, Bun, and Worker Service Struggles
A comprehensive chronicle of platform-specific issues, attempted fixes, and architectural decisions.
## Executive Summary
The claude-mem project has faced persistent Windows-specific issues centered around three core problems:
1. **Console Window Popups**: Blank terminal windows appearing when spawning worker and SDK subprocess
2. **Zombie Socket Issues**: Bun leaving TCP sockets in LISTEN state after termination on Windows
3. **Process Management Complexity**: Platform-specific spawning logic and reliability issues
These issues have driven multiple PRs, architectural pivots, and significant debate about runtime switching (Bun → Node.js).
---
## Timeline of Issues
### Issue #209: Windows Worker Startup Failures (Dec 12-13, 2025)
**Problem**: Worker service failed to start on Windows using PowerShell Start-Process approach.
**Symptoms**:
- Worker startup attempted via `powershell.exe -NoProfile -NonInteractive -Command Start-Process`
- Health check retries exhausted (15 attempts over 15 seconds)
- Users left unable to start worker manually
**Root Causes**:
- Platform-conditional process spawning (PowerShell for Windows, PM2 for Unix)
- PowerShell spawning without `-PassThru` to capture PID
- Inconsistent process management across platforms
**Resolution**: Issue was marked as closed, suggesting it was resolved in v7.1.0 through architectural unification with Bun-based ProcessManager using PID file tracking consistently across all platforms.
**Status**: ✅ Resolved (pre-PR #335)
---
### Issue #309 & PR #315: Console Window Popups (Dec 14-15, 2025)
**Problem**: Blank terminal windows appear when spawning worker processes and SDK subprocesses on Windows.
**First Attempted Fix (PR #315)**: Add `windowsHide: true` to spawn options
**Why It Failed**: Node.js bug #21825 - `windowsHide: true` is **ignored** when `detached: true` is also set. Both flags are required:
- `detached: true` - Needed for background process
- `windowsHide: true` - Needed to hide window (but doesn't work when detached)
**Testing Results** (by ToxMox):
- Tested PR #315 on Windows 11
- Confirmed blank terminal windows still appear for both worker and SDK subprocess spawns
- Affects both `ProcessManager.ts` (worker) and `SDKAgent.ts` (SDK subprocess)
**Working Solution**: Use PowerShell's `Start-Process` with `-WindowStyle Hidden` flag instead of standard spawn.
**Status**: ❌ PR #315 closed in favor of more comprehensive solution
---
### Bun Zombie Socket Issue (Dec 15, 2025)
**Problem**: Bun leaves TCP sockets in zombie LISTEN state on Windows after worker termination.
**Symptoms**:
- Port remains bound even though no process owns it
- `OwningProcess` shows 0 or dead PID
- New worker instances cannot start due to `EADDRINUSE` errors
- Happens regardless of termination method (process.exit(), external kill, Ctrl+C)
- **Only system reboot clears zombie ports**
**Upstream Tracking**:
- Bun issue #12127
- Bun issue #5774
- Bun issue #8786
**Impact**: Windows users may need to reboot their systems when worker crashes or is restarted.
**Proposed Solution**: Switch worker runtime from Bun to Node.js on Windows (or globally).
**Status**: 🟡 Unresolved - Platform-specific bug in Bun's Windows socket cleanup
---
### SDK Subprocess Hang Issue (Dec 15, 2025)
**Problem**: SDK subprocesses can hang indefinitely, blocking observation processing.
**Root Cause**: `AbortController.abort()` does not actually terminate child processes.
**Symptoms**:
- For-await loop blocks forever waiting for output from hung subprocess
- Observation processing halts
- No recovery mechanism
**Solution**: Implement watchdog timer that explicitly kills child processes using platform-specific commands:
- **Windows**: `wmic process where ParentProcessId=<pid> delete`
- **Unix**: `pkill -P <pid>`
**Timeout**: `SDK_QUERY_TIMEOUT_MS` set to 2 minutes
**Status**: ✅ Fixed in PR #335 (watchdog implementation)
---
## PR #335: Comprehensive Windows Fix (Dec 15, 2025)
### What It Attempted
ToxMox developed a comprehensive PR addressing all Windows issues simultaneously:
1. **PowerShell-based spawning** to fix popup windows
2. **Runtime switch** from Bun to Node.js (globally) to fix zombie sockets
3. **Queue monitoring system** with persistent message queue
4. **Watchdog service** for stuck message recovery
5. **SQLite compatibility layer** for Node.js support
### Architecture Decisions
**ProcessManager Changes**:
- Switched from `startWithBun()` to `startWithNode()`
- Windows: Uses PowerShell `Start-Process -WindowStyle Hidden -PassThru`
- Unix: Uses standard `spawn()` with `detached: true`
- Captures PID via PowerShell `Select-Object -ExpandProperty Id`
- Comment states: "Use Node on all platforms (Bun has zombie socket issues on Windows)"
**SQLite Compatibility Layer**:
- Created `sqlite-compat.ts` adapter pattern
- Provides `bun:sqlite` API compatibility via `better-sqlite3`
- Allows code to work with both Bun and Node.js runtimes
### Critical Issues Identified
#### 1. **Global vs Platform-Conditional Runtime**
**The Inconsistency**: Code comment explicitly states zombie sockets occur "on Windows", yet solution applies Node.js universally across all platforms.
**Questions Raised**:
- Why sacrifice Bun's performance on macOS/Linux where no issues documented?
- Platform-specific spawning already implemented - why not platform-specific runtime?
- No documented Bun reliability issues on non-Windows platforms
#### 2. **Performance Regressions**
**better-sqlite3 Blocking**:
- Synchronous-only API blocks Node.js event loop during all DB operations
- Contrasts with Bun's async SQLite support
- Affects: enqueue, markProcessing, markProcessed, watchdog checks
**Watchdog Polling Overhead**:
- Full table scans every 30 seconds even when idle
- Constant database I/O overhead
- No max queue size limits = unbounded growth
**Startup Latency**:
- Node.js initialization (slower than Bun)
- Native module loading (better-sqlite3)
- Database migrations
- Stuck message scan
- Watchdog initialization
- HTTP server startup
#### 3. **Build Dependencies**
**better-sqlite3 Requirements**:
- node-gyp
- Python
- C++ compiler toolchains
- Visual Studio Build Tools (Windows)
**Impact**:
- Local development machines without build tools fail
- CI/CD pipelines need updated Docker images
- Restricted environments where compilers not permitted
- ARM/M1 Mac compatibility issues
#### 4. **Migration Risks**
**Breaking Changes**:
- Automatic database migration adds `pending_messages` table
- Runtime switch not documented in PR
- Node.js becomes undocumented hard requirement
- No migration guide or rollback procedure
**Unanswered Questions**:
- What happens to in-flight messages during upgrade?
- Can users safely downgrade?
- Is migration idempotent?
#### 5. **Code Quality Issues**
**Command Injection Risk** (ProcessManager.ts:67):
- PowerShell commands use template literal concatenation
- Vulnerable if `MARKETPLACE_ROOT` or script paths attacker-controlled
- Should use array-based argument passing
**Missing Error Handling** (WatchdogService.ts:61):
- `setInterval` callback lacks error handling
- Timer continues running if `check()` throws
- Creates zombie watchdog scenario
**No Queue Size Limits**:
- Unbounded database growth if messages accumulate
- Failed messages (exceeding `maxRetries`) accumulate indefinitely
- Only 24-hour retention for processed messages
---
## Assessment and Recommendations
### What Was Validated
**Legitimate Windows Issues**:
- ✅ Console window popups are real (Node.js bug #21825)
- ✅ PowerShell `Start-Process` solution works
- ✅ Bun zombie socket issue is real and Windows-specific
- ✅ SDK subprocess hang issue is real
### What Remains Questionable
**Global Runtime Switch**:
- ❌ No evidence Bun problematic on macOS/Linux
- ❌ Platform-conditional runtime not considered
- ❌ Performance trade-offs not documented
- ❌ "Windows-only" issue applied globally
**Zombie Socket Root Cause**:
- 🟡 May be fixable with proper cleanup handlers:
- Missing `server.close()` calls before exit
- Processes killed with `SIGKILL` before cleanup finishes
- Missing `SIGTERM` signal handlers for graceful shutdown
- 🟡 Runtime switch may be unnecessary over-engineering
### Salvageable Components
**If Extracted into Separate PRs**:
1. **PowerShell Spawning for Windows Worker**
- Focused PR: "Windows: Use Node.js instead of Bun for worker process"
- Platform-conditional logic (Node.js on Windows, Bun elsewhere)
- Independent justification required
2. **SQLite Compatibility Layer**
- Well-designed adapter pattern
- Requires independent justification for Node.js runtime need
- Should not be bundled with other changes
3. **Queue Monitoring UI Concept**
- Valuable visibility into worker state
- Should build on in-memory state first
- Remove database persistence requirement initially
4. **Watchdog Improvements**
- SDK subprocess timeout handling
- Evidence of superiority over current approach needed
---
## Current Status
### Resolved
- ✅ Issue #209: Windows worker startup (v7.1.0)
- ✅ SDK subprocess hang issue (watchdog implementation)
### In Progress
- 🔄 PR #339: Windows console popup fix (extracted from PR #335)
- 🔄 PR #338: Queue monitoring system (extracted from PR #335)
### Open Questions
- ❓ Should runtime switch be global or Windows-only?
- ❓ Can zombie socket issue be fixed without runtime switch?
- ❓ Is better-sqlite3's synchronous blocking acceptable?
- ❓ Should queue persistence be in-memory first?
---
## Lessons Learned
### Architectural Principles Violated
**YAGNI**: Queue persistence, watchdog service, and comprehensive monitoring added without proven need.
**Happy Path**: Should have started with simplest Windows fix (PowerShell spawning), validated, then added complexity if needed.
**Incremental Validation**: Bundling multiple architectural changes prevents isolating what actually solves the problem.
### What Should Have Happened
1. **Phase 1**: PowerShell spawning fix for Windows console popups (targeted, testable)
2. **Phase 2**: Investigate zombie socket root cause (cleanup handlers vs runtime switch)
3. **Phase 3**: If runtime switch justified, implement as Windows-conditional first
4. **Phase 4**: Add queue monitoring as optional feature with in-memory state
5. **Phase 5**: Add persistence only if in-memory insufficient
### Key Takeaways
- **Windows-specific issues don't justify global architectural changes** without clear evidence
- **Platform-conditional logic is acceptable** when solving platform-specific problems
- **Native module dependencies are heavy** - avoid unless necessary
- **Performance regressions need explicit justification** - synchronous blocking, startup latency, polling overhead all impact UX
- **Bundle size matters** - build tools, compilers, Python are significant requirements
---
## References
**GitHub Issues**:
- #209: Windows worker startup failures
- #309: Console window popups
- #315: windowsHide approach (closed)
**PRs**:
- #335: Comprehensive Windows fix (under review)
- #338: Queue monitoring system (extracted)
- #339: Windows console popup fix (extracted)
**Upstream Bugs**:
- Node.js #21825: windowsHide ignored with detached
- Bun #12127, #5774, #8786: Windows zombie sockets
**Related Observations**:
- #27302: PR #315 windowsHide failure analysis
- #27233: Bun zombie socket discovery
- #27232: Windows background window root cause
- #27286: Runtime switch assessment
- #27283: PowerShell process spawn fix
- #27190: ProcessManager Node.js implementation
- #24532: Issue #209 resolution
---
**Last Updated**: 2025-12-16
**Document Status**: Comprehensive review based on memory search through #S3485
@@ -3,6 +3,14 @@ title: "PM2 to Bun Migration"
description: "Complete technical documentation for the process management and database driver migration in v7.1.0"
---
<Note>
**Historical Migration Documentation**
This document describes the PM2 to Bun migration that occurred in v7.1.0 (December 2025). If you're installing claude-mem for the first time, this migration has already been completed and you can use the current Bun-based system documented in the main guides.
This documentation is preserved for users upgrading from versions older than v7.1.0.
</Note>
# PM2 to Bun Migration: Complete Technical Documentation
**Version**: 7.1.0
@@ -98,7 +106,7 @@ pm2 logs claude-mem-worker # View logs
```bash
npm run worker:start # Start worker
npm run worker:stop # Stop worker
npm run worker:restart # Restart worker
claude-mem restart # Restart worker
npm run worker:status # Check status
npm run worker:logs # View logs
```
@@ -297,7 +305,7 @@ No migration logic runs on subsequent sessions.
| `pm2 list` | `npm run worker:status` | Shows worker status |
| `pm2 start <script>` | `npm run worker:start` | Start worker |
| `pm2 stop claude-mem-worker` | `npm run worker:stop` | Stop worker |
| `pm2 restart claude-mem-worker` | `npm run worker:restart` | Restart worker |
| `pm2 restart claude-mem-worker` | `claude-mem restart` | Restart worker |
| `pm2 delete claude-mem-worker` | `npm run worker:stop` | Remove worker |
| `pm2 logs claude-mem-worker` | `npm run worker:logs` | View logs |
| `pm2 describe claude-mem-worker` | `npm run worker:status` | Detailed status |
@@ -443,7 +451,7 @@ pm2 save # Persist the deletion
rm ~/.claude-mem/.pm2-migrated
# Restart worker
npm run worker:restart
claude-mem restart
```
### Scenario 2: Stale PID File (Process Dead)
@@ -475,7 +483,7 @@ lsof -i :37777
kill -9 <PID>
# Restart worker
npm run worker:restart
claude-mem restart
```
### Common Error Messages
@@ -416,7 +416,7 @@ If searches fail, check worker service:
```bash
npm run worker:status # Check status
npm run worker:restart # Restart worker
claude-mem restart # Restart worker
npm run worker:logs # View logs
```
+2 -2
View File
@@ -240,7 +240,7 @@ POST /api/observations/batch
- `400 Bad Request`: `{"error": "ids must be an array of numbers"}`
- `400 Bad Request`: `{"error": "All ids must be integers"}`
**Use Case**: This endpoint is used by the `get_batch_observations` MCP tool to efficiently retrieve multiple observations in a single request, avoiding the overhead of multiple individual requests.
**Use Case**: This endpoint is used by the `get_observations` MCP tool to efficiently retrieve multiple observations in a single request, avoiding the overhead of multiple individual requests.
#### 9. Get Session by ID
```
@@ -500,7 +500,7 @@ npm run worker:start
npm run worker:stop
# Restart worker
npm run worker:restart
claude-mem restart
# View logs
npm run worker:logs
+5 -5
View File
@@ -316,7 +316,7 @@ Edit `~/.claude-mem/settings.json`:
Then restart the worker:
```bash
npm run worker:restart
claude-mem restart
```
### Custom Model
@@ -331,7 +331,7 @@ Edit `~/.claude-mem/settings.json`:
Then restart the worker:
```bash
export CLAUDE_MEM_MODEL=opus
npm run worker:restart
claude-mem restart
```
### Custom Skip Tools
@@ -388,7 +388,7 @@ Enable debug logging:
```bash
export DEBUG=claude-mem:*
npm run worker:restart
claude-mem restart
npm run worker:logs
```
@@ -406,7 +406,7 @@ npm run worker:logs
1. Restart worker after changes:
```bash
npm run worker:restart
claude-mem restart
```
2. Verify environment variables:
@@ -440,7 +440,7 @@ If port 37777 is already in use:
2. Restart worker:
```bash
npm run worker:restart
claude-mem restart
```
3. Verify new port:
+2 -2
View File
@@ -165,7 +165,7 @@ npm run build
1. Make changes to React components in `src/ui/viewer/`
2. Build: `npm run build`
3. Sync to installed plugin: `npm run sync-marketplace`
4. Restart worker: `npm run worker:restart`
4. Restart worker: `claude-mem restart`
5. Refresh browser at http://localhost:37777
**Hot Reload**: Not currently supported. Full rebuild + restart required for changes.
@@ -456,7 +456,7 @@ export async function createObservation(
```bash
export DEBUG=claude-mem:*
npm run worker:restart
claude-mem restart
npm run worker:logs
```
+2 -2
View File
@@ -94,7 +94,7 @@ git checkout beta/endless-mode
npm install
# Restart the worker
npm run worker:restart
claude-mem restart
```
**To return to stable:**
@@ -103,7 +103,7 @@ npm run worker:restart
cd ~/.claude/plugins/marketplaces/thedotmack/
git checkout main
npm install
npm run worker:restart
claude-mem restart
```
## Summary
+1 -1
View File
@@ -534,7 +534,7 @@ npm run worker:status
npm run worker:logs
# Restart
npm run worker:restart
claude-mem restart
# Stop
npm run worker:stop
+1 -1
View File
@@ -57,7 +57,7 @@ CLAUDE_MEM_PYTHON_VERSION=3.13 # Python version for chroma-mcp
```bash
npm run build # Compile TypeScript (hooks + worker)
npm run sync-marketplace # Copy to ~/.claude/plugins
npm run worker:restart # Restart worker
claude-mem restart # Restart worker
npm run worker:logs # View worker logs
npm run worker:status # Check worker status
```
+8 -8
View File
@@ -48,14 +48,14 @@ The skill includes comprehensive diagnostics, automated repair sequences, and de
4. Restart worker service:
```bash
npm run worker:restart
claude-mem restart
```
5. Check for port conflicts:
```bash
# If port 37777 is in use by another service
export CLAUDE_MEM_WORKER_PORT=38000
npm run worker:restart
claude-mem restart
```
### Theme Toggle Not Persisting
@@ -110,7 +110,7 @@ The skill includes comprehensive diagnostics, automated repair sequences, and de
5. Restart worker and refresh browser:
```bash
npm run worker:restart
claude-mem restart
```
### Chroma/Python Dependency Issues (v5.0.0+)
@@ -225,7 +225,7 @@ The skill includes comprehensive diagnostics, automated repair sequences, and de
3. Or use a different port:
```bash
export CLAUDE_MEM_WORKER_PORT=38000
npm run worker:restart
claude-mem restart
```
4. Verify new port:
@@ -282,7 +282,7 @@ The skill includes comprehensive diagnostics, automated repair sequences, and de
4. Restart worker:
```bash
npm run worker:restart
claude-mem restart
```
## Hook Issues
@@ -644,7 +644,7 @@ The skill includes comprehensive diagnostics, automated repair sequences, and de
2. Restart worker:
```bash
npm run worker:restart
claude-mem restart
```
3. Clean up old data (see "Database Too Large" above)
@@ -721,7 +721,7 @@ The skill includes comprehensive diagnostics, automated repair sequences, and de
```bash
export DEBUG=claude-mem:*
npm run worker:restart
claude-mem restart
npm run worker:logs
```
@@ -781,7 +781,7 @@ SELECT created_at, tool_name FROM observations ORDER BY created_at DESC LIMIT 10
**Cause**: Worker not running or port mismatch.
**Solution**: Restart worker with `npm run worker:restart`.
**Solution**: Restart worker with `claude-mem restart`.
### "Database is locked"
+2 -2
View File
@@ -118,12 +118,12 @@ The skill provides access to these MCP tools:
| `search` | Unified search across observations, sessions, and prompts |
| `timeline` | Get chronological context around a query or observation ID |
| `get_observation` | Fetch a single observation by ID |
| `get_batch_observations` | Fetch multiple observations efficiently |
| `get_observations` | Fetch multiple observations efficiently |
| `get_session` | Fetch session summary by ID |
| `get_prompt` | Fetch user prompt by ID |
| `get_recent_context` | Get recent timeline items |
| `get_context_timeline` | Get timeline around a specific observation |
| `progressive_description` | Load detailed usage instructions |
| `help` | Load detailed usage instructions |
## Troubleshooting
+1 -1
View File
@@ -86,7 +86,7 @@ npm run worker:start
npm run worker:stop
# Restart worker service
npm run worker:restart
claude-mem restart
# View worker logs
npm run worker:logs
+1 -1
View File
@@ -176,7 +176,7 @@ This design ensures that private content never reaches the database, search indi
1. Verify correct syntax: `<private>content</private>`
2. Check `~/.claude-mem/silent.log` for errors
3. Ensure worker is running: `npm run worker:status`
4. Restart worker: `npm run worker:restart`
4. Restart worker: `claude-mem restart`
### Partial Content Stored
+1 -1
View File
@@ -364,7 +364,7 @@ If search isn't working, check the worker service:
```bash
npm run worker:status # Check worker status
npm run worker:restart # Restart if needed
claude-mem restart # Restart if needed
npm run worker:logs # View logs
```
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "7.3.2",
"version": "7.4.2",
"description": "Memory compression system for Claude Code - persist context across sessions",
"keywords": [
"claude",
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "7.3.2",
"version": "7.4.2",
"description": "Persistent memory system for Claude Code - seamlessly preserve context across sessions",
"author": {
"name": "Alex Newman"
+1 -1
View File
@@ -1,6 +1,6 @@
{
"mcpServers": {
"claude-mem-search": {
"mem-search": {
"type": "stdio",
"command": "${CLAUDE_PLUGIN_ROOT}/scripts/mcp-server.cjs"
}
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem-plugin",
"version": "7.3.2",
"version": "7.4.1",
"private": true,
"description": "Runtime dependencies for claude-mem bundled hooks",
"type": "module",
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
+94
View File
@@ -253,6 +253,97 @@ function installUv() {
}
}
/**
* Install the claude-mem CLI command to PATH
* Creates a wrapper script in ~/.local/bin (Unix) or %LOCALAPPDATA%\Programs\claude-mem (Windows)
*/
function installCLI() {
const CLI_NAME = 'claude-mem';
const WORKER_CLI = join(ROOT, 'plugin', 'scripts', 'worker-cli.js');
if (IS_WINDOWS) {
// Windows: Create .cmd file in LocalAppData
const cliDir = join(process.env.LOCALAPPDATA || join(homedir(), 'AppData', 'Local'), 'Programs', 'claude-mem');
const cliPath = join(cliDir, `${CLI_NAME}.cmd`);
const markerPath = join(cliDir, '.cli-installed');
// Skip if already installed
if (existsSync(markerPath)) return;
try {
// Create directory if needed
if (!existsSync(cliDir)) {
execSync(`mkdir "${cliDir}"`, { stdio: 'ignore', shell: true });
}
// Get Bun path for the wrapper
const bunPath = getBunPath() || 'bun';
// Create the wrapper script
const cmdContent = `@echo off
"${bunPath}" "${WORKER_CLI}" %*
`;
writeFileSync(cliPath, cmdContent);
writeFileSync(markerPath, new Date().toISOString());
console.error(`✅ CLI installed: ${cliPath}`);
console.error('');
console.error('📋 Add to PATH (run once in PowerShell as Admin):');
console.error(` [Environment]::SetEnvironmentVariable("Path", $env:Path + ";${cliDir}", "User")`);
console.error('');
console.error(' Then restart your terminal and use: claude-mem start|stop|restart|status');
} catch (error) {
console.error(`⚠️ Could not install CLI: ${error.message}`);
console.error(` You can still use: bun "${WORKER_CLI}" <command>`);
}
} else {
// Unix: Create shell script in ~/.local/bin
const cliDir = join(homedir(), '.local', 'bin');
const cliPath = join(cliDir, CLI_NAME);
const markerPath = join(ROOT, '.cli-installed');
// Skip if already installed
if (existsSync(markerPath) && existsSync(cliPath)) return;
try {
// Create directory if needed
if (!existsSync(cliDir)) {
execSync(`mkdir -p "${cliDir}"`, { stdio: 'ignore', shell: true });
}
// Get Bun path for the wrapper
const bunPath = getBunPath() || 'bun';
// Create the wrapper script
const shContent = `#!/usr/bin/env bash
# claude-mem CLI wrapper - manages the worker service
exec "${bunPath}" "${WORKER_CLI}" "$@"
`;
writeFileSync(cliPath, shContent, { mode: 0o755 });
writeFileSync(markerPath, new Date().toISOString());
console.error(`✅ CLI installed: ${cliPath}`);
// Check if ~/.local/bin is in PATH
const pathDirs = (process.env.PATH || '').split(':');
const localBinInPath = pathDirs.some(p => p === cliDir || p === '$HOME/.local/bin' || p.endsWith('/.local/bin'));
if (!localBinInPath) {
console.error('');
console.error('📋 Add to PATH (add to ~/.bashrc or ~/.zshrc):');
console.error(' export PATH="$HOME/.local/bin:$PATH"');
console.error('');
console.error(' Then restart your terminal and use: claude-mem start|stop|restart|status');
} else {
console.error(' Usage: claude-mem start|stop|restart|status');
}
} catch (error) {
console.error(`⚠️ Could not install CLI: ${error.message}`);
console.error(` You can still use: bun "${WORKER_CLI}" <command>`);
}
}
}
/**
* Check if dependencies need to be installed
*/
@@ -351,6 +442,9 @@ try {
installDeps();
console.error('✅ Dependencies installed');
}
// Step 4: Install CLI to PATH
installCLI();
} catch (e) {
console.error('❌ Installation failed:', e.message);
process.exit(1);
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
+2
View File
@@ -0,0 +1,2 @@
#!/usr/bin/env bun
"use strict";var m=Object.create;var w=Object.defineProperty;var u=Object.getOwnPropertyDescriptor;var I=Object.getOwnPropertyNames;var f=Object.getPrototypeOf,x=Object.prototype.hasOwnProperty;var g=(e,i,n,o)=>{if(i&&typeof i=="object"||typeof i=="function")for(let s of I(i))!x.call(e,s)&&s!==n&&w(e,s,{get:()=>i[s],enumerable:!(o=u(i,s))||o.enumerable});return e};var k=(e,i,n)=>(n=e!=null?m(f(e)):{},g(i||!e||!e.__esModule?w(n,"default",{value:e,enumerable:!0}):n,e));var c=require("child_process"),p=k(require("path"),1),y=process.platform==="win32",P=__dirname,l=p.default.join(P,"worker-service.cjs"),t=null,a=!1;function r(e){let i=new Date().toISOString();console.log(`[${i}] [wrapper] ${e}`)}function h(){r(`Spawning inner worker: ${l}`),t=(0,c.spawn)(process.execPath,[l],{stdio:["inherit","inherit","inherit","ipc"],env:{...process.env,CLAUDE_MEM_MANAGED:"true"},cwd:p.default.dirname(l)}),t.on("message",async e=>{(e.type==="restart"||e.type==="shutdown")&&(r(`${e.type} requested by inner`),a=!0,await d(),r("Exiting wrapper"),process.exit(0))}),t.on("exit",(e,i)=>{r(`Inner exited with code=${e}, signal=${i}`),t=null,a||(r("Inner exited unexpectedly, wrapper exiting (hooks will restart if needed)"),process.exit(e??1))}),t.on("error",e=>{r(`Inner error: ${e.message}`)})}async function d(){if(!t||!t.pid){r("No inner process to kill");return}let e=t.pid;if(r(`Killing inner process tree (pid=${e})`),y)try{(0,c.execSync)(`taskkill /PID ${e} /T /F`,{timeout:1e4,stdio:"ignore"}),r(`taskkill completed for pid=${e}`)}catch(i){r(`taskkill failed (process may be dead): ${i}`)}else{t.kill("SIGTERM");let i=new Promise(o=>{if(!t){o();return}t.on("exit",()=>o())}),n=new Promise(o=>setTimeout(()=>o(),5e3));await Promise.race([i,n]),t&&!t.killed&&(r("Inner did not exit gracefully, force killing"),t.kill("SIGKILL"))}await S(e,5e3),t=null,r("Inner process terminated")}async function S(e,i){let n=Date.now();for(;Date.now()-n<i;)try{process.kill(e,0),await new Promise(o=>setTimeout(o,100))}catch{return}r(`Timeout waiting for process ${e} to exit`)}process.on("SIGTERM",async()=>{r("Wrapper received SIGTERM"),a=!0,await d(),process.exit(0)});process.on("SIGINT",async()=>{r("Wrapper received SIGINT"),a=!0,await d(),process.exit(0)});r("Wrapper starting");h();
Binary file not shown.
+124 -9
View File
@@ -86,13 +86,13 @@ For each relevant ID, fetch full details using MCP tools:
**Fetch multiple observations (ALWAYS use for 2+ IDs):**
```
get_batch_observations(ids=[11131, 10942, 10855])
get_observations(ids=[11131, 10942, 10855])
```
**With ordering and limit:**
```
get_batch_observations(
get_observations(
ids=[11131, 10942, 10855],
orderBy="date_desc",
limit=10,
@@ -126,7 +126,7 @@ get_prompt(id=5421)
**Batch optimization:**
- **ALWAYS use `get_batch_observations` for 2+ observations**
- **ALWAYS use `get_observations` for 2+ observations**
- 10-100x more efficient than individual fetches
- Single HTTP request vs N requests
- Returns all results in one response
@@ -175,13 +175,13 @@ search(query="database migration", limit=20, project="my-project")
**Get detailed instructions:**
Use the `progressive_description` tool to load full instructions on-demand:
Use the `help` tool to load full instructions on-demand:
```
progressive_description(topic="workflow") # Get 4-step workflow
progressive_description(topic="search_params") # Get parameters reference
progressive_description(topic="examples") # Get usage examples
progressive_description(topic="all") # Get complete guide
help(topic="workflow") # Get 4-step workflow
help(topic="search_params") # Get parameters reference
help(topic="examples") # Get usage examples
help(topic="all") # Get complete guide
```
## Why This Workflow?
@@ -210,5 +210,120 @@ progressive_description(topic="all") # Get complete guide
**Remember:**
- ALWAYS get timeline context to understand what was happening
- ALWAYS use `get_batch_observations` when fetching 2+ observations
- ALWAYS use `get_observations` when fetching 2+ observations
- The workflow is optimized: search → timeline → batch fetch = 10-100x faster
---
## Tool Reference
Comprehensive parameter documentation for all memory tools. For MCP usage, call `help(topic="search")` to load specific tool docs.
### search
Search across all memory types (observations, sessions, prompts).
**Parameters:**
- `query` (string, optional) - Search term for full-text search
- `limit` (number, optional) - Maximum results to return. Default: 20, Max: 100
- `offset` (number, optional) - Number of results to skip. Default: 0
- `project` (string, required) - Project name to filter by
- `type` (string, optional) - Filter by type: "observations", "sessions", "prompts"
- `dateStart` (string, optional) - Start date filter (YYYY-MM-DD or epoch ms)
- `dateEnd` (string, optional) - End date filter (YYYY-MM-DD or epoch ms)
- `obs_type` (string, optional) - Filter observations by type (comma-separated): bugfix, feature, decision, discovery, change
- `orderBy` (string, optional) - Sort order: "date_desc" (default), "date_asc", "relevance"
**Returns:** Table of results with IDs, timestamps, types, titles
### timeline
Get chronological context around a specific point in time or observation.
**Parameters:**
- `anchor` (number, optional) - Observation ID to center timeline around. If not provided, uses most recent result from query
- `query` (string, optional) - Search term to find anchor automatically (if anchor not provided)
- `depth_before` (number, optional) - Items before anchor. Default: 5, Max: 20
- `depth_after` (number, optional) - Items after anchor. Default: 5, Max: 20
- `project` (string, required) - Project name to filter by
**Returns:** Exactly `depth_before + 1 + depth_after` items in chronological order, with observations, sessions, and prompts interleaved
### get_recent_context
Get the most recent observations from current or recent sessions.
**Parameters:**
- `limit` (number, optional) - Maximum observations to return. Default: 10, Max: 50
- `project` (string, required) - Project name to filter by
**Returns:** Recent observations in reverse chronological order
### get_context_timeline
Get timeline context around a specific observation ID.
**Parameters:**
- `anchor` (number, required) - Observation ID to center timeline around
- `depth_before` (number, optional) - Items before anchor. Default: 5, Max: 20
- `depth_after` (number, optional) - Items after anchor. Default: 5, Max: 20
- `project` (string, optional) - Project name to filter by
**Returns:** Timeline items centered on the anchor observation
### get_observation
Fetch a single observation by ID with full details.
**Parameters:**
- `id` (number, required) - Observation ID to fetch
**Returns:** Complete observation object with title, subtitle, narrative, facts, concepts, files, timestamps
### get_observations
Batch fetch multiple observations by IDs. Always prefer this over individual fetches for 2+ observations.
**Parameters:**
- `ids` (array of numbers, required) - Array of observation IDs to fetch
- `orderBy` (string, optional) - Sort order: "date_desc" (default), "date_asc"
- `limit` (number, optional) - Maximum observations to return. Default: no limit
- `project` (string, optional) - Project name to filter by
**Returns:** Array of complete observation objects, 10-100x faster than individual fetches
### get_session
Fetch a single session by ID with metadata.
**Parameters:**
- `id` (number, required) - Session ID to fetch (just the number, not "S2005" format)
**Returns:** Session object with ID, start time, end time, project, model info
### get_prompt
Fetch a single prompt by ID with full text.
**Parameters:**
- `id` (number, required) - Prompt ID to fetch
**Returns:** Prompt object with ID, text, timestamp, session reference
### help
Load detailed instructions for specific topics or all documentation.
**Parameters:**
- `topic` (string, optional) - Specific topic to load: "workflow", "search", "timeline", "get_recent_context", "get_context_timeline", "get_observation", "get_observations", "get_session", "get_prompt", "all". Default: "all"
**Returns:** Formatted documentation for the requested topic
@@ -282,13 +282,13 @@ No results found for "{query}". Try:
The search service isn't available. Check if the worker is running:
```bash
pm2 list
npm run worker:status
```
If the worker is stopped, restart it:
```bash
npm run worker:restart
claude-mem restart
```
```
+1 -1
View File
@@ -113,7 +113,7 @@ Many endpoints share these parameters:
## Error Handling
**Worker not running:**
Connection refused error. Response: "The search API isn't available. Check if worker is running: `pm2 list`"
Connection refused error. Response: "The search API isn't available. Check if worker is running: `npm run worker:status`"
**Invalid endpoint:**
```json
@@ -93,7 +93,7 @@ curl -s "http://localhost:37777/api/context/recent?limit=3"
Response: "No recent sessions found for 'new-project'. This might be a new project."
**Worker not running:**
Connection refused error. Inform user to check if worker is running: `pm2 list`
Connection refused error. Inform user to check if worker is running: `npm run worker:status`
## Tips
+5 -5
View File
@@ -1,6 +1,6 @@
---
name: troubleshoot
description: Diagnose and fix claude-mem installation issues. Checks PM2 worker status, database integrity, service health, dependencies, and provides automated fixes for common problems.
description: Diagnose and fix claude-mem installation issues. Checks worker status, database integrity, service health, dependencies, and provides automated fixes for common problems.
---
# Claude-Mem Troubleshooting Skill
@@ -39,7 +39,7 @@ Choose the appropriate operation file for detailed instructions:
### Diagnostic Workflows
1. **[Full System Diagnostics](operations/diagnostics.md)** - Comprehensive step-by-step diagnostic workflow
2. **[Worker Diagnostics](operations/worker.md)** - PM2 worker-specific troubleshooting
2. **[Worker Diagnostics](operations/worker.md)** - Bun worker-specific troubleshooting
3. **[Database Diagnostics](operations/database.md)** - Database integrity and data checks
### Issue Resolution
@@ -54,9 +54,9 @@ Choose the appropriate operation file for detailed instructions:
**Fast automated fix (try this first):**
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/ && \
pm2 delete claude-mem-worker 2>/dev/null; \
npm run worker:stop; \
npm install && \
node_modules/.bin/pm2 start ecosystem.config.cjs && \
npm run worker:start && \
sleep 3 && \
curl -s http://127.0.0.1:37777/health
```
@@ -79,7 +79,7 @@ When troubleshooting:
- **Worker port:** Default 37777 (configurable via `CLAUDE_MEM_WORKER_PORT`)
- **Database location:** `~/.claude-mem/claude-mem.db`
- **Plugin location:** `~/.claude/plugins/marketplaces/thedotmack/`
- **PM2 process name:** `claude-mem-worker`
- **Worker PID file:** `~/.claude-mem/worker.pid`
## Error Reporting
@@ -8,9 +8,9 @@ One-command fix sequences for common claude-mem issues.
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/ && \
pm2 delete claude-mem-worker 2>/dev/null; \
npm run worker:stop; \
npm install && \
node_modules/.bin/pm2 start ecosystem.config.cjs && \
npm run worker:start && \
sleep 3 && \
curl -s http://127.0.0.1:37777/health
```
@@ -20,22 +20,22 @@ curl -s http://127.0.0.1:37777/health
**What it does:**
1. Stops the worker (if running)
2. Ensures dependencies are installed
3. Starts worker with local PM2
3. Starts worker
4. Waits for startup
5. Verifies health
## Fix: Worker Not Running
**Use when:** PM2 shows worker as stopped or not listed
**Use when:** Worker status shows it's not running
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/ && \
node_modules/.bin/pm2 start ecosystem.config.cjs && \
npm run worker:start && \
sleep 2 && \
pm2 status
npm run worker:status
```
**Expected output:** Worker shows as "online"
**Expected output:** Worker running with PID and health OK
## Fix: Dependencies Missing
@@ -44,9 +44,23 @@ pm2 status
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/ && \
npm install && \
pm2 restart claude-mem-worker
claude-mem restart
```
## Fix: Stale PID File
**Use when:** Worker reports running but health check fails
```bash
rm -f ~/.claude-mem/worker.pid && \
cd ~/.claude/plugins/marketplaces/thedotmack/ && \
npm run worker:start && \
sleep 2 && \
curl -s http://127.0.0.1:37777/health
```
**Expected output:** `{"status":"ok"}`
## Fix: Port Conflict
**Use when:** Error shows port already in use
@@ -54,8 +68,9 @@ pm2 restart claude-mem-worker
```bash
# Change to port 37778
mkdir -p ~/.claude-mem && \
echo '{"env":{"CLAUDE_MEM_WORKER_PORT":"37778"}}' > ~/.claude-mem/settings.json && \
pm2 restart claude-mem-worker && \
echo '{"CLAUDE_MEM_WORKER_PORT":"37778"}' > ~/.claude-mem/settings.json && \
cd ~/.claude/plugins/marketplaces/thedotmack/ && \
claude-mem restart && \
sleep 2 && \
curl -s http://127.0.0.1:37778/health
```
@@ -70,14 +85,16 @@ curl -s http://127.0.0.1:37778/health
# Backup and test integrity
cp ~/.claude-mem/claude-mem.db ~/.claude-mem/claude-mem.db.backup && \
sqlite3 ~/.claude-mem/claude-mem.db "PRAGMA integrity_check;" && \
pm2 restart claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/ && \
claude-mem restart
```
**If integrity check fails, recreate database:**
```bash
# WARNING: This deletes all memory data
mv ~/.claude-mem/claude-mem.db ~/.claude-mem/claude-mem.db.old && \
pm2 restart claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/ && \
claude-mem restart
```
## Fix: Clean Reinstall
@@ -88,36 +105,49 @@ pm2 restart claude-mem-worker
# Backup data first
cp ~/.claude-mem/claude-mem.db ~/.claude-mem/claude-mem.db.backup 2>/dev/null
# Stop and remove worker
pm2 delete claude-mem-worker 2>/dev/null
# Stop worker
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:stop
# Clean PID file
rm -f ~/.claude-mem/worker.pid
# Reinstall dependencies
cd ~/.claude/plugins/marketplaces/thedotmack/ && \
rm -rf node_modules && \
npm install
# Start worker
node_modules/.bin/pm2 start ecosystem.config.cjs && \
npm run worker:start && \
sleep 3 && \
curl -s http://127.0.0.1:37777/health
```
## Fix: Clear PM2 Logs
## Fix: Clear Old Logs
**Use when:** Logs are too large, want fresh start
**Use when:** Want to start with fresh logs
```bash
pm2 flush claude-mem-worker && \
pm2 restart claude-mem-worker
# Archive old logs
tar -czf ~/.claude-mem/logs-archive-$(date +%Y-%m-%d).tar.gz ~/.claude-mem/logs/*.log 2>/dev/null
# Remove logs older than 7 days
find ~/.claude-mem/logs/ -name "worker-*.log" -mtime +7 -delete
# Restart worker for fresh log
cd ~/.claude/plugins/marketplaces/thedotmack/
claude-mem restart
```
**Note:** Logs auto-rotate daily, manual cleanup rarely needed.
## Verification Commands
**After running any fix, verify with these:**
```bash
# Check worker status
pm2 status | grep claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:status
# Check health
curl -s http://127.0.0.1:37777/health
@@ -129,23 +159,48 @@ sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"
curl -s http://127.0.0.1:37777/api/stats
# Check logs for errors
pm2 logs claude-mem-worker --lines 20 --nostream | grep -i error
grep -i "error" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log | tail -20
```
**All checks should pass:**
- Worker status: "online"
- Health: `{"status":"ok"}`
- Worker status: Shows PID and "Health: OK"
- Health endpoint: `{"status":"ok"}`
- Database: Shows count (may be 0 if new)
- Stats: Returns JSON with counts
- Logs: No recent errors
## One-Line Complete Diagnostic
**Quick health check:**
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/ && npm run worker:status && curl -s http://127.0.0.1:37777/health && echo " ✓ All systems OK"
```
## Troubleshooting the Fixes
**If automated fix fails:**
1. Run the diagnostic script from [diagnostics.md](diagnostics.md)
2. Check specific error in PM2 logs
2. Check specific error in worker logs:
```bash
tail -50 ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
```
3. Try manual worker start to see detailed error:
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
node plugin/scripts/worker-service.cjs
bun plugin/scripts/worker-service.js
```
4. Use the bug report tool:
```bash
npm run bug-report
```
## Common Error Patterns and Fixes
| Error Pattern | Likely Cause | Quick Fix |
|---------------|--------------|-----------|
| `EADDRINUSE` | Port conflict | Change port in settings.json |
| `SQLITE_ERROR` | Database corruption | Run integrity check, recreate if needed |
| `ENOENT` | Missing files | Run `npm install` |
| `Module not found` | Dependency issue | Clean reinstall |
| Connection refused | Worker not running | `npm run worker:start` |
| Stale PID | Old PID file | Remove `~/.claude-mem/worker.pid` |
@@ -17,7 +17,8 @@ Quick fixes for frequently encountered claude-mem problems.
**Fix:**
1. Verify worker is running:
```bash
pm2 jlist | grep claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:status
```
2. Check database has recent observations:
@@ -27,7 +28,8 @@ Quick fixes for frequently encountered claude-mem problems.
3. Restart worker and start new session:
```bash
pm2 restart claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/
claude-mem restart
```
4. Create a test observation: `/skill version-bump` then cancel
@@ -66,7 +68,7 @@ Quick fixes for frequently encountered claude-mem problems.
3. Verify worker is using correct database path in logs:
```bash
pm2 logs claude-mem-worker --lines 50 --nostream | grep "Database"
grep "Database" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
```
4. Test viewer connection manually:
@@ -109,34 +111,34 @@ Quick fixes for frequently encountered claude-mem problems.
## Issue: Worker Not Starting {#worker-not-starting}
**Symptoms:**
- PM2 shows worker as "stopped" or "errored"
- Worker status shows not running or error
- Health check fails
- Viewer not accessible
**Root cause:**
- Port already in use
- PM2 not installed or not in PATH
- Bun not installed
- Missing dependencies
**Fix:**
1. Try manual worker start to see error:
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
node plugin/scripts/worker-service.cjs
bun plugin/scripts/worker-service.js
# Should start server on port 37777 or show error
```
2. If port in use, change it:
```bash
mkdir -p ~/.claude-mem
echo '{"env":{"CLAUDE_MEM_WORKER_PORT":"37778"}}' > ~/.claude-mem/settings.json
echo '{"CLAUDE_MEM_WORKER_PORT":"37778"}' > ~/.claude-mem/settings.json
```
3. If dependencies missing:
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
npm install
pm2 start ecosystem.config.cjs
npm run worker:start
```
## Issue: Search Results Empty
@@ -170,7 +172,8 @@ Quick fixes for frequently encountered claude-mem problems.
4. If FTS5 out of sync, restart worker (triggers reindex):
```bash
pm2 restart claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/
claude-mem restart
```
## Issue: Port Conflicts
@@ -189,8 +192,9 @@ Quick fixes for frequently encountered claude-mem problems.
2. Either kill the conflicting process or change claude-mem port:
```bash
mkdir -p ~/.claude-mem
echo '{"env":{"CLAUDE_MEM_WORKER_PORT":"37778"}}' > ~/.claude-mem/settings.json
pm2 restart claude-mem-worker
echo '{"CLAUDE_MEM_WORKER_PORT":"37778"}' > ~/.claude-mem/settings.json
cd ~/.claude/plugins/marketplaces/thedotmack/
claude-mem restart
```
## Issue: Database Corrupted
@@ -214,7 +218,8 @@ Quick fixes for frequently encountered claude-mem problems.
3. If repair fails, recreate (loses data):
```bash
rm ~/.claude-mem/claude-mem.db
pm2 restart claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/
claude-mem restart
# Worker will create new database
```
@@ -172,7 +172,8 @@ SELECT
If FTS5 counts don't match, triggers may have failed. Restart worker to rebuild:
```bash
pm2 restart claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/
claude-mem restart
```
The worker will rebuild FTS5 indexes on startup if they're out of sync.
@@ -196,7 +197,7 @@ The worker will rebuild FTS5 indexes on startup if they're out of sync.
1. Create test observation (use any skill and cancel)
2. Check worker logs for errors:
```bash
pm2 logs claude-mem-worker --lines 50 --nostream
tail -50 ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
```
3. Verify observation appears in database
@@ -228,9 +229,10 @@ ls -la ~/.claude-mem/claude-mem.db-wal
ls -la ~/.claude-mem/claude-mem.db-shm
# Remove lock files (only if worker is stopped!)
pm2 stop claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:stop
rm ~/.claude-mem/claude-mem.db-wal ~/.claude-mem/claude-mem.db-shm
pm2 start claude-mem-worker
npm run worker:start
```
### Issue: Database Growing Too Large
@@ -260,7 +262,8 @@ sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"
3. Archive and start fresh:
```bash
mv ~/.claude-mem/claude-mem.db ~/.claude-mem/claude-mem.db.archive
pm2 restart claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/
claude-mem restart
```
## Database Recovery
@@ -275,9 +278,10 @@ cp ~/.claude-mem/claude-mem.db ~/.claude-mem/claude-mem.db.backup
### Restore from Backup
```bash
pm2 stop claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:stop
cp ~/.claude-mem/claude-mem.db.backup ~/.claude-mem/claude-mem.db
pm2 start claude-mem-worker
npm run worker:start
```
### Export Data
@@ -300,8 +304,10 @@ sqlite3 ~/.claude-mem/claude-mem.db -json "SELECT * FROM user_prompts;" > prompt
**WARNING: Data loss. Backup first!**
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
# Stop worker
pm2 stop claude-mem-worker
npm run worker:stop
# Backup current database
cp ~/.claude-mem/claude-mem.db ~/.claude-mem/claude-mem.db.old
@@ -310,7 +316,7 @@ cp ~/.claude-mem/claude-mem.db ~/.claude-mem/claude-mem.db.old
rm ~/.claude-mem/claude-mem.db
# Start worker (creates new database)
pm2 start claude-mem-worker
npm run worker:start
```
## Database Statistics
@@ -6,29 +6,42 @@ Comprehensive step-by-step diagnostic workflow for claude-mem issues.
Run these checks systematically to identify the root cause:
### 1. Check PM2 Worker Status
### 1. Check Worker Status
First, verify if the worker service is running:
```bash
# Check if PM2 is available
which pm2 || echo "PM2 not found in PATH"
# Check worker status using npm script
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:status
# List PM2 processes
pm2 jlist 2>&1
# If pm2 is not found, try the local installation
~/.claude/plugins/marketplaces/thedotmack/node_modules/.bin/pm2 jlist 2>&1
# Or check health endpoint directly
curl -s http://127.0.0.1:37777/health
```
**Expected output:** JSON array with `claude-mem-worker` process showing `"status": "online"`
**Expected output from npm run worker:status:**
```
✓ Worker is running (PID: 12345)
Port: 37777
Uptime: 45m
Health: OK
```
**If worker not running or status is not "online":**
**Expected output from health endpoint:** `{"status":"ok"}`
**If worker not running:**
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
pm2 start ecosystem.config.cjs
# Or use local pm2:
node_modules/.bin/pm2 start ecosystem.config.cjs
npm run worker:start
```
**If health endpoint fails but worker reports running:**
Check for stale PID file:
```bash
cat ~/.claude-mem/worker.pid
ps -p $(cat ~/.claude-mem/worker.pid 2>/dev/null | grep -o '"pid":[0-9]*' | grep -o '[0-9]*') 2>/dev/null || echo "Stale PID - worker not actually running"
rm ~/.claude-mem/worker.pid
npm run worker:start
```
### 2. Check Worker Service Health
@@ -98,10 +111,12 @@ cd ~/.claude/plugins/marketplaces/thedotmack/
# Check for critical packages
ls node_modules/@anthropic-ai/claude-agent-sdk 2>&1 | head -1
ls node_modules/express 2>&1 | head -1
ls node_modules/pm2 2>&1 | head -1
# Check if Bun is available
bun --version 2>&1
```
**Expected:** All critical packages present
**Expected:** All critical packages present, Bun installed
**If dependencies missing:**
```bash
@@ -114,17 +129,26 @@ npm install
Review recent worker logs for errors:
```bash
# View last 50 lines of worker logs
pm2 logs claude-mem-worker --lines 50 --nostream
# Or use local pm2:
# View logs using npm script
cd ~/.claude/plugins/marketplaces/thedotmack/
node_modules/.bin/pm2 logs claude-mem-worker --lines 50 --nostream
npm run worker:logs
# View today's log file directly
cat ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Last 50 lines
tail -50 ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Check for specific errors
pm2 logs claude-mem-worker --lines 100 --nostream | grep -i "error\|exception\|failed"
grep -iE "error|exception|failed" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log | tail -20
```
**Common error patterns to look for:**
- `SQLITE_ERROR` - Database issues
- `EADDRINUSE` - Port conflict
- `ENOENT` - Missing files
- `Module not found` - Dependency issues
### 6. Test Viewer UI
Check if the web viewer is accessible:
@@ -167,6 +191,8 @@ echo "=== Claude-Mem Troubleshooting Report ==="
echo ""
echo "1. Environment"
echo " OS: $(uname -s)"
echo " Node version: $(node --version 2>/dev/null || echo 'N/A')"
echo " Bun version: $(bun --version 2>/dev/null || echo 'N/A')"
echo ""
echo "2. Plugin Installation"
echo " Plugin directory exists: $([ -d ~/.claude/plugins/marketplaces/thedotmack ] && echo 'YES' || echo 'NO')"
@@ -179,20 +205,28 @@ echo " Observation count: $(sqlite3 ~/.claude-mem/claude-mem.db 'SELECT COUNT(
echo " Session count: $(sqlite3 ~/.claude-mem/claude-mem.db 'SELECT COUNT(*) FROM sessions;' 2>/dev/null || echo 'N/A')"
echo ""
echo "4. Worker Service"
PM2_PATH=$(which pm2 2>/dev/null || echo "~/.claude/plugins/marketplaces/thedotmack/node_modules/.bin/pm2")
echo " PM2 path: $PM2_PATH"
WORKER_STATUS=$($PM2_PATH jlist 2>/dev/null | grep -o '"name":"claude-mem-worker".*"status":"[^"]*"' | grep -o 'status":"[^"]*"' | cut -d'"' -f3 || echo 'not running')
echo " Worker status: $WORKER_STATUS"
echo " Worker PID file: $([ -f ~/.claude-mem/worker.pid ] && echo 'EXISTS' || echo 'MISSING')"
if [ -f ~/.claude-mem/worker.pid ]; then
WORKER_PID=$(cat ~/.claude-mem/worker.pid 2>/dev/null | grep -o '"pid":[0-9]*' | grep -o '[0-9]*')
echo " Worker PID: $WORKER_PID"
echo " Process running: $(ps -p $WORKER_PID >/dev/null 2>&1 && echo 'YES' || echo 'NO (stale PID)')"
fi
echo " Health check: $(curl -s http://127.0.0.1:37777/health 2>/dev/null || echo 'FAILED')"
echo ""
echo "5. Configuration"
echo " Port setting: $(cat ~/.claude-mem/settings.json 2>/dev/null | grep CLAUDE_MEM_WORKER_PORT || echo 'default (37777)')"
echo " Observation count: $(cat ~/.claude-mem/settings.json 2>/dev/null | grep CLAUDE_MEM_CONTEXT_OBSERVATIONS || echo 'default (50)')"
echo " Model: $(cat ~/.claude-mem/settings.json 2>/dev/null | grep CLAUDE_MEM_MODEL || echo 'default (claude-sonnet-4-5)')"
echo ""
echo "6. Recent Activity"
echo " Latest observation: $(sqlite3 ~/.claude-mem/claude-mem.db 'SELECT created_at FROM observations ORDER BY created_at DESC LIMIT 1;' 2>/dev/null || echo 'N/A')"
echo " Latest session: $(sqlite3 ~/.claude-mem/claude-mem.db 'SELECT created_at FROM sessions ORDER BY created_at DESC LIMIT 1;' 2>/dev/null || echo 'N/A')"
echo ""
echo "7. Logs"
echo " Today's log file: $([ -f ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log ] && echo 'EXISTS' || echo 'MISSING')"
echo " Log file size: $(du -h ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log 2>/dev/null | cut -f1 || echo 'N/A')"
echo " Recent errors: $(grep -c -i "error" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log 2>/dev/null || echo '0')"
echo ""
echo "=== End Report ==="
```
@@ -201,18 +235,75 @@ Save this as `/tmp/claude-mem-diagnostics.sh` and run:
bash /tmp/claude-mem-diagnostics.sh
```
## Quick Diagnostic One-Liners
```bash
# Full status check
npm run worker:status && curl -s http://127.0.0.1:37777/health && echo " - All systems OK"
# Database stats
echo "DB: $(du -h ~/.claude-mem/claude-mem.db | cut -f1) | Obs: $(sqlite3 ~/.claude-mem/claude-mem.db 'SELECT COUNT(*) FROM observations;' 2>/dev/null) | Sessions: $(sqlite3 ~/.claude-mem/claude-mem.db 'SELECT COUNT(*) FROM sessions;' 2>/dev/null)"
# Recent errors
grep -i "error" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log 2>/dev/null | tail -5 || echo "No recent errors"
# Port check
lsof -i :37777 || echo "Port 37777 is free"
# Worker process check
ps aux | grep -E "bun.*worker-service" | grep -v grep || echo "Worker not running"
```
## Automated Fix Sequence
If diagnostics show issues, run this automated fix sequence:
```bash
#!/bin/bash
echo "Running automated fix sequence..."
# 1. Stop worker if running
echo "1. Stopping worker..."
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:stop
# 2. Clean stale PID if exists
echo "2. Cleaning stale PID file..."
rm -f ~/.claude-mem/worker.pid
# 3. Reinstall dependencies
echo "3. Reinstalling dependencies..."
npm install
# 4. Start worker
echo "4. Starting worker..."
npm run worker:start
# 5. Wait for startup
echo "5. Waiting for worker to start..."
sleep 3
# 6. Verify health
echo "6. Verifying health..."
curl -s http://127.0.0.1:37777/health || echo "Worker health check FAILED"
echo "Fix sequence complete!"
```
## Reporting Issues
If troubleshooting doesn't resolve the issue, collect this information for a bug report:
If troubleshooting doesn't resolve the issue, run the built-in bug report tool:
1. Full diagnostic report (run script above)
2. Worker logs: `pm2 logs claude-mem-worker --lines 100 --nostream`
3. Your setup:
- Claude version: Check with Claude
- OS: `uname -a`
- Node version: `node --version`
- Plugin version: In package.json
4. Steps to reproduce the issue
5. Expected vs actual behavior
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run bug-report
```
Post to: https://github.com/thedotmack/claude-mem/issues
This will collect:
1. Full diagnostic report
2. Worker logs
3. System information
4. Configuration details
5. Database stats
Post the generated report to: https://github.com/thedotmack/claude-mem/issues
@@ -6,30 +6,29 @@ Essential commands for troubleshooting claude-mem.
```bash
# Check worker status
pm2 status | grep claude-mem-worker
pm2 jlist | grep claude-mem-worker # JSON format
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:status
# Start worker
cd ~/.claude/plugins/marketplaces/thedotmack/
pm2 start ecosystem.config.cjs
npm run worker:start
# Restart worker
pm2 restart claude-mem-worker
claude-mem restart
# Stop worker
pm2 stop claude-mem-worker
# Delete worker (for clean restart)
pm2 delete claude-mem-worker
npm run worker:stop
# View logs
pm2 logs claude-mem-worker
npm run worker:logs
# View last N lines
pm2 logs claude-mem-worker --lines 50 --nostream
# View today's log file
cat ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Clear logs
pm2 flush claude-mem-worker
# Last 50 lines
tail -50 ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Follow logs in real-time
tail -f ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
```
## Health Checks
@@ -82,21 +81,17 @@ cat ~/.claude-mem/settings.json
cat ~/.claude/settings.json
# Change worker port
echo '{"env":{"CLAUDE_MEM_WORKER_PORT":"37778"}}' > ~/.claude-mem/settings.json
echo '{"CLAUDE_MEM_WORKER_PORT":"37778"}' > ~/.claude-mem/settings.json
# Change context observation count
# Edit ~/.claude-mem/settings.json and add:
{
"env": {
"CLAUDE_MEM_CONTEXT_OBSERVATIONS": "25"
}
"CLAUDE_MEM_CONTEXT_OBSERVATIONS": "25"
}
# Change AI model
{
"env": {
"CLAUDE_MEM_MODEL": "claude-sonnet-4-5"
}
"CLAUDE_MEM_MODEL": "claude-sonnet-4-5"
}
```
@@ -132,16 +127,19 @@ curl -v http://127.0.0.1:37777/health
```bash
# Search logs for errors
pm2 logs claude-mem-worker --lines 100 --nostream | grep -i "error"
grep -i "error" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Search for specific keyword
pm2 logs claude-mem-worker --lines 100 --nostream | grep "keyword"
grep "keyword" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Search across all log files
grep -i "error" ~/.claude-mem/logs/worker-*.log
# Last 100 error lines
grep -i "error" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log | tail -100
# Follow logs in real-time
pm2 logs claude-mem-worker
# Show only error logs
pm2 logs claude-mem-worker --err
tail -f ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
```
## File Locations
@@ -160,11 +158,11 @@ pm2 logs claude-mem-worker --err
# Chroma vector database
~/.claude-mem/chroma/
# Usage logs
~/.claude-mem/usage-logs/
# Worker logs (daily rotation)
~/.claude-mem/logs/worker-*.log
# PM2 logs
~/.pm2/logs/
# Worker PID file
~/.claude-mem/worker.pid
```
## System Information
@@ -179,8 +177,8 @@ node --version
# NPM version
npm --version
# PM2 version
pm2 --version
# Bun version
bun --version
# SQLite version
sqlite3 --version
@@ -188,3 +186,22 @@ sqlite3 --version
# Check disk space
df -h ~/.claude-mem/
```
## One-Line Diagnostics
```bash
# Full worker status check
npm run worker:status && curl -s http://127.0.0.1:37777/health
# Quick health check
curl -s http://127.0.0.1:37777/health && echo " - Worker is healthy"
# Database stats
echo "Observations: $(sqlite3 ~/.claude-mem/claude-mem.db 'SELECT COUNT(*) FROM observations;')" && echo "Sessions: $(sqlite3 ~/.claude-mem/claude-mem.db 'SELECT COUNT(*) FROM sessions;')"
# Recent errors
grep -i "error" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log | tail -10
# Port check
lsof -i :37777 || echo "Port 37777 is free"
```
+149 -94
View File
@@ -1,10 +1,10 @@
# Worker Service Diagnostics
PM2 worker-specific troubleshooting for claude-mem.
Bun worker-specific troubleshooting for claude-mem.
## PM2 Worker Overview
## Worker Overview
The claude-mem worker is a persistent background service managed by PM2. It:
The claude-mem worker is a persistent background service managed by Bun. It:
- Runs Express.js server on port 37777 (default)
- Processes observations asynchronously
- Serves the viewer UI
@@ -15,36 +15,41 @@ The claude-mem worker is a persistent background service managed by PM2. It:
### Basic Status Check
```bash
# List all PM2 processes
pm2 list
# Check worker status using npm script
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:status
# JSON format (parseable)
pm2 jlist
# Filter for claude-mem-worker
pm2 status | grep claude-mem-worker
# Or check health endpoint directly
curl -s http://127.0.0.1:37777/health
```
**Expected output:**
**Expected npm run worker:status output:**
```
│ claude-mem-worker │ online │ 12345 │ 0 │ 45m │ 0% │ 85.6mb │
✓ Worker is running (PID: 12345)
Port: 37777
Uptime: 45m
Health: OK
```
**Status meanings:**
- `online` - Worker running correctly
- `stopped` - Worker stopped (normal shutdown)
- `errored` - Worker crashed (check logs)
- `stopping` - Worker shutting down
- Not listed - Worker never started
**Expected health endpoint output:**
```json
{"status":"ok"}
```
**Status indicators:**
- `Worker is running` - Worker running correctly
- `Worker is not running` - Worker stopped or crashed
- Connection refused - Worker not running
- Timeout - Worker hung (restart needed)
### Detailed Worker Info
```bash
# Show detailed information
pm2 show claude-mem-worker
# View PID file
cat ~/.claude-mem/worker.pid
# JSON format
pm2 jlist | grep -A 20 '"name":"claude-mem-worker"'
# Check process details
ps aux | grep "bun.*worker-service"
```
## Worker Health Endpoint
@@ -72,30 +77,37 @@ curl -s http://127.0.0.1:$PORT/health
### View Recent Logs
```bash
# Last 50 lines
pm2 logs claude-mem-worker --lines 50 --nostream
# View logs using npm script
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:logs
# Last 200 lines
pm2 logs claude-mem-worker --lines 200 --nostream
# View today's log file directly
cat ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Last 50 lines of today's log
tail -50 ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Follow logs in real-time
pm2 logs claude-mem-worker
tail -f ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
```
### Search Logs for Errors
```bash
# Find errors
pm2 logs claude-mem-worker --lines 500 --nostream | grep -i "error"
# Find errors in today's log
grep -i "error" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Find exceptions
pm2 logs claude-mem-worker --lines 500 --nostream | grep -i "exception"
grep -i "exception" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Find failed requests
pm2 logs claude-mem-worker --lines 500 --nostream | grep -i "failed"
grep -i "failed" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# All error patterns
pm2 logs claude-mem-worker --lines 500 --nostream | grep -iE "error|exception|failed|crash"
grep -iE "error|exception|failed|crash" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# Search across all log files
grep -iE "error|exception|failed|crash" ~/.claude-mem/logs/worker-*.log
```
### Common Log Patterns
@@ -122,8 +134,8 @@ Port 37777 already in use
**Crashes:**
```
PM2 | App [claude-mem-worker] exited with code [1]
PM2 | App [claude-mem-worker] will restart in 100ms
Worker process exited with code 1
Worker restarting...
```
## Starting the Worker
@@ -132,37 +144,26 @@ PM2 | App [claude-mem-worker] will restart in 100ms
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
pm2 start ecosystem.config.cjs
```
### Start with Local PM2
If `pm2` command not in PATH:
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
node_modules/.bin/pm2 start ecosystem.config.cjs
npm run worker:start
```
### Force Restart
```bash
# Restart if already running
pm2 restart claude-mem-worker
# Restart worker (stops and starts)
cd ~/.claude/plugins/marketplaces/thedotmack/
claude-mem restart
# Delete and start fresh
pm2 delete claude-mem-worker
pm2 start ecosystem.config.cjs
# Or manually stop and start
npm run worker:stop
npm run worker:start
```
## Stopping the Worker
```bash
# Graceful stop
pm2 stop claude-mem-worker
# Delete completely (also removes from PM2 list)
pm2 delete claude-mem-worker
cd ~/.claude/plugins/marketplaces/thedotmack/
npm run worker:stop
```
## Worker Not Starting
@@ -172,23 +173,22 @@ pm2 delete claude-mem-worker
1. **Try manual start to see error:**
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
node plugin/scripts/worker-service.cjs
bun plugin/scripts/worker-service.js
```
This runs the worker directly without PM2, showing full error output.
This runs the worker directly, showing full error output.
2. **Check PM2 itself:**
2. **Check Bun installation:**
```bash
which pm2
pm2 --version
which bun
bun --version
```
If PM2 not found, dependencies not installed.
If Bun not found, run: `npm install` (auto-installs Bun)
3. **Check dependencies:**
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
ls node_modules/@anthropic-ai/claude-agent-sdk
ls node_modules/express
ls node_modules/pm2
```
4. **Check port availability:**
@@ -197,42 +197,57 @@ pm2 delete claude-mem-worker
```
If port in use, either kill that process or change claude-mem port.
5. **Check PID file:**
```bash
cat ~/.claude-mem/worker.pid
```
If worker PID exists but process is dead, remove stale PID:
```bash
rm ~/.claude-mem/worker.pid
npm run worker:start
```
### Common Fixes
**Dependencies missing:**
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
npm install
pm2 start ecosystem.config.cjs
npm run worker:start
```
**Port conflict:**
```bash
echo '{"env":{"CLAUDE_MEM_WORKER_PORT":"37778"}}' > ~/.claude-mem/settings.json
pm2 restart claude-mem-worker
echo '{"CLAUDE_MEM_WORKER_PORT":"37778"}' > ~/.claude-mem/settings.json
claude-mem restart
```
**Corrupted PM2:**
**Stale PID file:**
```bash
pm2 kill # Stop PM2 daemon
cd ~/.claude/plugins/marketplaces/thedotmack/
pm2 start ecosystem.config.cjs
rm ~/.claude-mem/worker.pid
npm run worker:start
```
## Worker Crashing Repeatedly
If worker keeps restarting (check with `pm2 status` showing high restart count):
If worker keeps restarting (check logs for repeated startup messages):
### Find the Cause
1. **Check error logs:**
```bash
pm2 logs claude-mem-worker --err --lines 100 --nostream
grep -i "error" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log | tail -100
```
2. **Look for crash pattern:**
```bash
pm2 logs claude-mem-worker --lines 200 --nostream | grep -A 5 "exited with code"
grep -A 5 "exited with code" ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
```
3. **Run worker in foreground to see crashes:**
```bash
cd ~/.claude/plugins/marketplaces/thedotmack/
bun plugin/scripts/worker-service.js
```
### Common Crash Causes
@@ -246,43 +261,71 @@ If fails, backup and recreate database.
**Out of memory:**
Check if database is too large or memory leak. Restart:
```bash
pm2 restart claude-mem-worker
claude-mem restart
```
**Port conflict race condition:**
Another process grabbing port intermittently. Change port:
```bash
echo '{"env":{"CLAUDE_MEM_WORKER_PORT":"37778"}}' > ~/.claude-mem/settings.json
pm2 restart claude-mem-worker
echo '{"CLAUDE_MEM_WORKER_PORT":"37778"}' > ~/.claude-mem/settings.json
claude-mem restart
```
## PM2 Management Commands
## Worker Management Commands
```bash
# List processes
pm2 list
pm2 jlist # JSON format
# Check status
npm run worker:status
# Show detailed info
pm2 show claude-mem-worker
# Start worker
npm run worker:start
# Monitor resources
pm2 monit
# Stop worker
npm run worker:stop
# Clear logs
pm2 flush claude-mem-worker
# Restart worker
claude-mem restart
# Restart PM2 daemon
pm2 kill
pm2 resurrect # Restore saved processes
# View logs
npm run worker:logs
# Save current process list
pm2 save
# Check health endpoint
curl -s http://127.0.0.1:37777/health
# Update PM2
npm install -g pm2
# View PID
cat ~/.claude-mem/worker.pid
# View today's log file
cat ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# List all log files
ls -lh ~/.claude-mem/logs/worker-*.log
```
## Log File Management
Worker logs are stored in `~/.claude-mem/logs/` with daily rotation:
```bash
# View today's log
cat ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log
# View yesterday's log
cat ~/.claude-mem/logs/worker-$(date -d "yesterday" +%Y-%m-%d).log # Linux
cat ~/.claude-mem/logs/worker-$(date -v-1d +%Y-%m-%d).log # macOS
# List all logs
ls -lh ~/.claude-mem/logs/
# Clean old logs (older than 7 days)
find ~/.claude-mem/logs/ -name "worker-*.log" -mtime +7 -delete
# Archive logs
tar -czf ~/claude-mem-logs-backup-$(date +%Y-%m-%d).tar.gz ~/.claude-mem/logs/
```
**Note:** Logs auto-rotate daily. No manual flush required.
## Testing Worker Endpoints
Once worker is running, test all endpoints:
@@ -298,10 +341,22 @@ curl -s http://127.0.0.1:37777/ | head -20
curl -s http://127.0.0.1:37777/api/stats
# Search API
curl -s "http://127.0.0.1:37777/api/search/observations?q=test&format=index"
curl -s "http://127.0.0.1:37777/api/search?query=test&limit=5"
# Prompts API
curl -s "http://127.0.0.1:37777/api/prompts?limit=5"
# Recent context
curl -s "http://127.0.0.1:37777/api/context/recent?limit=3"
```
All should return appropriate responses (HTML for viewer, JSON for APIs).
## Troubleshooting Quick Reference
| Problem | Command | Expected Result |
|---------|---------|----------------|
| Check if running | `npm run worker:status` | Shows PID and uptime |
| Worker not running | `npm run worker:start` | Worker starts successfully |
| Worker crashed | `claude-mem restart` | Worker restarts |
| View recent errors | `grep -i error ~/.claude-mem/logs/worker-$(date +%Y-%m-%d).log \| tail -20` | Shows recent errors |
| Port in use | `lsof -i :37777` | Shows process using port |
| Stale PID | `rm ~/.claude-mem/worker.pid && npm run worker:start` | Removes stale PID and starts |
| Dependencies missing | `npm install && npm run worker:start` | Installs deps and starts |
+1 -1
View File
@@ -1,7 +1,7 @@
#!/usr/bin/env node
import fs from 'fs';
import Database from 'better-sqlite3';
import { Database } from 'bun:sqlite';
import readline from 'readline';
import path from 'path';
import { homedir } from 'os';
+31 -1
View File
@@ -26,6 +26,11 @@ const WORKER_SERVICE = {
source: 'src/services/worker-service.ts'
};
const WORKER_WRAPPER = {
name: 'worker-wrapper',
source: 'src/services/worker-wrapper.ts'
};
const MCP_SERVER = {
name: 'mcp-server',
source: 'src/servers/mcp-server.ts'
@@ -120,6 +125,31 @@ async function buildHooks() {
const workerStats = fs.statSync(`${hooksDir}/${WORKER_SERVICE.name}.cjs`);
console.log(`✓ worker-service built (${(workerStats.size / 1024).toFixed(2)} KB)`);
// Build worker wrapper (Windows zombie port fix)
console.log(`\n🔧 Building worker wrapper...`);
await build({
entryPoints: [WORKER_WRAPPER.source],
bundle: true,
platform: 'node',
target: 'node18',
format: 'cjs',
outfile: `${hooksDir}/${WORKER_WRAPPER.name}.cjs`,
minify: true,
logLevel: 'error',
external: ['bun:sqlite'],
define: {
'__DEFAULT_PACKAGE_VERSION__': `"${version}"`
},
banner: {
js: '#!/usr/bin/env bun'
}
});
// Make worker wrapper executable
fs.chmodSync(`${hooksDir}/${WORKER_WRAPPER.name}.cjs`, 0o755);
const wrapperStats = fs.statSync(`${hooksDir}/${WORKER_WRAPPER.name}.cjs`);
console.log(`✓ worker-wrapper built (${(wrapperStats.size / 1024).toFixed(2)} KB)`);
// Build MCP server
console.log(`\n🔧 Building MCP server...`);
await build({
@@ -136,7 +166,7 @@ async function buildHooks() {
'__DEFAULT_PACKAGE_VERSION__': `"${version}"`
},
banner: {
js: '#!/usr/bin/env bun'
js: '#!/usr/bin/env node'
}
});
+12 -27
View File
@@ -5,8 +5,7 @@
* Example: npx tsx scripts/export-memories.ts "windows" windows-memories.json --project=claude-mem
*/
import Database from 'better-sqlite3';
import { existsSync, writeFileSync } from 'fs';
import { writeFileSync } from 'fs';
import { homedir } from 'os';
import { join } from 'path';
import { SettingsDefaultsManager } from '../src/shared/SettingsDefaultsManager';
@@ -127,33 +126,19 @@ async function exportMemories(query: string, outputFile: string, project?: strin
if (s.sdk_session_id) sdkSessionIds.add(s.sdk_session_id);
});
// Get SDK sessions metadata from database
// (We need this because the API doesn't expose sdk_sessions table directly)
// Get SDK sessions metadata via API
console.log('📡 Fetching SDK sessions metadata...');
const sessions: SdkSessionRecord[] = [];
let sessions: SdkSessionRecord[] = [];
if (sdkSessionIds.size > 0) {
// Read directly from database for sdk_sessions table
const Database = (await import('better-sqlite3')).default;
const dbPath = join(homedir(), '.claude-mem', 'claude-mem.db');
if (!existsSync(dbPath)) {
console.error(`❌ Database not found at: ${dbPath}`);
console.error('💡 Has claude-mem been initialized? Try running a session first.');
process.exit(1);
}
const db = new Database(dbPath, { readonly: true });
try {
const placeholders = Array.from(sdkSessionIds).map(() => '?').join(',');
const sessionQuery = `
SELECT * FROM sdk_sessions
WHERE sdk_session_id IN (${placeholders})
ORDER BY started_at_epoch DESC
`;
sessions.push(...db.prepare(sessionQuery).all(...Array.from(sdkSessionIds)));
} finally {
db.close();
const sessionsResponse = await fetch(`${baseUrl}/api/sdk-sessions/batch`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ sdkSessionIds: Array.from(sdkSessionIds) })
});
if (sessionsResponse.ok) {
sessions = await sessionsResponse.json();
} else {
console.warn(`⚠️ Failed to fetch SDK sessions: ${sessionsResponse.status}`);
}
}
console.log(`✅ Found ${sessions.length} SDK sessions`);
+47 -203
View File
@@ -3,37 +3,21 @@
* Import memories from a JSON export file with duplicate prevention
* Usage: npx tsx scripts/import-memories.ts <input-file>
* Example: npx tsx scripts/import-memories.ts windows-memories.json
*
* This script uses the worker API instead of direct database access.
*/
import Database from 'better-sqlite3';
import { existsSync, readFileSync } from 'fs';
import { homedir } from 'os';
import { join } from 'path';
interface ImportStats {
sessionsImported: number;
sessionsSkipped: number;
summariesImported: number;
summariesSkipped: number;
observationsImported: number;
observationsSkipped: number;
promptsImported: number;
promptsSkipped: number;
}
const WORKER_PORT = process.env.CLAUDE_MEM_WORKER_PORT || 37777;
const WORKER_URL = `http://127.0.0.1:${WORKER_PORT}`;
function importMemories(inputFile: string) {
async function importMemories(inputFile: string) {
if (!existsSync(inputFile)) {
console.error(`❌ Input file not found: ${inputFile}`);
process.exit(1);
}
const dbPath = join(homedir(), '.claude-mem', 'claude-mem.db');
if (!existsSync(dbPath)) {
console.error(`❌ Database not found at: ${dbPath}`);
process.exit(1);
}
// Read and parse export file
const exportData = JSON.parse(readFileSync(inputFile, 'utf-8'));
@@ -47,190 +31,50 @@ function importMemories(inputFile: string) {
console.log(`${exportData.totalPrompts} prompts`);
console.log('');
const db = new Database(dbPath);
const stats: ImportStats = {
sessionsImported: 0,
sessionsSkipped: 0,
summariesImported: 0,
summariesSkipped: 0,
observationsImported: 0,
observationsSkipped: 0,
promptsImported: 0,
promptsSkipped: 0
};
// Check if worker is running
try {
// Prepare statements for duplicate checking
const checkSession = db.prepare('SELECT id FROM sdk_sessions WHERE claude_session_id = ?');
const checkSummary = db.prepare('SELECT id FROM session_summaries WHERE sdk_session_id = ?');
const checkObservation = db.prepare(`
SELECT id FROM observations
WHERE sdk_session_id = ?
AND title = ?
AND created_at_epoch = ?
`);
const checkPrompt = db.prepare(`
SELECT id FROM user_prompts
WHERE claude_session_id = ?
AND prompt_number = ?
`);
// Prepare insert statements
const insertSession = db.prepare(`
INSERT INTO sdk_sessions (
claude_session_id, sdk_session_id, project, user_prompt,
started_at, started_at_epoch, completed_at, completed_at_epoch,
status
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
const insertSummary = db.prepare(`
INSERT INTO session_summaries (
sdk_session_id, project, request, investigated, learned,
completed, next_steps, files_read, files_edited, notes,
prompt_number, discovery_tokens, created_at, created_at_epoch
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
const insertObservation = db.prepare(`
INSERT INTO observations (
sdk_session_id, project, text, type, title, subtitle,
facts, narrative, concepts, files_read, files_modified,
prompt_number, discovery_tokens, created_at, created_at_epoch
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
const insertPrompt = db.prepare(`
INSERT INTO user_prompts (
claude_session_id, prompt_number, prompt_text,
created_at, created_at_epoch
) VALUES (?, ?, ?, ?, ?)
`);
// Import in transaction
db.transaction(() => {
// 1. Import sessions first (dependency for everything else)
console.log('🔄 Importing sessions...');
for (const session of exportData.sessions) {
const exists = checkSession.get(session.claude_session_id);
if (exists) {
stats.sessionsSkipped++;
continue;
}
insertSession.run(
session.claude_session_id,
session.sdk_session_id,
session.project,
session.user_prompt,
session.started_at,
session.started_at_epoch,
session.completed_at,
session.completed_at_epoch,
session.status
);
stats.sessionsImported++;
}
console.log(` ✅ Imported: ${stats.sessionsImported}, Skipped: ${stats.sessionsSkipped}`);
// 2. Import summaries (depends on sessions)
console.log('🔄 Importing summaries...');
for (const summary of exportData.summaries) {
const exists = checkSummary.get(summary.sdk_session_id);
if (exists) {
stats.summariesSkipped++;
continue;
}
insertSummary.run(
summary.sdk_session_id,
summary.project,
summary.request,
summary.investigated,
summary.learned,
summary.completed,
summary.next_steps,
summary.files_read,
summary.files_edited,
summary.notes,
summary.prompt_number,
summary.discovery_tokens || 0,
summary.created_at,
summary.created_at_epoch
);
stats.summariesImported++;
}
console.log(` ✅ Imported: ${stats.summariesImported}, Skipped: ${stats.summariesSkipped}`);
// 3. Import observations (depends on sessions)
console.log('🔄 Importing observations...');
for (const obs of exportData.observations) {
const exists = checkObservation.get(
obs.sdk_session_id,
obs.title,
obs.created_at_epoch
);
if (exists) {
stats.observationsSkipped++;
continue;
}
insertObservation.run(
obs.sdk_session_id,
obs.project,
obs.text,
obs.type,
obs.title,
obs.subtitle,
obs.facts,
obs.narrative,
obs.concepts,
obs.files_read,
obs.files_modified,
obs.prompt_number,
obs.discovery_tokens || 0,
obs.created_at,
obs.created_at_epoch
);
stats.observationsImported++;
}
console.log(` ✅ Imported: ${stats.observationsImported}, Skipped: ${stats.observationsSkipped}`);
// 4. Import prompts (depends on sessions)
console.log('🔄 Importing prompts...');
for (const prompt of exportData.prompts) {
const exists = checkPrompt.get(
prompt.claude_session_id,
prompt.prompt_number
);
if (exists) {
stats.promptsSkipped++;
continue;
}
insertPrompt.run(
prompt.claude_session_id,
prompt.prompt_number,
prompt.prompt_text,
prompt.created_at,
prompt.created_at_epoch
);
stats.promptsImported++;
}
console.log(` ✅ Imported: ${stats.promptsImported}, Skipped: ${stats.promptsSkipped}`);
})();
console.log('\n✅ Import complete!');
console.log('📊 Summary:');
console.log(` Sessions: ${stats.sessionsImported} imported, ${stats.sessionsSkipped} skipped`);
console.log(` Summaries: ${stats.summariesImported} imported, ${stats.summariesSkipped} skipped`);
console.log(` Observations: ${stats.observationsImported} imported, ${stats.observationsSkipped} skipped`);
console.log(` Prompts: ${stats.promptsImported} imported, ${stats.promptsSkipped} skipped`);
} finally {
db.close();
const healthCheck = await fetch(`${WORKER_URL}/api/stats`);
if (!healthCheck.ok) {
throw new Error('Worker not responding');
}
} catch (error) {
console.error(`❌ Worker not running at ${WORKER_URL}`);
console.error(' Please ensure the claude-mem worker is running.');
process.exit(1);
}
console.log('🔄 Importing via worker API...');
// Send import request to worker
const response = await fetch(`${WORKER_URL}/api/import`, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
sessions: exportData.sessions || [],
summaries: exportData.summaries || [],
observations: exportData.observations || [],
prompts: exportData.prompts || []
})
});
if (!response.ok) {
const errorText = await response.text();
console.error(`❌ Import failed: ${response.status} ${response.statusText}`);
console.error(` ${errorText}`);
process.exit(1);
}
const result = await response.json();
const stats = result.stats;
console.log('\n✅ Import complete!');
console.log('📊 Summary:');
console.log(` Sessions: ${stats.sessionsImported} imported, ${stats.sessionsSkipped} skipped`);
console.log(` Summaries: ${stats.summariesImported} imported, ${stats.summariesSkipped} skipped`);
console.log(` Observations: ${stats.observationsImported} imported, ${stats.observationsSkipped} skipped`);
console.log(` Prompts: ${stats.promptsImported} imported, ${stats.promptsSkipped} skipped`);
}
// CLI interface
+2 -2
View File
@@ -6,12 +6,12 @@
* native module dependencies.
*/
import path from "path";
import { stdin } from "process";
import { ensureWorkerRunning, getWorkerPort } from "../shared/worker-utils.js";
import { HOOK_TIMEOUTS } from "../shared/hook-constants.js";
import { handleWorkerError } from "../shared/hook-error-handler.js";
import { handleFetchError } from "./shared/error-handler.js";
import { getProjectName } from "../utils/project-name.js";
export interface SessionStartInput {
session_id: string;
@@ -25,7 +25,7 @@ async function contextHook(input?: SessionStartInput): Promise<string> {
await ensureWorkerRunning();
const cwd = input?.cwd ?? process.cwd();
const project = cwd ? path.basename(cwd) : "unknown-project";
const project = getProjectName(cwd);
const port = getWorkerPort();
const url = `http://127.0.0.1:${port}/api/context/inject?project=${encodeURIComponent(project)}`;
+2 -2
View File
@@ -1,9 +1,9 @@
import path from 'path';
import { stdin } from 'process';
import { createHookResponse } from './hook-response.js';
import { ensureWorkerRunning, getWorkerPort } from '../shared/worker-utils.js';
import { handleWorkerError } from '../shared/hook-error-handler.js';
import { handleFetchError } from './shared/error-handler.js';
import { getProjectName } from '../utils/project-name.js';
export interface UserPromptSubmitInput {
session_id: string;
@@ -24,7 +24,7 @@ async function newHook(input?: UserPromptSubmitInput): Promise<void> {
}
const { session_id, cwd, prompt } = input;
const project = path.basename(cwd);
const project = getProjectName(cwd);
const port = getWorkerPort();
+192 -78
View File
@@ -6,14 +6,18 @@
* Maintains MCP protocol handling and tool schemas
*/
// CRITICAL: Redirect console.log to stderr BEFORE any imports
// MCP uses stdio transport where stdout is reserved for JSON-RPC protocol messages.
// Any logs to stdout break the protocol (Claude Desktop parses "[2025..." as JSON array).
const _originalConsoleLog = console.log;
console.log = (...args: any[]) => console.error(...args);
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import { logger } from '../utils/logger.js';
import { getWorkerPort, getWorkerHost } from '../shared/worker-utils.js';
@@ -32,7 +36,73 @@ const TOOL_ENDPOINT_MAP: Record<string, string> = {
'timeline': '/api/timeline',
'get_recent_context': '/api/context/recent',
'get_context_timeline': '/api/context/timeline',
'progressive_description': '/api/instructions'
'help': '/api/instructions'
};
/**
* Detailed parameter schemas for each tool
*/
const TOOL_SCHEMAS: Record<string, any> = {
search: {
query: { type: 'string', description: 'Full-text search query' },
type: { type: 'string', description: 'Filter by type: tool_use, tool_result, prompt, summary' },
obs_type: { type: 'string', description: 'Observation type filter' },
concepts: { type: 'string', description: 'Comma-separated concept tags' },
files: { type: 'string', description: 'Comma-separated file paths' },
project: { type: 'string', description: 'Project name filter' },
dateStart: { type: ['string', 'number'], description: 'Start date (ISO or timestamp)' },
dateEnd: { type: ['string', 'number'], description: 'End date (ISO or timestamp)' },
limit: { type: 'number', description: 'Max results (default: 10)' },
offset: { type: 'number', description: 'Result offset for pagination' },
orderBy: { type: 'string', description: 'Sort order: created_at, relevance' }
},
timeline: {
query: { type: 'string', description: 'Search query to find anchor point' },
anchor: { type: 'number', description: 'Observation ID as timeline center' },
depth_before: { type: 'number', description: 'Observations before anchor (default: 5)' },
depth_after: { type: 'number', description: 'Observations after anchor (default: 5)' },
type: { type: 'string', description: 'Filter by type' },
concepts: { type: 'string', description: 'Comma-separated concept tags' },
files: { type: 'string', description: 'Comma-separated file paths' },
project: { type: 'string', description: 'Project name filter' }
},
get_recent_context: {
limit: { type: 'number', description: 'Max results (default: 20)' },
type: { type: 'string', description: 'Filter by type' },
concepts: { type: 'string', description: 'Comma-separated concept tags' },
files: { type: 'string', description: 'Comma-separated file paths' },
project: { type: 'string', description: 'Project name filter' },
dateStart: { type: ['string', 'number'], description: 'Start date' },
dateEnd: { type: ['string', 'number'], description: 'End date' }
},
get_context_timeline: {
anchor: { type: 'number', description: 'Observation ID (required)', required: true },
depth_before: { type: 'number', description: 'Observations before anchor' },
depth_after: { type: 'number', description: 'Observations after anchor' },
type: { type: 'string', description: 'Filter by type' },
concepts: { type: 'string', description: 'Comma-separated concept tags' },
files: { type: 'string', description: 'Comma-separated file paths' },
project: { type: 'string', description: 'Project name filter' }
},
get_observations: {
ids: { type: 'array', items: { type: 'number' }, description: 'Array of observation IDs (required)', required: true },
orderBy: { type: 'string', description: 'Sort order' },
limit: { type: 'number', description: 'Max results' },
project: { type: 'string', description: 'Project filter' }
},
help: {
operation: { type: 'string', description: 'Operation type: "observations", "timeline", "sessions", etc.' },
topic: { type: 'string', description: 'Specific topic for help' }
},
get_observation: {
id: { type: 'number', description: 'Observation ID (required)', required: true }
},
get_session: {
id: { type: 'number', description: 'Session ID (required)', required: true }
},
get_prompt: {
id: { type: 'number', description: 'Prompt ID (required)', required: true }
}
};
/**
@@ -182,25 +252,47 @@ async function verifyWorkerConnection(): Promise<boolean> {
/**
* Tool definitions with HTTP-based handlers
* Descriptions removed - use progressive_description tool for parameter documentation
* Minimal descriptions - use help() tool with operation parameter for detailed docs
*/
const tools = [
{
name: 'get_schema',
description: 'Get parameter schema for a tool. Call get_schema(tool_name) for details',
inputSchema: {
type: 'object',
properties: { tool_name: { type: 'string' } },
required: ['tool_name']
},
handler: async (args: any) => {
// Validate tool_name to prevent prototype pollution
const toolName = args.tool_name;
if (typeof toolName !== 'string' || !Object.hasOwn(TOOL_SCHEMAS, toolName)) {
return {
content: [{
type: 'text' as const,
text: `Unknown tool: ${toolName}\n\nAvailable tools: ${Object.keys(TOOL_SCHEMAS).join(', ')}`
}],
isError: true
};
}
const schema = TOOL_SCHEMAS[toolName];
return {
content: [{
type: 'text' as const,
text: `# ${toolName} Parameters\n\n${JSON.stringify(schema, null, 2)}`
}]
};
}
},
{
name: 'search',
description: 'Search memory',
inputSchema: z.object({
query: z.string().optional(),
type: z.enum(['observations', 'sessions', 'prompts']).optional(),
obs_type: z.string().optional(),
concepts: z.string().optional(),
files: z.string().optional(),
project: z.string().optional(),
dateStart: z.union([z.string(), z.number()]).optional(),
dateEnd: z.union([z.string(), z.number()]).optional(),
limit: z.number().min(1).max(100).default(20),
offset: z.number().min(0).default(0),
orderBy: z.enum(['relevance', 'date_desc', 'date_asc']).default('date_desc')
}),
description: 'Search memory. All parameters optional - call get_schema("search") for details',
inputSchema: {
type: 'object',
properties: {},
additionalProperties: true
},
handler: async (args: any) => {
const endpoint = TOOL_ENDPOINT_MAP['search'];
return await callWorkerAPI(endpoint, args);
@@ -208,17 +300,12 @@ const tools = [
},
{
name: 'timeline',
description: 'Timeline context',
inputSchema: z.object({
query: z.string().optional(),
anchor: z.number().optional(),
depth_before: z.number().min(0).max(100).default(10),
depth_after: z.number().min(0).max(100).default(10),
type: z.string().optional(),
concepts: z.string().optional(),
files: z.string().optional(),
project: z.string().optional()
}),
description: 'Timeline context. All parameters optional - call get_schema("timeline") for details',
inputSchema: {
type: 'object',
properties: {},
additionalProperties: true
},
handler: async (args: any) => {
const endpoint = TOOL_ENDPOINT_MAP['timeline'];
return await callWorkerAPI(endpoint, args);
@@ -226,16 +313,12 @@ const tools = [
},
{
name: 'get_recent_context',
description: 'Recent context',
inputSchema: z.object({
limit: z.number().min(1).max(100).default(30),
type: z.string().optional(),
concepts: z.string().optional(),
files: z.string().optional(),
project: z.string().optional(),
dateStart: z.union([z.string(), z.number()]).optional(),
dateEnd: z.union([z.string(), z.number()]).optional()
}),
description: 'Recent context. All parameters optional - call get_schema("get_recent_context") for details',
inputSchema: {
type: 'object',
properties: {},
additionalProperties: true
},
handler: async (args: any) => {
const endpoint = TOOL_ENDPOINT_MAP['get_recent_context'];
return await callWorkerAPI(endpoint, args);
@@ -243,71 +326,102 @@ const tools = [
},
{
name: 'get_context_timeline',
description: 'Timeline around ID',
inputSchema: z.object({
anchor: z.number(),
depth_before: z.number().min(0).max(100).default(10),
depth_after: z.number().min(0).max(100).default(10),
type: z.string().optional(),
concepts: z.string().optional(),
files: z.string().optional(),
project: z.string().optional()
}),
description: 'Timeline around observation ID',
inputSchema: {
type: 'object',
properties: {
anchor: {
type: 'number',
description: 'Observation ID (required). Optional params: get_schema("get_context_timeline")'
}
},
required: ['anchor'],
additionalProperties: true
},
handler: async (args: any) => {
const endpoint = TOOL_ENDPOINT_MAP['get_context_timeline'];
return await callWorkerAPI(endpoint, args);
}
},
{
name: 'progressive_description',
description: 'Usage help',
inputSchema: z.object({
topic: z.enum(['workflow', 'search_params', 'examples', 'all']).default('all')
}),
name: 'help',
description: 'Get detailed docs. All parameters optional - call get_schema("help") for details',
inputSchema: {
type: 'object',
properties: {},
additionalProperties: true
},
handler: async (args: any) => {
const endpoint = TOOL_ENDPOINT_MAP['progressive_description'];
const endpoint = TOOL_ENDPOINT_MAP['help'];
return await callWorkerAPI(endpoint, args);
}
},
{
name: 'get_observation',
description: 'Fetch by ID',
inputSchema: z.object({
id: z.number()
}),
description: 'Fetch observation by ID',
inputSchema: {
type: 'object',
properties: {
id: {
type: 'number',
description: 'Observation ID (required)'
}
},
required: ['id']
},
handler: async (args: any) => {
return await callWorkerAPIWithPath('/api/observation', args.id);
}
},
{
name: 'get_batch_observations',
description: 'Batch fetch',
inputSchema: z.object({
ids: z.array(z.number()),
orderBy: z.enum(['date_desc', 'date_asc']).optional(),
limit: z.number().optional(),
project: z.string().optional()
}),
name: 'get_observations',
description: 'Batch fetch observations',
inputSchema: {
type: 'object',
properties: {
ids: {
type: 'array',
items: { type: 'number' },
description: 'Array of observation IDs (required). Optional params: get_schema("get_observations")'
}
},
required: ['ids'],
additionalProperties: true
},
handler: async (args: any) => {
return await callWorkerAPIPost('/api/observations/batch', args);
}
},
{
name: 'get_session',
description: 'Session by ID',
inputSchema: z.object({
id: z.number()
}),
description: 'Fetch session by ID',
inputSchema: {
type: 'object',
properties: {
id: {
type: 'number',
description: 'Session ID (required)'
}
},
required: ['id']
},
handler: async (args: any) => {
return await callWorkerAPIWithPath('/api/session', args.id);
}
},
{
name: 'get_prompt',
description: 'Prompt by ID',
inputSchema: z.object({
id: z.number()
}),
description: 'Fetch prompt by ID',
inputSchema: {
type: 'object',
properties: {
id: {
type: 'number',
description: 'Prompt ID (required)'
}
},
required: ['id']
},
handler: async (args: any) => {
return await callWorkerAPIWithPath('/api/prompt', args.id);
}
@@ -317,7 +431,7 @@ const tools = [
// Create the MCP server
const server = new Server(
{
name: 'claude-mem-search-server',
name: 'mem-search-server',
version: '1.0.0',
},
{
@@ -333,7 +447,7 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
tools: tools.map(tool => ({
name: tool.name,
description: tool.description,
inputSchema: zodToJsonSchema(tool.inputSchema) as Record<string, unknown>
inputSchema: tool.inputSchema
}))
};
});
@@ -382,7 +496,7 @@ async function main() {
if (!workerAvailable) {
logger.warn('SYSTEM', 'Worker not available', undefined, { workerUrl: WORKER_BASE_URL });
logger.warn('SYSTEM', 'Tools will fail until Worker is started');
logger.warn('SYSTEM', 'Start Worker with: npm run worker:restart');
logger.warn('SYSTEM', 'Start Worker with: claude-mem restart');
} else {
logger.info('SYSTEM', 'Worker available', undefined, { workerUrl: WORKER_BASE_URL });
}
+2 -1
View File
@@ -25,6 +25,7 @@ import {
toRelativePath,
extractFirstFile
} from '../shared/timeline-formatting.js';
import { getProjectName } from '../utils/project-name.js';
// Version marker path - use homedir-based path that works in both CJS and ESM contexts
const VERSION_MARKER_PATH = path.join(homedir(), '.claude', 'plugins', 'marketplaces', 'thedotmack', 'plugin', '.install-version');
@@ -222,7 +223,7 @@ function extractPriorMessages(transcriptPath: string): { userMessage: string; as
export async function generateContext(input?: ContextInput, useColors: boolean = false): Promise<string> {
const config = loadContextConfig();
const cwd = input?.cwd ?? process.cwd();
const project = cwd ? path.basename(cwd) : 'unknown-project';
const project = getProjectName(cwd);
let db: SessionStore | null = null;
try {
+144 -21
View File
@@ -5,6 +5,7 @@ import { spawn, spawnSync } from 'child_process';
import { homedir } from 'os';
import { DATA_DIR } from '../../shared/paths.js';
import { getBunPath, isBunAvailable } from '../../utils/bun-path.js';
import { SettingsDefaultsManager } from '../../shared/SettingsDefaultsManager.js';
const PID_FILE = join(DATA_DIR, 'worker.pid');
const LOG_DIR = join(DATA_DIR, 'logs');
@@ -16,6 +17,7 @@ const HEALTH_CHECK_TIMEOUT_MS = 10000;
const HEALTH_CHECK_INTERVAL_MS = 200;
const HEALTH_CHECK_FETCH_TIMEOUT_MS = 1000;
const PROCESS_EXIT_CHECK_INTERVAL_MS = 100;
const HTTP_SHUTDOWN_TIMEOUT_MS = 2000;
interface PidInfo {
pid: number;
@@ -43,8 +45,10 @@ export class ProcessManager {
// Ensure log directory exists
mkdirSync(LOG_DIR, { recursive: true });
// Get worker script path
const workerScript = join(MARKETPLACE_ROOT, 'plugin', 'scripts', 'worker-service.cjs');
// On Windows, use the wrapper script to solve zombie port problem
// On Unix, use the worker directly
const scriptName = process.platform === 'win32' ? 'worker-wrapper.cjs' : 'worker-service.cjs';
const workerScript = join(MARKETPLACE_ROOT, 'plugin', 'scripts', scriptName);
if (!existsSync(workerScript)) {
return { success: false, error: `Worker script not found at ${workerScript}` };
@@ -86,6 +90,10 @@ export class ProcessManager {
// Note: windowsHide: true doesn't work with detached: true (Bun inherits Node.js process spawning semantics)
// See: https://github.com/nodejs/node/issues/21825 and PR #315 for detailed testing
//
// On Windows, we start worker-wrapper.cjs which manages the actual worker-service.cjs.
// This solves the zombie port problem: the wrapper has no sockets, so when it kills
// and respawns the inner worker, the socket is properly released.
//
// Security: All paths (bunPath, script, MARKETPLACE_ROOT) are application-controlled system paths,
// not user input. If an attacker could modify these paths, they would already have full filesystem
// access including direct access to ~/.claude-mem/claude-mem.db. Nevertheless, we properly escape
@@ -93,8 +101,9 @@ export class ProcessManager {
const escapedBunPath = this.escapePowerShellString(bunPath);
const escapedScript = this.escapePowerShellString(script);
const escapedWorkDir = this.escapePowerShellString(MARKETPLACE_ROOT);
const escapedLogFile = this.escapePowerShellString(logFile);
const envVars = `$env:CLAUDE_MEM_WORKER_PORT='${port}'`;
const psCommand = `${envVars}; Start-Process -FilePath '${escapedBunPath}' -ArgumentList '${escapedScript}' -WorkingDirectory '${escapedWorkDir}' -WindowStyle Hidden -PassThru | Select-Object -ExpandProperty Id`;
const psCommand = `${envVars}; Start-Process -FilePath '${escapedBunPath}' -ArgumentList '${escapedScript}' -WorkingDirectory '${escapedWorkDir}' -WindowStyle Hidden -RedirectStandardOutput '${escapedLogFile}' -RedirectStandardError '${escapedLogFile}.err' -PassThru | Select-Object -ExpandProperty Id`;
const result = spawnSync('powershell', ['-Command', psCommand], {
stdio: 'pipe',
@@ -165,21 +174,65 @@ export class ProcessManager {
static async stop(timeout: number = PROCESS_STOP_TIMEOUT_MS): Promise<boolean> {
const info = this.getPidInfo();
if (!info) return true;
try {
process.kill(info.pid, 'SIGTERM');
await this.waitForExit(info.pid, timeout);
} catch {
try {
process.kill(info.pid, 'SIGKILL');
} catch {
// Process already dead
if (process.platform === 'win32') {
// Windows: Try graceful HTTP shutdown first - this works regardless of PID file state
// because the worker shuts itself down from the inside (via wrapper IPC)
const port = info?.port ?? this.getPortFromSettings();
const httpShutdownSucceeded = await this.tryHttpShutdown(port);
if (httpShutdownSucceeded) {
// HTTP shutdown succeeded - worker confirmed down, safe to remove PID file
this.removePidFile();
return true;
}
}
this.removePidFile();
return true;
// HTTP shutdown failed (worker not responding), fall back to taskkill
if (!info) {
// No PID file and HTTP failed - nothing more we can do
return true;
}
const { execSync } = await import('child_process');
try {
// Use taskkill /T /F to kill entire process tree
// This ensures the wrapper AND all its children (inner worker, MCP, ChromaSync) are killed
// which is necessary to properly release the socket and avoid zombie ports
execSync(`taskkill /PID ${info.pid} /T /F`, { timeout: 10000, stdio: 'ignore' });
} catch {
// Process may already be dead
}
// Wait for process to actually exit before removing PID file
try {
await this.waitForExit(info.pid, timeout);
} catch {
// Timeout waiting - process may still be alive
}
// Only remove PID file if process is confirmed dead
if (!this.isProcessAlive(info.pid)) {
this.removePidFile();
}
return true;
} else {
// Unix: Use signals (unchanged behavior)
if (!info) return true;
try {
process.kill(info.pid, 'SIGTERM');
await this.waitForExit(info.pid, timeout);
} catch {
try {
process.kill(info.pid, 'SIGKILL');
} catch {
// Process already dead
}
}
this.removePidFile();
return true;
}
}
static async restart(port: number): Promise<{ success: boolean; pid?: number; error?: string }> {
@@ -210,6 +263,66 @@ export class ProcessManager {
return alive;
}
/**
* Get worker port from settings file
*/
private static getPortFromSettings(): number {
try {
const settingsPath = join(DATA_DIR, 'settings.json');
const settings = SettingsDefaultsManager.loadFromFile(settingsPath);
return parseInt(settings.CLAUDE_MEM_WORKER_PORT, 10);
} catch {
return parseInt(SettingsDefaultsManager.get('CLAUDE_MEM_WORKER_PORT'), 10);
}
}
/**
* Try to shut down the worker via HTTP endpoint
* Returns true if shutdown succeeded, false if worker not responding
*/
private static async tryHttpShutdown(port: number): Promise<boolean> {
try {
// Send shutdown request
const response = await fetch(`http://127.0.0.1:${port}/api/admin/shutdown`, {
method: 'POST',
signal: AbortSignal.timeout(HTTP_SHUTDOWN_TIMEOUT_MS)
});
if (!response.ok) {
return false;
}
// Wait for worker to actually stop responding
return await this.waitForWorkerDown(port, PROCESS_STOP_TIMEOUT_MS);
} catch {
// Worker not responding to HTTP - it may be dead or hung
return false;
}
}
/**
* Wait for worker to stop responding on the given port
*/
private static async waitForWorkerDown(port: number, timeout: number): Promise<boolean> {
const startTime = Date.now();
while (Date.now() - startTime < timeout) {
try {
await fetch(`http://127.0.0.1:${port}/api/health`, {
signal: AbortSignal.timeout(500)
});
// Still responding, wait and retry
await new Promise(resolve => setTimeout(resolve, PROCESS_EXIT_CHECK_INTERVAL_MS));
} catch {
// Worker stopped responding - success
return true;
}
}
// Timeout - worker still responding
return false;
}
// Helper methods
private static getPidInfo(): PidInfo | null {
try {
@@ -252,29 +365,39 @@ export class ProcessManager {
private static async waitForHealth(pid: number, port: number, timeoutMs: number = HEALTH_CHECK_TIMEOUT_MS): Promise<{ success: boolean; pid?: number; error?: string }> {
const startTime = Date.now();
const isWindows = process.platform === 'win32';
// Increase timeout on Windows to account for slower process startup
const adjustedTimeout = isWindows ? timeoutMs * 2 : timeoutMs;
while (Date.now() - startTime < timeoutMs) {
while (Date.now() - startTime < adjustedTimeout) {
// Check if process is still alive
if (!this.isProcessAlive(pid)) {
return { success: false, error: 'Process died during startup' };
const errorMsg = isWindows
? `Process died during startup\n\nTroubleshooting:\n1. Check Task Manager for zombie 'bun.exe' or 'node.exe' processes\n2. Verify port ${port} is not in use: netstat -ano | findstr ${port}\n3. Check worker logs in ~/.claude-mem/logs/\n4. See GitHub issues: #363, #367, #371, #373\n5. Docs: https://docs.claude-mem.ai/troubleshooting/windows-issues`
: 'Process died during startup';
return { success: false, error: errorMsg };
}
// Try health check
// Try readiness check (changed from /health to /api/readiness)
try {
const response = await fetch(`http://127.0.0.1:${port}/health`, {
const response = await fetch(`http://127.0.0.1:${port}/api/readiness`, {
signal: AbortSignal.timeout(HEALTH_CHECK_FETCH_TIMEOUT_MS)
});
if (response.ok) {
return { success: true, pid };
}
} catch {
// Not ready yet
// Not ready yet, continue polling
}
await new Promise(resolve => setTimeout(resolve, HEALTH_CHECK_INTERVAL_MS));
}
return { success: false, error: 'Health check timed out' };
const timeoutMsg = isWindows
? `Worker failed to start on Windows (readiness check timed out after ${adjustedTimeout}ms)\n\nTroubleshooting:\n1. Check Task Manager for zombie 'bun.exe' or 'node.exe' processes\n2. Verify port ${port} is not in use: netstat -ano | findstr ${port}\n3. Check worker logs in ~/.claude-mem/logs/\n4. See GitHub issues: #363, #367, #371, #373\n5. Docs: https://docs.claude-mem.ai/troubleshooting/windows-issues`
: `Readiness check timed out after ${adjustedTimeout}ms`;
return { success: false, error: timeoutMsg };
}
private static async waitForExit(pid: number, timeout: number): Promise<void> {
+238
View File
@@ -1039,6 +1039,36 @@ export class SessionStore {
return stmt.get(id) || null;
}
/**
* Get SDK sessions by SDK session IDs
* Used for exporting session metadata
*/
getSdkSessionsBySessionIds(sdkSessionIds: string[]): {
id: number;
claude_session_id: string;
sdk_session_id: string;
project: string;
user_prompt: string;
started_at: string;
started_at_epoch: number;
completed_at: string | null;
completed_at_epoch: number | null;
status: string;
}[] {
if (sdkSessionIds.length === 0) return [];
const placeholders = sdkSessionIds.map(() => '?').join(',');
const stmt = this.db.prepare(`
SELECT id, claude_session_id, sdk_session_id, project, user_prompt,
started_at, started_at_epoch, completed_at, completed_at_epoch, status
FROM sdk_sessions
WHERE sdk_session_id IN (${placeholders})
ORDER BY started_at_epoch DESC
`);
return stmt.all(...sdkSessionIds) as any[];
}
/**
* Find active SDK session for a Claude session
*/
@@ -1795,4 +1825,212 @@ export class SessionStore {
close(): void {
this.db.close();
}
// ===========================================
// Import Methods (for import-memories script)
// ===========================================
/**
* Import SDK session with duplicate checking
* Returns: { imported: boolean, id: number }
*/
importSdkSession(session: {
claude_session_id: string;
sdk_session_id: string;
project: string;
user_prompt: string;
started_at: string;
started_at_epoch: number;
completed_at: string | null;
completed_at_epoch: number | null;
status: string;
}): { imported: boolean; id: number } {
// Check if session already exists
const existing = this.db.prepare(
'SELECT id FROM sdk_sessions WHERE claude_session_id = ?'
).get(session.claude_session_id) as { id: number } | undefined;
if (existing) {
return { imported: false, id: existing.id };
}
const stmt = this.db.prepare(`
INSERT INTO sdk_sessions (
claude_session_id, sdk_session_id, project, user_prompt,
started_at, started_at_epoch, completed_at, completed_at_epoch, status
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
const result = stmt.run(
session.claude_session_id,
session.sdk_session_id,
session.project,
session.user_prompt,
session.started_at,
session.started_at_epoch,
session.completed_at,
session.completed_at_epoch,
session.status
);
return { imported: true, id: result.lastInsertRowid as number };
}
/**
* Import session summary with duplicate checking
* Returns: { imported: boolean, id: number }
*/
importSessionSummary(summary: {
sdk_session_id: string;
project: string;
request: string | null;
investigated: string | null;
learned: string | null;
completed: string | null;
next_steps: string | null;
files_read: string | null;
files_edited: string | null;
notes: string | null;
prompt_number: number | null;
discovery_tokens: number;
created_at: string;
created_at_epoch: number;
}): { imported: boolean; id: number } {
// Check if summary already exists for this session
const existing = this.db.prepare(
'SELECT id FROM session_summaries WHERE sdk_session_id = ?'
).get(summary.sdk_session_id) as { id: number } | undefined;
if (existing) {
return { imported: false, id: existing.id };
}
const stmt = this.db.prepare(`
INSERT INTO session_summaries (
sdk_session_id, project, request, investigated, learned,
completed, next_steps, files_read, files_edited, notes,
prompt_number, discovery_tokens, created_at, created_at_epoch
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
const result = stmt.run(
summary.sdk_session_id,
summary.project,
summary.request,
summary.investigated,
summary.learned,
summary.completed,
summary.next_steps,
summary.files_read,
summary.files_edited,
summary.notes,
summary.prompt_number,
summary.discovery_tokens || 0,
summary.created_at,
summary.created_at_epoch
);
return { imported: true, id: result.lastInsertRowid as number };
}
/**
* Import observation with duplicate checking
* Duplicates are identified by sdk_session_id + title + created_at_epoch
* Returns: { imported: boolean, id: number }
*/
importObservation(obs: {
sdk_session_id: string;
project: string;
text: string | null;
type: string;
title: string | null;
subtitle: string | null;
facts: string | null;
narrative: string | null;
concepts: string | null;
files_read: string | null;
files_modified: string | null;
prompt_number: number | null;
discovery_tokens: number;
created_at: string;
created_at_epoch: number;
}): { imported: boolean; id: number } {
// Check if observation already exists
const existing = this.db.prepare(`
SELECT id FROM observations
WHERE sdk_session_id = ? AND title = ? AND created_at_epoch = ?
`).get(obs.sdk_session_id, obs.title, obs.created_at_epoch) as { id: number } | undefined;
if (existing) {
return { imported: false, id: existing.id };
}
const stmt = this.db.prepare(`
INSERT INTO observations (
sdk_session_id, project, text, type, title, subtitle,
facts, narrative, concepts, files_read, files_modified,
prompt_number, discovery_tokens, created_at, created_at_epoch
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
const result = stmt.run(
obs.sdk_session_id,
obs.project,
obs.text,
obs.type,
obs.title,
obs.subtitle,
obs.facts,
obs.narrative,
obs.concepts,
obs.files_read,
obs.files_modified,
obs.prompt_number,
obs.discovery_tokens || 0,
obs.created_at,
obs.created_at_epoch
);
return { imported: true, id: result.lastInsertRowid as number };
}
/**
* Import user prompt with duplicate checking
* Duplicates are identified by claude_session_id + prompt_number
* Returns: { imported: boolean, id: number }
*/
importUserPrompt(prompt: {
claude_session_id: string;
prompt_number: number;
prompt_text: string;
created_at: string;
created_at_epoch: number;
}): { imported: boolean; id: number } {
// Check if prompt already exists
const existing = this.db.prepare(`
SELECT id FROM user_prompts
WHERE claude_session_id = ? AND prompt_number = ?
`).get(prompt.claude_session_id, prompt.prompt_number) as { id: number } | undefined;
if (existing) {
return { imported: false, id: existing.id };
}
const stmt = this.db.prepare(`
INSERT INTO user_prompts (
claude_session_id, prompt_number, prompt_text,
created_at, created_at_epoch
) VALUES (?, ?, ?, ?, ?)
`);
const result = stmt.run(
prompt.claude_session_id,
prompt.prompt_number,
prompt.prompt_text,
prompt.created_at,
prompt.created_at_epoch
);
return { imported: true, id: result.lastInsertRowid as number };
}
}
+13 -2
View File
@@ -101,7 +101,9 @@ export class ChromaSync {
// See: https://github.com/thedotmack/claude-mem/issues/170 (Python 3.14 incompatibility)
const settings = SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH);
const pythonVersion = settings.CLAUDE_MEM_PYTHON_VERSION;
this.transport = new StdioClientTransport({
const isWindows = process.platform === 'win32';
const transportOptions: any = {
command: 'uvx',
args: [
'--python', pythonVersion,
@@ -110,7 +112,16 @@ export class ChromaSync {
'--data-dir', this.VECTOR_DB_DIR
],
stderr: 'ignore'
});
};
// CRITICAL: On Windows, try to hide console window to prevent PowerShell popups
// Note: windowsHide may not be supported by MCP SDK's StdioClientTransport
if (isWindows) {
transportOptions.windowsHide = true;
logger.debug('CHROMA_SYNC', 'Windows detected, attempting to hide console window', { project: this.project });
}
this.transport = new StdioClientTransport(transportOptions);
this.client = new Client({
name: 'claude-mem-chroma-sync',
+268 -52
View File
@@ -14,7 +14,7 @@ import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';
import { getWorkerPort, getWorkerHost } from '../shared/worker-utils.js';
import { logger } from '../utils/logger.js';
import { exec } from 'child_process';
import { exec, execSync } from 'child_process';
import { promisify } from 'util';
const execAsync = promisify(exec);
@@ -32,7 +32,7 @@ import { TimelineService } from './worker/TimelineService.js';
import { SessionEventBroadcaster } from './worker/events/SessionEventBroadcaster.js';
// Import HTTP layer
import { createMiddleware, summarizeRequestBody as summarizeBody } from './worker/http/middleware.js';
import { createMiddleware, summarizeRequestBody as summarizeBody, requireLocalhost } from './worker/http/middleware.js';
import { ViewerRoutes } from './worker/http/routes/ViewerRoutes.js';
import { SessionRoutes } from './worker/http/routes/SessionRoutes.js';
import { DataRoutes } from './worker/http/routes/DataRoutes.js';
@@ -45,6 +45,10 @@ export class WorkerService {
private startTime: number = Date.now();
private mcpClient: Client;
// Initialization flags for MCP/SDK readiness tracking
private mcpReady: boolean = false;
private initializationCompleteFlag: boolean = false;
// Domain services
private dbManager: DatabaseManager;
private sessionManager: SessionManager;
@@ -118,18 +122,46 @@ export class WorkerService {
*/
private setupRoutes(): void {
// Health check endpoint
// TEST_BUILD_ID helps verify which build is running during debugging
const TEST_BUILD_ID = 'TEST-008-wrapper-ipc';
this.app.get('/api/health', (_req, res) => {
res.status(200).json({ status: 'ok' });
res.status(200).json({
status: 'ok',
build: TEST_BUILD_ID,
managed: process.env.CLAUDE_MEM_MANAGED === 'true',
hasIpc: typeof process.send === 'function',
platform: process.platform,
pid: process.pid,
initialized: this.initializationCompleteFlag,
mcpReady: this.mcpReady,
});
});
// Readiness check endpoint - returns 503 until full initialization completes
// Used by ProcessManager and worker-utils to ensure worker is fully ready before routing requests
this.app.get('/api/readiness', (_req, res) => {
if (this.initializationCompleteFlag) {
res.status(200).json({
status: 'ready',
mcpReady: this.mcpReady,
});
} else {
res.status(503).json({
status: 'initializing',
message: 'Worker is still initializing, please retry',
});
}
});
// Version endpoint - returns the worker's current version
this.app.get('/api/version', (_req, res) => {
const { homedir } = require('os');
const { readFileSync } = require('fs');
const marketplaceRoot = path.join(homedir(), '.claude', 'plugins', 'marketplaces', 'thedotmack');
const packageJsonPath = path.join(marketplaceRoot, 'package.json');
try {
// Read version from marketplace package.json
const { homedir } = require('os');
const { readFileSync } = require('fs');
const marketplaceRoot = path.join(homedir(), '.claude', 'plugins', 'marketplaces', 'thedotmack');
const packageJsonPath = path.join(marketplaceRoot, 'package.json');
const packageJson = JSON.parse(readFileSync(packageJsonPath, 'utf-8'));
res.status(200).json({ version: packageJson.version });
} catch (error) {
@@ -146,26 +178,35 @@ export class WorkerService {
// Instructions endpoint - loads SKILL.md sections on-demand for progressive instruction loading
this.app.get('/api/instructions', async (req, res) => {
const topic = (req.query.topic as string) || 'all';
// Read SKILL.md from plugin directory
const operation = req.query.operation as string | undefined;
// Path resolution: __dirname is build output directory (plugin/scripts/)
// SKILL.md is at plugin/skills/mem-search/SKILL.md
const skillPath = path.join(__dirname, '../skills/mem-search/SKILL.md');
// Operations are at plugin/skills/mem-search/operations/*.md
try {
const fullContent = await fs.promises.readFile(skillPath, 'utf-8');
let content: string;
// Extract section based on topic
const section = this.extractInstructionSection(fullContent, topic);
if (operation) {
// Load specific operation file
const operationPath = path.join(__dirname, '../skills/mem-search/operations', `${operation}.md`);
content = await fs.promises.readFile(operationPath, 'utf-8');
} else {
// Load SKILL.md and extract section based on topic (backward compatibility)
const skillPath = path.join(__dirname, '../skills/mem-search/SKILL.md');
const fullContent = await fs.promises.readFile(skillPath, 'utf-8');
content = this.extractInstructionSection(fullContent, topic);
}
// Return in MCP format
res.json({
content: [{
type: 'text',
text: section
text: content
}]
});
} catch (error) {
logger.error('WORKER', 'Failed to load instructions', { topic, skillPath }, error as Error);
logger.error('WORKER', 'Failed to load instructions', { topic, operation }, error as Error);
res.status(500).json({
content: [{
type: 'text',
@@ -176,21 +217,46 @@ export class WorkerService {
}
});
// Admin endpoints for process management
this.app.post('/api/admin/restart', async (_req, res) => {
// Admin endpoints for process management (localhost-only)
this.app.post('/api/admin/restart', requireLocalhost, async (_req, res) => {
res.json({ status: 'restarting' });
setTimeout(async () => {
await this.shutdown();
process.exit(0);
}, 100);
// On Windows, if managed by wrapper, send message to parent to handle restart
// This solves the Windows zombie port problem where sockets aren't properly released
const isWindowsManaged = process.platform === 'win32' &&
process.env.CLAUDE_MEM_MANAGED === 'true' &&
process.send;
if (isWindowsManaged) {
logger.info('SYSTEM', 'Sending restart request to wrapper');
process.send!({ type: 'restart' });
} else {
// Unix or standalone Windows - handle restart ourselves
setTimeout(async () => {
await this.shutdown();
process.exit(0);
}, 100);
}
});
this.app.post('/api/admin/shutdown', async (_req, res) => {
this.app.post('/api/admin/shutdown', requireLocalhost, async (_req, res) => {
res.json({ status: 'shutting_down' });
setTimeout(async () => {
await this.shutdown();
process.exit(0);
}, 100);
// On Windows, if managed by wrapper, send message to parent to handle shutdown
const isWindowsManaged = process.platform === 'win32' &&
process.env.CLAUDE_MEM_MANAGED === 'true' &&
process.send;
if (isWindowsManaged) {
logger.info('SYSTEM', 'Sending shutdown request to wrapper');
process.send!({ type: 'shutdown' });
} else {
// Unix or standalone Windows - handle shutdown ourselves
setTimeout(async () => {
await this.shutdown();
process.exit(0);
}, 100);
}
});
this.viewerRoutes.setupRoutes(this.app);
@@ -261,25 +327,47 @@ export class WorkerService {
*/
private async cleanupOrphanedProcesses(): Promise<void> {
try {
// Find all chroma-mcp processes
const { stdout } = await execAsync('ps aux | grep "chroma-mcp" | grep -v grep || true');
if (!stdout.trim()) {
logger.debug('SYSTEM', 'No orphaned chroma-mcp processes found');
return;
}
const lines = stdout.trim().split('\n');
const isWindows = process.platform === 'win32';
const pids: number[] = [];
for (const line of lines) {
const parts = line.trim().split(/\s+/);
if (parts.length > 1) {
const pid = parseInt(parts[1], 10);
if (!isNaN(pid)) {
if (isWindows) {
// Windows: Use PowerShell Get-CimInstance to find chroma-mcp processes
const cmd = `powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.Name -like '*python*' -and $_.CommandLine -like '*chroma-mcp*' } | Select-Object -ExpandProperty ProcessId"`;
const { stdout } = await execAsync(cmd, { timeout: 5000 });
if (!stdout.trim()) {
logger.debug('SYSTEM', 'No orphaned chroma-mcp processes found (Windows)');
return;
}
const pidStrings = stdout.trim().split('\n');
for (const pidStr of pidStrings) {
const pid = parseInt(pidStr.trim(), 10);
// SECURITY: Validate PID is positive integer before adding to list
if (!isNaN(pid) && Number.isInteger(pid) && pid > 0) {
pids.push(pid);
}
}
} else {
// Unix: Use ps aux | grep
const { stdout } = await execAsync('ps aux | grep "chroma-mcp" | grep -v grep || true');
if (!stdout.trim()) {
logger.debug('SYSTEM', 'No orphaned chroma-mcp processes found (Unix)');
return;
}
const lines = stdout.trim().split('\n');
for (const line of lines) {
const parts = line.trim().split(/\s+/);
if (parts.length > 1) {
const pid = parseInt(parts[1], 10);
// SECURITY: Validate PID is positive integer before adding to list
if (!isNaN(pid) && Number.isInteger(pid) && pid > 0) {
pids.push(pid);
}
}
}
}
if (pids.length === 0) {
@@ -287,12 +375,28 @@ export class WorkerService {
}
logger.info('SYSTEM', 'Cleaning up orphaned chroma-mcp processes', {
platform: isWindows ? 'Windows' : 'Unix',
count: pids.length,
pids
});
// Kill all found processes
await execAsync(`kill ${pids.join(' ')}`);
if (isWindows) {
for (const pid of pids) {
// SECURITY: Double-check PID validation before using in taskkill command
if (!Number.isInteger(pid) || pid <= 0) {
logger.warn('SYSTEM', 'Skipping invalid PID', { pid });
continue;
}
try {
execSync(`taskkill /PID ${pid} /T /F`, { timeout: 5000, stdio: 'ignore' });
} catch (error) {
logger.warn('SYSTEM', 'Failed to kill orphaned process', { pid }, error as Error);
}
}
} else {
await execAsync(`kill ${pids.join(' ')}`);
}
logger.info('SYSTEM', 'Orphaned processes cleaned up', { count: pids.length });
} catch (error) {
@@ -346,7 +450,7 @@ export class WorkerService {
this.searchRoutes.setupRoutes(this.app); // Setup search routes now that SearchManager is ready
logger.info('WORKER', 'SearchManager initialized and search routes registered');
// Connect to MCP server
// Connect to MCP server with timeout guard
const mcpServerPath = path.join(__dirname, 'mcp-server.cjs');
const transport = new StdioClientTransport({
command: 'node',
@@ -354,10 +458,19 @@ export class WorkerService {
env: process.env
});
await this.mcpClient.connect(transport);
// Add timeout guard to prevent hanging on MCP connection (15 seconds)
const MCP_INIT_TIMEOUT_MS = 15000;
const mcpConnectionPromise = this.mcpClient.connect(transport);
const timeoutPromise = new Promise<never>((_, reject) =>
setTimeout(() => reject(new Error('MCP connection timeout after 15s')), MCP_INIT_TIMEOUT_MS)
);
await Promise.race([mcpConnectionPromise, timeoutPromise]);
this.mcpReady = true;
logger.success('WORKER', 'Connected to MCP server');
// Signal that initialization is complete
this.initializationCompleteFlag = true;
this.resolveInitialization();
logger.info('SYSTEM', 'Background initialization complete');
} catch (error) {
@@ -399,12 +512,32 @@ export class WorkerService {
/**
* Shutdown the worker service
*
* IMPORTANT: On Windows, we must kill all child processes before exiting
* to prevent zombie ports. The socket handle can be inherited by children,
* and if not properly closed, the port stays bound after process death.
*/
async shutdown(): Promise<void> {
// Shutdown all active sessions
logger.info('SYSTEM', 'Shutdown initiated');
// STEP 1: Enumerate all child processes BEFORE we start closing things
const childPids = await this.getChildProcesses(process.pid);
logger.info('SYSTEM', 'Found child processes', { count: childPids.length, pids: childPids });
// STEP 2: Close HTTP server first
if (this.server) {
this.server.closeAllConnections();
await new Promise<void>((resolve, reject) => {
this.server!.close(err => err ? reject(err) : resolve());
});
this.server = null;
logger.info('SYSTEM', 'HTTP server closed');
}
// STEP 3: Shutdown active sessions
await this.sessionManager.shutdownAll();
// Close MCP client connection (terminates MCP server process)
// STEP 4: Close MCP client connection (signals child to exit gracefully)
if (this.mcpClient) {
try {
await this.mcpClient.close();
@@ -414,19 +547,102 @@ export class WorkerService {
}
}
// Close HTTP server
if (this.server) {
await new Promise<void>((resolve, reject) => {
this.server!.close(err => err ? reject(err) : resolve());
});
}
// Close database connection (includes ChromaSync cleanup)
// STEP 5: Close database connection (includes ChromaSync cleanup)
await this.dbManager.close();
// STEP 6: Force kill any remaining child processes (Windows zombie port fix)
if (childPids.length > 0) {
logger.info('SYSTEM', 'Force killing remaining children');
for (const pid of childPids) {
await this.forceKillProcess(pid);
}
// Wait for children to fully exit
await this.waitForProcessesExit(childPids, 5000);
}
logger.info('SYSTEM', 'Worker shutdown complete');
}
/**
* Get all child process PIDs (Windows-specific)
*/
private async getChildProcesses(parentPid: number): Promise<number[]> {
if (process.platform !== 'win32') {
return [];
}
// SECURITY: Validate PID is a positive integer to prevent command injection
if (!Number.isInteger(parentPid) || parentPid <= 0) {
logger.warn('SYSTEM', 'Invalid parent PID for child process enumeration', { parentPid });
return [];
}
try {
const cmd = `powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.ParentProcessId -eq ${parentPid} } | Select-Object -ExpandProperty ProcessId"`;
const { stdout } = await execAsync(cmd, { timeout: 5000 });
return stdout
.trim()
.split('\n')
.map(s => parseInt(s.trim(), 10))
.filter(n => !isNaN(n) && Number.isInteger(n) && n > 0); // SECURITY: Validate each PID
} catch (error) {
logger.warn('SYSTEM', 'Failed to enumerate child processes', {}, error as Error);
return [];
}
}
/**
* Force kill a process by PID (Windows: uses taskkill /F /T)
*/
private async forceKillProcess(pid: number): Promise<void> {
// SECURITY: Validate PID is a positive integer to prevent command injection
if (!Number.isInteger(pid) || pid <= 0) {
logger.warn('SYSTEM', 'Invalid PID for force kill', { pid });
return;
}
try {
if (process.platform === 'win32') {
// /T kills entire process tree, /F forces termination
await execAsync(`taskkill /PID ${pid} /T /F`, { timeout: 5000 });
logger.info('SYSTEM', 'Killed process', { pid });
} else {
process.kill(pid, 'SIGKILL');
}
} catch (error) {
// Process may already be dead, which is fine
logger.debug('SYSTEM', 'Process already dead or kill failed', { pid });
}
}
/**
* Wait for processes to fully exit
*/
private async waitForProcessesExit(pids: number[], timeoutMs: number): Promise<void> {
const start = Date.now();
while (Date.now() - start < timeoutMs) {
const stillAlive = pids.filter(pid => {
try {
process.kill(pid, 0); // Signal 0 checks if process exists
return true;
} catch {
return false;
}
});
if (stillAlive.length === 0) {
logger.info('SYSTEM', 'All child processes exited');
return;
}
logger.debug('SYSTEM', 'Waiting for processes to exit', { stillAlive });
await new Promise(r => setTimeout(r, 100));
}
logger.warn('SYSTEM', 'Timeout waiting for child processes to exit');
}
/**
* Summarize request body for logging
* Used to avoid logging sensitive data or large payloads
+157
View File
@@ -0,0 +1,157 @@
/**
* Worker Wrapper - Manages worker process lifecycle
*
* This wrapper exists to solve the Windows zombie port problem.
* The wrapper spawns the actual worker as a child process.
* When shutdown is requested, the wrapper kills the child and exits.
* The hooks will start a fresh wrapper+worker if needed.
*
* The wrapper itself has no sockets, so Bun's socket cleanup bug
* doesn't affect it.
*
* NOTE: The wrapper does NOT auto-restart the worker on crash.
* This is intentional - the hooks handle startup via ensureWorkerRunning().
* Auto-restart would cause PID file mismatches and potential infinite loops.
*/
import { spawn, ChildProcess, execSync } from 'child_process';
import path from 'path';
const isWindows = process.platform === 'win32';
const SCRIPT_DIR = __dirname;
const INNER_SCRIPT = path.join(SCRIPT_DIR, 'worker-service.cjs');
let inner: ChildProcess | null = null;
let isShuttingDown = false;
function log(msg: string) {
const timestamp = new Date().toISOString();
console.log(`[${timestamp}] [wrapper] ${msg}`);
}
function spawnInner() {
log(`Spawning inner worker: ${INNER_SCRIPT}`);
inner = spawn(process.execPath, [INNER_SCRIPT], {
stdio: ['inherit', 'inherit', 'inherit', 'ipc'],
env: { ...process.env, CLAUDE_MEM_MANAGED: 'true' },
cwd: path.dirname(INNER_SCRIPT),
});
inner.on('message', async (msg: { type: string }) => {
if (msg.type === 'restart' || msg.type === 'shutdown') {
// Both restart and shutdown: kill inner and exit wrapper
// The hooks will start a fresh wrapper+inner if needed
log(`${msg.type} requested by inner`);
isShuttingDown = true;
await killInner();
log('Exiting wrapper');
process.exit(0);
}
});
inner.on('exit', (code, signal) => {
log(`Inner exited with code=${code}, signal=${signal}`);
inner = null;
// Don't auto-restart - let hooks handle it via ensureWorkerRunning()
// Auto-restart causes PID file mismatches and potential infinite loops
if (!isShuttingDown) {
log('Inner exited unexpectedly, wrapper exiting (hooks will restart if needed)');
process.exit(code ?? 1);
}
});
inner.on('error', (err) => {
log(`Inner error: ${err.message}`);
});
}
async function killInner(): Promise<void> {
if (!inner || !inner.pid) {
log('No inner process to kill');
return;
}
const pid = inner.pid;
log(`Killing inner process tree (pid=${pid})`);
if (isWindows) {
// On Windows, use taskkill /T /F to kill entire process tree
// This ensures all children (MCP server, ChromaSync, etc.) are killed
// which is necessary to properly release the socket
try {
execSync(`taskkill /PID ${pid} /T /F`, { timeout: 10000, stdio: 'ignore' });
log(`taskkill completed for pid=${pid}`);
} catch (error) {
// Process may already be dead
log(`taskkill failed (process may be dead): ${error}`);
}
} else {
// On Unix, SIGTERM then SIGKILL
inner.kill('SIGTERM');
// Wait for exit with timeout
const exitPromise = new Promise<void>(resolve => {
if (!inner) {
resolve();
return;
}
inner.on('exit', () => resolve());
});
const timeoutPromise = new Promise<void>(resolve =>
setTimeout(() => resolve(), 5000)
);
await Promise.race([exitPromise, timeoutPromise]);
// Force kill if still alive
if (inner && !inner.killed) {
log('Inner did not exit gracefully, force killing');
inner.kill('SIGKILL');
}
}
// Wait for the process to fully exit
await waitForProcessExit(pid, 5000);
inner = null;
log('Inner process terminated');
}
async function waitForProcessExit(pid: number, timeoutMs: number): Promise<void> {
const start = Date.now();
while (Date.now() - start < timeoutMs) {
try {
process.kill(pid, 0); // Check if process exists
await new Promise(r => setTimeout(r, 100));
} catch {
// Process is dead
return;
}
}
log(`Timeout waiting for process ${pid} to exit`);
}
// Handle wrapper signals
process.on('SIGTERM', async () => {
log('Wrapper received SIGTERM');
isShuttingDown = true;
await killInner();
process.exit(0);
});
process.on('SIGINT', async () => {
log('Wrapper received SIGINT');
isShuttingDown = true;
await killInner();
process.exit(0);
});
// Start the inner worker
log('Wrapper starting');
spawnInner();
+1 -1
View File
@@ -18,7 +18,7 @@ export class FormattingService {
💡 Search Strategy:
1. Search with index to see titles, dates, IDs
2. Use timeline to get context around interesting results
3. Batch fetch full details: get_batch_observations(ids=[...])
3. Batch fetch full details: get_observations(ids=[...])
Tips:
Filter by type: obs_type="bugfix,feature"
+1 -1
View File
@@ -108,7 +108,7 @@ Settings and configuration (use domain services directly):
- Keep all existing behavior identical
**MCP vs Direct DB Split** (inherited, not changed in Phase 1):
- Search operations → MCP server (claude-mem-search)
- Search operations → MCP server (mem-search)
- Session/data operations → Direct DB access via domain services
## Future Phase 2
+13 -3
View File
@@ -113,7 +113,7 @@ export class SDKAgent {
// Calculate discovery tokens (delta for this response only)
const discoveryTokens = (session.cumulativeInputTokens + session.cumulativeOutputTokens) - tokensBeforeResponse;
// Only log non-empty responses (filter out noise)
// Process response (empty or not) and mark messages as processed
if (responseSize > 0) {
const truncatedResponse = responseSize > 100
? textContent.substring(0, 100) + '...'
@@ -125,6 +125,9 @@ export class SDKAgent {
// Parse and process response with discovery token delta
await this.processSDKResponse(session, textContent, worker, discoveryTokens);
} else {
// Empty response - still need to mark pending messages as processed
await this.markMessagesProcessed(session, worker);
}
}
@@ -396,8 +399,15 @@ export class SDKAgent {
}
}
// CRITICAL: Mark ALL pending messages as successfully processed
// This prevents message loss if worker crashes before SDK finishes
// Mark messages as processed after successful observation/summary storage
await this.markMessagesProcessed(session, worker);
}
/**
* Mark all pending messages as successfully processed
* CRITICAL: Prevents message loss and duplicate processing
*/
private async markMessagesProcessed(session: ActiveSession, worker: any | undefined): Promise<void> {
const pendingMessageStore = this.sessionManager.getPendingMessageStore();
if (session.pendingProcessingIds.size > 0) {
for (const messageId of session.pendingProcessingIds) {
+12 -1
View File
@@ -87,7 +87,7 @@ export class SearchManager {
try {
// Normalize URL-friendly params to internal format
const normalized = this.normalizeParams(args);
const { query, type, obs_type, concepts, files, ...options } = normalized;
const { query, type, obs_type, concepts, files, format, ...options } = normalized;
let observations: ObservationSearchResult[] = [];
let sessions: SessionSummarySearchResult[] = [];
let prompts: UserPromptSearchResult[] = [];
@@ -200,6 +200,17 @@ export class SearchManager {
const totalResults = observations.length + sessions.length + prompts.length;
// JSON format: return raw data for programmatic access (e.g., export scripts)
if (format === 'json') {
return {
observations,
sessions,
prompts,
totalResults,
query: query || ''
};
}
if (totalResults === 0) {
return {
content: [{
+28
View File
@@ -60,6 +60,34 @@ export function createMiddleware(
return middlewares;
}
/**
* Middleware to require localhost-only access
* Used for admin endpoints that should not be exposed when binding to 0.0.0.0
*/
export function requireLocalhost(req: Request, res: Response, next: NextFunction): void {
const clientIp = req.ip || req.connection.remoteAddress || '';
const isLocalhost =
clientIp === '127.0.0.1' ||
clientIp === '::1' ||
clientIp === '::ffff:127.0.0.1' ||
clientIp === 'localhost';
if (!isLocalhost) {
logger.warn('SECURITY', 'Admin endpoint access denied - not localhost', {
endpoint: req.path,
clientIp,
method: req.method
});
res.status(403).json({
error: 'Forbidden',
message: 'Admin endpoints are only accessible from localhost'
});
return;
}
next();
}
/**
* Summarize request body for logging
* Used to avoid logging sensitive data or large payloads
@@ -40,6 +40,7 @@ export class DataRoutes extends BaseRouteHandler {
app.get('/api/observation/:id', this.handleGetObservationById.bind(this));
app.post('/api/observations/batch', this.handleGetObservationsByIds.bind(this));
app.get('/api/session/:id', this.handleGetSessionById.bind(this));
app.post('/api/sdk-sessions/batch', this.handleGetSdkSessionsByIds.bind(this));
app.get('/api/prompt/:id', this.handleGetPromptById.bind(this));
// Metadata endpoints
@@ -49,6 +50,9 @@ export class DataRoutes extends BaseRouteHandler {
// Processing status endpoints
app.get('/api/processing-status', this.handleGetProcessingStatus.bind(this));
app.post('/api/processing', this.handleSetProcessing.bind(this));
// Import endpoint
app.post('/api/import', this.handleImport.bind(this));
}
/**
@@ -146,6 +150,24 @@ export class DataRoutes extends BaseRouteHandler {
res.json(sessions[0]);
});
/**
* Get SDK sessions by SDK session IDs
* POST /api/sdk-sessions/batch
* Body: { sdkSessionIds: string[] }
*/
private handleGetSdkSessionsByIds = this.wrapHandler((req: Request, res: Response): void => {
const { sdkSessionIds } = req.body;
if (!Array.isArray(sdkSessionIds)) {
this.badRequest(res, 'sdkSessionIds must be an array');
return;
}
const store = this.dbManager.getSessionStore();
const sessions = store.getSdkSessionsBySessionIds(sdkSessionIds);
res.json(sessions);
});
/**
* Get user prompt by ID
* GET /api/prompt/:id
@@ -267,4 +289,79 @@ export class DataRoutes extends BaseRouteHandler {
return { offset, limit, project };
}
/**
* Import memories from export file
* POST /api/import
* Body: { sessions: [], summaries: [], observations: [], prompts: [] }
*/
private handleImport = this.wrapHandler((req: Request, res: Response): void => {
const { sessions, summaries, observations, prompts } = req.body;
const stats = {
sessionsImported: 0,
sessionsSkipped: 0,
summariesImported: 0,
summariesSkipped: 0,
observationsImported: 0,
observationsSkipped: 0,
promptsImported: 0,
promptsSkipped: 0
};
const store = this.dbManager.getSessionStore();
// Import sessions first (dependency for everything else)
if (Array.isArray(sessions)) {
for (const session of sessions) {
const result = store.importSdkSession(session);
if (result.imported) {
stats.sessionsImported++;
} else {
stats.sessionsSkipped++;
}
}
}
// Import summaries (depends on sessions)
if (Array.isArray(summaries)) {
for (const summary of summaries) {
const result = store.importSessionSummary(summary);
if (result.imported) {
stats.summariesImported++;
} else {
stats.summariesSkipped++;
}
}
}
// Import observations (depends on sessions)
if (Array.isArray(observations)) {
for (const obs of observations) {
const result = store.importObservation(obs);
if (result.imported) {
stats.observationsImported++;
} else {
stats.observationsSkipped++;
}
}
}
// Import prompts (depends on sessions)
if (Array.isArray(prompts)) {
for (const prompt of prompts) {
const result = store.importUserPrompt(prompt);
if (result.imported) {
stats.promptsImported++;
} else {
stats.promptsSkipped++;
}
}
}
res.json({
success: true,
stats
});
});
}
@@ -7,7 +7,7 @@
import express, { Request, Response } from 'express';
import path from 'path';
import { readFileSync } from 'fs';
import { readFileSync, existsSync } from 'fs';
import { getPackageRoot } from '../../../../shared/paths.js';
import { SSEBroadcaster } from '../../SSEBroadcaster.js';
import { DatabaseManager } from '../../DatabaseManager.js';
@@ -41,7 +41,19 @@ export class ViewerRoutes extends BaseRouteHandler {
*/
private handleViewerUI = this.wrapHandler((req: Request, res: Response): void => {
const packageRoot = getPackageRoot();
const viewerPath = path.join(packageRoot, 'plugin', 'ui', 'viewer.html');
// Try cache structure first (ui/viewer.html), then marketplace structure (plugin/ui/viewer.html)
const viewerPaths = [
path.join(packageRoot, 'ui', 'viewer.html'),
path.join(packageRoot, 'plugin', 'ui', 'viewer.html')
];
const viewerPath = viewerPaths.find(p => existsSync(p));
if (!viewerPath) {
throw new Error('Viewer UI not found at any expected location');
}
const html = readFileSync(viewerPath, 'utf-8');
res.setHeader('Content-Type', 'text/html');
res.send(html);
+4 -3
View File
@@ -58,17 +58,18 @@ export function getWorkerHost(): string {
}
/**
* Check if worker is responsive by trying the health endpoint
* Check if worker is responsive and fully initialized by trying the readiness endpoint
* Changed from /health to /api/readiness to ensure MCP initialization is complete
*/
async function isWorkerHealthy(): Promise<boolean> {
try {
const port = getWorkerPort();
const response = await fetch(`http://127.0.0.1:${port}/health`, {
const response = await fetch(`http://127.0.0.1:${port}/api/readiness`, {
signal: AbortSignal.timeout(HEALTH_CHECK_TIMEOUT_MS)
});
return response.ok;
} catch (error) {
logger.debug('SYSTEM', 'Worker health check failed', {
logger.debug('SYSTEM', 'Worker readiness check failed', {
error: error instanceof Error ? error.message : String(error),
errorType: error?.constructor?.name
});
+2 -16
View File
@@ -24,18 +24,6 @@ export function getWorkerRestartInstructions(
actualError
} = options;
const isWindows = process.platform === 'win32';
// Platform-specific directory paths
const pluginDir = isWindows
? '%USERPROFILE%\\.claude\\plugins\\marketplaces\\thedotmack'
: '~/.claude/plugins/marketplaces/thedotmack';
// Platform-specific terminal name
const terminal = isWindows
? 'Command Prompt or PowerShell'
: 'Terminal';
// Build error message
const prefix = customPrefix || 'Worker service connection failed.';
const portInfo = port ? ` (port ${port})` : '';
@@ -43,10 +31,8 @@ export function getWorkerRestartInstructions(
let message = `${prefix}${portInfo}\n\n`;
message += `To restart the worker:\n`;
message += `1. Exit Claude Code completely\n`;
message += `2. Open ${terminal}\n`;
message += `3. Navigate to: ${pluginDir}\n`;
message += `4. Run: npm run worker:restart\n`;
message += `5. Restart Claude Code`;
message += `2. Run: claude-mem restart\n`;
message += `3. Restart Claude Code`;
if (includeSkillFallback) {
message += `\n\nIf that doesn't work, try: /troubleshoot`;
+37
View File
@@ -0,0 +1,37 @@
import path from 'path';
import { logger } from './logger.js';
/**
* Extract project name from working directory path
* Handles edge cases: null/undefined cwd, drive roots, trailing slashes
*
* @param cwd - Current working directory (absolute path)
* @returns Project name or "unknown-project" if extraction fails
*/
export function getProjectName(cwd: string | null | undefined): string {
if (!cwd || cwd.trim() === '') {
logger.warn('PROJECT_NAME', 'Empty cwd provided, using fallback', { cwd });
return 'unknown-project';
}
// Extract basename (handles trailing slashes automatically)
const basename = path.basename(cwd);
// Edge case: Drive roots on Windows (C:\, J:\) or Unix root (/)
// path.basename('C:\') returns '' (empty string)
if (basename === '') {
// Extract drive letter on Windows, or use 'root' on Unix
const isWindows = process.platform === 'win32';
if (isWindows && cwd.match(/^[A-Z]:\\/i)) {
const driveLetter = cwd[0].toUpperCase();
const projectName = `drive-${driveLetter}`;
logger.info('PROJECT_NAME', 'Drive root detected', { cwd, projectName });
return projectName;
} else {
logger.warn('PROJECT_NAME', 'Root directory detected, using fallback', { cwd });
return 'unknown-project';
}
}
return basename;
}
@@ -48,7 +48,7 @@ describe('Hook Error Logging', () => {
handleFetchError(mockResponse, errorText, context);
} catch (error: any) {
expect(error.message).toContain('Failed Observation storage for Bash');
expect(error.message).toContain('npm run worker:restart');
expect(error.message).toContain('claude-mem restart');
}
});
@@ -119,7 +119,7 @@ describe('Hook Error Logging', () => {
expect(() => {
handleWorkerError(connError);
}).toThrow('npm run worker:restart');
}).toThrow('claude-mem restart');
});
it('re-throws non-connection errors unchanged', () => {
@@ -130,7 +130,7 @@ describe('Hook Error Logging', () => {
expect.fail('Should have thrown');
} catch (error: any) {
expect(error.message).toBe('Something went wrong');
expect(error.message).not.toContain('npm run worker:restart');
expect(error.message).not.toContain('claude-mem restart');
}
});
@@ -227,7 +227,7 @@ describe('Hook Error Logging', () => {
handleFetchError(mockResponse, 'error', context);
} catch (error: any) {
// Must include restart command
expect(error.message).toMatch(/npm run worker:restart/);
expect(error.message).toMatch(/claude-mem restart/);
// Must be user-facing (no technical jargon)
expect(error.message).not.toContain('ECONNREFUSED');