Compare commits

...

7 Commits

Author SHA1 Message Date
Alex Newman 98d87d7573 chore: bump version to 10.0.4
Reverts v10.0.3 chroma-mcp spawn storm fix (broken release).
Restores codebase to v10.0.2 state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 21:36:34 -05:00
Alex Newman 0dda593c45 docs: update CHANGELOG.md for v10.0.3 2026-02-11 15:45:25 -05:00
Alex Newman 1bfb473c19 chore: bump version to 10.0.3 2026-02-11 15:44:45 -05:00
Alex Newman 3f01baebfe Merge remote-tracking branch 'origin/main' into fix/chroma-mcp-spawn-storm
# Conflicts:
#	src/services/worker-service.ts
#	tests/infrastructure/process-manager.test.ts
2026-02-11 15:43:08 -05:00
Alex Newman 46b61857ab docs: update CHANGELOG.md for v10.0.2
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 15:26:31 -05:00
Rod Boev 2c5c99c0c7 fix: use etime-based sorting instead of PID ordering for process guards
Addresses Greptile review feedback:
- ChromaSync: replace PID-based sort with ps etime column + parseElapsedTime()
  for reliable age ordering (PIDs wrap and don't guarantee ordering)
- ProcessManager: filter out entries with unparseable etime (-1) before
  sorting to prevent sort corruption in cleanupExcessChromaProcesses()
2026-02-11 07:19:28 -05:00
Rod Boev a3f9e7f638 fix: prevent chroma-mcp spawn storm with 5-layer defense (641 processes → max 2)
During SIGHUP testing with 6+ active sessions, ChromaSync.ensureConnection()
had no mutex — concurrent fire-and-forget syncObservation() calls each spawned
a chroma-mcp subprocess via StdioClientTransport, creating 641 orphans in ~5min.
Error-driven reconnection formed a positive feedback loop amplifying the storm.

Defense layers:
- Layer 0: Connection mutex via promise memoization (prevents concurrent spawns)
- Layer 1: Pre-spawn process count guard using execFileSync('ps') (kills excess)
- Layer 2: Hardened close() with try-finally + Unix pkill in GracefulShutdown
- Layer 3: Count-based orphan reaper in ProcessManager (not age-based)
- Layer 4: Circuit breaker stops retries after 3 consecutive failures for 60s

Closes #1063, closes #695
Relates to #1010, #707
2026-02-11 07:19:28 -05:00
7 changed files with 62 additions and 45 deletions
+1 -1
View File
@@ -10,7 +10,7 @@
"plugins": [
{
"name": "claude-mem",
"version": "10.0.2",
"version": "10.0.4",
"source": "./plugin",
"description": "Persistent memory system for Claude Code - context compression across sessions"
}
+54 -37
View File
@@ -2,6 +2,60 @@
All notable changes to claude-mem.
## [v10.0.3] - 2026-02-11
## Fix: Prevent chroma-mcp spawn storm (PR #1065)
Fixes a critical bug where killing the worker daemon during active sessions caused **641 chroma-mcp Python processes** to spawn in ~5 minutes, consuming 75%+ CPU and ~64GB virtual memory.
### Root Cause
`ChromaSync.ensureConnection()` had no connection mutex. Concurrent fire-and-forget `syncObservation()` calls from multiple sessions raced through the check-then-act guard, each spawning a chroma-mcp subprocess via `StdioClientTransport`. Error-driven reconnection created a positive feedback loop.
### 5-Layer Defense
| Layer | Mechanism | Purpose |
|-------|-----------|---------|
| **0** | Connection mutex via promise memoization | Coalesces concurrent callers onto a single spawn attempt |
| **1** | Pre-spawn process count guard (`execFileSync('ps')`) | Kills excess chroma-mcp processes before spawning new ones |
| **2** | Hardened `close()` with try-finally + Unix `pkill -P` fallback | Guarantees state reset even on error, kills orphaned children |
| **3** | Count-based orphan reaper in `ProcessManager` | Kills by count (not age), catches spawn storms where all processes are young |
| **4** | Circuit breaker (3 failures → 60s cooldown) | Stops error-driven reconnection positive feedback loop |
### Additional Fix
- Process guards now use `etime`-based sorting instead of PID ordering for reliable age determination (PIDs wrap and don't guarantee ordering)
### Testing
- 16 new tests for mutex, circuit breaker, close() hardening, and count guard
- All tests pass (947 pass, 3 skip)
Closes #1063, closes #695. Relates to #1010, #707.
**Contributors:** @rodboev
## [v10.0.2] - 2026-02-11
## Bug Fixes
- **Prevent daemon silent death from SIGHUP + unhandled errors** — Worker process could silently die when receiving SIGHUP signals or encountering unhandled errors, leaving hooks without a backend. Now properly handles these signals and prevents silent crashes.
- **Hook resilience and worker lifecycle improvements** — Comprehensive fixes for hook command error classification, addressing issues #957, #923, #984, #987, and #1042. Hooks now correctly distinguish between worker unavailability errors and other failures.
- **Clarify TypeError order dependency in error classifier** — Fixed error classification logic to properly handle TypeError ordering edge cases.
## New Features
- **Project-scoped statusline counter utility** — Added `statusline-counts.js` for tracking observation counts per project in the Claude Code status line.
## Internal
- Added test coverage for hook command error classification and process manager
- Worker service and MCP server lifecycle improvements
- Process manager enhancements for better cross-platform stability
### Contributors
- @rodboev — Hook resilience and worker lifecycle fixes (PR #1056)
## [v10.0.1] - 2026-02-11
## What's Changed
@@ -1469,40 +1523,3 @@ Refactored context loading logic to differentiate between code and non-code mode
Fix critical worker crashes on startup (v8.0.2 regression)
## [v8.0.2] - 2025-12-23
New "chill" remix of code mode for users who want fewer, more selective observations.
## Features
- **code--chill mode**: A behavioral variant that produces fewer observations
- Only records things "painful to rediscover" - shipped features, architectural decisions, non-obvious gotchas
- Skips routine work, straightforward implementations, and obvious changes
- Philosophy: "When in doubt, skip it"
## Documentation
- Updated modes.mdx with all 28 language modes (was 10)
- Added Code Mode Variants section documenting chill mode
## Usage
Set in ~/.claude-mem/settings.json:
```json
{
"CLAUDE_MEM_MODE": "code--chill"
}
```
## [v8.0.1] - 2025-12-23
## 🎨 UI Improvements
- **Header Redesign**: Moved documentation and X (Twitter) links from settings modal to main header for better accessibility
- **Removed Product Hunt Badge**: Cleaned up header layout by removing the Product Hunt badge
- **Icon Reorganization**: Reordered header icons for improved UX flow (Docs → X → Discord → GitHub)
---
🤖 Generated with [Claude Code](https://claude.com/claude-code)
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "10.0.2",
"version": "10.0.4",
"description": "Memory compression system for Claude Code - persist context across sessions",
"keywords": [
"claude",
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem",
"version": "10.0.2",
"version": "10.0.4",
"description": "Persistent memory system for Claude Code - seamlessly preserve context across sessions",
"author": {
"name": "Alex Newman"
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "claude-mem-plugin",
"version": "10.0.2",
"version": "10.0.4",
"private": true,
"description": "Runtime dependencies for claude-mem bundled hooks",
"type": "module",
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long