Commit Graph

8 Commits

Author SHA1 Message Date
Alex Newman 80a8c90a1a feat: add embedded Process Supervisor for unified process lifecycle (#1370)
* feat: add embedded Process Supervisor for unified process lifecycle management

Consolidates scattered process management (ProcessManager, GracefulShutdown,
HealthMonitor, ProcessRegistry) into a unified src/supervisor/ module.

New: ProcessRegistry with JSON persistence, env sanitizer (strips CLAUDECODE_*
vars), graceful shutdown cascade (SIGTERM → 5s wait → SIGKILL with tree-kill
on Windows), PID file liveness validation, and singleton Supervisor API.

Fixes #1352 (worker inherits CLAUDECODE env causing nested sessions)
Fixes #1356 (zombie TCP socket after Windows reboot)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add session-scoped process reaping to supervisor

Adds reapSession(sessionId) to ProcessRegistry for killing session-tagged
processes on session end. SessionManager.deleteSession() now triggers reaping.
Tightens orphan reaper interval from 60s to 30s.

Fixes #1351 (MCP server processes leak on session end)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Unix domain socket support for worker communication

Introduces socket-manager.ts for UDS-based worker communication, eliminating
port 37777 collisions between concurrent sessions. Worker listens on
~/.claude-mem/sockets/worker.sock by default with TCP fallback.

All hook handlers, MCP server, health checks, and admin commands updated to
use socket-aware workerHttpRequest(). Backwards compatible — settings can
force TCP mode via CLAUDE_MEM_WORKER_TRANSPORT=tcp.

Fixes #1346 (port 37777 collision across concurrent sessions)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove in-process worker fallback from hook command

Removes the fallback path where hook scripts started WorkerService in-process,
making the worker a grandchild of Claude Code (killed by sandbox). Hooks now
always delegate to ensureWorkerStarted() which spawns a fully detached daemon.

Fixes #1249 (grandchild process killed by sandbox)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add health checker and /api/admin/doctor endpoint

Adds 30-second periodic health sweep that prunes dead processes from the
supervisor registry and cleans stale socket files. Adds /api/admin/doctor
endpoint exposing supervisor state, process liveness, and environment health.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add comprehensive supervisor test suite

64 tests covering all supervisor modules: process registry (18 tests),
env sanitizer (8), shutdown cascade (10), socket manager (15), health
checker (5), and supervisor API (6). Includes persistence, isolation,
edge cases, and cross-module integration scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: revert Unix domain socket transport, restore TCP on port 37777

The socket-manager introduced UDS as default transport, but this broke
the HTTP server's TCP accessibility (viewer UI, curl, external monitoring).
Since there's only ever one worker process handling all sessions, the
port collision rationale for UDS doesn't apply. Reverts to TCP-only,
removing ~900 lines of unnecessary complexity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: remove dead code found in pre-landing review

Remove unused `acceptingSpawns` field from Supervisor class (written but
never read — assertCanSpawn uses stopPromise instead) and unused
`buildWorkerUrl` import from context handler.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* updated gitignore

* fix: address PR review feedback - downgrade HTTP logging, clean up gitignore, harden supervisor

- Downgrade request/response HTTP logging from info to debug to reduce noise
- Remove unused getWorkerPort imports, use buildWorkerUrl helper
- Export ENV_PREFIXES/ENV_EXACT_MATCHES from env-sanitizer, reuse in Server.ts
- Fix isPidAlive(0) returning true (should be false)
- Add shutdownInitiated flag to prevent signal handler race condition
- Make validateWorkerPidFile testable with pidFilePath option
- Remove unused dataDir from ShutdownCascadeOptions
- Upgrade reapSession log from debug to warn
- Rename zombiePidFiles to deadProcessPids (returns actual PIDs)
- Clean up gitignore: remove duplicate datasets/, stale ~*/ and http*/ patterns
- Fix tests to use temp directories instead of relying on real PID file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 14:49:23 -07:00
Alex Newman e788fd3676 fix: prevent duplicate worker daemons and zombie processes (#1178)
* fix: prevent duplicate worker daemons and zombie processes

Three root causes of chroma-mcp timeouts:

1. HTTP shutdown (POST /api/admin/shutdown) closed resources but never
   called process.exit(). Zombie workers stayed alive, background tasks
   reconnected to chroma-mcp, spawning duplicate subprocesses that all
   contended for the same persistent data directory.

2. No guard against concurrent daemon startup. When hooks fired
   simultaneously, multiple daemons started before either wrote a PID
   file. The loser got EADDRINUSE but stayed alive because signal
   handlers registered in the constructor prevented exit.

3. Corrupt 147GB HNSW index file caused all chroma queries to timeout
   (MCP error -32001). Data fix: deleted corrupt collection, backfill
   rebuilds from SQLite.

Code fixes:
- Add PID-based guard in daemon startup: exit if PID file process alive
- Add port-based guard in daemon startup: exit if port already bound
  (runs before WorkerService constructor registers keepalive handlers)
- Add process.exit(0) after HTTP shutdown/restart completes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: aggressive startup cleanup and one-time chroma wipe for upgrade

Kill orphaned worker-service.cjs and chroma-mcp processes immediately
at startup (no age gate) while keeping 30-min threshold for mcp-server.
Wipe corrupt chroma data once on upgrade from pre-v10.3 versions —
backfill rebuilds from SQLite automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: wrap shutdown handlers in try/finally to guarantee process.exit

If onShutdown() or onRestart() threw, process.exit(0) was never reached,
leaving the daemon alive as a zombie. Also removed redundant require('fs')
calls in process-manager tests where ESM imports already existed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 20:10:28 -05:00
Kamran Khalid 02f7c3c9d0 fix(security): validate and restrict /api/instructions operation and topic params (CWE-22, CWE-1321) (#986) 2026-02-16 00:29:08 -05:00
Alex Newman 05e904e613 feat: enhance /api/health with version, uptime, workerPath, and AI status
Replace hardcoded TEST-008 build ID with real package version. Add worker
filesystem path, uptime counter, and AI provider status (including last
interaction success/failure tracking) to the health endpoint response.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 21:16:22 -05:00
Alex Newman 4df9f61347 refactor: implement in-process worker architecture for hooks (#722)
* fix: stop generating empty CLAUDE.md files

- Return empty string instead of "No recent activity" when no observations exist
- Skip writing CLAUDE.md files when formatted content is empty
- Remove redundant "auto-generated by claude-mem" HTML comment
- Clean up 98 existing empty CLAUDE.md files across the codebase
- Update tests to expect empty string for empty input

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* build assets

* refactor: implement in-process worker architecture for hooks

Replaces spawn-based worker startup with in-process architecture:
- Hook processes now become the worker when port 37777 is free
- Eliminates Windows spawn issues (NO SPAWN rule)
- SessionStart chains: smart-install && stop && context

Key changes:
- worker-service.ts: hook case starts WorkerService in-process
- hook-command.ts: skipExit option prevents process.exit() when hosting worker
- hooks.json: single chained command replaces separate start/hook commands
- worker-utils.ts: ensureWorkerRunning() returns boolean, doesn't block
- handlers: graceful fallback when worker unavailable

All 761 tests pass. Manual verification confirms hook stays alive as worker.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* context

* a

* MAESTRO: Mark PR #722 test verification task complete

All 797 tests passed (3 skipped, 0 failed) after merge conflict resolution.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* MAESTRO: Mark PR #722 build verification task complete

* MAESTRO: Mark PR #722 code review task complete

Code review verified:
- worker-service.ts hook case starts WorkerService in-process
- hook-command.ts has skipExit option
- hooks.json uses single chained command
- worker-utils.ts ensureWorkerRunning() returns boolean

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* MAESTRO: Mark PR #722 conflict resolution push task complete

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 19:49:15 -05:00
Alex Newman 2659ec3231 fix: Claude Code 2.1.1 compatibility + log-level audit + path validation fixes (#614)
* Refactor CLAUDE.md and related files for December 2025 updates

- Updated CLAUDE.md in src/services/worker with new entries for December 2025, including changes to Search.ts, GeminiAgent.ts, SDKAgent.ts, and SessionManager.ts.
- Revised CLAUDE.md in src/shared to reflect updates and new entries for December 2025, including paths.ts and worker-utils.ts.
- Modified hook-constants.ts to clarify exit codes and their behaviors.
- Added comprehensive hooks reference documentation for Claude Code, detailing usage, events, and examples.
- Created initial CLAUDE.md files in various directories to track recent activity.

* fix: Merge user-message-hook output into context-hook hookSpecificOutput

- Add footer message to additionalContext in context-hook.ts
- Remove user-message-hook from SessionStart hooks array
- Fixes issue where stderr+exit(1) approach was silently discarded

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update logs and documentation for recent plugin and worker service changes

- Added detailed logs for worker service activities from Dec 10, 2025 to Jan 7, 2026, including initialization patterns, cleanup confirmations, and diagnostic logging.
- Updated plugin documentation with recent activities, including plugin synchronization and configuration changes from Dec 3, 2025 to Jan 7, 2026.
- Enhanced the context hook and worker service logs to reflect improvements and fixes in the plugin architecture.
- Documented the migration and verification processes for the Claude memory system and its integration with the marketplace.

* Refactor hooks architecture and remove deprecated user-message-hook

- Updated hook configurations in CLAUDE.md and hooks.json to reflect changes in session start behavior.
- Removed user-message-hook functionality as it is no longer utilized in Claude Code 2.1.0; context is now injected silently.
- Enhanced context-hook to handle session context injection without user-visible messages.
- Cleaned up documentation across multiple files to align with the new hook structure and removed references to obsolete hooks.
- Adjusted timing and command execution for hooks to improve performance and reliability.

* fix: Address PR #610 review issues

- Replace USER_MESSAGE_ONLY test with BLOCKING_ERROR test in hook-constants.test.ts
- Standardize Claude Code 2.1.0 note wording across all three documentation files
- Exclude deprecated user-message-hook.ts from logger-usage-standards test

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: Remove hardcoded fake token counts from context injection

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address PR #610 review issues by fixing test files, standardizing documentation notes, and verifying code quality improvements.

* fix: Add path validation to CLAUDE.md distribution to prevent invalid directory creation

- Add isValidPathForClaudeMd() function to reject invalid paths:
  - Tilde paths (~) that Node.js doesn't expand
  - URLs (http://, https://)
  - Paths with spaces (likely command text or PR references)
  - Paths with # (GitHub issue/PR references)
  - Relative paths that escape project boundary

- Integrate validation in updateFolderClaudeMdFiles loop
- Add 6 unit tests for path validation
- Update .gitignore to prevent accidental commit of malformed directories
- Clean up existing invalid directories (~/, PR #610..., git diff..., https:)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Implement path validation in CLAUDE.md generation to prevent invalid directory creation

- Added `isValidPathForClaudeMd()` function to validate file paths in `src/utils/claude-md-utils.ts`.
- Integrated path validation in `updateFolderClaudeMdFiles` to skip invalid paths.
- Added 6 new unit tests in `tests/utils/claude-md-utils.test.ts` to cover various rejection cases.
- Updated `.gitignore` to prevent tracking of invalid directories.
- Cleaned up existing invalid directories in the repository.

* feat: Promote critical WARN logs to ERROR level across codebase

Comprehensive log-level audit promoting 38+ WARN messages to ERROR for
improved debugging and incident response:

- Parser: observation type errors, data contamination
- SDK/Agents: empty init responses (Gemini, OpenRouter)
- Worker/Queue: session recovery, auto-recovery failures
- Chroma: sync failures, search failures (now treated as critical)
- SQLite: search failures (primary data store)
- Session/Generator: failures, missing context
- Infrastructure: shutdown, process management failures
- File Operations: CLAUDE.md updates, config reads
- Branch Management: recovery checkout failures

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Address PR #614 review issues

- Remove incorrectly tracked tilde-prefixed files from git
- Fix absolute path validation to check projectRoot boundaries
- Add test coverage for absolute path validation edge cases

Closes review issues:
- Issue 1: ~/ prefixed files removed from tracking
- Issue 3: Absolute paths now validated against projectRoot
- Issue 4: Added 3 new test cases for absolute path scenarios

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* build assets and context

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 23:34:20 -05:00
Alex Newman e1ab73decc feat: Live Context System with Distributed CLAUDE.md Generation (#556)
* docs: add folder index generator plan

RFC for auto-generating folder-level CLAUDE.md files with observation
timelines. Includes IDE symlink support and root CLAUDE.md integration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: implement folder index generator (Phase 1)

Add automatic CLAUDE.md generation for folders containing observed files.
This enables IDE context providers to access relevant memory observations.

Core modules:
- FolderDiscovery: Extract folders from observation file paths
- FolderTimelineCompiler: Compile chronological timeline per folder
- ClaudeMdGenerator: Write CLAUDE.md with tag-based content replacement
- FolderIndexOrchestrator: Coordinate regeneration on observation save

Integration:
- Event-driven regeneration after observation save in ResponseProcessor
- HTTP endpoints for folder discovery, timeline, and manual generation
- Settings for enabling/configuring folder index behavior

The <claude-mem-context> tag wrapping ensures:
- Manual CLAUDE.md content is preserved
- Auto-generated content won't be recursively observed
- Clean separation between user and system content

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add updateFolderClaudeMd function to CursorHooksInstaller

Adds function to update CLAUDE.md files for folders touched by observations.
Uses existing /api/search/by-file endpoint, preserves content outside
<claude-mem-context> tags, and writes atomically via temp file + rename.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: hook updateFolderClaudeMd into ResponseProcessor

Calls updateFolderClaudeMd after observation save to update folder-level
CLAUDE.md files. Uses fire-and-forget pattern with error logging.
Extracts file paths from saved observations and workspace path from registry.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add timeline formatting for folder CLAUDE.md files

Implements formatTimelineForClaudeMd function that transforms API response
into compact markdown table format. Converts emojis to text labels,
handles ditto marks for timestamps, and groups under "Recent" header.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor: remove old folder-index implementation

Deletes redundant folder-index services that were replaced by the simpler
updateFolderClaudeMd approach in CursorHooksInstaller.ts.

Removed:
- src/services/folder-index/ directory (5 files)
- FolderIndexRoutes.ts
- folder-index settings from SettingsDefaultsManager
- folder-index route registration from worker-service

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add worktree-aware project filtering for unified timelines

Detect git worktrees and show both parent repo and worktree observations
in the session start timeline. When running in a worktree, the context
now includes observations from both projects, interleaved chronologically.

- Add detectWorktree() utility to identify worktree directories
- Add getProjectContext() to return parent + worktree projects
- Update context hook to pass multi-project queries
- Add queryObservationsMulti() and querySummariesMulti() for IN clauses
- Maintain backward compatibility with single-project queries

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* fix: restructure logging to prove session correctness and reduce noise

Add critical logging at each stage of the session lifecycle to prove the session ID chain (contentSessionId → sessionDbId → memorySessionId) stays aligned. New logs include CREATED, ENQUEUED, CLAIMED, MEMORY_ID_CAPTURED, STORING, and STORED. Move intermediate migration and backfill progress logs to DEBUG level to reduce noise, keeping only essential initialization and completion logs at INFO level.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* refactor: extract folder CLAUDE.md utils to shared location

Moves folder CLAUDE.md utilities from CursorHooksInstaller to a new
shared utils file. Removes Cursor registry dependency - file paths
from observations are already absolute, no workspace lookup needed.

New file: src/utils/claude-md-utils.ts
- replaceTaggedContent() - preserves user content outside tags
- writeClaudeMdToFolder() - atomic writes with tag preservation
- formatTimelineForClaudeMd() - API response to compact markdown
- updateFolderClaudeMdFiles() - orchestrates folder updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: trigger folder CLAUDE.md updates when observations are saved

The folder CLAUDE.md update was previously only triggered in
syncAndBroadcastSummary, but summaries run with observationCount=0
(observations are saved separately). Moved the update logic to
syncAndBroadcastObservations where file paths are available.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* all the claudes

* test: add unit tests for claude-md-utils pure functions

Add 11 tests covering replaceTaggedContent and formatTimelineForClaudeMd:
- replaceTaggedContent: empty content, tag replacement, appending, partial tags
- formatTimelineForClaudeMd: empty input, parsing, ditto marks, session IDs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test: add integration tests for file operation functions

Add 9 tests for writeClaudeMdToFolder and updateFolderClaudeMdFiles:
- writeClaudeMdToFolder: folder creation, content preservation, nested dirs, atomic writes
- updateFolderClaudeMdFiles: empty skip, fetch/write, deduplication, error handling

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test: add unit tests for timeline-formatting utilities

Add 14 tests for extractFirstFile and groupByDate functions:
- extractFirstFile: relative paths, fallback to files_read, null handling, invalid JSON
- groupByDate: empty arrays, date grouping, chronological sorting, item preservation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: rebuild plugin scripts with merged features

* docs: add project-specific CLAUDE.md with architecture and development notes

* fix: exclude project root from auto-generated CLAUDE.md updates

Skip folders containing .git directory when auto-updating subfolder
CLAUDE.md files. This ensures:

1. Root CLAUDE.md remains user-managed and untouched by the system
2. SessionStart context injection stays pristine throughout the session
3. Subfolder CLAUDE.md files continue to receive live context updates
4. Cleaner separation between user-authored root docs and auto-generated folder indexes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: prevent crash from resuming stale SDK sessions on worker restart

When the worker restarts, it was incorrectly passing the `resume` parameter
to INIT prompts (lastPromptNumber=1) when a memorySessionId existed from a
previous SDK session. This caused "Claude Code process exited with code 1"
crashes because the SDK tried to resume into a session that no longer exists.

Root cause: The resume condition only checked `hasRealMemorySessionId` but
did not verify that this was a CONTINUATION prompt (lastPromptNumber > 1).

Fix: Add `session.lastPromptNumber > 1` check to the resume condition:
- Before: `...(hasRealMemorySessionId && { resume: session.memorySessionId })`
- After: `...(hasRealMemorySessionId && session.lastPromptNumber > 1 && { resume: ... })`

Also added:
- Enhanced debug logging that warns when skipping resume for INIT prompts
- Unit tests in tests/sdk-agent-resume.test.ts (9 test cases)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: properly handle Chroma MCP connection errors

Previously, ensureCollection() caught ALL errors from chroma_get_collection_info
and assumed they meant "collection doesn't exist", triggering unnecessary
collection creation attempts. Connection errors like "Not connected" or
"MCP error -32000: Connection closed" would cascade into failed creation attempts.

Similarly, queryChroma() would silently return empty results when the MCP call
failed, masking the underlying connection problem.

Changes:
- ensureCollection(): Detect connection errors and re-throw immediately instead
  of attempting collection creation
- queryChroma(): Wrap MCP call in try-catch and throw connection errors instead
  of returning empty results
- Both methods reset connection state (connected=false, client=null) on
  connection errors so subsequent operations can attempt to reconnect

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* pushed

* fix: scope regenerate-claude-md.ts to current working directory

Critical bug fix: The script was querying ALL observations from the database
across ALL projects ever recorded (1396+ folders), then attempting to write
CLAUDE.md files everywhere including other projects, non-existent paths, and
ignored directories.

Changes:
- Use git ls-files to discover folders (respects .gitignore automatically)
- Filter database query to current project only (by folder name)
- Use relative paths for database queries (matches storage format)
- Add --clean flag to remove auto-generated CLAUDE.md files
- Add fallback directory walker for non-git repos

Now correctly scopes to 26 folders with observations instead of 1396+.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs and adjustments

* fix: cleanup mode strips tags instead of deleting files blindly

The cleanup mode was incorrectly deleting entire files that contained
<claude-mem-context> tags. The correct behavior (per original design):

1. Strip the <claude-mem-context>...</claude-mem-context> section
2. If empty after stripping → delete the file
3. If has remaining content → save the stripped version

Now properly preserves user content in CLAUDE.md files while removing
only the auto-generated sections.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* deleted some files

* chore: regenerate folder CLAUDE.md files with fixed script

Regenerated 23 folder CLAUDE.md files using the corrected script that:
- Scopes to current working directory only
- Uses git ls-files to respect .gitignore
- Filters by project name

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update CLAUDE.md files for January 5, 2026

- Regenerated and staged 23 CLAUDE.md files with a mix of new and modified content.
- Fixed cleanup mode to properly strip tags instead of deleting files blindly.
- Cleaned up empty CLAUDE.md files from various directories, including ~/.claude and ~/Scripts.
- Conducted dry-run cleanup that identified a significant reduction in auto-generated CLAUDE.md files.
- Removed the isAutoGeneratedClaudeMd function due to incorrect file deletion behavior.

* feat: use settings for observation limit in batch regeneration script

Replace hard-coded limit of 10 with configurable CLAUDE_MEM_CONTEXT_OBSERVATIONS
setting (default: 50). This allows users to control how many observations appear
in folder CLAUDE.md files.

Changes:
- Import SettingsDefaultsManager and load settings at script startup
- Use OBSERVATION_LIMIT constant derived from settings at both call sites
- Remove stale default parameter from findObservationsByFolder function

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: use settings for observation limit in event-driven updates

Replace hard-coded limit of 10 in updateFolderClaudeMdFiles with
configurable CLAUDE_MEM_CONTEXT_OBSERVATIONS setting (default: 50).

Changes:
- Import SettingsDefaultsManager and os module
- Load settings at function start (once, not in loop)
- Use limit from settings in API call

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: Implement configurable observation limits and enhance search functionality

- Added configurable observation limits to batch regeneration scripts.
- Enhanced SearchManager to handle folder queries and normalize parameters.
- Introduced methods to check for direct child files in observations and sessions.
- Updated SearchOptions interface to include isFolder flag for filtering.
- Improved code quality with comprehensive reviews and anti-pattern checks.
- Cleaned up auto-generated CLAUDE.md files across various directories.
- Documented recent changes and improvements in CLAUDE.md files.

* build asset

* Project Context from Claude-Mem auto-added (can be auto removed at any time)

* CLAUDE.md updates

* fix: resolve CLAUDE.md files to correct directory in worktree setups

When using git worktrees, CLAUDE.md files were being written relative to
the worker's process.cwd() instead of the actual project directory. This
fix threads the project's cwd from message processing through to the file
writing utilities, ensuring CLAUDE.md files are created in the correct
project directory regardless of where the worker was started.

Changes:
- Add projectRoot parameter to updateFolderClaudeMdFiles for path resolution
- Thread projectRoot through ResponseProcessor call chain
- Track lastCwd from messages in SDKAgent, GeminiAgent, OpenRouterAgent
- Add tests for relative/absolute path handling with projectRoot

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* more project context updates

* context updates

* fix: preserve actual dates in folder CLAUDE.md generation

Previously, formatTimelineForClaudeMd used today's date for all
observations because the API only returned time (e.g., "4:30 PM")
without date information. This caused all historical observations
to appear as if they happened today.

Changes:
- SearchManager.findByFile now groups results by date with headers
  (e.g., "### Jan 4, 2026") matching formatSearchResults behavior
- formatTimelineForClaudeMd now parses these date headers and uses
  the correct date when constructing epochs for date grouping

The timeline dates are critical for claude-mem context - LLMs need
accurate temporal context to understand when work happened.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* build: update worker assets with date parsing fix

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* claude-mem context: Fixed critical date parsing bug in PR #556

* fix: address PR #556 review items

- Use getWorkerHost() instead of hard-coded 127.0.0.1 in claude-md-utils
- Add error message and stack details to FOLDER_INDEX logging
- Add 5 new tests for worktree/projectRoot path resolution

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Refactor CLAUDE documentation across multiple components and tests

- Updated CLAUDE.md files in src/ui/viewer, src/ui/viewer/constants, src/ui/viewer/hooks, tests/server, tests/worker/agents, and plans to reflect recent changes and improvements.
- Removed outdated entries and consolidated recent activities for clarity.
- Enhanced documentation for hooks, settings, and pagination implementations.
- Streamlined test suite documentation for server and worker agents, indicating recent test audits and cleanup efforts.
- Adjusted plans to remove obsolete entries and focus on current implementation strategies.

* docs: comprehensive v9.0 documentation audit and updates

- Add usage/folder-context to docs.json navigation (was documented but hidden!)
- Update introduction.mdx with v9.0 release notes (Live Context, Worktree Support, Windows Fixes)
- Add CLAUDE_MEM_WORKER_HOST setting to configuration.mdx
- Add Folder Context Files section with link to detailed docs
- Document worktree support in folder-context.mdx
- Update terminology from "mem-search skill" to "MCP tools" throughout active docs
- Update Search Pipeline in architecture/overview.mdx
- Update usage/getting-started.mdx with MCP tools terminology
- Update usage/claude-desktop.mdx title and terminology
- Update hooks-architecture.mdx reference

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add recent activity log for worker CLI with detailed entries

* chore: update CLAUDE.md context files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add brainstorming report for CLAUDE.md distribution architecture

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 22:41:42 -05:00
Alex Newman 2fc4153bef refactor: decompose monolithic services into modular architecture (#534)
* docs: add monolith refactor report with system breakdown

Comprehensive analysis of codebase identifying:
- 14 files over 500 lines requiring refactoring
- 3 critical monoliths (SessionStore, SearchManager, worker-service)
- 80% code duplication across agent files
- 5-phase refactoring roadmap with domain-based architecture

* fix: prevent memory_session_id from equaling content_session_id

The bug: memory_session_id was initialized to contentSessionId as a
"placeholder for FK purposes". This caused the SDK resume logic to
inject memory agent messages into the USER's Claude Code transcript,
corrupting their conversation history.

Root cause:
- SessionStore.createSDKSession initialized memory_session_id = contentSessionId
- SDKAgent checked memorySessionId !== contentSessionId but this check
  only worked if the session was fetched fresh from DB

The fix:
- SessionStore: Initialize memory_session_id as NULL, not contentSessionId
- SDKAgent: Simple truthy check !!session.memorySessionId (NULL = fresh start)
- Database migration: Ran UPDATE to set memory_session_id = NULL for 1807
  existing sessions that had the bug

Also adds [ALIGNMENT] logging across the session lifecycle to help debug
session continuity issues:
- Hook entry: contentSessionId + promptNumber
- DB lookup: contentSessionId → memorySessionId mapping proof
- Resume decision: shows which memorySessionId will be used for resume
- Capture: logs when memorySessionId is captured from first SDK response

UI: Added "Alignment" quick filter button in LogsModal to show only
alignment logs for debugging session continuity.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor: improve error handling in worker-service.ts

- Fix GENERIC_CATCH anti-patterns by logging full error objects instead of just messages
- Add [ANTI-PATTERN IGNORED] markers for legitimate cases (cleanup, hot paths)
- Simplify error handling comments to be more concise
- Improve httpShutdown() error discrimination for ECONNREFUSED
- Reduce LARGE_TRY_BLOCK issues in initialization code

Part of anti-pattern cleanup plan (132 total issues)

* refactor: improve error logging in SearchManager.ts

- Pass full error objects to logger instead of just error.message
- Fixes PARTIAL_ERROR_LOGGING anti-patterns (10 instances)
- Better debugging visibility when Chroma queries fail

Part of anti-pattern cleanup (133 remaining)

* refactor: improve error logging across SessionStore and mcp-server

- SessionStore.ts: Fix error logging in column rename utility
- mcp-server.ts: Log full error objects instead of just error.message
- Improve error handling in Worker API calls and tool execution

Part of anti-pattern cleanup (133 remaining)

* Refactor hooks to streamline error handling and loading states

- Simplified error handling in useContextPreview by removing try-catch and directly checking response status.
- Refactored usePagination to eliminate try-catch, improving readability and maintaining error handling through response checks.
- Cleaned up useSSE by removing unnecessary try-catch around JSON parsing, ensuring clarity in message handling.
- Enhanced useSettings by streamlining the saving process, removing try-catch, and directly checking the result for success.

* refactor: add error handling back to SearchManager Chroma calls

- Wrap queryChroma calls in try-catch to prevent generator crashes
- Log Chroma errors as warnings and fall back gracefully
- Fixes generator failures when Chroma has issues
- Part of anti-pattern cleanup recovery

* feat: Add generator failure investigation report and observation duplication regression report

- Created a comprehensive investigation report detailing the root cause of generator failures during anti-pattern cleanup, including the impact, investigation process, and implemented fixes.
- Documented the critical regression causing observation duplication due to race conditions in the SDK agent, outlining symptoms, root cause analysis, and proposed fixes.

* fix: address PR #528 review comments - atomic cleanup and detector improvements

This commit addresses critical review feedback from PR #528:

## 1. Atomic Message Cleanup (Fix Race Condition)

**Problem**: SessionRoutes.ts generator error handler had race condition
- Queried messages then marked failed in loop
- If crash during loop → partial marking → inconsistent state

**Solution**:
- Added `markSessionMessagesFailed()` to PendingMessageStore.ts
- Single atomic UPDATE statement replaces loop
- Follows existing pattern from `resetProcessingToPending()`

**Files**:
- src/services/sqlite/PendingMessageStore.ts (new method)
- src/services/worker/http/routes/SessionRoutes.ts (use new method)

## 2. Anti-Pattern Detector Improvements

**Problem**: Detector didn't recognize logger.failure() method
- Lines 212 & 335 already included "failure"
- Lines 112-113 (PARTIAL_ERROR_LOGGING detection) did not

**Solution**: Updated regex patterns to include "failure" for consistency

**Files**:
- scripts/anti-pattern-test/detect-error-handling-antipatterns.ts

## 3. Documentation

**PR Comment**: Added clarification on memory_session_id fix location
- Points to SessionStore.ts:1155
- Explains why NULL initialization prevents message injection bug

## Review Response

Addresses "Must Address Before Merge" items from review:
 Clarified memory_session_id bug fix location (via PR comment)
 Made generator error handler message cleanup atomic
 Deferred comprehensive test suite to follow-up PR (keeps PR focused)

## Testing

- Build passes with no errors
- Anti-pattern detector runs successfully
- Atomic cleanup follows proven pattern from existing methods

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: FOREIGN KEY constraint and missing failed_at_epoch column

Two critical bugs fixed:

1. Missing failed_at_epoch column in pending_messages table
   - Added migration 20 to create the column
   - Fixes error when trying to mark messages as failed

2. FOREIGN KEY constraint failed when storing observations
   - All three agents (SDK, Gemini, OpenRouter) were passing
     session.contentSessionId instead of session.memorySessionId
   - storeObservationsAndMarkComplete expects memorySessionId
   - Added null check and clear error message

However, observations still not saving - see investigation report.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Refactor hook input parsing to improve error handling

- Added a nested try-catch block in new-hook.ts, save-hook.ts, and summary-hook.ts to handle JSON parsing errors more gracefully.
- Replaced direct error throwing with logging of the error details using logger.error.
- Ensured that the process exits cleanly after handling input in all three hooks.

* docs: update monolith report post session-logging merge

- SessionStore grew to 2,011 lines (49 methods) - highest priority
- SearchManager reduced to 1,778 lines (improved)
- Agent files reduced by ~45 lines combined
- Added trend indicators and post-merge observations
- Core refactoring proposal remains valid

* refactor(sqlite): decompose SessionStore into modular architecture

Extract the 2011-line SessionStore.ts monolith into focused, single-responsibility
modules following grep-optimized progressive disclosure pattern:

New module structure:
- sessions/ - Session creation and retrieval (create.ts, get.ts, types.ts)
- observations/ - Observation storage and queries (store.ts, get.ts, recent.ts, files.ts, types.ts)
- summaries/ - Summary storage and queries (store.ts, get.ts, recent.ts, types.ts)
- prompts/ - User prompt management (store.ts, get.ts, types.ts)
- timeline/ - Cross-entity timeline queries (queries.ts)
- import/ - Bulk import operations (bulk.ts)
- migrations/ - Database migrations (runner.ts)

New coordinator files:
- Database.ts - ClaudeMemDatabase class with re-exports
- transactions.ts - Atomic cross-entity transactions
- Named re-export facades (Sessions.ts, Observations.ts, etc.)

Key design decisions:
- All functions take `db: Database` as first parameter (functional style)
- Named re-exports instead of index.ts for grep-friendliness
- SessionStore retained as backward-compatible wrapper
- Target file size: 50-150 lines (60% compliance)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(agents): extract shared logic into modular architecture

Consolidate duplicate code across SDKAgent, GeminiAgent, and OpenRouterAgent
into focused utility modules. Total reduction: 500 lines (29%).

New modules in src/services/worker/agents/:
- ResponseProcessor.ts: Atomic DB transactions, Chroma sync, SSE broadcast
- ObservationBroadcaster.ts: SSE event formatting and dispatch
- SessionCleanupHelper.ts: Session state cleanup and stuck message reset
- FallbackErrorHandler.ts: Provider error detection for fallback logic
- types.ts: Shared interfaces (WorkerRef, SSE payloads, StorageResult)

Bug fix: SDKAgent was incorrectly using obs.files instead of obs.files_read
and hardcoding files_modified to empty array.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(search): extract search strategies into modular architecture

Decompose SearchManager into focused strategy pattern with:
- SearchOrchestrator: Coordinates strategy selection and fallback
- ChromaSearchStrategy: Vector semantic search via ChromaDB
- SQLiteSearchStrategy: Filter-only queries for date/project/type
- HybridSearchStrategy: Metadata filtering + semantic ranking
- ResultFormatter: Markdown table formatting for results
- TimelineBuilder: Chronological timeline construction
- Filter modules: DateFilter, ProjectFilter, TypeFilter

SearchManager now delegates to new infrastructure while maintaining
full backward compatibility with existing public API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(context): decompose context-generator into modular architecture

Extract 660-line monolith into focused components:
- ContextBuilder: Main orchestrator (~160 lines)
- ContextConfigLoader: Configuration loading
- TokenCalculator: Token budget calculations
- ObservationCompiler: Data retrieval and query building
- MarkdownFormatter/ColorFormatter: Output formatting
- Section renderers: Header, Timeline, Summary, Footer

Maintains full backward compatibility - context-generator.ts now
delegates to new ContextBuilder while preserving public API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(worker): decompose worker-service into modular infrastructure

Split 2000+ line monolith into focused modules:

Infrastructure:
- ProcessManager: PID files, signal handlers, child process cleanup
- HealthMonitor: Port checks, health polling, version matching
- GracefulShutdown: Coordinated cleanup on exit

Server:
- Server: Express app setup, core routes, route registration
- Middleware: Re-exports from existing middleware
- ErrorHandler: Centralized error handling with AppError class

Integrations:
- CursorHooksInstaller: Full Cursor IDE integration (registry, hooks, MCP)

WorkerService now acts as thin coordinator wiring all components together.
Maintains full backward compatibility with existing public API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Refactor session queue processing and database interactions

- Implement claim-and-delete pattern in SessionQueueProcessor to simplify message handling and eliminate duplicate processing.
- Update PendingMessageStore to support atomic claim-and-delete operations, removing the need for intermediate processing states.
- Introduce storeObservations method in SessionStore for simplified observation and summary storage without message tracking.
- Remove deprecated methods and clean up session state management in worker agents.
- Adjust response processing to accommodate new storage patterns, ensuring atomic transactions for observations and summaries.
- Remove unnecessary reset logic for stuck messages due to the new queue handling approach.

* Add duplicate observation cleanup script

Script to clean up duplicate observations created by the batching bug
where observations were stored once per message ID instead of once per
observation. Includes safety checks to always keep at least one copy.

Usage:
  bun scripts/cleanup-duplicates.ts           # Dry run
  bun scripts/cleanup-duplicates.ts --execute # Delete duplicates
  bun scripts/cleanup-duplicates.ts --aggressive # Ignore time window

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-01-03 21:22:27 -05:00