Commit Graph

1152 Commits

Author SHA1 Message Date
Alex Newman eb7e37dcf1 MAESTRO: Merge PR #937 - gracefully process orphaned pending messages after session termination 2026-02-06 01:47:55 -05:00
jayvenn21 f24bba21e9 fix(worker): gracefully process orphaned pending messages after session termination 2026-02-06 01:47:43 -05:00
Alex Newman 1a1297c12a MAESTRO: Merge PR #940 - backfill project field on SDK session creation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 01:45:35 -05:00
Michael Lipscombe af308ea5c8 fix: Backfill project field on SDK session creation to prevent race condition
PostToolUse hook can create the session before UserPromptSubmit's
session-init sets the project, leaving it empty. Add an UPDATE after
INSERT OR IGNORE to backfill the project when a later hook provides it.

Closes #939

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 01:44:42 -05:00
Alex Newman 6382d6f9c7 MAESTRO: Merge PR #693 - prevent infinite restart loop that causes runaway API costs
Add restart limit (max 3 consecutive restarts) with exponential backoff
to prevent infinite generator restart loops. Also add defensive
memorySessionId checks in GeminiAgent and OpenRouterAgent before
expensive LLM calls to fail fast when session ID hasn't been captured.

Based on PR #693 by @ajbmachon (applied to current main).

Co-Authored-By: Andre Machon <ajbmachon2@gmail.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 01:43:12 -05:00
Alex Newman 1ae6c5708c MAESTRO: Merge PR #934 - terminate session on prompt-too-long instead of retrying indefinitely
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 01:40:20 -05:00
jayvenn21 93bad99d79 fix: terminate session on prompt-too-long instead of retrying indefinitely 2026-02-06 01:39:55 -05:00
Alex Newman 9c68a3ba3d MAESTRO: Merge PR #913 - respect CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED setting
Folder CLAUDE.md generation is now gated behind the CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED
setting (default: false). The setting is exposed via the settings API and handles both
string and boolean JSON values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 01:38:01 -05:00
Alex Newman 117710d449 Merge PR #913: respect CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED setting 2026-02-06 01:36:58 -05:00
Michel Tomas b07130acc6 fix: handle both boolean and string types for settings
JSON.parse preserves native types, so boolean true/false stay as
booleans rather than strings. The previous check only handled string
'true', meaning users who set `"ENABLED": true` (boolean) wouldn't
have the feature work.

Now both `getBool()` helper and ResponseProcessor check handle:
- String 'true' → enabled
- Boolean true → enabled
- Any other value → disabled

Tested: Confirmed feature stays disabled with string "false" and
no new CLAUDE.md files are created when accessing directories.
2026-02-06 01:36:45 -05:00
Michel Tomas bb96092d74 fix: add FOLDER_CLAUDEMD_ENABLED to settingKeys for API/UI access 2026-02-06 01:36:45 -05:00
Michel Tomas 9907df1db8 fix: respect CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED setting
The CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED setting was documented but never
actually checked in code. The folder CLAUDE.md generation ran
unconditionally whenever files were touched.

Changes:
- Add CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED to SettingsDefaults interface
- Add default value 'false' to DEFAULTS object
- Check setting in ResponseProcessor before calling updateFolderClaudeMdFiles

Fixes the issue identified in issue-600-documentation-audit-features-not-implemented.md:
"The setting CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED is never read."
2026-02-06 01:36:45 -05:00
Alex Newman ab3d4ca865 MAESTRO: Prevent CLAUDE.md generation in unsafe directories (PR #929 concept)
Add exclusion list for directories where CLAUDE.md generation breaks
toolchains: res/ (Android aapt2), .git/, build/, node_modules/,
__pycache__/. Closes issue #912. Credit to @jayvenn21.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 01:35:31 -05:00
Alex Newman e424d85a93 MAESTRO: Close PR #834 - isInsideGitRepo would disable folder CLAUDE.md feature for all git projects
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 01:32:50 -05:00
Alex Newman e5a133b3da MAESTRO: Prevent nested duplicate directory creation in CLAUDE.md paths (PR #836 concept)
Add hasConsecutiveDuplicateSegments() check to isValidPathForClaudeMd() to reject paths
like frontend/frontend/ or src/src/ that occur when cwd already includes the directory name.
3 new tests added (46 total for claude-md-utils). Fixes #814. Credit to @Glucksberg.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 00:03:03 -05:00
Alex Newman 15e9473533 MAESTRO: Fix CLAUDE.md race condition from PR #974 - skip folders with active CLAUDE.md edits
Prevents "file modified since read" errors when Claude Code is actively editing
a CLAUDE.md file by detecting CLAUDE.md paths in observation file lists and
skipping those folders during updates. Closes #859. Credit: @cheapsteak.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 23:57:32 -05:00
Alex Newman 20333e6214 MAESTRO: Merge PR #932 - prevent duplicate generator spawns in handleSessionInit
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 23:49:23 -05:00
jayvenn21 5d1ee20076 fix: prevent duplicate generator spawns in handleSessionInit
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-05 23:48:56 -05:00
Alex Newman 576d407c00 MAESTRO: Fix image-only prompts skipping session init (PR #928 concept)
Image-only prompts have empty text content, causing session-init handler
to skip session creation entirely. Use '[media prompt]' placeholder so
sessions still get created and tracked in memory.

Closes PR #928 (applied directly due to merge conflicts with outdated base).
Credit: @iammike for identifying the issue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 23:47:12 -05:00
Alex Newman 3eb0154d58 MAESTRO: Close PR #829 as redundant - empty prompt fix already merged via PR #828
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 22:39:20 -05:00
Alex Newman 80663fff97 MAESTRO: Merge PR #828 - server-side DB readiness wait for session-init endpoint
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 22:37:47 -05:00
Rajiv Sinclair 27aa98c269 fix: wait for database initialization before processing session-init requests
Fixes a race condition where the UserPromptSubmit hook could call
/api/sessions/init before the database is fully initialized, resulting
in a 500 error with "Database not initialized".

Root cause:
- Worker starts HTTP server immediately for fast startup
- Health endpoint (/api/health) returns OK when server is listening
- session-init hook waits for health check, then calls /api/sessions/init
- Database initialization happens in background, creating a race window

Solution:
- Add early handler for /api/sessions/init that waits for initialization
- Uses same pattern as existing /api/context/inject handler
- Returns 503 with retry message if initialization times out

The fix ensures hooks receive proper responses even during the brief
startup window when the server is listening but DB isn't ready yet.

Co-Authored-By: Claude Code <noreply@anthropic.com>
2026-02-05 22:23:52 -05:00
Rajiv Sinclair 9789a1969c fix: gracefully handle empty prompts in session-init hook
When a user submits an empty or whitespace-only prompt to Claude Code,
the UserPromptSubmit hook would throw "sessionInitHandler requires prompt"
and block the operation.

This change returns early with `{ continue: true }` instead of throwing,
allowing the session to continue gracefully.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 22:23:52 -05:00
Alex Newman d333c7dc08 MAESTRO: Expand startup orphan cleanup to target mcp-server and worker-service processes
The startup cleanupOrphanedProcesses() only targeted chroma-mcp, leaving
orphaned mcp-server.cjs and worker-service.cjs processes undetected after
daemon crashes. Expanded to target all claude-mem process types with
30-minute age filtering and current PID exclusion. Closes PR #687 (which
had a spawnDaemon regression removing Windows WMIC support).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 22:06:46 -05:00
Alex Newman edfc1a403f MAESTRO: Close PR #713 - subprocess accumulation fix already covered by ProcessRegistry and idle timeout
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 20:19:43 -05:00
Alex Newman ad37629f0a MAESTRO: Close PR #738 - parallel subprocess tracking system duplicates ProcessRegistry
PR #738's QueryWrapper/QueryWrapperManager introduces a separate process tracking
system alongside the existing ProcessRegistry. Its orphan cleanup duplicates
killSystemOrphans() and killIdleDaemonChildren() already in main.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 20:13:26 -05:00
Alex Newman 7be19b20cd MAESTRO: Close PR #847 - observer session isolation already shipped via PRs #845, #733
All changes from the comprehensive PR were merged incrementally:
- PR #845 (v9.0.12): cwd isolation replacing CLAUDE_CONFIG_DIR
- PR #733: EnvManager.ts with buildIsolatedEnv() for credential isolation
- CLAUDE_MEM_CLAUDE_AUTH_METHOD setting already in SettingsDefaultsManager

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 20:09:52 -05:00
Alex Newman 5b2eb9f599 MAESTRO: Merge PR #879 - daemon children cleanup fills orphan reaper gap
PR #879 adds killIdleDaemonChildren() to ProcessRegistry, targeting idle
Claude SDK processes that are children of the worker-service daemon but
weren't caught by the existing registry-based or ppid=1 cleanup paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 20:04:59 -05:00
Alex Newman ede75b1c5b Merge PR #879: fix: add daemon children cleanup to orphan reaper 2026-02-05 20:03:15 -05:00
david718 cdb0e823aa fix: add daemon children cleanup to orphan reaper
Add killIdleDaemonChildren() to clean up SDK-spawned claude processes
that don't terminate after completing their work.

Problem:
- Worker-service daemon spawns Claude SDK processes
- These processes remain alive after work completes
- They accumulate over time, consuming significant memory
- Existing killSystemOrphans() only handles PPID=1 orphans

Solution:
- Add killIdleDaemonChildren() that finds claude processes where:
  - Parent PID = daemon's PID (children of worker-service)
  - CPU = 0% (idle, not actively working)
  - Running > 2 minutes (completed their work)
- Call it from reapOrphanedProcesses() (runs every 5 minutes)

Testing:
- Verified locally: 15+ zombie processes cleaned up
- Memory saved: ~2GB
- Normal processes (MCP server, Chroma) unaffected

Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-05 19:50:46 -05:00
Alex Newman a35190b7bf MAESTRO: Close PR #867 - process bomb and zombie observer fixes already in main
Both features proposed by PR #867 are already addressed:
1. Zombie observer idle timeout: fully implemented in SessionQueueProcessor.ts
2. Startup batching: 100ms stagger + idle timeout makes it unnecessary

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 19:30:25 -05:00
Alex Newman 4b180d8d66 MAESTRO: Close PR #930 - deferred worker init superseded by fail-open architecture
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 19:25:38 -05:00
Alex Newman 61615ddc0f MAESTRO: Document PR #931 closure and spawn guard implementation in triage doc
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 19:00:23 -05:00
Alex Newman 0ecb387f58 MAESTRO: Implement Windows spawn guard to prevent repeated worker popups (closes #921)
Adds file-based cooldown lock in ensureWorkerStarted() so failed worker
spawn attempts on Windows don't produce repeated bun.exe terminal popups.
Based on the approach from PR #931, integrated into the refactored code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:59:53 -05:00
Alex Newman 3840a1734d MAESTRO: Close PR #935 - SDK patching approach rejected in favor of fail-open architecture
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:31:40 -05:00
Alex Newman ca990e8d53 MAESTRO: Merge PR #972 - fix Windows path handling for usernames with spaces
Removed shell: IS_WINDOWS from bun-runner.js spawn() call to prevent
cmd.exe from splitting paths at spaces. Added windowsHide: true to
prevent visible console windows on Windows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:29:57 -05:00
Tafari Higgs c9010c5c58 Fix Windows path handling for usernames with spaces
On Windows, when the username contains spaces (e.g., "Tafari Higgs"),
bun-runner.js fails with: Syntax Error at C:\Users\Tafari:1:2

This happens because `shell: IS_WINDOWS` (true on Windows) causes
spawn() to route through cmd.exe, which misinterprets the path at
the first space in the username directory.

Removing `shell: true` on Windows fixes the issue since spawn()
can invoke the Bun binary directly without needing a shell wrapper.
Added `windowsHide: true` to prevent a visible console window from
appearing, matching the previous hidden behavior under cmd.exe.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:28:58 -05:00
Alex Newman 2ea716a017 MAESTRO: Close PR #530 - retry logic superseded by fail-open hook strategy
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:27:55 -05:00
Alex Newman 4e4ad32f97 MAESTRO: Close PR #922 - async SessionStart hooks would break context injection
SessionStart hooks must run synchronously because the context hook's stdout
is captured and injected as Claude's system-level memory context. The Windows
blocking issue was already resolved by the fail-open approach in PRs #973,
#959, and #964.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:27:06 -05:00
Alex Newman faa1360f33 MAESTRO: Merge PR #964 - add fetch timeouts to Stop hook and health checks
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:25:49 -05:00
Rod Boev 64dc50f3fb Restructure fetchWithTimeout to clear timer on all paths
Replace Promise.race with new Promise wrapper so timeoutId is assigned
as const before fetch starts. Timer is now cleared on both fetch success
and fetch rejection, not just success.
2026-02-05 18:24:20 -05:00
Rod Boev 1ac35b5d56 Clear setTimeout when fetch wins the race
Prevents the timer callback from firing unnecessarily after a
successful fetch response.
2026-02-05 18:24:20 -05:00
Rod Boev 1dd456a6ca Add fetch timeouts to Stop hook and health checks
Replace bare fetch() calls with fetchWithTimeout() using Promise.race
instead of AbortSignal.timeout() which causes libuv assertion crashes
in Bun on Windows.

Applies the already-defined HEALTH_CHECK_TIMEOUT_MS (30s/45s) to
isWorkerHealthy() and getWorkerVersion(), and HOOK_TIMEOUTS.DEFAULT
to the summarize POST. Previously these fetches had no timeout at all,
causing the Stop hook to hang for the full hook timeout (120s) when
the worker was unreachable.

Fixes #963
2026-02-05 18:24:19 -05:00
Alex Newman be01694383 MAESTRO: Merge PR #959 - fail open on /api/context/inject during initialization
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:23:24 -05:00
Rod Boev b8b88d998c fix: fail open on /api/context/inject during initialization
The /api/context/inject endpoint previously blocked for up to 5 minutes
(300s timeout) waiting for initializationComplete, which includes MCP
connection setup. On Windows, the MCP connection can hang indefinitely,
causing the context hook to never return and blocking Claude Code startup.

This change makes the endpoint fail open: if the worker hasn't finished
initializing, return empty context immediately instead of blocking. The
hook completes fast, and context becomes available on subsequent prompts
once initialization finishes in the background.

Closes #958
2026-02-05 18:22:08 -05:00
Alex Newman bf4d9421e0 MAESTRO: Merge PR #973 - hooks fail gracefully instead of blocking prompts
session-init.ts: replaced throws on worker 500/SDK agent failure with
logger.failure() + graceful exit 0. user-message.ts: replaced throw with
graceful return, console.error → process.stderr.write, USER_MESSAGE_ONLY → SUCCESS.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:19:41 -05:00
Tafari Higgs 67f9d631bd Fix hooks to fail gracefully instead of blocking user prompts
Three issues fixed:

1. session-init handler: Worker 500 errors caused `throw`, which
   hook-command.ts caught and exited with code 2 (blocking error).
   This blocked the user's prompt entirely with:
   "Hook error: Error: Session initialization failed: 500"
   Fix: Log the failure and return SUCCESS so the prompt proceeds.

2. user-message handler: Referenced nonexistent constant
   HOOK_EXIT_CODES.USER_MESSAGE_ONLY (undefined). Also used
   console.error() for informational output, which Claude Code
   may flag as an error. Fix: Use process.stderr.write() and
   return HOOK_EXIT_CODES.SUCCESS explicitly.

3. Both handlers threw on HTTP errors from the worker, causing
   exit code 2 (blocking). Memory plugin failures should never
   prevent users from using Claude Code. All worker HTTP failures
   now log and return gracefully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:18:36 -05:00
Alex Newman 311d62cc02 MAESTRO: Merge PR #960 removing user-message hook from SessionStart to fix startup error
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 18:16:15 -05:00
Rod Boev 46a75c4d98 Restore USER_MESSAGE_ONLY exit code constant
The user-message handler references HOOK_EXIT_CODES.USER_MESSAGE_ONLY
but the constant was missing from hook-constants.ts, causing it to
return exitCode: undefined. The handler is still registered for Cursor's
afterFileEdit flow.
2026-02-05 18:15:54 -05:00
Rod Boev fececc4e51 fix: remove user-message hook from SessionStart to prevent startup error
The user-message hook exits with code 3 (USER_MESSAGE_ONLY), which Claude
Code interprets as a hook failure, displaying "SessionStart:startup hook
error" on every session start. The context hook already handles injecting
context into the conversation, making the user-message hook redundant.

Removing it eliminates the spurious error message and saves ~60 seconds
of potential timeout on systems where the worker is slow to respond.

Closes #905
2026-02-05 18:15:54 -05:00