Commit Graph

66 Commits

Author SHA1 Message Date
Ben Younes 983be42998 fix: resolve Gemini CLI 0.37.0 session capture failures (#1664) (#1692)
Three root causes prevented Gemini sessions from persisting prompts,
observations, and summaries:

1. BeforeAgent was mapped to user-message (display-only) instead of
   session-init (which initialises the session and starts the SDK agent).

2. The transcript parser expected Claude Code JSONL (type: "assistant")
   but Gemini CLI 0.37.0 writes a JSON document with a messages array
   where assistant entries carry type: "gemini". extractLastMessage now
   detects the format and routes to the correct parser, preserving
   full backward compatibility with Claude Code JSONL transcripts.

3. The summarize handler omitted platformSource from the
   /api/sessions/summarize request body, causing sessions to be recorded
   without the gemini-cli source tag.

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-15 00:58:20 -07:00
Ethan 16a0737dfc fix: use parent project name for worktree observation writes (#1820)
* fix: use parent project name for worktree observation writes (#1819)

Observations and sessions from git worktrees were stored under
basename(cwd) instead of the parent repo name because write paths
called getProjectName() (not worktree-aware) instead of
getProjectContext() (worktree-aware). This is the same bug as
#1081, #1317, and #1500 — it regressed because the two functions
coexist and new code reached for the simpler one.

Fix: getProjectContext() now returns parentProjectName as primary
when in a worktree, and all four write-path call sites now use
getProjectContext().primary instead of getProjectName().

Includes regression test that creates a real worktree directory
structure and asserts primary === parentProjectName.

* fix: address review nitpicks — allProjects fallback, JSDoc, write-path test

- ContextBuilder: default projects to context.allProjects for legacy
  worktree-labeled record compatibility
- ProjectContext: clarify JSDoc that primary is canonical (parent repo
  in worktrees)
- Tests: add write-path regression test mirroring session-init/SessionRoutes
  pattern; refactor worktree fixture into beforeAll/afterAll

* refactor(project-name): rename local to cwdProjectName and dedupe allProjects

Addresses final CodeRabbit nitpick: disambiguates the local variable
from the returned `primary` field, and dedupes allProjects via Set
in case parent and cwd resolve to the same name.

---------

Co-authored-by: Ethan Hurst <ethan.hurst@outlook.com.au>
2026-04-15 00:58:14 -07:00
Tran Quang 2a2008bac2 fix(file-context): preserve targeted reads + invalidate on mtime (#1719) (#1729)
* fix(file-context): preserve targeted reads + invalidate on mtime (#1719)

The PreToolUse:Read hook unconditionally rewrote tool input to
{file_path, limit:1}, which interacted with two failure modes:

1. Subagent edits a file → parent's next Read still gets truncated
   because the observation snapshot predates the change.
2. Claude requests a different section with offset/limit → the hook
   strips them, so the Claude Code harness's read-dedup cache returns
   "File unchanged" against the prior 1-line read. The file becomes
   unreadable for the rest of the conversation, even though the hook's
   own recovery hint says "Read again with offset/limit for the
   section you need."

Two complementary fixes:

- **mtime invalidation**: stat the file (we already stat for the size
  gate) and compare mtimeMs to the newest observation's created_at_epoch.
  If the file is newer, pass the read through unchanged so fresh content
  reaches Claude.

- **Targeted-read pass-through**: when toolInput already specifies
  offset and/or limit, preserve them in updatedInput instead of
  collapsing to {limit:1}. The harness's dedup cache then sees a
  distinct input and lets the read proceed.

The unconstrained-read path (no offset, no limit) is unchanged: still
truncated to 1 line plus the observation timeline, so token economics
are preserved for the common case.

Tests cover all three branches: existing truncation, targeted-read
pass-through (offset+limit, limit-only), and mtime-driven bypass.

Fixes #1719

* refactor(file-context): address review findings on #1719 fix

- Add offset-only test case for full targeted-read branch coverage
- Use >= for mtime comparison to handle same-millisecond edge case
- Add Number.isFinite() + bounds guards on offset/limit pass-through
- Trim over-verbose comments to concise single-line summaries
- Remove redundant `as number` casts after typeof narrowing
- Add comment explaining fileMtimeMs=0 sentinel invariant
2026-04-15 00:57:57 -07:00
Ousama Ben Younes edc8535ac1 fix: skip queueLength===0 completion branch when session returns 404 2026-04-11 08:16:35 +00:00
Ousama Ben Younes 2f19eab9c2 fix: expose summaryStored in session status to detect silent summary loss (#1633)
Stop hook polled queueLength===0 as a proxy for summary success, but the queue
empties regardless of whether the LLM produced valid <summary> tags. Added
lastSummaryStored tracking on ActiveSession, surfaced via the /api/sessions/status
endpoint, and emit a logger.warn in the Stop hook when summaryStored===false.

Generated by Claude Code
Vibe coded by ousamabenyounes

Co-Authored-By: Claude <noreply@anthropic.com>
2026-04-10 15:06:18 +00:00
Alex Newman a0e895b53b fix: enhance title sanitization per PR #1641 review (round 4)
Collapse multiple whitespace, trim, and increase max length to 160 chars
for observation titles in file-context deny reason.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 14:18:22 -07:00
Alex Newman 753a993647 fix: address PR #1641 review comments (round 3)
- Fix migration version conflict: addSessionPlatformSourceColumn now uses v25
- Sanitize observation titles in file-context deny reason (strip newlines, limit length)
- Guard json_each() with LIKE '[%' check for legacy bare-path rows
- Guard /stream SSE endpoint with 503 before DB initialization
- Scope bun-runner signal exit handling to start subcommand only
- Normalize platformSource at route boundary in DataRoutes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 14:16:41 -07:00
Alex Newman d0676aa049 feat: file-read gate allows Edit, add legacy-peer-deps for grammar install
- Change file-read gate from deny to allow with limit:1, injecting the
  observation timeline as additionalContext. Edit now works on gated files
  since the file registers as "read" with near-zero token cost.
- Add updatedInput to HookResult type for PreToolUse hooks.
- Add .npmrc with legacy-peer-deps=true for tree-sitter peer dep conflicts.
- Add --legacy-peer-deps to npm fallback paths in smart-install.js so end
  users without bun can install the 24 grammar packages.
- Rebuild plugin artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 14:06:07 -07:00
Alex Newman c21e49d9fa fix: address PR review comments and add file read gate docs
Fix indentation bugs flagged in PR review (SettingsDefaultsManager,
MigrationRunner), add current date/time to file read gate timeline
so the model can judge observation recency, and add documentation
for the file read gate feature.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:09:46 -07:00
Alex Newman b8999c1181 Merge branch 'thedotmack/file-read-timeline-inject' into integration/validation-batch 2026-04-07 11:18:58 -07:00
Alex Newman d8947473b8 fix: escape filePath in recovery hints to prevent malformed output
Filenames containing quotes, backslashes, or newlines could produce
malformed smart_outline/smart_unfold examples in the deny message.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:06:32 -07:00
Alex Newman e3475180cd fix: address PR review — day sort, path canonicalization, dead code cleanup
- Sort within-day observations chronologically (was specificity-ordered)
- Canonicalize relative paths to POSIX format before DB lookup
- Skip projects param when allProjects is empty (prevents cross-project leaks)
- Remove dead stderrMessage field and hook-command block (unused after permissionDecision switch)
- Type permissionDecision as 'allow' | 'deny' union instead of string
- Remove redundant non-null assertions in getObservationsByFilePath
- Add edit guidance to deny message (use sed via Bash with smart tools)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 01:59:30 -07:00
Alex Newman ef1b427a2a fix: update timeline deny message to route to smart tools
The deny reason is the routing surface — show all cheaper exits:
semantic priming from the timeline, get_observations for details,
and smart_outline/smart_unfold for current code structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:25:55 -07:00
Alex Newman 455aeaf654 fix: remove per-session gate, use permissionDecision deny for every read
The per-session FileReadGate was never requested and broke the cost
savings loop — subsequent reads in the same session silently bypassed
the timeline, hiding newly created observations.

Now the timeline fires on every read that has observations, using the
hook contract's permissionDecision: "deny" with the timeline as the
reason (exit 0 + JSON) instead of exit code 2 + stderr.

- Delete FileReadGate.ts entirely
- Remove /api/file-context/gate endpoint from DataRoutes
- Switch handler from exit code 2 to permissionDecision: "deny"
- Restore permissionDecision fields to HookResult
- Eliminate one HTTP round-trip per read (no gate check needed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:05:40 -07:00
Alex Newman 31910fb265 fix: address PR review feedback — path safety, SQL injection, gate scoping
- Resolve relative filePath against input.cwd before statSync; early-return on ENOENT
- Replace LIKE '%path%' with exact json_each equality to prevent false matches
- Sanitize and parameterize LIMIT to prevent NaN SQL errors
- Fix day-sorting to use earliest epoch in group, not first (specificity-sorted) item
- Use exact path equality in deduplicateObservations instead of substring includes
- Scope FileReadGate by session+cwd to prevent worktree collisions
- Refresh lastAccess TTL on active sessions; throttle prune to every 50 calls
- Type params as (string | number)[] instead of any[]
- Remove unused permissionDecision fields from HookResult

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 17:29:59 -07:00
Alex Newman 6250a194dd Merge branch 'pr-1472' into integration/validation-batch
# Conflicts:
#	plugin/scripts/context-generator.cjs
#	plugin/scripts/mcp-server.cjs
#	plugin/scripts/worker-service.cjs
#	plugin/ui/viewer-bundle.js
#	src/cli/handlers/context.ts
#	src/services/sqlite/SessionStore.ts
#	src/services/sqlite/migrations/runner.ts
#	src/services/worker-service.ts
#	src/shared/SettingsDefaultsManager.ts
2026-04-06 14:23:18 -07:00
Alex Newman a60f79c44d feat: file-size threshold and observation dedup for timeline gate
- Skip gate for files under 1,500 bytes — timeline (~370 tokens) costs
  more than just reading small files directly
- Deduplicate observations by memory_session_id (one per session)
- Rank by specificity: files_modified > files_read, fewer tagged files > many
- Fetch 40 candidates, dedup/score down to 15 for display
- Reduce default by-file query limit from 30 to 15

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:29:28 -07:00
Alex Newman 2b8fbcf50e Merge main into thedotmack/file-read-timeline-inject
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 03:00:06 -07:00
Alessandro Costa 876cc4d837 feat: semantic context injection via Chroma on UserPromptSubmit (#1568)
* feat: semantic context injection via Chroma on every UserPromptSubmit

On each prompt, queries ChromaDB for the top-N most relevant past
observations and injects them as additionalContext. Replaces the
recency-based "last N observations" approach with relevance-based
semantic search.

Changes:
- session-init.ts: After session init, query /api/context/semantic
  with user's prompt text. If results found, return as
  hookSpecificOutput with hookEventName 'UserPromptSubmit'.
- SearchRoutes.ts: New GET /api/context/semantic endpoint that queries
  SearchManager with format='json' and formats results as markdown.
- SettingsDefaultsManager.ts: New settings CLAUDE_MEM_SEMANTIC_INJECT
  (default: true) and CLAUDE_MEM_SEMANTIC_INJECT_LIMIT (default: 5).

Key behaviors:
- Fires on every UserPromptSubmit (not just SessionStart)
- Minimum prompt length: 20 chars (skips "ok", "yes", etc.)
- Skips media-only prompts
- Graceful degradation: if worker/Chroma unavailable, no injection
- Survives /clear: re-injects on next prompt (not session-bound)
- Uses workerHttpRequest (v10.6.3 API, not raw fetch)

Production data (23 days, 3,400+ observations):
- Before: 8 most recent observations (often irrelevant to current topic)
- After: 5 most relevant observations (semantic match)
- Token cost: ~1800 → ~800-1200 per injection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address CodeRabbit review on PR #1568

- session-init: don't skip semantic injection when contextInjected=true
  (only skip agent re-init, semantic lookup must run every prompt)
- session-init: normalize SEMANTIC_INJECT toggle via String().toLowerCase()
- semantic endpoint: change from GET to POST to avoid URL-length limits
  and prompt exposure in access logs. Handler accepts both body and query
  for backwards compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Alessandro Costa <alessandro@claudio.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:16:46 -07:00
Alessandro Costa 64cce2bf10 fix: resolve 3 upstream bugs (summarize, ChromaSync, HealthMonitor) (#1566)
* fix: resolve 3 upstream bugs in summarize, ChromaSync, and HealthMonitor

1. summarize.ts: Skip summary when transcript has no assistant message.
   Prevents error loop where empty transcripts cause repeated failed
   summarize attempts (~30 errors/day observed in production).

2. ChromaSync.ts: Fallback to chroma_update_documents when add fails
   with "IDs already exist". Handles partial writes after MCP timeout
   without waiting for next backfill cycle.

3. HealthMonitor.ts: Replace HTTP-based isPortInUse with atomic socket
   bind on Unix. Eliminates TOCTOU race when two sessions start
   simultaneously (HTTP check is non-atomic — both see "port free"
   before either completes listen()). Updated tests accordingly.

All three bugs are pre-existing in v10.5.5. Confirmed via log analysis
of 543K lines over 17 days of production usage across two servers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: add CONTRIB_NOTES.md to gitignore

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address CodeRabbit review on PR #1566

- HealthMonitor: add APPROVED OVERRIDE annotation for Win32 HTTP fallback
- ChromaSync: replace chroma_update_documents with delete+add for proper
  upsert (update only modifies existing IDs, silently ignores missing ones)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Alessandro Costa <alessandro@claudio.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:15:08 -07:00
Alex Newman a2ac116aac fix: move summary wait + session-complete into Stop hook to prevent lost summaries
SessionEnd has a 1.5s hardcoded cap from Claude Code (CLAUDE_CODE_SESSIONEND_HOOKS_TIMEOUT_MS),
making it unsuitable for waiting on async work. Previously, the Stop hook would fire-and-forget
the summarize request, then SessionEnd would immediately call deleteSession — aborting the SDK
agent mid-summary.

Now the Stop hook (120s timeout, no cap) owns the full lifecycle:
1. Queue summarize request
2. Poll new GET /api/sessions/status endpoint until queue drains
3. Call /api/sessions/complete after summary finishes

SessionEnd is now a true fire-and-forget fallback (process.exit(0) immediately).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:05:53 -07:00
Alex Newman 8265fc7aa1 Merge remote-tracking branch 'origin/thedotmack/npx-gemini-cli' into thedotmack/npx-gemini-cli
Resolve merge conflicts in adapter index, gemini-cli adapter, and rebuilt CJS artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 13:47:49 -07:00
Alex Newman 76a880a3d6 feat: update install CLI, ESM compat, and Gemini CLI docs
Fixes CursorHooksInstaller ESM compatibility, updates install command
with improved path resolution, and refreshes built plugin artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 12:38:45 -07:00
Alex Newman 67645041fa Merge main into thedotmack/file-read-timeline-inject
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 16:11:41 -07:00
Alex Newman 80d1deedbe fix: address PR review feedback from CodeRabbit
- Add sessionId to summarize.ts warning log for easier triage
- Add APPROVED OVERRIDE annotation to Windows spawn catch block

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:34:42 -07:00
Alex Newman 07ab7000a8 fix: patch 7 critical bugs affecting all non-dev-machine users and Windows
1. Fix esbuild inlining build-machine __dirname as string literal — use
   CJS-compatible runtime banner with require("node:url").fileURLToPath
   across worker-service, mcp-server, and context-generator builds.

2. Fix isMainModule check missing .cjs extension and Windows backslash
   path normalization.

3. Wrap extractLastMessage in try-catch to prevent infinite Stop hook
   feedback loop on malformed transcripts (exit 0 instead of exit 2).

4. Replace heavy SessionEnd hook (Node→Bun→1.7MB CJS→HTTP) with
   lightweight inline node -e one-liner (~200ms vs >1s).

5. Add 7 Gemini/OpenRouter error patterns to unrecoverablePatterns
   circuit breaker to prevent 77K+ retry loops on expired API keys.

6. Preserve CLAUDE_CODE_OAUTH_TOKEN and CLAUDE_CODE_GIT_BASH_PATH in
   sanitizeEnv instead of stripping them with the CLAUDE_CODE_ prefix.

7. Use PowerShell -EncodedCommand for spawnDaemon to fix path quoting
   when Windows usernames contain spaces.

Closes #1515, #1495, #1475, #1465, #1500, #1513, #1512, #1450, #1460,
#1486, #1449, #1481, #1451, #1480, #1453, #1445

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:20:29 -07:00
Conductor 5621b67ccd Saving uncommitted changes before archiving 2026-03-26 19:35:27 -07:00
Alex Newman a656af2bff feat: improve Gemini CLI timeline display by stripping ANSI colors and providing markdown fallback 2026-03-25 23:51:56 -07:00
huakson 2b60dd2932 feat: isolate Claude and Codex session sources
Persist platform_source across session creation, transcript ingestion, API query paths, and viewer state so Claude and Codex data can coexist without bleeding into each other.

- add platform-source normalization helpers and persist platform_source in sdk_sessions via migration 24 with backfill and indexing
- thread platformSource through CLI hooks, transcript processing, context generation, pagination, search routes, SSE payloads, and session management
- expose source-aware project catalogs, viewer tabs, context preview selectors, and source badges for observations, prompts, and summaries
- start the transcript watcher from the worker for transcript-based clients and preserve platform source during Codex ingestion
- auto-start the worker from the MCP server for MCP-only clients and tighten stdio-driven cleanup during shutdown
- keep createSDKSession backward compatible with existing custom-title callers while allowing explicit platform source forwarding
2026-03-24 08:46:18 -03:00
Alex Newman f2cc33b494 feat: add Gemini CLI, OpenCode, and Windsurf IDE integrations
Gemini CLI: platform adapter mapping 6 of 11 hooks, settings.json
deep-merge installer, GEMINI.md context injection.

OpenCode: plugin with tool.execute.after interceptor, bus events for
session lifecycle, claude_mem_search custom tool, AGENTS.md context.

Windsurf: platform adapter for tool_info envelope format, hooks.json
installer for 5 post-action hooks, .windsurf/rules context injection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 23:02:18 -07:00
Alex Newman c80763390b feat: file-read decision gate — block reads when observation history exists
Add a PreToolUse gate that blocks file reads on first attempt when rich
observation history exists, presenting the timeline as feedback. Claude
then decides: use get_observations() (skip read, save tokens) or re-read
(allowed on second attempt).

- FileReadGate: in-memory session-scoped gate with 4h TTL
- POST /api/file-context/gate endpoint in worker
- stderrMessage plumbing in hook-command for exit code 2
- file-context handler uses gate to block/allow reads

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 12:11:02 -07:00
Alex Newman e07b13f7de fix: proper project isolation and relative path matching for file-context hook
- Use getProjectContext(cwd).allProjects for project scoping (same as SessionStart)
- Convert absolute file_path to relative using cwd (observations store relative paths)
- API accepts comma-separated projects param with IN() SQL filter
- Remove basename matching — use full relative path to avoid cross-file collisions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:38:53 -07:00
Alex Newman 1d48f63b99 fix: remove project filter from file-context hook — cwd != stored project name
The handler was passing input.cwd (full absolute path) as the project
filter, but observations store short project names ('san-diego', not
'/Users/.../san-diego'). This caused zero results for every query.
Removing the filter entirely is better: cross-project observations
about the same file are useful for duplicate prevention.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:24:34 -07:00
Alex Newman fb9d917f8a feat: inject file observation timeline on PreToolUse Read hook
When Claude reads a file, the PreToolUse hook queries for existing
observations about that file and injects the timeline into context
via additionalContext + permissionDecision: allow. This prevents
duplicate observations and saves tokens through active rediscovery.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:18:54 -07:00
Alex Newman 80a8c90a1a feat: add embedded Process Supervisor for unified process lifecycle (#1370)
* feat: add embedded Process Supervisor for unified process lifecycle management

Consolidates scattered process management (ProcessManager, GracefulShutdown,
HealthMonitor, ProcessRegistry) into a unified src/supervisor/ module.

New: ProcessRegistry with JSON persistence, env sanitizer (strips CLAUDECODE_*
vars), graceful shutdown cascade (SIGTERM → 5s wait → SIGKILL with tree-kill
on Windows), PID file liveness validation, and singleton Supervisor API.

Fixes #1352 (worker inherits CLAUDECODE env causing nested sessions)
Fixes #1356 (zombie TCP socket after Windows reboot)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add session-scoped process reaping to supervisor

Adds reapSession(sessionId) to ProcessRegistry for killing session-tagged
processes on session end. SessionManager.deleteSession() now triggers reaping.
Tightens orphan reaper interval from 60s to 30s.

Fixes #1351 (MCP server processes leak on session end)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Unix domain socket support for worker communication

Introduces socket-manager.ts for UDS-based worker communication, eliminating
port 37777 collisions between concurrent sessions. Worker listens on
~/.claude-mem/sockets/worker.sock by default with TCP fallback.

All hook handlers, MCP server, health checks, and admin commands updated to
use socket-aware workerHttpRequest(). Backwards compatible — settings can
force TCP mode via CLAUDE_MEM_WORKER_TRANSPORT=tcp.

Fixes #1346 (port 37777 collision across concurrent sessions)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove in-process worker fallback from hook command

Removes the fallback path where hook scripts started WorkerService in-process,
making the worker a grandchild of Claude Code (killed by sandbox). Hooks now
always delegate to ensureWorkerStarted() which spawns a fully detached daemon.

Fixes #1249 (grandchild process killed by sandbox)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add health checker and /api/admin/doctor endpoint

Adds 30-second periodic health sweep that prunes dead processes from the
supervisor registry and cleans stale socket files. Adds /api/admin/doctor
endpoint exposing supervisor state, process liveness, and environment health.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add comprehensive supervisor test suite

64 tests covering all supervisor modules: process registry (18 tests),
env sanitizer (8), shutdown cascade (10), socket manager (15), health
checker (5), and supervisor API (6). Includes persistence, isolation,
edge cases, and cross-module integration scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: revert Unix domain socket transport, restore TCP on port 37777

The socket-manager introduced UDS as default transport, but this broke
the HTTP server's TCP accessibility (viewer UI, curl, external monitoring).
Since there's only ever one worker process handling all sessions, the
port collision rationale for UDS doesn't apply. Reverts to TCP-only,
removing ~900 lines of unnecessary complexity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: remove dead code found in pre-landing review

Remove unused `acceptingSpawns` field from Supervisor class (written but
never read — assertCanSpawn uses stopPromise instead) and unused
`buildWorkerUrl` import from context handler.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* updated gitignore

* fix: address PR review feedback - downgrade HTTP logging, clean up gitignore, harden supervisor

- Downgrade request/response HTTP logging from info to debug to reduce noise
- Remove unused getWorkerPort imports, use buildWorkerUrl helper
- Export ENV_PREFIXES/ENV_EXACT_MATCHES from env-sanitizer, reuse in Server.ts
- Fix isPidAlive(0) returning true (should be false)
- Add shutdownInitiated flag to prevent signal handler race condition
- Make validateWorkerPidFile testable with pidFilePath option
- Remove unused dataDir from ShutdownCascadeOptions
- Upgrade reapSession log from debug to warn
- Rename zombiePidFiles to deadProcessPids (returns actual PIDs)
- Clean up gitignore: remove duplicate datasets/, stale ~*/ and http*/ patterns
- Fix tests to use temp directories instead of relying on real PID file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 14:49:23 -07:00
AlexWorland 10e980cd69 fix: remove unrecognized fields from Claude Code Stop hook output (#1291)
* fix: remove unrecognized fields from Claude Code Stop hook output

Claude Code validates Stop hook JSON output against its hook contract
schema which only accepts {decision?, reason?, systemMessage?}. The
formatOutput() function was returning {continue, suppressOutput} which
are not part of the Claude Code hook API, causing "JSON validation
failed" errors on every session stop.

Return an empty object {} for the default case (no hookSpecificOutput),
preserving only systemMessage when present. This is valid for all hook
event types and eliminates the schema validation error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add unhappy-path tests for formatOutput per PR review

Add edge case coverage for malformed input (undefined/null), falsy
systemMessage values, non-contract field stripping, and contract key
allowlist. Also add defensive null guard to formatOutput matching
normalizeInput pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Alex Worland <alexworland@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:59:45 -07:00
alan e0fec4bad7 feat: add terminal output control for SessionStart context (#1143)
* feat: add terminal output control for SessionStart context

Add CLAUDE_MEM_CONTEXT_SHOW_TERMINAL_OUTPUT setting to control whether
context is displayed in the terminal at SessionStart.

When set to "false", the terminal remains clean at startup while
context is still injected into Claude's system prompt. This allows
users who find the context output verbose to disable it without
losing the automatic context injection.

Defaults to "true" for backward compatibility.

Changes:
- Add CLAUDE_MEM_CONTEXT_SHOW_TERMINAL_OUTPUT to SettingsDefaultsManager
- Check setting in context handler before setting systemMessage
- Update settings file format to include new option

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: use USER_SETTINGS_PATH and skip color fetch when disabled

Address PR feedback from automated review:

1. Use shared USER_SETTINGS_PATH constant instead of hardcoded path
   - Respects custom CLAUDE_MEM_DATA_DIR override
   - Consistent with other handlers (session-init, observation)

2. Skip color fetch when terminal output disabled
   - Check setting before making HTTP requests
   - Saves network round-trip on every session start

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Alan Dong <adong@Alans-MacBook-Pro.local>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 21:05:05 -05:00
Alex Newman c6f932988a Fix 30+ root-cause bugs across 10 triage phases (#1214)
* MAESTRO: fix ChromaDB core issues — Python pinning, Windows paths, disable toggle, metadata sanitization, transport errors

- Add --python version pinning to uvx args in both local and remote mode (fixes #1196, #1206, #1208)
- Convert backslash paths to forward slashes for --data-dir on Windows (fixes #1199)
- Add CLAUDE_MEM_CHROMA_ENABLED setting for SQLite-only fallback mode (fixes #707)
- Sanitize metadata in addDocuments() to filter null/undefined/empty values (fixes #1183, #1188)
- Wrap callTool() in try/catch for transport errors with auto-reconnect (fixes #1162)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix data integrity — content-hash deduplication, project name collision, empty project guard, stuck isProcessing

- Add SHA-256 content-hash deduplication to observations INSERT (store.ts, transactions.ts, SessionStore.ts)
- Add content_hash column via migration 22 with backfill and index
- Fix project name collision: getCurrentProjectName() now returns parent/basename
- Guard against empty project string with cwd-derived fallback
- Fix stuck isProcessing: hasAnyPendingWork() resets processing messages older than 5 minutes
- Add 12 new tests covering all four fixes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix hook lifecycle — stderr suppression, output isolation, conversation pollution prevention

- Suppress process.stderr.write in hookCommand() to prevent Claude Code showing diagnostic
  output as error UI (#1181). Restores stderr in finally block for worker-continues case.
- Convert console.error() to logger.warn()/error() in hook-command.ts and handlers/index.ts
  so all diagnostics route to log file instead of stderr.
- Verified all 7 handlers return suppressOutput: true (prevents conversation pollution #598, #784).
- Verified session-complete is a recognized event type (fixes #984).
- Verified unknown event types return no-op handler with exit 0 (graceful degradation).
- Added 10 new tests in tests/hook-lifecycle.test.ts covering event dispatch, adapter defaults,
  stderr suppression, and standard response constants.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix worker lifecycle — restart loop coordination, stale transport retry, ENOENT shutdown race

- Add PID file mtime guard to prevent concurrent restart storms (#1145):
  isPidFileRecent() + touchPidFile() coordinate across sessions
- Add transparent retry in ChromaMcpManager.callTool() on transport
  error — reconnects and retries once instead of failing (#1131)
- Wrap getInstalledPluginVersion() with ENOENT/EBUSY handling (#1042)
- Verified ChromaMcpManager.stop() already called on all shutdown paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix Windows platform support — uvx.cmd spawn, PowerShell $_ elimination, windowsHide, FTS5 fallback

- Route uvx spawn through cmd.exe /c on Windows since MCP SDK lacks shell:true (#1190, #1192, #1199)
- Replace all PowerShell Where-Object {$_} pipelines with WQL -Filter server-side filtering (#1024, #1062)
- Add windowsHide: true to all exec/spawn calls missing it to prevent console popups (#1048)
- Add FTS5 runtime probe with graceful fallback when unavailable on Windows (#791)
- Guard FTS5 table creation in migrations, SessionSearch, and SessionStore with try/catch

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix skills/ distribution — build-time verification and regression tests (#1187)

Add post-build verification in build-hooks.js that fails if critical
distribution files (skills, hooks, plugin manifest) are missing. Add
10 regression tests covering skill file presence, YAML frontmatter,
hooks.json integrity, and package.json files field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix MigrationRunner schema initialization (#979) — version conflict between parallel migration systems

Root cause: old DatabaseManager migrations 1-7 shared schema_versions table with
MigrationRunner's 4-22, causing version number collisions (5=drop tables vs add column,
6=FTS5 vs prompt tracking, 7=discovery_tokens vs remove UNIQUE).  initializeSchema()
was gated behind maxApplied===0, so core tables were never created when old versions
were present.

Fixes:
- initializeSchema() always creates core tables via CREATE TABLE IF NOT EXISTS
- Migrations 5-7 check actual DB state (columns/constraints) not just version tracking
- Crash-safe temp table rebuilds (DROP IF EXISTS _new before CREATE)
- Added missing migration 21 (ON UPDATE CASCADE) to MigrationRunner
- Added ON UPDATE CASCADE to FK definitions in initializeSchema()
- All changes applied to both runner.ts and SessionStore.ts

Tests: 13 new tests in migration-runner.test.ts covering fresh DB, idempotency,
version conflicts, crash recovery, FK constraints, and data integrity.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix 21 test failures — stale mocks, outdated assertions, missing OpenClaw guards

Server tests (12): Added missing workerPath and getAiStatus to ServerOptions
mocks after interface expansion. ChromaSync tests (3): Updated to verify
transport cleanup in ChromaMcpManager after architecture refactor. OpenClaw (2):
Added memory_ tool skipping and response truncation to prevent recursive loops
and oversized payloads. MarkdownFormatter (2): Updated assertions to match
current output. SettingsDefaultsManager (1): Used correct default key for
getBool test. Logger standards (1): Excluded CLI transcript command from
background service check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix Codex CLI compatibility (#744) — session_id fallbacks, unknown platform tolerance, undefined guard

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix Cursor IDE integration (#838, #1049) — adapter field fallbacks, tolerant session-init validation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix /api/logs OOM (#1203) — tail-read replaces full-file readFileSync

Replace readFileSync (loads entire file into memory) with readLastLines()
that reads only from the end of the file in expanding chunks (64KB → 10MB cap).
Prevents OOM on large log files while preserving the same API response shape.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix Settings CORS error (#1029) — explicit methods and allowedHeaders in CORS config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: add session custom_title for agent attribution (#1213) — migration 23, endpoint + store support

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: prevent CLAUDE.md/AGENTS.md writes inside .git/ directories (#1165)

Add .git path guard to all 4 write sites to prevent ref corruption when
paths resolve inside .git internals.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix plugin disabled state not respected (#781) — early exit check in all hook entry points

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix UserPromptSubmit context re-injection on every turn (#1079) — contextInjected session flag

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix stale AbortController queue stall (#1099) — lastGeneratorActivity tracking + 30s timeout

Three-layer fix:
1. Added lastGeneratorActivity timestamp to ActiveSession, updated by
   processAgentResponse (all agents), getMessageIterator (queue yields),
   and startGeneratorWithProvider (generator launch)
2. Added stale generator detection in ensureGeneratorRunning — if no
   activity for >30s, aborts stale controller, resets state, restarts
3. Added AbortSignal.timeout(30000) in deleteSession to prevent
   indefinite hang when awaiting a stuck generator promise

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 19:34:35 -05:00
Albert Hui 42adfe29c8 fix: gracefully handle missing input fields in hook handlers (#1098)
The summarize (Stop) and observation (PostToolUse) handlers throw
blocking errors (exit code 2) when optional input fields like
transcriptPath, toolName, or cwd are missing. This causes visible
hook errors on every session stop and after some tool uses.

Replace throws with graceful returns matching the existing pattern
used for worker-unavailable checks.

Fixes #1097

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 00:25:55 -05:00
Alex Newman 676a3d175e fix: make context and colored timeline fetches truly parallel
Address PR #1125 review feedback - both fetches now start simultaneously
via Promise.all instead of sequential-then-parallel.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 00:11:25 -05:00
Alex Newman 34358ab33d feat: add systemMessage support for SessionStart hook and tune defaults
Add systemMessage field to HookResult so SessionStart can display a
colored timeline directly to the user in the CLI. The handler now
parallel-fetches both markdown (for Claude context) and ANSI-colored
(for user display) timelines, appending a viewer URL link.

Also update default settings to hide verbose token columns (read/work
tokens, savings amount) and disable full observation expansion, keeping
the cleaner index-only view by default.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 00:05:13 -05:00
Alex Newman cb0933a908 fix: resolve merge conflict in isWorkerUnavailableError
Missing return statement and closing brace in the programming errors
check caused a build failure after merging main.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 23:47:46 -05:00
Rod Boev 22683f6910 fix: clarify TypeError order dependency in error classifier
Address Greptile review: add comment noting that TypeError('fetch failed')
is already handled by transport patterns before the instanceof check.
2026-02-10 17:50:47 -05:00
Rod Boev 7ffa1b06ee Clarify order dependency
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-02-10 17:36:13 -05:00
Rod Boev 418e38ee46 fix: hook resilience and worker lifecycle improvements (#957, #923, #984, #987, #1042)
Reduce timeouts to eliminate 10-30s startup delay when worker is dead
(common on WSL2 after hibernate). Add stale PID detection, graceful
error handling across all handlers, and error classification that
distinguishes worker unavailability from handler bugs.

- HEALTH_CHECK 30s→3s, new POST_SPAWN_WAIT (5s), PORT_IN_USE_WAIT (3s)
- isProcessAlive() with EPERM handling, cleanStalePidFile()
- getPluginVersion() try-catch for shutdown race (#1042)
- isWorkerUnavailableError: transport+5xx+429→exit 0, 4xx→exit 2
- No-op handler for unknown event types (#984)
- Wrap all handler fetch calls in try-catch for graceful degradation
- CLAUDE_MEM_HEALTH_TIMEOUT_MS env var override with validation
2026-02-10 15:34:35 -05:00
Alex Newman 8dfcb5e612 chore: bump version to 9.1.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 01:05:38 -05:00
Alex Newman ff503d08a7 MAESTRO: Merge PR #657 - Add generate/clean CLI commands for CLAUDE.md management
Cherry-picked source changes from PR #657 (224 commits behind main).
Adds `claude-mem generate` and `claude-mem clean` CLI commands:
- New src/cli/claude-md-commands.ts with generateClaudeMd() and cleanClaudeMd()
- Worker service generate/clean case handlers with --dry-run support
- CLAUDE_MD logger component type
- Uses shared isDirectChild from path-utils.ts (DRY improvement over PR original)

Skipped from PR: 91 CLAUDE.md file deletions (stale), build artifacts,
.claude/plans/ dev artifact, smart-install.js shell alias auto-injection
(aggressive profile modification without consent).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 05:52:54 -05:00
Alex Newman bf439043cf MAESTRO: Merge PRs #920 and #699 - Add project exclusion and folder CLAUDE.md exclusion settings
Cherry-picked both PRs to main (both had merge conflicts with current main).

PR #920 (@Spunky84): CLAUDE_MEM_EXCLUDED_PROJECTS setting with glob patterns
to exclude entire projects from memory tracking (privacy/confidentiality).
Early-exit in session-init and observation handlers. 11 unit tests.

PR #699 (@leepokai): CLAUDE_MEM_FOLDER_MD_EXCLUDE setting with JSON array
of paths to exclude from CLAUDE.md file generation (fixes SwiftUI/Xcode
build conflicts and drizzle kit migration failures). Closes #620.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 05:23:01 -05:00
Alex Newman 5dffb1ebb0 MAESTRO: fix(hooks): add session-complete handler to enable orphan reaper cleanup
Cherry-picked from PR #844 by @thusdigital. Sessions stayed in active
sessions map forever after summarize, causing the orphan reaper to think
all processes were still active. Adds session-complete as Stop phase 2
hook that calls POST /api/sessions/complete to remove sessions from the
active map, allowing the reaper to correctly identify and clean up
orphaned worker processes. Fixes #842.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 03:23:13 -05:00
Claude 830d7a2b23 fix: detect complete JSON instead of waiting for EOF
Root cause: Claude Code doesn't close stdin after writing hook input,
so stdin.on('end') never fires.

Previous approach: Timeout-based workaround (wait 5s then parse).

New approach: JSON is self-delimiting. We attempt to parse after each
data chunk. Once we have valid JSON, we resolve immediately without
waiting for EOF. This is the proper fix - hooks now exit in <500ms
instead of waiting for any timeout.

Changes:
- Add tryParseJson() to detect complete JSON
- Parse after each stdin chunk, resolve immediately on success
- Add 50ms parse delay for multi-chunk delivery edge case
- Safety timeout (30s) only for truly malformed input
- Removes dependency on stdin.on('end') which never fires

Testing:
- Normal operation: 448ms (was 5000ms+ with timeout approach)
- Stdin stays open: Process exits immediately after JSON complete

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 03:06:03 -05:00