e8082bb99221649abcf78a437935f81def1e6d91
700 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
e8082bb992 |
fix: add missing migration 28 mirror in SessionStore (#2139) (#2140)
* fix: mirror migration 28 in SessionStore so pending_messages.tool_use_id and worker_pid columns are created (#2139) SessionStore's inline migration list jumped from v27 to v29, skipping rebuildPendingMessagesForSelfHealingClaim. The worker uses SessionStore directly via worker/DatabaseManager.ts and bypasses the canonical MigrationRunner, so fresh installs ended up at "max v29" with neither column present — every queue claim and observation insert failed. Adds addPendingMessagesToolUseIdAndWorkerPidColumns following the existing mirror precedent (addObservationSubagentColumns / addObservationsUniqueContentHashIndex). Uses ALTER TABLE + column-existence guards so already-broken DBs at v29 self-heal on next worker boot. Verified on fresh DB and on a synthetic v29-without-v28 broken DB: both columns and indexes (idx_pending_messages_worker_pid, ux_pending_session_tool) appear after one boot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: wrap v28 mirror dedup+index creation in transaction Addresses Greptile P2 review on PR #2140: matches the existing pattern in addObservationsUniqueContentHashIndex (v29 mirror at SessionStore.ts:1127) and runner.ts rebuildPendingMessagesForSelfHealingClaim. A crash between the dedup DELETE and the schema_versions INSERT no longer leaves the DB in a half-applied state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8e0e3ca109 |
fix: stop draining queue on /clear (remove SessionEnd shim) (#2136)
* fix: stop draining queue on /clear (and on every other SessionEnd) The SessionEnd hook was wired to session-complete on Claude Code, Gemini CLI, the transcripts processor, the OpenCode plugin, and OpenClaw. All of those paths called POST /api/sessions/complete, which marked the session completed and abandoned every still-pending observation in the queue. So typing /clear (or logging out, or quitting) wiped in-flight work that the worker was perfectly happy to keep processing on its own. Removed the entire shim: - Deleted SessionEnd hook block in plugin/hooks/hooks.json - Deleted src/cli/handlers/session-complete.ts and its registry entry - Deleted POST /api/sessions/complete route + Zod schema in SessionRoutes - Removed call from transcripts processor handleSessionEnd - Removed call from opencode-plugin session.deleted handler - Removed Gemini SessionEnd → session-complete mapping - Removed openclaw scheduleSessionComplete + completionDelayMs + timer state - Updated tests + comments accordingly Explicit user-initiated deletion (DELETE /api/sessions/:id and POST /api/sessions/:sessionDbId/complete from the viewer UI) still works via SessionCompletionHandler.completeByDbId — that's the only path that should drain the queue. The worker self-completes via its SDK-agent generator's finally-block, so no external completion call is needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: clarify opencode-plugin session.deleted is in-memory cleanup only Greptile P2: file-level header still implied session.deleted called the worker. Now it only cleans up the local contentSessionIdsByOpenCodeSessionId map; worker self-completes via the SDK-agent generator finally-block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
703c64c756 |
v12.4.3: one-time pollution cleanup migration + v12.4.1/v12.4.2 fixes (#2133)
* fix: 5 trivial bugs from v12.4.1 issue triage - #2092: emit CJS-safe banner (no import.meta.url) in worker-service.cjs - #2100: PreToolUse Read hook timeout 2000s → 60s - #2131: add "shell": "bash" to every hook for Windows compat - #2132: Antigravity dir typo .agent → .agents - #2088: clear inherited MCP servers in worker SDK query() calls Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: stop context overflow loop + block task-notification leak - SDKAgent: clear memorySessionId on "prompt is too long" so crash-recovery starts a fresh SDK session instead of resuming the same poisoned context forever (was producing 68+ failed pending_messages on a single stuck session in the wild) - tag-stripping: new isInternalProtocolPayload() predicate; session-init hook + SessionRoutes both skip storage when entire prompt is one of Claude Code's autonomous protocol blocks (currently <task-notification>; conservative deny-list — does NOT touch <command-name>/<command-message> which wrap real user slash-commands) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version to 12.4.2 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: update CHANGELOG.md for v12.4.2 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cleanup): one-time v12.4.3 migration purges observer-sessions and stuck pending_messages Adds CleanupV12_4_3 module that runs once per data dir on worker startup (after migrations apply, before Chroma backfill). Drops accumulated pollution that v12.4.0 (observer-sessions filter) and v12.4.2 (context-overflow guard + task-notification leak block) prevent from recurring: - DELETE FROM sdk_sessions WHERE project='observer-sessions' (cascades to user_prompts, observations, session_summaries via existing FK ON DELETE CASCADE) - DELETE FROM pending_messages stuck in 'failed'/'processing' for any session with >=10 such rows (poisoned chains from the pre-v12.4.2 retry loop; threshold spares legitimate transient failures) - Wipes ~/.claude-mem/chroma and chroma-sync-state.json so backfillAllProjects rebuilds the vector store from cleaned SQLite Pre-flight checks free disk (1.2x DB size + 100MB) via fs.statfsSync; backs up via VACUUM INTO with copyFileSync fallback; PRAGMA foreign_keys=ON on the cleanup connection (off by default in bun:sqlite). Marker file ~/.claude-mem/.cleanup-v12.4.3-applied records backup path and counts. Opt-out via CLAUDE_MEM_SKIP_CLEANUP_V12_4_3=1. Verified locally: 311MB DB backed up to 277MB in 943ms; 11 observer sessions + 3 cascade rows + 141 stuck pending_messages purged; chroma rebuilt via backfill. Total cleanup time 1.1s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address PR #2133 code review - SessionRoutes: check isInternalProtocolPayload before stripping tags so internal protocol prompts skip the strip work entirely. - tag-stripping: bound isInternalProtocolPayload input length to 256KB to prevent ReDoS-class scans on malformed unclosed tags. - SDKAgent: extract resetSessionForFreshStart helper; both context-overflow paths now share one nullification routine. - worker-service: drop the per-startup "Checking for one-time v12.4.3 cleanup" info log — runs every boot even after marker exists; the function already logs at debug/warn when relevant. - tests: add isInternalProtocolPayload edge cases (whitespace, attributes, partial tags, unrelated tags, oversize input). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address Greptile P2 comments on PR #2133 CleanupV12_4_3.ts: derive backup directory and restore-hint path from effectiveDataDir instead of the module-level BACKUPS_DIR/DB_PATH constants. The dataDirectory override is meant for test isolation; the prior version still wrote backups to the production directory. SessionRoutes.ts: move isInternalProtocolPayload guard to the top of handleSessionInitByClaudeId, before createSDKSession. The previous position blocked the user_prompts insert but still created an empty sdk_sessions row, asymmetric with the hook-layer guard in session-init.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cleanup): retry on disk-skip; survive chroma wipe failure CodeRabbit Major + Claude review: - Disk pre-flight skip no longer writes the marker. A user temporarily low on disk would otherwise have the cleanup permanently disabled even after freeing space. Retry on next startup instead. - Wrap wipeChromaArtifacts in try/catch and write the marker even on failure (with chromaWipeError captured). Without this, an rmSync permission failure on chroma/ left writeMarker unreached, so every subsequent boot re-ran the SQL purge AND created a fresh backup, consuming disk indefinitely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cleanup): close backup handle before copyFileSync fallback Claude review: - backupDb is now closed before falling into the copyFileSync fallback. On Windows an open SQLite handle holds a file lock that can prevent the fallback copy from reading the source. The previous version only closed after both branches completed. - Add empty-body <task-notification></task-notification> case to the isInternalProtocolPayload tests for completeness. Cascade-row count queries already match the actual FK columns (content_session_id for user_prompts, memory_session_id for observations / session_summaries) — no fix needed there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cleanup): accurate session count + add migration tests Claude review v3: session-init.ts: filter on rawPrompt before the [media prompt] substitution. Functionally equivalent but explicit — the check no longer depends on the substitution leaving real protocol payloads untouched. CleanupV12_4_3.ts: counts.observerSessions now comes from a pre-DELETE COUNT(*), not from result.changes. bun:sqlite inflates result.changes with FTS-trigger and cascade row counts (the user_prompts_fts triggers inflate a 3-session purge to 19 changes). The previous code logged a misleading total and wrote it to the marker. tests/infrastructure/cleanup-v12_4_3.test.ts: happy-path coverage of the migration against a real on-disk SQLite under a tmpdir. Verifies observer-session purge with cascades, stuck pending_messages purge, chroma artifact wipe, marker payload shape, idempotency on re-run, and CLAUDE_MEM_SKIP_CLEANUP_V12_4_3 opt-out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(protocol-filter): close two-block false positive; address review CodeRabbit + Claude review v5: tag-stripping.ts: PROTOCOL_ONLY_REGEX rewritten with a negative-lookahead body so a prompt like "<task-notification>x</task-notification> hi <task-notification>y</task-notification>" no longer matches as a single outer block — the prior greedy [\s\S]* spanned the middle user text and would have silently dropped a real prompt. Confirmed via probe. tag-stripping.test.ts: drop the 50ms wall-clock assertion (CI flake); add the two-block-with-text case as a regression test. SessionRoutes.ts: filter on req.body.prompt directly, before the [media prompt] substitution and 256KB truncation. Mirrors the session-init.ts hook-layer ordering and ensures a protocol payload that happens to be near the byte limit isn't truncated before the filter runs. cleanup-v12_4_3.test.ts: add stuckCount=9 below-threshold case verifying pending_messages with <10 stuck rows are preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cleanup): include WAL/SHM in backup fallback; safer rollback CodeRabbit Major + Claude review v6: CleanupV12_4_3.ts: when VACUUM INTO fails and copyFileSync runs, also copy any -wal/-shm sidecars. The DB is configured WAL mode, so recent committed pages can live in those files; copying only the .db would miss them. VACUUM INTO already captures everything in one file, so the happy path is unaffected. CleanupV12_4_3.ts: wrap ROLLBACK in try/catch so a no-op rollback (SQLite already rolled back on a constraint failure) cannot shadow the original purge error. SDKAgent.ts: align both context-overflow log levels to error. Both branches are fatal-recovery paths; the previous warn/error split was inconsistent and made the throw branch easy to miss in logs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: pre-count stuck pending_messages; document adjacent-block fall-through Claude review v7: CleanupV12_4_3.ts: runStuckPendingPurge now uses a SELECT COUNT(*) before the DELETE, matching the pattern in runObserverSessionsPurge. result.changes is reliable today (no FTS on pending_messages) but the explicit count protects against future schema additions, and keeps the two purges symmetric. tag-stripping.test.ts: add test documenting that adjacent protocol blocks (no user text between) deliberately fall through to storage. The deny-list is per-block; concatenations are out of scope. Skipped per project rules / Node API constraints: - frsize fallback in disk check: Node/Bun StatFs doesn't expose frsize - VACUUM-INTO comment: comment-only suggestion - Overflow string constant extraction: low value Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5769f00827 |
perf(chroma): cache backfill watermarks in JSON to skip per-restart Chroma scans
Worker restarts triggered a full Chroma metadata scan for every project on every boot to figure out which sqlite ids were already embedded. With 253 projects and ~92k embeddings, this pegged chroma-mcp at 100-422% CPU on every spawn. Replace the scan with ~/.claude-mem/chroma-sync-state.json — per-project highest synced sqlite_id watermarks for observations/summaries/prompts. Backfill switches from "id NOT IN (huge list)" to "id > watermark"; live syncs bump the watermark on success; one-time bootstrap derives initial watermarks from a single Chroma scan if the state file is missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
94d592f212 |
perf: streamline worker startup and consolidate database connections (#2122)
* docs: pathfinder refactor corpus + Node 20 preflight
Adds the PATHFINDER-2026-04-22 principle-driven refactor plan (11 docs,
cross-checked PASS) plus the exploratory PATHFINDER-2026-04-21 corpus
that motivated it. Bumps engines.node to >=20.0.0 per the ingestion-path
plan preflight (recursive fs.watch). Adds the pathfinder skill.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: land PATHFINDER Plan 01 — data integrity
Schema, UNIQUE constraints, self-healing claim, Chroma upsert fallback.
- Phase 1: fresh schema.sql regenerated at post-refactor shape.
- Phase 2: migrations 23+24 — rebuild pending_messages without
started_processing_at_epoch; UNIQUE(session_id, tool_use_id);
UNIQUE(memory_session_id, content_hash) on observations; dedup
duplicate rows before adding indexes.
- Phase 3: claimNextMessage rewritten to self-healing query using
worker_pid NOT IN live_worker_pids; STALE_PROCESSING_THRESHOLD_MS
and the 60-s stale-reset block deleted.
- Phase 4: DEDUP_WINDOW_MS and findDuplicateObservation deleted;
observations.insert now uses ON CONFLICT DO NOTHING.
- Phase 5: failed-message purge block deleted from worker-service
2-min interval; clearFailedOlderThan method deleted.
- Phase 6: repairMalformedSchema and its Python subprocess repair
path deleted from Database.ts; SQLite errors now propagate.
- Phase 7: Chroma delete-then-add fallback gated behind
CHROMA_SYNC_FALLBACK_ON_CONFLICT env flag as bridge until
Chroma MCP ships native upsert.
- Phase 8: migration 19 no-op block absorbed into fresh schema.sql.
Verification greps all return 0 matches. bun test tests/sqlite/
passes 63/63. bun run build succeeds.
Plan: PATHFINDER-2026-04-22/01-data-integrity.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: land PATHFINDER Plan 02 — process lifecycle
OS process groups replace hand-rolled reapers. Worker runs until
killed; orphans are prevented by detached spawn + kill(-pgid).
- Phase 1: src/services/worker/ProcessRegistry.ts DELETED. The
canonical registry at src/supervisor/process-registry.ts is the
sole survivor; SDK spawn site consolidated into it via new
createSdkSpawnFactory/spawnSdkProcess/getSdkProcessForSession/
ensureSdkProcessExit/waitForSlot helpers.
- Phase 2: SDK children spawn with detached:true + stdio:
['ignore','pipe','pipe']; pgid recorded on ManagedProcessInfo.
- Phase 3: shutdown.ts signalProcess teardown uses
process.kill(-pgid, signal) on Unix when pgid is recorded;
Windows path unchanged (tree-kill/taskkill).
- Phase 4: all reaper intervals deleted — startOrphanReaper call,
staleSessionReaperInterval setInterval (including the co-located
WAL checkpoint — SQLite's built-in wal_autocheckpoint handles
WAL growth without an app-level timer), killIdleDaemonChildren,
killSystemOrphans, reapOrphanedProcesses, reapStaleSessions, and
detectStaleGenerator. MAX_GENERATOR_IDLE_MS and MAX_SESSION_IDLE_MS
constants deleted.
- Phase 5: abandonedTimer — already 0 matches; primary-path cleanup
via generatorPromise.finally() already lives in worker-service
startSessionProcessor and SessionRoutes ensureGeneratorRunning.
- Phase 6: evictIdlestSession and its evict callback deleted from
SessionManager. Pool admission gates backpressure upstream.
- Phase 7: SDK-failure fallback — SessionManager has zero matches
for fallbackAgent/Gemini/OpenRouter. Failures surface to hooks
via exit code 2 through SessionRoutes error mapping.
- Phase 8: ensureWorkerRunning in worker-utils.ts rewritten to
lazy-spawn — consults isWorkerPortAlive (which gates
captureProcessStartToken for PID-reuse safety via commit
|
||
|
|
f2d361b918 |
feat: security observation types + Telegram notifier (#2084)
* feat: security observation types + Telegram notifier Adds two severity-axis security observation types (security_alert, security_note) to the code mode and a fire-and-forget Telegram notifier that posts when a saved observation matches configured type or concept triggers. Default trigger fires on security_alert only; notifier is disabled until BOT_TOKEN and CHAT_ID are set. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(telegram): honor CLAUDE_MEM_TELEGRAM_ENABLED master toggle Adds an explicit on/off flag (default 'true') so users can disable the notifier without clearing credentials. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * perf(stop-hook): make summarize handler fire-and-forget Stop hook previously blocked the Claude Code session for up to 110 seconds while polling the worker for summary completion. The handler now returns as soon as the enqueue POST is acked. - summarize.ts: drop the 500ms polling loop and /api/sessions/complete call; tighten SUMMARIZE_TIMEOUT_MS from 300s to 5s since the worker acks the enqueue synchronously. - SessionCompletionHandler: extract idempotent finalizeSession() for DB mark + orphaned-pending-queue drain + broadcast. completeByDbId now delegates so the /api/sessions/complete HTTP route is backward compatible. - SessionRoutes: wire finalizeSession into the SDK-agent generator's finally block, gated on lastSummaryStored + empty pending queue so only Stop events produce finalize (not every idle tick). - WorkerService: own the single SessionCompletionHandler instance and inject it into SessionRoutes to avoid duplicate construction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(pr2084): address reviewer findings CodeRabbit: - SessionStore.getSessionById now returns status; without it, the finalizeSession idempotency guard always evaluated false and re-fired drain/broadcast on every call. - worker-service.ts: three call sites that remove the in-memory session after finalizeSession now do so only on success. On failure the session is left in place so the 60s orphan reaper can retry; removing it would orphan an 'active' DB row indefinitely under the fire-and- forget Stop hook. - runFallbackForTerminatedSession no longer emits a second session_completed event; finalizeSession already broadcasts one. The explicit broadcast now runs only on the finalize-failure fallback. Greptile: - TelegramNotifier reads via loadFromFile(USER_SETTINGS_PATH) so values in ~/.claude-mem/settings.json actually take effect; SettingsDefaultsManager.get() alone skipped the file and silently ignored user-configured credentials. - Emoji is derived from obs.type (security_alert → 🚨, security_note → 🔐, fallback 🔔) instead of hardcoded 🚨 for every observation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(hooks): worker-port mismatch on Windows and settings.json overrides (#2086) Hooks computed the health-check port as \$((37700 + id -u % 100)), ignoring ~/.claude-mem/settings.json. Two failure modes resulted: 1. Users upgrading from pre-per-uid builds kept CLAUDE_MEM_WORKER_PORT set to '37777' in settings.json. The worker bound 37777 (settings wins), but hooks queried 37701 (uid 501 on macOS), so every SessionStart/UserPromptSubmit health check failed. 2. Windows Git Bash/PowerShell returns a real Windows UID for 'id -u' (e.g. 209), producing port 37709 while the Node worker fell back to 37777 (process.getuid?.() ?? 77). Every prompt hit the 60s hook timeout. hooks.json now resolves the port in this order, matching how the worker itself resolves it: 1. sed CLAUDE_MEM_WORKER_PORT from ~/.claude-mem/settings.json 2. If absent, and uname is MINGW/CYGWIN/MSYS → 37777 3. Otherwise 37700 + (id -u || 77) % 100 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(pr2084): sync DatabaseManager.getSessionById return type CodeRabbit round 2: the DatabaseManager.getSessionById return type was missing platform_source, custom_title, and status fields that SessionStore.getSessionById actually returns. Structural typing hid the mismatch at compile time, but it prevents callers going through DatabaseManager from seeing the status field that the idempotency guard in SessionCompletionHandler relies on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(pr2084): hooks honor env vars and host; looser port regex (#2086 followup) CodeRabbit round 3: match the worker's env > file > defaults precedence and resolve host the same way as port. - Env: CLAUDE_MEM_WORKER_PORT and CLAUDE_MEM_WORKER_HOST win first. - File: sed now accepts both quoted ('"37777"') and unquoted (37777) JSON values for the port; a separate sed reads CLAUDE_MEM_WORKER_HOST. - Defaults: port per-uid formula (Windows: 37777), host 127.0.0.1. - Health-check URL uses the resolved $HOST instead of hardcoded localhost. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
99060bac1a |
fix: detect PID reuse in worker start-guard (container restarts) (#2082)
* fix: detect PID reuse in worker start-guard to survive container restarts The 'Worker already running' guard checked PID liveness with kill(0), which false-positives when a persistent PID file outlives the PID namespace (docker stop / docker start, pm2 graceful reloads). The new worker comes up with the same low PID (e.g. 11) as the old one, kill(0) says 'alive', and the worker refuses to start against its own prior incarnation. Capture a process-start token alongside the PID and verify identity, not just liveness: - Linux: /proc/<pid>/stat field 22 (starttime, jiffies since boot) - macOS/POSIX: `ps -p <pid> -o lstart=` - Windows: unchanged (returns null, falls back to liveness) PID files written by older versions are token-less, so verifyPidFileOwnership falls back to the current liveness-only behavior for backwards compatibility. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: apply review feedback to PID identity helpers - Collapse ProcessManager re-export down to a single import/export statement. - Make verifyPidFileOwnership a type predicate (info is PidInfo) so callers don't need non-null assertions on the narrowed value. - Drop the `!` assertions at the worker-service GUARD 1 call site now that the predicate narrows. - Tighten the captureProcessStartToken platform doc comment to enumerate process.platform values explicitly. No behavior change — esbuild output is byte-identical (type-only edits). Addresses items 1-3 of the claude-review comment on PR #2082. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: pin LC_ALL=C for `ps lstart=` in captureProcessStartToken Without a locale pin, `ps -o lstart=` emits month/weekday names in the system locale. A bind-mounted PID file written under one locale and read under another would hash to different tokens and the live worker would incorrectly appear stale — reintroducing the very bug this helper exists to prevent. Flagged by Greptile on PR #2082. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: address second-round review on PID identity helpers - verifyPidFileOwnership: log a DEBUG diagnostic when the PID is alive but the start-token mismatches. Without it, callers can't distinguish the "process dead" path from the "PID reused" path in production logs — the exact case this helper exists to catch. - writePidFile: drop the redundant `?? undefined` coercion. `null` and `undefined` are both falsy for the subsequent ternary, so the coercion was purely cosmetic noise that suggested an important distinction. - Add a unit test for the win32 fallback path in captureProcessStartToken (mocks process.platform) — previously uncovered in CI. Addresses items 1, 2, and 5 of the second claude-review on PR #2082. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
03748acd6a |
refactor: remove bearer auth and platform_source context filter (#2081)
* fix: resolve search, database, and docker bugs (#1913, #1916, #1956, #1957, #2048) - Fix concept/concepts param mismatch in SearchManager.normalizeParams (#1916) - Add FTS5 keyword fallback when ChromaDB is unavailable (#1913, #2048) - Add periodic WAL checkpoint and journal_size_limit to prevent unbounded WAL growth (#1956) - Add periodic clearFailed() to purge stale pending_messages (#1957) - Fix nounset-safe TTY_ARGS expansion in docker/claude-mem/run.sh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: prevent silent data loss on non-XML responses, add queue info to /health (#1867, #1874) - ResponseProcessor: mark messages as failed (with retry) instead of confirming when the LLM returns non-XML garbage (auth errors, rate limits) (#1874) - Health endpoint: include activeSessions count for queue liveness monitoring (#1867) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: cache isFts5Available() at construction time Addresses Greptile review: avoid DDL probe (CREATE + DROP) on every text query. Result is now cached in _fts5Available at construction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve worker stability bugs — pool deadlock, MCP loopback, restart guard (#1868, #1876, #2053) - Replace flat consecutiveRestarts counter with time-windowed RestartGuard: only counts restarts within 60s window (cap=10), decays after 5min of success. Prevents stranding pending messages on long-running sessions. (#2053) - Add idle session eviction to pool slot allocation: when all slots are full, evict the idlest session (no pending work, oldest activity) to free a slot for new requests, preventing 60s timeout deadlock. (#1868) - Fix MCP loopback self-check: use process.execPath instead of bare 'node' which fails on non-interactive PATH. Fix crash misclassification by removing false "Generator exited unexpectedly" error log on normal completion. (#1876) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve hooks reliability bugs — summarize exit code, session-init health wait (#1896, #1901, #1903, #1907) - Wrap summarize hook's workerHttpRequest in try/catch to prevent exit code 2 (blocking error) on network failures or malformed responses. Session exit no longer blocks on worker errors. (#1901) - Add health-check wait loop to UserPromptSubmit session-init command in hooks.json. On Linux/WSL where hook ordering fires UserPromptSubmit before SessionStart, session-init now waits up to 10s for worker health before proceeding. Also wrap session-init HTTP call in try/catch. (#1907) - Close #1896 as already-fixed: mtime comparison at file-context.ts:255-267 bypasses truncation when file is newer than latest observation. - Close #1903 as no-repro: hooks.json correctly declares all hook events. Issue was Claude Code 12.0.1/macOS platform event-dispatch bug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: security hardening — bearer auth, path validation, rate limits, per-user port (#1932, #1933, #1934, #1935, #1936) - Add bearer token auth to all API endpoints: auto-generated 32-byte token stored at ~/.claude-mem/worker-auth-token (mode 0600). All hook, MCP, viewer, and OpenCode requests include Authorization header. Health/readiness endpoints exempt for polling. (#1932, #1933) - Add path traversal protection: watch.context.path validated against project root and ~/.claude-mem/ before write. Rejects ../../../etc style attacks. (#1934) - Reduce JSON body limit from 50MB to 5MB. Add in-memory rate limiter (300 req/min/IP) to prevent abuse. (#1935) - Derive default worker port from UID (37700 + uid%100) to prevent cross-user data leakage on multi-user macOS. Windows falls back to 37777. Shell hooks use same formula via id -u. (#1936) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve search project filtering and import Chroma sync (#1911, #1912, #1914, #1918) - Fix per-type search endpoints to pass project filter to Chroma queries and SQLite hydration. searchObservations/Sessions/UserPrompts now use $or clause matching project + merged_into_project. (#1912) - Fix timeline/search methods to pass project to Chroma anchor queries. Prevents cross-project result leakage when project param omitted. (#1911) - Sync imported observations to ChromaDB after FTS rebuild. Import endpoint now calls chromaSync.syncObservation() for each imported row, making them visible to MCP search(). (#1914) - Fix session-init cwd fallback to match context.ts (process.cwd()). Prevents project key mismatch that caused "no previous sessions" on fresh sessions. (#1918) - Fix sync-marketplace restart to include auth token and per-user port. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve all CodeRabbit and Greptile review comments on PR #2080 - Fix run.sh comment mismatch (no-op flag vs empty array) - Gate session-init on health check success (prevent running when worker unreachable) - Fix date_desc ordering ignored in FTS session search - Age-scope failed message purge (1h retention) instead of clearing all - Anchor RestartGuard decay to real successes (null init, not Date.now()) - Add recordSuccess() calls in ResponseProcessor and completion path - Prevent caller headers from overriding bearer auth token - Add lazy cleanup for rate limiter map to prevent unbounded growth - Bound post-import Chroma sync with concurrency limit of 8 - Add doc_type:'observation' filter to Chroma queries feeding observation hydration - Add FTS fallback to all specialized search handlers (observations, sessions, prompts, timeline) - Add response.ok check and error handling in viewer saveSettings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve CodeRabbit round-2 review comments - Use failure timestamp (COALESCE) instead of created_at_epoch for stale purge - Downgrade _fts5Available flag when FTS table creation fails - Escape FTS5 MATCH input by quoting user queries as literal phrases - Escape LIKE metacharacters (%, _, \) in prompt text search - Add response.ok check in initial settings load (matches save flow) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve CodeRabbit round-3 review comments - Include failed_at_epoch in COALESCE for age-scoped purge - Re-throw FTS5 errors so callers can distinguish failure from no-results - Wrap all FTS fallback calls in SearchManager with try/catch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: remove bearer auth and platform_source from context inject Bearer token auth (#1932/#1933) added friction for all localhost API clients with no benefit — the worker already binds localhost-only (CORS restriction + host binding). Removed auth-token module, requireAuth middleware, and Authorization headers from all internal callers. platform_source filtering from the /api/context/inject path was never used by any caller and silently filtered out observations. The underlying platform_source column stays; only the query-time filter and its plumbing through ContextBuilder, ObservationCompiler, SearchRoutes, context.ts, and transcripts/processor.ts are removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: resolve CodeRabbit + Greptile + claude-review comments on PR #2081 - middleware.ts: drop 'Authorization' from CORS allowedHeaders (Greptile) - middleware.ts: rate limiter falls back to req.socket.remoteAddress; add Retry-After on 429 (claude-review) - SearchRoutes.ts: drop leftover platformSource read+pass in handleContextPreview (Greptile) - .docker-blowout-data/: stop tracking the empty SQLite placeholder and gitignore the dir (claude-review) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: tighten rate limiter — correct boundary + drop dead cleanup branch - `entry.count >= RATE_LIMIT_MAX_REQUESTS` so the 300th request is the first rejected (was 301). - Removed the `requestCounts.size > 100` lazy-cleanup block — on a localhost-only server the map tops out at 1–2 entries, so the branch was dead code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: rate limiter correctly allows exactly 300 req/min; doc localhost scope - Check `entry.count >= max` BEFORE incrementing so the cap matches the comment: 300 requests pass, the 301st gets 429. - Added a comment noting the limiter is effectively a global cap on a localhost-only worker (all callers share the 127.0.0.1/::1 bucket). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: normalise IPv4-mapped IPv6 in rate limiter client IP Strip the `::ffff:` prefix so a localhost caller routed as `::ffff:127.0.0.1` shares a bucket with `127.0.0.1`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: size-guarded prune of rate limiter map for non-localhost deploys Prune expired entries only when the map exceeds 1000 keys and we're already doing a window reset, so the cost is zero on the localhost hot path (1–2 keys) and the map can't grow unbounded if the worker is ever bound on a non-loopback interface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
8fd3685d6e |
chore: bump version to 12.3.6
Removes the 300 req/min rate limiter from the worker's HTTP middleware. The worker is localhost-only (enforced via CORS), so rate limiting was pointless security theater — but it broke the viewer, which polls logs and stats frequently enough to trip the limit within seconds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8d166b47c1 |
Revert "revert: roll back v12.3.3 (Issue Blowout 2026)"
This reverts commit
|
||
|
|
bfc7de377a |
revert: roll back v12.3.3 (Issue Blowout 2026)
SessionStart context injection regressed in v12.3.3 — no memory context is being delivered to new sessions. Rolling back to the v12.3.2 tree state while the regression is investigated. Reverts #2080. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
ba1ef6c42c |
fix: Issue Blowout 2026 — 25 bugs across worker, hooks, security, and search (#2080)
* fix: resolve search, database, and docker bugs (#1913, #1916, #1956, #1957, #2048) - Fix concept/concepts param mismatch in SearchManager.normalizeParams (#1916) - Add FTS5 keyword fallback when ChromaDB is unavailable (#1913, #2048) - Add periodic WAL checkpoint and journal_size_limit to prevent unbounded WAL growth (#1956) - Add periodic clearFailed() to purge stale pending_messages (#1957) - Fix nounset-safe TTY_ARGS expansion in docker/claude-mem/run.sh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: prevent silent data loss on non-XML responses, add queue info to /health (#1867, #1874) - ResponseProcessor: mark messages as failed (with retry) instead of confirming when the LLM returns non-XML garbage (auth errors, rate limits) (#1874) - Health endpoint: include activeSessions count for queue liveness monitoring (#1867) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: cache isFts5Available() at construction time Addresses Greptile review: avoid DDL probe (CREATE + DROP) on every text query. Result is now cached in _fts5Available at construction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve worker stability bugs — pool deadlock, MCP loopback, restart guard (#1868, #1876, #2053) - Replace flat consecutiveRestarts counter with time-windowed RestartGuard: only counts restarts within 60s window (cap=10), decays after 5min of success. Prevents stranding pending messages on long-running sessions. (#2053) - Add idle session eviction to pool slot allocation: when all slots are full, evict the idlest session (no pending work, oldest activity) to free a slot for new requests, preventing 60s timeout deadlock. (#1868) - Fix MCP loopback self-check: use process.execPath instead of bare 'node' which fails on non-interactive PATH. Fix crash misclassification by removing false "Generator exited unexpectedly" error log on normal completion. (#1876) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve hooks reliability bugs — summarize exit code, session-init health wait (#1896, #1901, #1903, #1907) - Wrap summarize hook's workerHttpRequest in try/catch to prevent exit code 2 (blocking error) on network failures or malformed responses. Session exit no longer blocks on worker errors. (#1901) - Add health-check wait loop to UserPromptSubmit session-init command in hooks.json. On Linux/WSL where hook ordering fires UserPromptSubmit before SessionStart, session-init now waits up to 10s for worker health before proceeding. Also wrap session-init HTTP call in try/catch. (#1907) - Close #1896 as already-fixed: mtime comparison at file-context.ts:255-267 bypasses truncation when file is newer than latest observation. - Close #1903 as no-repro: hooks.json correctly declares all hook events. Issue was Claude Code 12.0.1/macOS platform event-dispatch bug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: security hardening — bearer auth, path validation, rate limits, per-user port (#1932, #1933, #1934, #1935, #1936) - Add bearer token auth to all API endpoints: auto-generated 32-byte token stored at ~/.claude-mem/worker-auth-token (mode 0600). All hook, MCP, viewer, and OpenCode requests include Authorization header. Health/readiness endpoints exempt for polling. (#1932, #1933) - Add path traversal protection: watch.context.path validated against project root and ~/.claude-mem/ before write. Rejects ../../../etc style attacks. (#1934) - Reduce JSON body limit from 50MB to 5MB. Add in-memory rate limiter (300 req/min/IP) to prevent abuse. (#1935) - Derive default worker port from UID (37700 + uid%100) to prevent cross-user data leakage on multi-user macOS. Windows falls back to 37777. Shell hooks use same formula via id -u. (#1936) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve search project filtering and import Chroma sync (#1911, #1912, #1914, #1918) - Fix per-type search endpoints to pass project filter to Chroma queries and SQLite hydration. searchObservations/Sessions/UserPrompts now use $or clause matching project + merged_into_project. (#1912) - Fix timeline/search methods to pass project to Chroma anchor queries. Prevents cross-project result leakage when project param omitted. (#1911) - Sync imported observations to ChromaDB after FTS rebuild. Import endpoint now calls chromaSync.syncObservation() for each imported row, making them visible to MCP search(). (#1914) - Fix session-init cwd fallback to match context.ts (process.cwd()). Prevents project key mismatch that caused "no previous sessions" on fresh sessions. (#1918) - Fix sync-marketplace restart to include auth token and per-user port. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve all CodeRabbit and Greptile review comments on PR #2080 - Fix run.sh comment mismatch (no-op flag vs empty array) - Gate session-init on health check success (prevent running when worker unreachable) - Fix date_desc ordering ignored in FTS session search - Age-scope failed message purge (1h retention) instead of clearing all - Anchor RestartGuard decay to real successes (null init, not Date.now()) - Add recordSuccess() calls in ResponseProcessor and completion path - Prevent caller headers from overriding bearer auth token - Add lazy cleanup for rate limiter map to prevent unbounded growth - Bound post-import Chroma sync with concurrency limit of 8 - Add doc_type:'observation' filter to Chroma queries feeding observation hydration - Add FTS fallback to all specialized search handlers (observations, sessions, prompts, timeline) - Add response.ok check and error handling in viewer saveSettings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve CodeRabbit round-2 review comments - Use failure timestamp (COALESCE) instead of created_at_epoch for stale purge - Downgrade _fts5Available flag when FTS table creation fails - Escape FTS5 MATCH input by quoting user queries as literal phrases - Escape LIKE metacharacters (%, _, \) in prompt text search - Add response.ok check in initial settings load (matches save flow) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve CodeRabbit round-3 review comments - Include failed_at_epoch in COALESCE for age-scoped purge - Re-throw FTS5 errors so callers can distinguish failure from no-results - Wrap all FTS fallback calls in SearchManager with try/catch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
be99a5d690 |
fix: resolve search, database, and docker bugs (#2079)
* fix: resolve search, database, and docker bugs (#1913, #1916, #1956, #1957, #2048) - Fix concept/concepts param mismatch in SearchManager.normalizeParams (#1916) - Add FTS5 keyword fallback when ChromaDB is unavailable (#1913, #2048) - Add periodic WAL checkpoint and journal_size_limit to prevent unbounded WAL growth (#1956) - Add periodic clearFailed() to purge stale pending_messages (#1957) - Fix nounset-safe TTY_ARGS expansion in docker/claude-mem/run.sh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: prevent silent data loss on non-XML responses, add queue info to /health (#1867, #1874) - ResponseProcessor: mark messages as failed (with retry) instead of confirming when the LLM returns non-XML garbage (auth errors, rate limits) (#1874) - Health endpoint: include activeSessions count for queue liveness monitoring (#1867) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: cache isFts5Available() at construction time Addresses Greptile review: avoid DDL probe (CREATE + DROP) on every text query. Result is now cached in _fts5Available at construction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
b8360cdee1 |
fix: restore assistant replies to conversationHistory in OpenRouterAgent
The extracted helper methods (handleInitResponse, processObservationMessage, processSummaryMessage) lost the conversationHistory.push calls for assistant replies, breaking multi-turn context for queryOpenRouterMultiTurn. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
d2eb89d27f |
fix: resolve all CodeRabbit review comments
Critical fixes:
- GeminiCliHooksInstaller: wrap install/uninstall prep in try/catch to maintain
numeric return contract
- McpIntegrations: move mkdirSync inside try block for Goose installer
Major fixes:
- import-xml-observations: guard invalid dates before toISOString()
- runtime.ts: guard response.json() parsing
- WorktreeAdoption: delay adoptedSqliteIds mutation until SQL succeeds
- CursorHooksInstaller: move mkdirSync inside try block
- McpIntegrations: throw on failed claude-mem block replacement
- OpenClawInstaller: propagate parse failure instead of returning {}
- OpenClawInstaller: move mkdirSync inside try block
- WindsurfHooksInstaller: validate hooks.json shape with optional chaining
- timeline/queries: pass normalized Error directly to logger
- ChromaSync: use composite dedup key (entityType:id) to prevent cross-type collisions
- EnvManager: wrap preflight directory/file ops in try/catch with ENV logging
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
30a0ab4ddb |
fix: resolve Greptile review comments on error handling
- Remove spurious console.error in logger JSON.parse catch (expected control flow) - Remove debug logging from hot PID cleanup loop (approved override) - Replace unsafe `error as Error` casts with instanceof checks in ChromaSync, GeminiAgent, OpenRouterAgent - Wrap non-Error FTS failures with new Error(String()) instead of dropping details Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
a0dd516cd5 |
fix: resolve all 301 error handling anti-patterns across codebase
Systematic cleanup of every error handling anti-pattern detected by the automated scanner. 289 issues fixed via code changes, 12 approved with specific technical justifications. Changes across 90 files: - GENERIC_CATCH (141): Added instanceof Error type discrimination - LARGE_TRY_BLOCK (82): Extracted helper methods to narrow try scope to ≤10 lines - NO_LOGGING_IN_CATCH (65): Added logger/console calls for error visibility - CATCH_AND_CONTINUE_CRITICAL_PATH (10): Added throw/return or approved overrides - ERROR_STRING_MATCHING (2): Approved with rationale (no typed error classes) - ERROR_MESSAGE_GUESSING (1): Replaced chained .includes() with documented pattern array - PROMISE_CATCH_NO_LOGGING (1): Added logging to .catch() handler Also fixes a detector bug where nested try/catch inside a catch block corrupted brace-depth tracking, causing false positives. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
2337997c48 |
fix(parser): stop warning on normal observation responses (#2074)
parseSummary runs on every agent response, not just summary turns. When the turn is a normal observation, the LLM correctly emits <observation> and no <summary> — but the fallthrough branch from #1345 treated this as prompt misbehavior and logged "prompt conditioning may need strengthening" every time. That assumption stopped holding after #1633 refactored the caller to always invoke parseSummary with a coerceFromObservation flag. Gate the whole observation-on-summary path on coerceFromObservation. On a real summary turn, coercion still runs and logs the legitimate "coercion failed" warning when the response has no usable content. On an observation turn, parseSummary returns null silently, which is the correct behavior. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
789efe4234 |
feat: disable subagent summaries, label subagent observations (#2073)
* feat: disable subagent summaries and label subagent observations Detect Claude Code subagent hook context via `agent_id`/`agent_type` on stdin, short-circuit the Stop-hook summary path when present, and thread the subagent identity end-to-end onto observation rows (new `agent_type` and `agent_id` columns, migration 010 at version 27). Main-session rows remain NULL; content-hash dedup is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address PR #2073 review feedback - Narrow summarize subagent guard to agentId only so --agent-started main sessions still own their summary (agentType alone is main-session). - Remove now-dead agentId/agentType spreads from the summarize POST body. - Always overwrite pendingAgentId/pendingAgentType in SDK/Gemini/OpenRouter agents (clears stale subagent identity on main-session messages after a subagent message in the same batch). - Add idx_observations_agent_id index in migration 010 + the mirror migration in SessionStore + the runner. - Replace console.log in migration010 with logger.debug. - Update summarize test: agentType alone no longer short-circuits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address CodeRabbit + claude-review iteration 4 feedback - SessionRoutes.handleSummarizeByClaudeId: narrow worker-side guard to agentId only (matches hook-side). agentType alone = --agent main session, which still owns its summary. - ResponseProcessor: wrap storeObservations in try/finally so pendingAgentId/Type clear even if storage throws. Prevents stale subagent identity from leaking into the next batch on error. - SessionStore.importObservation + bulk.importObservation: persist agent_type/agent_id so backup/import round-trips preserve subagent attribution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * polish: claude-review iteration 5 cleanup - Use ?? not || for nullable subagent fields in PendingMessageStore (prevents treating empty string as null). - Simplify observation.ts body spread — include fields unconditionally; JSON.stringify drops undefined anyway. - Narrow any[] to Array<{ name: string }> in migration010 column checks. - Add trailing newline to migrations.ts. - Document in observations/store.ts why the dedup hash intentionally excludes agent fields. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * polish: claude-review iteration 7 feedback - claude-code adapter: add 128-char safety cap on agent_id/agent_type so a malformed Claude Code payload cannot balloon DB rows. Empty strings now also treated as absent. - migration010: state-aware debug log lists only columns actually added; idempotent re-runs log "already present; ensured indexes". - Add 3 adapter tests covering the length cap boundary and empty-string rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * perf: skip subagent summary before worker bootstrap Move the agentId short-circuit above ensureWorkerRunning() so a Stop hook fired inside a subagent does not trigger worker startup just to return early. Addresses CodeRabbit nit on summarize.ts:36-47. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8ec91e7ffa |
fix: break infinite summary-retry loop (#1633) (#2072)
* Initial plan * fix: break infinite summary-retry loop (#1633) Three-part fix: 1. Parser coercion: When LLM returns <observation> tags instead of <summary>, coerce observation content into summary fields (root cause fix) 2. Stronger summary prompt: Add clearer tag requirements with warnings 3. Circuit breaker: Track consecutive summary failures per session, skip further attempts after 3 failures to prevent unbounded prompt growth Agent-Logs-Url: https://github.com/thedotmack/claude-mem/sessions/e345e8ec-bc97-4eaa-94bd-6e951fda8f77 Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com> * refactor: extract shared constants for summary mode marker and failure threshold Addresses code review feedback: SUMMARY_MODE_MARKER and MAX_CONSECUTIVE_SUMMARY_FAILURES are now defined once in sdk/prompts.ts and imported by ResponseProcessor and SessionManager. Agent-Logs-Url: https://github.com/thedotmack/claude-mem/sessions/e345e8ec-bc97-4eaa-94bd-6e951fda8f77 Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com> * fix: guard summary failure counter on summaryExpected (Greptile P1) The circuit breaker counter previously incremented on any response containing <observation> or <summary> tags — which matches virtually every normal observation response. After 3 observations the breaker would open and permanently block summarization, reproducing the data-loss scenario #1633 was meant to prevent. Gate the increment block on summaryExpected (already computed for parseSummary coercion) so the counter only tracks actual summary attempts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: cover circuit-breaker + apply review polish - Use findLast / at(-1) for last-user-message lookup instead of filter + index (O(1) common case). - Drop redundant `|| 0` fallback — field is required and initialized. - Add comment noting counter is ephemeral by design. - Add ResponseProcessor tests covering: * counter NOT incrementing on normal observation responses (regression guard for the Greptile P1) * counter incrementing when a summary was expected but missing * counter resetting to 0 on successful summary storage Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: iterate all observation blocks; don't count skip_summary as failure Addresses CodeRabbit review on #2072: - coerceObservationToSummary now iterates all <observation> blocks with a global regex and returns the first block that has title, narrative, or facts. Previously, an empty leading observation would short-circuit and discard populated follow-ups. - Circuit-breaker counter now treats explicit <skip_summary/> as neutral — neither a failure nor a success — so a run that happens to end on a skip doesn't punish the session or mask a prior bad streak. Real failures (no summary, no skip) still increment. - Tests added for both cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: reference SUMMARY_MODE_MARKER constant instead of hardcoded string Addresses CodeRabbit nitpick: tests should pull the marker from the canonical source so they don't silently drift when the constant is renamed or edited. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: also coerce observations when <summary> has empty sub-tags When the LLM wraps an empty <summary></summary> around real observation content, the #1360 empty-subtag guard rejects the summary and returns null — which would lose the observation content and resurrect the #1633 retry loop. Fall back to coerceObservationToSummary in that branch too, mirroring the unmatched-<summary> path. Adds a test covering the empty-summary-wraps-observation case and a guard test for empty summary with no observation content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com> Co-authored-by: Alex Newman <thedotmack@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b8d63d949f |
fix(worktree): self-heal Chroma metadata on re-run
Addresses unresolved CodeRabbit finding on WorktreeAdoption.ts:296. Previously, Chroma patch failures stranded rows permanently: adoptedSqliteIds was built only from rows where merged_into_project IS NULL, so once SQL committed, reruns couldn't rediscover them for retry. The Chroma id set is now built from ALL observations whose project matches a merged worktree — including rows already stamped to this parent. Combined with the idempotent updateMergedIntoProject, transient Chroma failures self-heal on the next adoption pass. SQL writes remain idempotent (UPDATE still guards on merged_into_project IS NULL), so adoptedObservations / adoptedSummaries continue to count only newly-adopted rows. chromaUpdates now counts total Chroma writes per pass (may exceed adoptedObservations when retrying). |
||
|
|
7a66cb310f |
fix(worktree): address PR review — schema guard, startup adoption, query parity
Addresses six CodeRabbit/Greptile findings on PR #2052: - Schema guard in adoptMergedWorktrees probes for merged_into_project columns before preparing statements; returns early when absent so first boot after upgrade (pre-migration) doesn't silently fail. - Startup adoption now iterates distinct cwds from pending_messages and dedupes via resolveMainRepoPath — the worker daemon runs with cwd=plugin scripts dir, so process.cwd() fallback was a no-op. - ObservationCompiler single-project queries (queryObservations / querySummaries) OR merged_into_project into WHERE so injected context surfaces adopted worktree rows, matching the Multi variants. - SessionStore constructor now calls ensureMergedIntoProjectColumns so bundled artifacts (context-generator.cjs) that embed SessionStore get the merged_into_project column on DBs that only went through the bundled migration chain. - OBSERVER_SESSIONS_PROJECT constant is now derived from basename(OBSERVER_SESSIONS_DIR) and used across PaginationHelper, SessionStore, and timeline queries instead of hardcoded strings. - Corrected misleading Chroma retry docstring in WorktreeAdoption to match actual behavior (no auto-retry once SQL commits). |
||
|
|
d1601123fd |
feat(ui): hide observer-sessions project from UI lists
Observer sessions (internal SDK-driven worker queries) run under a synthetic project name 'observer-sessions' to keep them out of claude --resume. They were still surfacing in the viewer project picker and unfiltered observation/summary/prompt feeds. Filter them out at every UI-facing query: - SessionStore.getAllProjects and getProjectCatalog - timeline/queries.ts getAllProjects - PaginationHelper observations/summaries/prompts when no project is selected When a caller explicitly requests project='observer-sessions', results are still returned (not a hard ban, just hidden by default). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f6fda8fff4 |
fix(worktree): address CodeRabbit PR review feedback
- Document --branch override in npx-cli help text - Guard ContextBuilder against empty projects[] override; fall back to cwd-derived primary - Ensure merged_into_project indexes are created even if ALTER ran in a prior partial migration - Reject adopt --branch/--cwd flags with missing or flag-like values - Use defined --color-border-primary token for merged badge border Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d24f3a7019 |
fix(worktree): address PR review — test assertion, dry-run sentinel, git timeouts
- Update allProjects test expectation to match [parent, composite] (matches JSDoc + callers in ContextBuilder/context handlers). - Replace string-matched __DRY_RUN_ROLLBACK__ sentinel with dedicated DryRunRollback class to avoid swallowing unrelated errors. - Add 5000ms timeout to spawnSync git calls in WorktreeAdoption and ProcessManager so worker startup can't hang on a stuck git process. - Drop unreachable break after process.exit(0) in adopt case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
bce4ce32ec |
feat(ui): show merged-into-parent badge on adopted observations
ObservationCard renders a secondary "merged → <parent>" chip when merged_into_project is set, next to the existing project label. Both are meaningful: project is origin provenance, merged_into_project is the current home. Extends PaginationHelper's observations and summaries queries with OR merged_into_project = ? so the single-project viewer fetch pulls in adopted rows — the plan's Phase 3 covered multi-project context injection; this is the single-project UI read path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5664fabce4 |
feat(cli): npx claude-mem adopt [--dry-run] [--branch X]
Adds a manual escape hatch for the worktree adoption engine. Covers squash-merges where git branch --merged HEAD returns nothing, and lets users re-run adoption on demand. Wired through worker-service.cjs (same pattern as generate/clean) so the command runs under Bun with bun:sqlite, keeping npx-cli/ pure Node. --cwd flag passes the user's working directory through the spawn so the engine resolves the correct parent repo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
0b90495391 |
feat(worktree): auto-adopt merged worktrees on worker startup
Invokes adoptMergedWorktrees() right after runOneTimeCwdRemap() and before dbManager.initialize(), wrapped in try/catch so adoption failures never block startup. Idempotent, so running every startup is cheap — the SQL UPDATE only touches rows where merged_into_project IS NULL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3e770df332 |
feat(worktree): query plumbing surfaces merged rows under parent project
ObservationCompiler.queryObservationsMulti and querySummariesMulti WHERE clause extended with OR merged_into_project IN (...), so a parent-project read pulls in rows originally written under any child worktree's composite name once merged. SearchManager wraps the Chroma project filter in \$or so semantic search behaves identically. ChromaSync baseMetadata now carries merged_into_project on new embeddings; existing rows are patched retroactively by the adoption engine. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a7c3c4af2d |
feat(worktree): adoption engine for merged worktrees
Detects merged worktrees via git (worktree list --porcelain + branch --merged HEAD), then stamps merged_into_project on SQLite observations/summaries and propagates the same metadata to Chroma in lockstep. `project` stays immutable; adoption is a virtual pointer. Idempotent via IS NULL guard on UPDATE and by idempotent Chroma metadata writes. SQL is source of truth — Chroma failures are logged but don't roll back SQL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3d1dfcc26a |
feat(migration): add merged_into_project column for worktree adoption
Nullable pointer on observations and session_summaries that lets a worktree's rows surface under the parent project's observation list without data movement. Self-idempotent via PRAGMA table_info guard; does not bump schema_versions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
dc198d5677 |
feat(worktree): include parent repo observations in worktree read scope
Worktrees are branches off main; the parent holds the architecture, decisions, and long-tail history the worktree inherits. Scoping reads to the worktree alone meant every new worktree started cold on any question that required prior context. Expand `allProjects` in a worktree to `[parent, composite]` so reads pull both. Writes still go through `.primary` (the composite), so sibling worktrees don't leak into each other. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
193e7e0719 |
feat(worktree): auto-apply cwd-based project remap on worker startup
Ports scripts/cwd-remap.ts into ProcessManager.runOneTimeCwdRemap() and invokes it in initializeBackground() alongside the existing chroma migration. Uses pending_messages.cwd as the source of truth to rewrite pre-worktree bare project names into the parent/worktree composite format so search and context are consistent. - Backs up the DB to .bak-cwd-remap-<ts> before any writes. - Idempotent: marker file .cwd-remap-applied-v1 short-circuits reruns. - No-ops on fresh installs (no DB, or no pending_messages table). - On failure, logs and skips the marker so the next restart retries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9d695f53ed |
chore: remove auto-generated per-directory CLAUDE.md files
Leftover artifacts from an abandoned context-injection feature. The project-level CLAUDE.md stays; the directory-level ones were generated timeline scaffolding that never panned out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a65ab055ca |
fix(worktree): audit observation fetch/display for composite project names
Three sites didn't account for parent/worktree composite naming: - PaginationHelper.stripProjectPath: marker used full composite, breaking path sanitization for worktrees checked out outside a parent/leaf layout. Now extracts the leaf segment. - observations/store.ts: fallback imported getCurrentProjectName from shared/paths.ts (a duplicate impl without worktree detection). Switched to getProjectContext().primary so writes key into the same project as reads. - SearchManager.getRecentContext: fallback used basename(cwd) and lost the parent prefix, making the MCP tool find nothing in worktrees. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3869b083d0 |
fix(context): derive project from explicit projects array, not cwd
When a caller (e.g. worker context-inject route) passes a `projects` array without a matching cwd, the cwd-derived `context.primary` drifted from the projects being queried — producing an empty-state header for one project while querying another. Use the last entry of `projects` so header and query target stay in sync. |
||
|
|
040729beef |
fix(project-name): use parent/worktree composite so observations don't cross worktrees
Revert of #1820 behavior. Each worktree now gets its own bucket: - In a worktree, primary = `parent/worktree` (e.g. `claude-mem/dar-es-salaam`) - In a main repo, primary = basename (unchanged) - allProjects is always `[primary]` — strict isolation at query time Includes a one-off maintenance script (scripts/worktree-remap.ts) that retroactively reattributes past sessions to their worktree using path signals in observations and user prompts. Two-rule inference keeps the remap high-confidence: 1. The worktree basename in the path matches the session's current plain project name (pre-#1820 era; trusted). 2. Or all worktree path signals converge on a single (parent, worktree) across the session. Ambiguous sessions are skipped. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
c76a439491 |
fix: drop orphan flag when filtering empty-string spawn args (#2049)
Observations were 100% failing on Claude Code 2.1.109+ because the Agent SDK emits ["--setting-sources", ""] when settingSources defaults to []. The existing Bun-workaround filter stripped the empty string but left the orphan --setting-sources flag, which then consumed --permission-mode as its value, crashing the subprocess with: Error processing --setting-sources: Invalid setting source: --permission-mode. Make the filter pair-aware: when an empty arg follows a --flag, drop both so the SDK default (no setting sources) is preserved by omission. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
aa7cdb6d9f |
fix: revert unauthorized $CMEM branding in context header
A prior Claude instance snuck in a `$CMEM` token branding header
during a context compression refactor (
|
||
|
|
d0fc68c630 |
revert: remove overengineered summary salvage logic (#1718) (#1850)
The synthetic summary salvage feature created fake summaries from observation data when the AI returned <observation> instead of <summary> tags. This was overengineered — missing a summary is preferable to fabricating one from observation fields that don't map cleanly to summary semantics. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
05232ff091 |
fix: reap stuck generators in reapStaleSessions (fixes #1652) (#1698)
* fix: reap stuck generators in reapStaleSessions (fixes #1652) Sessions whose SDK subprocess hung would stay in the active sessions map forever because `reapStaleSessions()` unconditionally skipped any session with a non-null `generatorPromise`. The generator was blocked on `for await (const msg of queryResult)` inside SDKAgent and could never unblock itself — the idle-timeout only fires when the generator is in `waitForMessage()`, and the orphan reaper skips processes whose session is still in the map. Add `MAX_GENERATOR_IDLE_MS` (5 min). When `reapStaleSessions()` sees a session whose `generatorPromise` is set but `lastGeneratorActivity` has not advanced in over 5 minutes, it now: 1. SIGKILLs the tracked subprocess to unblock the stuck `for await` 2. Calls `session.abortController.abort()` so the generator loop exits 3. Calls `deleteSession()` which waits up to 30 s for the generator to finish, then cleans up supervisor-tracked children Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: freeze time in stale-generator test and import constants from production source - Export MAX_GENERATOR_IDLE_MS, MAX_SESSION_IDLE_MS, StaleGeneratorCandidate, StaleGeneratorProcess, and detectStaleGenerator from SessionManager.ts so tests no longer duplicate production constants or detection logic. - Use setSystemTime() from bun:test to freeze Date.now() in the "exactly at threshold" test, eliminating the flaky double-Date.now() race. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
f97c50bfb9 |
fix: session lifecycle guards to prevent runaway API spend (#1590) (#1693)
* fix: add session lifecycle guards to prevent runaway API spend (#1590) Three root causes allowed 30+ subprocess accumulation over 36 hours: 1. SIGTERM-killed processes (code 143) triggered crash recovery and immediately respawned — now detected and treated as intentional termination (aborts controller so wasAborted=true in .finally). 2. No wall-clock limit: sessions ran for 13+ hours continuously spending tokens — now refuses new generators after 4 hours and drains the pending queue to prevent further spawning. 3. Duplicate --resume processes for the same session UUID — now killed and unregistered before a new spawn is registered. Generated by Claude Code Vibe coded by ousamabenyounes Co-Authored-By: Claude <noreply@anthropic.com> * fix: use normalized errorMsg in logger.error payload and annotate SIGTERM override Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: use persisted createdAt for wall-clock guard and bind abortController locally to prevent stale abort Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: re-trigger CodeRabbit review after rate limit reset * fix: defer process unregistration until exit and align boundary test with strict > (#1693) - ProcessRegistry: don't unregister PID immediately after SIGTERM — let the existing 'exit' handler clean up when the process actually exits, preventing tracking loss for still-live processes. - Test: align wall-clock boundary test with production's strict `>` operator (exactly 4h is NOT terminated, only >4h is). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com> |
||
|
|
983be42998 |
fix: resolve Gemini CLI 0.37.0 session capture failures (#1664) (#1692)
Three root causes prevented Gemini sessions from persisting prompts, observations, and summaries: 1. BeforeAgent was mapped to user-message (display-only) instead of session-init (which initialises the session and starts the SDK agent). 2. The transcript parser expected Claude Code JSONL (type: "assistant") but Gemini CLI 0.37.0 writes a JSON document with a messages array where assistant entries carry type: "gemini". extractLastMessage now detects the format and routes to the correct parser, preserving full backward compatibility with Claude Code JSONL transcripts. 3. The summarize handler omitted platformSource from the /api/sessions/summarize request body, causing sessions to be recorded without the gemini-cli source tag. Co-authored-by: Claude <noreply@anthropic.com> |
||
|
|
16a0737dfc |
fix: use parent project name for worktree observation writes (#1820)
* fix: use parent project name for worktree observation writes (#1819) Observations and sessions from git worktrees were stored under basename(cwd) instead of the parent repo name because write paths called getProjectName() (not worktree-aware) instead of getProjectContext() (worktree-aware). This is the same bug as #1081, #1317, and #1500 — it regressed because the two functions coexist and new code reached for the simpler one. Fix: getProjectContext() now returns parentProjectName as primary when in a worktree, and all four write-path call sites now use getProjectContext().primary instead of getProjectName(). Includes regression test that creates a real worktree directory structure and asserts primary === parentProjectName. * fix: address review nitpicks — allProjects fallback, JSDoc, write-path test - ContextBuilder: default projects to context.allProjects for legacy worktree-labeled record compatibility - ProjectContext: clarify JSDoc that primary is canonical (parent repo in worktrees) - Tests: add write-path regression test mirroring session-init/SessionRoutes pattern; refactor worktree fixture into beforeAll/afterAll * refactor(project-name): rename local to cwdProjectName and dedupe allProjects Addresses final CodeRabbit nitpick: disambiguates the local variable from the returned `primary` field, and dedupes allProjects via Set in case parent and cwd resolve to the same name. --------- Co-authored-by: Ethan Hurst <ethan.hurst@outlook.com.au> |
||
|
|
3d92684e04 |
fix: filter empty string args before Bun spawn() to prevent CLI parsing errors (#1781)
Bun's child_process.spawn() silently drops empty string arguments from
argv, unlike Node which preserves them. When the Agent SDK defaults
settingSources to [] (empty array), [].join(",") produces "" which gets
pushed as ["--setting-sources", ""]. Bun drops the "", causing
--permission-mode to be consumed as the value for --setting-sources:
Error processing --setting-sources: Invalid setting source: --permission-mode
This caused 100% observation failure (exit code 1 on every SDK subprocess
spawn), resulting in 0 observations stored across all sessions.
The fix filters empty string args before passing to spawn(), making the
behavior consistent between Node and Bun runtimes.
Fixes #1779
Related: #1660
Co-authored-by: bswnth48 <69203760+bswnth48@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
471e1f62f9 |
Fix npx search and default Codex context to workspace-local AGENTS (#1780)
* Fix npx search query parameter mismatch * Use workspace-local Codex AGENTS context by default --------- Co-authored-by: bnb <bnb> |
||
|
|
eeb6841033 |
fix: coerce corpus route filters (#1776)
* fix: coerce corpus route filters * test: cover unsupported corpus type filters |
||
|
|
2a2008bac2 |
fix(file-context): preserve targeted reads + invalidate on mtime (#1719) (#1729)
* fix(file-context): preserve targeted reads + invalidate on mtime (#1719) The PreToolUse:Read hook unconditionally rewrote tool input to {file_path, limit:1}, which interacted with two failure modes: 1. Subagent edits a file → parent's next Read still gets truncated because the observation snapshot predates the change. 2. Claude requests a different section with offset/limit → the hook strips them, so the Claude Code harness's read-dedup cache returns "File unchanged" against the prior 1-line read. The file becomes unreadable for the rest of the conversation, even though the hook's own recovery hint says "Read again with offset/limit for the section you need." Two complementary fixes: - **mtime invalidation**: stat the file (we already stat for the size gate) and compare mtimeMs to the newest observation's created_at_epoch. If the file is newer, pass the read through unchanged so fresh content reaches Claude. - **Targeted-read pass-through**: when toolInput already specifies offset and/or limit, preserve them in updatedInput instead of collapsing to {limit:1}. The harness's dedup cache then sees a distinct input and lets the read proceed. The unconstrained-read path (no offset, no limit) is unchanged: still truncated to 1 line plus the observation timeline, so token economics are preserved for the common case. Tests cover all three branches: existing truncation, targeted-read pass-through (offset+limit, limit-only), and mtime-driven bypass. Fixes #1719 * refactor(file-context): address review findings on #1719 fix - Add offset-only test case for full targeted-read branch coverage - Use >= for mtime comparison to handle same-millisecond edge case - Add Number.isFinite() + bounds guards on offset/limit pass-through - Trim over-verbose comments to concise single-line summaries - Remove redundant `as number` casts after typeof narrowing - Add comment explaining fileMtimeMs=0 sentinel invariant |
||
|
|
59ce0fc553 |
fix: exclude primary key index from unique constraint check in migration 7 (#1771)
* fix: exclude primary key index from unique constraint check in migration 7 PRAGMA index_list returns all indexes including those backing PRIMARY KEY columns (origin='pk'), which always have unique=1. The check was therefore always true, causing migration 7 to run the full DROP/CREATE/RENAME table sequence on every worker startup instead of short-circuiting once the UNIQUE constraint had already been removed. Fix: filter to non-PK indexes by requiring idx.origin !== 'pk'. The origin field is already present on the existing IndexInfo interface. Fixes #1749 * fix: apply pk-origin guard to all three migration code paths CodeRabbit correctly identified that the origin !== 'pk' fix was only applied to MigrationRunner.ts but not to the two other active code paths that run the same removeSessionSummariesUniqueConstraint logic: - SessionStore.ts:220 — used by DatabaseManager and worker-service - plugin/scripts/context-generator.cjs — bundled artifact (minified) All three paths now consistently exclude primary-key indexes when detecting unique constraints on session_summaries. |
||
|
|
31ee1024c5 |
fix: restrict ~/.claude-mem/.env permissions to owner-only (0600) (#1770)
* fix: restrict .env file permissions to owner-only (0600) API keys stored in ~/.claude-mem/.env were created without explicit permissions, defaulting to umask-dependent mode. On systems with a permissive umask (e.g. 0022), the file would be world-readable. - Set directory permissions to 0700 on creation - Set file permissions to 0600 via writeFileSync mode option - Call chmodSync after write to fix permissions on pre-existing files Signed-off-by: Jochen Meyer * fix: also restrict pre-existing directory permissions to 0700 The initial fix only set directory mode on creation. Pre-existing ~/.claude-mem/ directories from earlier installs remained world-readable. Add chmodSync for the directory alongside the existing file chmod, and document the Windows limitation (ACLs, not POSIX permissions). --------- Signed-off-by: Jochen Meyer |