Major version bump following PR #2351 merge — server-beta runtime,
Postgres observation storage, BullMQ queue engine, and Apache 2.0
relicense are now on main.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add server beta runtime foundation
* Address server beta review findings
* Resolve server beta review comments
* Tighten server beta review follow-ups
* Harden server beta auth and search
* Avoid unnecessary FTS rebuilds
* Block scoped keys from creating projects
* Release BullMQ claims best effort on close
* Address server beta review blockers
* Reset BullMQ claims best effort
* Add Postgres observation storage foundation
* feat(server-beta): add independent runtime service
Introduce src/server/runtime/ as a self-contained server-beta runtime
that owns its lifecycle, Postgres bootstrap, and HTTP boundary without
depending on WorkerService.
ServerBetaService wraps the existing Server class, exposes
/healthz and /v1/info with runtime="server-beta", and persists state
to dedicated paths (.server-beta.pid|.port|.runtime.json). The four
boundary managers (queue, generation worker, provider registry, event
broadcaster) are intentionally disabled in this phase and report their
status through /v1/info; later phases activate them.
Adds plans/2026-05-07-finish-bullmq-branch-ship-plan.md to track the
remaining work for this branch.
Phase 2 of plans/2026-05-07-server-beta-independent-bullmq-observation-runtime.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(server-beta): route CLI lifecycle and bundle separate runtime
scripts/build-hooks.js now produces plugin/scripts/server-beta-service.cjs
as a separate Node CJS bundle, alongside the existing worker-service
bundle. The server-beta runtime is now installable independently.
src/npx-cli/commands/server.ts routes start|stop|restart|status to the
server-beta lifecycle instead of the legacy worker. The worker keeps its
own start|stop|restart|status under the worker namespace; the two
runtimes can be operated independently.
src/services/worker-service.ts adds a server-* command parser branch
that delegates to the sibling server-beta-service.cjs bundle so
direct worker-service invocations still route to the right runtime.
tests/npx-cli-server-namespace.test.ts updated to expect server-beta
lifecycle routing.
Includes rebuilt plugin/scripts/*.cjs bundles produced by
build-and-sync.
Phase 2 of plans/2026-05-07-server-beta-independent-bullmq-observation-runtime.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(server-beta): add BullMQ job queue primitives
Introduce src/server/jobs/ as the queue-side primitives that Phase 3 of
the server-beta runtime needs to operate.
types.ts defines a discriminated union over the four job kinds (event,
event-batch, summary, reindex) and maps each to a per-kind BullMQ queue
name and deterministic-ID prefix.
job-id.ts builds deterministic, colon-free BullMQ jobIds from
(kind, team, project, source). The colon ban exists because BullMQ uses
':' as a Redis key separator internally; embedding ':' in jobIds
breaks scan and state lookups.
ServerJobQueue.ts is a thin wrapper over BullMQ Queue + Worker that
enforces autorun:false, default concurrency 1, and an attached error
listener — all per BullMQ docs requirements. Test seams accept queue
and worker factories so unit tests do not need Redis.
outbox.ts publishes through the Postgres ObservationGenerationJob
repository as canonical history. enqueueOutbox writes the row first,
then publishes to BullMQ; if BullMQ throws, the row is transitioned to
failed and a failed event is appended. reconcileOnStartup re-enqueues
queued + processing rows after a restart, replacing terminal BullMQ
jobs that may still be holding the deterministic ID slot. markCompleted
and markFailed wrap transitionStatus and append the matching event row.
Includes 20 unit tests covering deterministic ID stability, colon-free
output, queue lifecycle, error-listener attachment, double-start
refusal, idempotent enqueue, BullMQ failure rollback, startup
reconciliation, max-attempts skipping, and completion / failure /
retry transitions.
Phase 3 commit 1 of plans/2026-05-07-server-beta-independent-bullmq-observation-runtime.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(server-beta): activate queue boundary in runtime service
Wire ActiveServerBetaQueueManager into the server-beta runtime graph.
The active manager owns one ServerJobQueue per generation kind (event,
event-batch, summary, reindex) and surfaces lane metadata through
boundary health.
Selection is opt-in and fail-fast: if CLAUDE_MEM_QUEUE_ENGINE is set to
bullmq the active manager is constructed (and any Redis/config error
throws — no silent fallback to SQLite, per Phase 3 anti-pattern guard).
For any other engine the disabled boundary remains so worker-era and
test setups stay compatible.
Widens ServerBetaBoundaryHealth.status to a discriminated union
('disabled' | 'active' | 'errored') with optional details. The disabled
adapter still emits status='disabled', which keeps the existing
server-beta-service test green.
ServerBetaService receives the manager through a new optional
queueManager field on CreateServerBetaServiceOptions so test graphs
and Phase 4 wiring can inject custom managers.
Adds tests/server/runtime/active-queue-manager.test.ts covering bullmq
guard, active health shape, per-kind queue access, close behavior, and
post-close errored health.
Phase 3 commit 2 of plans/2026-05-07-server-beta-independent-bullmq-observation-runtime.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(server-beta): cap /v1/events/batch at 500 events
Prevents unbounded array DoS surface flagged in PR review.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PATCH release for the /clear queue-drain fix (PR #2136):
removes the SessionEnd → session-complete shim across all five
integration surfaces so pending observations are no longer abandoned
when users type /clear, logout, or exit.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: 5 trivial bugs from v12.4.1 issue triage
- #2092: emit CJS-safe banner (no import.meta.url) in worker-service.cjs
- #2100: PreToolUse Read hook timeout 2000s → 60s
- #2131: add "shell": "bash" to every hook for Windows compat
- #2132: Antigravity dir typo .agent → .agents
- #2088: clear inherited MCP servers in worker SDK query() calls
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: stop context overflow loop + block task-notification leak
- SDKAgent: clear memorySessionId on "prompt is too long" so crash-recovery
starts a fresh SDK session instead of resuming the same poisoned context
forever (was producing 68+ failed pending_messages on a single stuck
session in the wild)
- tag-stripping: new isInternalProtocolPayload() predicate; session-init
hook + SessionRoutes both skip storage when entire prompt is one of
Claude Code's autonomous protocol blocks (currently <task-notification>;
conservative deny-list — does NOT touch <command-name>/<command-message>
which wrap real user slash-commands)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: bump version to 12.4.2
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: update CHANGELOG.md for v12.4.2
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(cleanup): one-time v12.4.3 migration purges observer-sessions and stuck pending_messages
Adds CleanupV12_4_3 module that runs once per data dir on worker startup
(after migrations apply, before Chroma backfill). Drops accumulated pollution
that v12.4.0 (observer-sessions filter) and v12.4.2 (context-overflow guard +
task-notification leak block) prevent from recurring:
- DELETE FROM sdk_sessions WHERE project='observer-sessions' (cascades to
user_prompts, observations, session_summaries via existing FK ON DELETE CASCADE)
- DELETE FROM pending_messages stuck in 'failed'/'processing' for any session
with >=10 such rows (poisoned chains from the pre-v12.4.2 retry loop;
threshold spares legitimate transient failures)
- Wipes ~/.claude-mem/chroma and chroma-sync-state.json so backfillAllProjects
rebuilds the vector store from cleaned SQLite
Pre-flight checks free disk (1.2x DB size + 100MB) via fs.statfsSync; backs up
via VACUUM INTO with copyFileSync fallback; PRAGMA foreign_keys=ON on the
cleanup connection (off by default in bun:sqlite). Marker file
~/.claude-mem/.cleanup-v12.4.3-applied records backup path and counts. Opt-out
via CLAUDE_MEM_SKIP_CLEANUP_V12_4_3=1.
Verified locally: 311MB DB backed up to 277MB in 943ms; 11 observer sessions
+ 3 cascade rows + 141 stuck pending_messages purged; chroma rebuilt via
backfill. Total cleanup time 1.1s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address PR #2133 code review
- SessionRoutes: check isInternalProtocolPayload before stripping tags
so internal protocol prompts skip the strip work entirely.
- tag-stripping: bound isInternalProtocolPayload input length to
256KB to prevent ReDoS-class scans on malformed unclosed tags.
- SDKAgent: extract resetSessionForFreshStart helper; both
context-overflow paths now share one nullification routine.
- worker-service: drop the per-startup "Checking for one-time
v12.4.3 cleanup" info log — runs every boot even after marker
exists; the function already logs at debug/warn when relevant.
- tests: add isInternalProtocolPayload edge cases (whitespace,
attributes, partial tags, unrelated tags, oversize input).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address Greptile P2 comments on PR #2133
CleanupV12_4_3.ts: derive backup directory and restore-hint path from
effectiveDataDir instead of the module-level BACKUPS_DIR/DB_PATH
constants. The dataDirectory override is meant for test isolation;
the prior version still wrote backups to the production directory.
SessionRoutes.ts: move isInternalProtocolPayload guard to the top of
handleSessionInitByClaudeId, before createSDKSession. The previous
position blocked the user_prompts insert but still created an empty
sdk_sessions row, asymmetric with the hook-layer guard in
session-init.ts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cleanup): retry on disk-skip; survive chroma wipe failure
CodeRabbit Major + Claude review:
- Disk pre-flight skip no longer writes the marker. A user temporarily
low on disk would otherwise have the cleanup permanently disabled
even after freeing space. Retry on next startup instead.
- Wrap wipeChromaArtifacts in try/catch and write the marker even on
failure (with chromaWipeError captured). Without this, an rmSync
permission failure on chroma/ left writeMarker unreached, so every
subsequent boot re-ran the SQL purge AND created a fresh backup,
consuming disk indefinitely.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cleanup): close backup handle before copyFileSync fallback
Claude review:
- backupDb is now closed before falling into the copyFileSync fallback.
On Windows an open SQLite handle holds a file lock that can prevent
the fallback copy from reading the source. The previous version only
closed after both branches completed.
- Add empty-body <task-notification></task-notification> case to the
isInternalProtocolPayload tests for completeness.
Cascade-row count queries already match the actual FK columns
(content_session_id for user_prompts, memory_session_id for
observations / session_summaries) — no fix needed there.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cleanup): accurate session count + add migration tests
Claude review v3:
session-init.ts: filter on rawPrompt before the [media prompt]
substitution. Functionally equivalent but explicit — the check no
longer depends on the substitution leaving real protocol payloads
untouched.
CleanupV12_4_3.ts: counts.observerSessions now comes from a pre-DELETE
COUNT(*), not from result.changes. bun:sqlite inflates result.changes
with FTS-trigger and cascade row counts (the user_prompts_fts triggers
inflate a 3-session purge to 19 changes). The previous code logged a
misleading total and wrote it to the marker.
tests/infrastructure/cleanup-v12_4_3.test.ts: happy-path coverage of
the migration against a real on-disk SQLite under a tmpdir. Verifies
observer-session purge with cascades, stuck pending_messages purge,
chroma artifact wipe, marker payload shape, idempotency on re-run, and
CLAUDE_MEM_SKIP_CLEANUP_V12_4_3 opt-out.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(protocol-filter): close two-block false positive; address review
CodeRabbit + Claude review v5:
tag-stripping.ts: PROTOCOL_ONLY_REGEX rewritten with a negative-lookahead
body so a prompt like "<task-notification>x</task-notification> hi
<task-notification>y</task-notification>" no longer matches as a single
outer block — the prior greedy [\s\S]* spanned the middle user text and
would have silently dropped a real prompt. Confirmed via probe.
tag-stripping.test.ts: drop the 50ms wall-clock assertion (CI flake);
add the two-block-with-text case as a regression test.
SessionRoutes.ts: filter on req.body.prompt directly, before the
[media prompt] substitution and 256KB truncation. Mirrors the
session-init.ts hook-layer ordering and ensures a protocol payload
that happens to be near the byte limit isn't truncated before the
filter runs.
cleanup-v12_4_3.test.ts: add stuckCount=9 below-threshold case
verifying pending_messages with <10 stuck rows are preserved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cleanup): include WAL/SHM in backup fallback; safer rollback
CodeRabbit Major + Claude review v6:
CleanupV12_4_3.ts: when VACUUM INTO fails and copyFileSync runs, also
copy any -wal/-shm sidecars. The DB is configured WAL mode, so recent
committed pages can live in those files; copying only the .db would
miss them. VACUUM INTO already captures everything in one file, so
the happy path is unaffected.
CleanupV12_4_3.ts: wrap ROLLBACK in try/catch so a no-op rollback
(SQLite already rolled back on a constraint failure) cannot shadow
the original purge error.
SDKAgent.ts: align both context-overflow log levels to error. Both
branches are fatal-recovery paths; the previous warn/error split was
inconsistent and made the throw branch easy to miss in logs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: pre-count stuck pending_messages; document adjacent-block fall-through
Claude review v7:
CleanupV12_4_3.ts: runStuckPendingPurge now uses a SELECT COUNT(*)
before the DELETE, matching the pattern in runObserverSessionsPurge.
result.changes is reliable today (no FTS on pending_messages) but the
explicit count protects against future schema additions, and keeps
the two purges symmetric.
tag-stripping.test.ts: add test documenting that adjacent protocol
blocks (no user text between) deliberately fall through to storage.
The deny-list is per-block; concatenations are out of scope.
Skipped per project rules / Node API constraints:
- frsize fallback in disk check: Node/Bun StatFs doesn't expose frsize
- VACUUM-INTO comment: comment-only suggestion
- Overflow string constant extraction: low value
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the 300 req/min rate limiter from the worker's HTTP middleware.
The worker is localhost-only (enforced via CORS), so rate limiting was
pointless security theater — but it broke the viewer, which polls logs
and stats frequently enough to trip the limit within seconds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SessionStart context injection regressed in v12.3.3 — no memory
context is being delivered to new sessions. Rolling back to the
v12.3.2 tree state while the regression is investigated.
Reverts #2080.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The synthetic summary salvage feature created fake summaries from observation
data when the AI returned <observation> instead of <summary> tags. This was
overengineered — missing a summary is preferable to fabricating one from
observation fields that don't map cleanly to summary semantics.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>