server-beta: Phases 4–13 — event pipeline, generation, MCP, compat, Docker, team audit, observability (#2383)

* feat(server-beta): Phase 4 — Postgres event-to-generation-job pipeline

Adds POST /v1/events, /v1/events/batch, GET /v1/jobs/:id, GET /v1/events/:id,
and POST /v1/memories on the server-beta runtime, backed by Postgres.

- Event row + outbox generation-job row insert in one withPostgresTransaction.
- BullMQ enqueue happens after commit; enqueue failure leaves the row queued
  for Phase 3 startup reconciliation.
- ?generate=false skips the outbox; ?wait=true returns queue status only,
  never observation IDs (provider generation is Phase 5).
- Batch pre-validates all event projectIds against api-key scope before any
  write; mixed-project batches reject 403 with zero side effects.
- /v1/memories is a direct insert alias — no generator, no outbox.
- Cross-tenant /v1/jobs/:id returns 404 to avoid leaking row existence.
- New PostgresAuthMiddleware reads api_keys by SHA-256 hash; populates
  req.authContext.teamId/projectId; legacy ServerV1Routes (SQLite, used by
  worker runtime) is left untouched.
- Tests: unit suite hardened with stubbed pool.query so route registration
  is safe; integration tests skip cleanly without CLAUDE_MEM_TEST_POSTGRES_URL.

Verification: 87 pass / 1 skip / 0 fail. No new typecheck errors. Required
greps for WorkerService and MemoryItemsRepository in src/server/routes/v1
and src/server/runtime return no hits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 5 — provider observation generator

Adds independent provider generation under src/server/generation/ with no
worker coupling. Server beta can now generate observations end-to-end:
event -> outbox -> BullMQ -> provider -> parser -> persisted observation.

- ProviderObservationGenerator orchestrates: lock outbox (queued -> processing),
  reload agent_event from Postgres (BullMQ payload is advisory only), call
  provider, hand raw text to processGeneratedResponse, route errors via
  markGenerationFailed with retryable flag from ServerClassifiedProviderError.
- processGeneratedResponse parses with parseAgentXml, persists via
  PostgresObservationRepository with deterministic
  generation_key = generation:v1:{job_id}:{index}:{fingerprint},
  links via PostgresObservationSourcesRepository, advances outbox status,
  appends observation_generation_job_events, audits — all in one
  withPostgresTransaction. Idempotent on retry via UNIQUE constraints.
- Three provider adapters under src/server/generation/providers/:
  Claude, Gemini, OpenRouter. Self-contained — no imports from
  src/services/worker/*. Worker providers unchanged.
- Shared error classification + prompt builder under providers/shared/.
  Prompt builder strips <private> at the edge; fully-private batches
  emit <skip_summary /> without billing the provider.
- ActiveServerBetaGenerationWorkerManager wires BullMQ Worker via
  ServerJobQueue.start(...) with concurrency 1 + autorun:false +
  worker.on('error') per BullMQ docs.
- New GET /v1/events/:id/observations on ServerV1PostgresRoutes returns
  observations linked via observation_sources, team/project scoped.

Verification: 104 pass / 4 skip / 0 fail. No typecheck regressions.
Anti-pattern greps clean for services/worker imports under src/server,
WorkerRef/ActiveSession/SessionStore in src/server/generation.

Deferred: ModeManager loading uses a stable fallback observation type
list; summary and reindex queue lanes are not yet wired.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 6 — independent server session semantics

server_sessions is now the canonical Server beta session model. Sessions
are independent of legacy worker ActiveSession state.

- PostgresServerSessionRepository extended: findByExternalIdForScope,
  endSession (idempotent via COALESCE(ended_at, now())),
  markGenerationStarted/Completed/Failed, listUnprocessedEvents (filters
  agent_events with completed agent_event jobs).
- ServerSessionRuntimeRepository wraps the repo; every method requires
  explicit team_id + project_id and validates scope via assertProjectOwnership.
- SessionGenerationPolicy supports per-event (default), debounce
  (BullMQ delayed-job replace via getJob+remove+add), and end-of-session.
  Configured via CLAUDE_MEM_SERVER_SESSION_POLICY and
  CLAUDE_MEM_SERVER_SESSION_DEBOUNCE_MS env vars; per-team override hooks
  are exposed on ServerV1PostgresRoutesOptions for future settings layer.
- POST /v1/sessions/start (find-or-create on (project_id, external_session_id),
  GET /v1/sessions/:id (scoped 404), POST /v1/sessions/:id/end
  (transactional: end + create summary outbox via UNIQUE collapse +
  enqueue post-commit). Re-ending is fully idempotent.
- processSessionSummaryResponse persists summary as kind='summary'
  observation with the same idempotency model
  (generation_key + observation_sources UNIQUE).
- ProviderObservationGenerator dispatches on source_type:
  agent_event -> processGeneratedResponse, session_summary ->
  processSessionSummaryResponse; loadEvents handles session-summary
  by loading unprocessed events.
- ActiveServerBetaGenerationWorkerManager wires summary BullMQ lane
  alongside event lane (concurrency=1, autorun=false, error listener
  attached per BullMQ docs).

Verification: 110 pass / 6 skip / 0 fail. Net typecheck error count
unchanged at 24 (pre-existing, none in Phase 6 files). Anti-pattern
greps clean for ActiveSession/SessionStore in src/server/runtime,
no worker imports anywhere in src/server.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 7 — hook routing without worker dependency

Hooks can now talk directly to server-beta when CLAUDE_MEM_RUNTIME=server-beta
is selected, with a clean worker fallback when server-beta is unhealthy.

- src/services/hooks/server-beta-client.ts — typed HTTP client for
  /v1/sessions/start, /v1/events, /v1/sessions/:id/end. Throws
  ServerBetaClientError with kind classification (missing_api_key,
  transport, timeout, http_error, invalid_response) and isFallbackEligible
  helper. Zero imports from services/worker/.
- src/services/hooks/runtime-selector.ts — reads CLAUDE_MEM_RUNTIME from
  settings, returns worker or server-beta context, logs
  [server-beta-fallback] reason=<code> on every config-time fallback.
- src/services/hooks/server-beta-bootstrap.ts — Postgres-backed API key
  bootstrap. Find-or-creates local-hook-team + local-hook-project,
  generates cmem_<random> key (SHA-256 hashed), inserts into api_keys
  with scopes events:write/sessions:write/observations:read/jobs:read.
  Settings file written with chmod 0600. rotateServerBetaApiKey() wired
  to a new `claude-mem server keys rotate` command.
- src/cli/handlers/{observation,session-init,summarize}.ts — every hook
  handler tries server-beta first when configured, falls through to the
  existing worker path on transport/5xx/429/missing-key. One WARN line
  per fallback. Hook JSON output shape unchanged.
- src/shared/SettingsDefaultsManager.ts — three new keys with defaults:
  CLAUDE_MEM_SERVER_BETA_URL, CLAUDE_MEM_SERVER_BETA_API_KEY,
  CLAUDE_MEM_SERVER_BETA_PROJECT_ID.
- src/npx-cli/commands/install.ts — when installer selects server-beta
  runtime and CLAUDE_MEM_SERVER_DATABASE_URL is set, bootstraps a local
  API key automatically. Warns and continues if the DB URL is missing.

plugin/scripts/*.cjs bundles rebuilt via npm run build to pick up the
new hook handler code path. No plaintext keys in the bundle (verified).

Verification: 16 hook unit tests pass; 275 server/storage/services tests
pass with 7 pre-existing failures (verified independent of this change
via git stash --include-untracked). Build clean. No new typecheck
errors in Phase 7 files.

Anti-pattern guards verified:
- /api/sessions/observations only reached via explicit fallback path
- server-beta runtime never starts the worker process
- API keys live only in ~/.claude-mem/settings.json (chmod 0600), never
  in the bundle (grep confirmed)
- Worker fallback preserved, observable via single WARN line per call

Deferred: semantic context injection (UserPromptSubmit hook) stays
worker-only; server-beta does not yet expose /v1/context/semantic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 8 — MCP backed by server-beta core

MCP tools now route through server-beta in server-beta mode while keeping
worker-mode search/timeline/get_observations tools fully working.

- src/servers/mcp-server.ts — five new observation_* tools registered:
  observation_add, observation_record_event, observation_search,
  observation_context, observation_generation_status. Three memory_*
  compatibility aliases delegate to the canonical handlers. Worker
  auto-start is gated when selectRuntime() === 'server-beta' so MCP
  in server-beta mode never spawns the worker.
- src/services/hooks/server-beta-client.ts — addObservation,
  searchObservations, contextObservations, getJobStatus added so MCP
  shares one transport with hooks (Phase 7).
- src/server/routes/v1/ServerV1PostgresRoutes.ts — POST /v1/search and
  POST /v1/context REST cores backed by PostgresObservationRepository
  full-text search (GIN tsvector from Phase 1).
- Existing memory_search/timeline/get_observations tools call
  callWorkerAPI unchanged in worker mode; worker tests unaffected.

Verification: 39 pass / 4 skip / 0 fail on targeted suite. Pre-existing
7 baseline failures verified independent (git stash). No new typecheck
errors. WorkerService grep clean across src/servers/mcp-server.ts and
src/server/.

Anti-pattern guards verified:
- No duplicate generation logic in MCP — observation_record_event hits
  /v1/events which owns event+outbox+enqueue inside one tx
- WorkerService not imported anywhere under MCP server-beta path
- No hardcoded worker URLs — all transport via Phase 7 ServerBetaClient
- memory_* aliases retained, single handler per pair

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 9 — compatibility adapters without coupling

Legacy /api/sessions/observations and /api/sessions/summarize endpoints
keep working on server-beta runtime by translating to AgentEvent and
session-end calls — no worker code, no route duplication.

- src/server/services/IngestEventsService.ts — shared event-ingest path
  used by both /v1/events and the compat adapter. Owns transactional
  event row + outbox row + lifecycle log + post-commit BullMQ enqueue,
  honors Phase 6 SessionGenerationPolicy.
- src/server/services/EndSessionService.ts — shared session-end path
  used by both /v1/sessions/:id/end and the compat adapter. Idempotent
  ended_at + summary outbox + deterministic summary job id.
- src/server/compat/SessionsObservationsAdapter.ts — translates legacy
  POST /api/sessions/observations payload (Claude Code transcript shape)
  -> AgentEvent (source_adapter='claude-code-compat',
  event_type='tool_use') -> IngestEventsService.ingestOne. Resolves
  contentSessionId to server_sessions via find-or-create.
- src/server/compat/SessionsSummarizeAdapter.ts — translates legacy
  POST /api/sessions/summarize -> EndSessionService.end. Preserves the
  legacy agentId -> {status:'skipped', reason:'subagent_context'}
  behavior so existing clients see the same response shape.
- src/server/routes/v1/ServerV1PostgresRoutes.ts — refactored to
  delegate to the new shared services (-203 LoC net) so /v1 and
  /api compat both call the SAME canonical code path.
- src/server/runtime/ServerBetaService.ts — registers both compat
  adapters alongside ServerV1PostgresRoutes, sharing service instances.
- docs/server-beta-parity-map.md — full enumeration of legacy /api/*
  routes labeled native, adapter, or unsupported (with reasons).
  Viewer read-path adapters explicitly listed as unsupported pending
  a future viewer-rewrite phase.

Verification: 7 compat tests pass, 6 v1-routes tests still pass
(refactor preserved behavior), 4 session-routes tests pass. Pre-
existing 16 baseline failures verified independent via git stash.
Zero new typecheck errors.

Anti-pattern guards verified:
- No services/worker/http/routes or WorkerService imports under
  src/server/compat or src/server/runtime
- Compat adapters are thin translators with names ending in *Adapter
  and a top-of-file comment noting they are legacy compatibility
- /v1/* remains the canonical Server beta API; compat adapters
  call shared services rather than acting as a parallel API

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 10 — Docker stack and deployable runtime

Server beta now ships as a Docker stack with no worker process anywhere
and a separate horizontal generation worker for scaling.

- src/server/runtime/create-server-beta-service.ts — validateServerBetaEnv()
  fails fast on missing CLAUDE_MEM_SERVER_DATABASE_URL, requires
  CLAUDE_MEM_QUEUE_ENGINE=bullmq in Docker, rejects
  CLAUDE_MEM_AUTH_MODE=local-dev and CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS
  inside containers (detected via /.dockerenv or CLAUDE_MEM_DOCKER=1).
  Adds CLAUDE_MEM_GENERATION_DISABLED so the HTTP service can run
  generator-free.
- src/server/runtime/ServerBetaService.ts — runServerBetaGenerationWorker
  for the dedicated consumer process; runServerBetaApiKeyCli is a new
  Postgres-backed `server api-key` command (the legacy worker CLI wrote
  to SQLite and was invisible to the Postgres runtime); getQueueHealth
  shim feeds /api/health a consistent ObservationQueueHealth shape.
- src/npx-cli/commands/{runtime,server}.ts — `claude-mem server worker
  start` subcommand that boots only the BullMQ consumer.
- docker/claude-mem/{Dockerfile,entrypoint.sh} — entrypoint forces
  CLAUDE_MEM_DOCKER=1 + CLAUDE_MEM_RUNTIME=server-beta and exposes
  three modes: server (HTTP only, generation disabled), worker (BullMQ
  consumer), shell. Worker bundle is no longer the default CMD.
- docker-compose.yml — full stack: postgres + valkey + claude-mem-server
  (HTTP-only) + claude-mem-worker (generation consumer). Wires
  service-to-service env vars.
- scripts/e2e-server-beta-docker.sh + docker/e2e/server-beta-e2e.mjs —
  E2E now hits /v1/sessions/start, /v1/events?wait=true, /v1/jobs/:id;
  asserts no worker-service.cjs process anywhere in the stack;
  one-shot docker compose run --rm verifies local-dev auth is
  rejected with the expected stderr; restart-and-verify confirms
  Postgres durability and BullMQ retry idempotency.
- docs/server.md — full Phase 10 doc: stack diagram, env table,
  worker mode, auth-in-Docker policy.
- docs/api.md — event generation semantics (wait=true, generationJob).

Verification: full Docker E2E PASSED on live daemon
(phase1 + phase2 + restart-and-verify + revoked-key + no-worker-
process + local-dev-rejected). Unit tests 292 pass / 9 skip / 7 fail
(7 fails pre-existing baseline). Zero new typecheck errors.

Anti-pattern guards verified:
- entrypoint never execs worker-service.cjs; E2E greps prove no
  worker process anywhere in the stack
- validateServerBetaEnv refuses local-dev auth in Docker with explicit
  remediation message; ALLOW_LOCAL_DEV_BYPASS rejected the same way
- Docker requires CLAUDE_MEM_QUEUE_ENGINE=bullmq; in-process queue
  rejected at startup
- claude-mem worker / worker-service / WorkerService greps clean
  in docker/

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 11 — team-aware generation with audit chain

Generation jobs now carry team_id/project_id/api_key_id/actor_id/
source_adapter from enqueue through execution; the outbox is reloaded
from Postgres before any side effect so BullMQ payload can never act
as auth authority.

- src/server/jobs/types.ts — ServerGenerationJobPayloadSchema (Zod
  discriminated union) requires team_id, project_id, generation_job_id,
  source_adapter, api_key_id, actor_id (nullable), source_type, source_id,
  plus event_id / server_session_id per kind. assertServerGenerationJobPayload
  is called at enqueue (outbox.ts) and again at execution boundary.
- src/server/services/{IngestEventsService,EndSessionService}.ts +
  SessionGenerationPolicy.ts — thread identity context (apiKeyId, actorId,
  sourceAdapter) into both event and summary BullMQ payloads.
- src/server/generation/ProviderObservationGenerator.ts —
  loadCanonicalOutbox loads the outbox row WITHOUT scope filter, then
  compares candidate.team_id/project_id to payload.team_id/project_id;
  mismatch -> ServerGenerationScopeViolationError (non-retryable),
  failed status, generation_job.scope_violation audit. isApiKeyRevoked
  checks api_keys (revoked_at, expires_at, row missing) before any
  provider call; revoked -> generation_job.revoked_key audit + non-
  retryable failure. generation_job.processing audit emitted on lock.
- src/server/generation/processGeneratedResponse.ts — generated
  observations carry team_id/project_id/server_session_id from the
  reloaded source row (not job payload). observation_sources.metadata
  records source_adapter, actor_id, api_key_id for traceability.
  observation.created audit per observation; generation_job.completed
  audit per terminal transition. All audit rows reference the same
  generation_job_id in details.
- src/server/routes/v1/ServerV1PostgresRoutes.ts — GET /v1/teams/:id/jobs
  and GET /v1/projects/:id/jobs with SQL-layer scoping (WHERE team_id=$1
  [AND project_id=$2] [AND status=$3]); cross-tenant returns 404 to
  avoid leaking row existence. Pagination via status/limit/offset.
  audit_log rows for event.received, event.batch_received, observation.read.
- src/server/compat/{SessionsObservationsAdapter,SessionsSummarizeAdapter}.ts —
  propagate apiKeyId and sourceAdapter='claude-code-compat'.

Verification: 162 pass / 10 skip / 0 fail. Pre-existing failures in
tests/services/queue and tests/services/worker confirmed independent
via git stash. Zero new typecheck errors in server-beta files.
Required greps:
  rg "team_id.*req\.body|project_id.*req\.body" src/server -> 0 matches
Audit chain integration test passes — generation_job.processing,
observation.created, and generation_job.completed audit rows all
share the same generation_job_id reference.

Anti-pattern guards verified:
- BullMQ payload never acts as auth authority — Postgres outbox
  reload with mismatch check happens before every side effect
- team_id / project_id never derived from request body for scope
  decisions; always req.authContext.teamId / projectId
- Application-layer team/project filtering forbidden — listJobsForScope
  pushes scope into the SQL WHERE clause
- Project-scoped key on cross-project /v1/teams/:id/jobs returns 404
- Revoked api keys cause non-retryable failure with audit before
  any provider call

Deferred: a redundant generation_job.queued audit_log row (already
covered by observation_generation_job_events lifecycle log per Phase 1
schema split). Compat adapters set actor_id=null but propagate
api_key_id which is the canonical reference downstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 12 — observability and operations

Operators can now inspect, retry, and cancel generation jobs from the
CLI; queue lane metrics flow into /api/health and /v1/info; every
request gets a stable request_id that flows through HTTP -> audit ->
outbox -> generator -> completion log.

- src/server/middleware/request-id.ts — honors safe inbound X-Request-Id,
  mints uuid v4 otherwise. Set on req.requestId and echoed via response
  header so external traces can correlate.
- src/server/jobs/ServerJobQueue.ts — QueueEvents wired with completed,
  failed, progress, stalled, error listeners; lifecycle counters
  exposed via observe() API. Logs emitted as
  [generation] job=<id> source_type=<...> duration=<ms> attempts=<N>
  reason=<message>. Stalled and error counters survive worker restart.
- src/server/jobs/types.ts — ServerGenerationJob payload schema
  extended with optional request_id; flows through from HTTP into
  every BullMQ job.
- src/server/queue/ObservationQueueEngine.ts — health snapshot now
  carries per-lane (event, summary) counts via
  ObservationQueueHealthLaneSnapshot.
- src/server/runtime/{ActiveServerBetaQueueManager,
  ActiveServerBetaGenerationWorkerManager,ServerBetaService}.ts —
  per-lane getJobCounts feed /api/health and /v1/info; stalled events
  audit through audit_log with action generation_job.stalled.
- src/server/routes/v1/ServerV1PostgresRoutes.ts —
  GET /v1/jobs (status/source_type/since/limit/offset, scope from
  api-key, payload stripped unless ?include=payload AND admin scope),
  POST /v1/jobs/:id/retry (idempotent; queued -> no-op; audit
  generation_job.retried_by_operator), POST /v1/jobs/:id/cancel
  (terminal -> no-op; audit generation_job.cancelled_by_operator;
  generator reload-before-side-effects already prevents double work).
- src/server/services/IngestEventsService.ts +
  SessionGenerationPolicy.ts + ProviderObservationGenerator.ts —
  request_id propagated end to end. Generator extracts request_id
  from BullMQ payload and includes it in lock/processing/completion
  logs and audit details.
- src/npx-cli/commands/server-jobs.ts +
  src/npx-cli/commands/server.ts — `claude-mem server jobs
  status|failed|retry|cancel`. status compares Postgres outbox counts
  to BullMQ queue counts and surfaces divergence. failed prints
  attempts + last_error message. --team and --project filters.

Verification: 350 pass / 12 skip / 7 fail (pre-existing baseline,
verified independent via git stash). 18 new tests added (request-id
middleware, server-jobs CLI seams, jobs list/retry/cancel routes
Postgres-gated). Zero new typecheck errors.

Anti-pattern guards verified:
- agent_events.payload only emitted in /v1/jobs response inside the
  admin-gated branch (?include=payload + admin scope) — returns 403
  otherwise
- jobs retry on a queued row is a no-op (no double BullMQ enqueue,
  no double UPDATE)
- Every operator action writes to audit_log with the
  *_by_operator action and request_id correlation in details
- Stalled events audit through generation_job.stalled

Sample correlated trace (one request_id end to end):
  HTTP middleware: req.requestId = 'req-abc'
  audit event.received: details.requestId = 'req-abc'
  BullMQ payload: { request_id: 'req-abc', generation_job_id: 'gj_x' }
  generator lock log: [generation] job locked { jobId, requestId }
  audit generation_job.processing: details.requestId = 'req-abc'
  completion log: [generation] job=evt_... duration=1230ms

Deferred: live /api/health round-trip integration test (needs
Redis); stalled event live integration test (needs Redis); storing
request_id on the observations row itself (spec did not require).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(server-beta): add Phase 13 release readiness report

Captures the final verification gate: tests (1749 pass, 45 fail all
pre-existing baseline, zero regressions), required greps clean,
Docker E2E green end-to-end, all 7 exit criteria met, build clean,
typecheck unchanged from main. Documents deferred items.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(server-beta): rebuild server-beta-service bundle

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): address Greptile review on PR #2383

- ProviderObservationGenerator.lockOutbox: skip duplicate worker run when
  another lock is active instead of returning the row, which previously let
  two BullMQ workers issue the (paid, rate-limited) external provider call
  before the persistence-layer terminal-status guard collapsed the duplicate.
  Reconciliation still recovers from a stale lock on startup or next retry.
- docker-compose.yml: require POSTGRES_USER/PASSWORD/DB env vars (no
  defaults). Stack refuses to start without explicit secrets. Added a header
  warning that the file must not be deployed unmodified.
- e2e-server-beta-docker.sh: export ephemeral test creds for the new
  required env vars so the Docker E2E driver still runs unattended.
- ServerBetaService api-key list: bound query with LIMIT/OFFSET (default 100,
  max 500) and add optional --team filter to prevent unintentional
  cross-tenant key metadata disclosure on shared admin hosts.
- SessionGenerationPolicy: fix dead `??` fallback for NaN parseInt result;
  use `||` so DEFAULT_DEBOUNCE_MS actually applies.
- ServerV1PostgresRoutes: `?wait=true` now actually waits — polls the outbox
  row until terminal status (timeout 30s, 100ms interval) on both
  /v1/events and /v1/events/batch. Returns `waitTimedOut: true` if the cap
  is hit so callers can re-poll the status endpoints.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): address CodeRabbit + Greptile second review on PR #2383

P1 fixes
- Operator retry endpoint was re-publishing the Postgres outbox metadata
  column as the BullMQ payload; the worker's
  assertServerGenerationJobPayload always rejected it, leaving the row
  stuck in queued until startup reconciliation. Persist the BullMQ payload
  on the outbox row at create-time inside IngestEventsService and
  EndSessionService, then re-enqueue that canonical payload on retry.

Major fixes
- prompt-builder: escape server_session_id when interpolating into the
  XML prompt; previously a session id containing `<`, `&`, or quotes
  could inject XML into the provider input.
- ServerJobQueue: route both worker.on('stalled') and the QueueEvents
  'stalled' subscriber through a single notifyStalled helper that
  dedupes by jobId for 30s, so counters.stalled increments once per
  stall. QueueEvents 'error' now routes through notifyQueueError so
  it increments counters.errored and runs onError listeners — keeping
  observability symmetric across both sources.
- ServerV1PostgresRoutes: convert PostgresObservationRepository from
  three dynamic imports to a single static import for consistency.
- mcp-server / ServerBetaClient: actually forward the
  observation_record_event tool's `generate` flag through to the
  /v1/events endpoint as `?generate=false` instead of voiding it.
- server-sessions.markGenerationFailed: guard jsonb_set against a null
  error payload so the failure path can't null out metadata before the
  generation_status='failed' write commits.

Minor fixes
- server-sessions.endSession: keep updated_at stable on repeated calls
  so the documented idempotency contract holds.
- SettingsDefaultsManager + ServerBetaService.getServerBetaPort: derive
  the server-beta default port from UID (37877 + uid%100), matching the
  worker port pattern, so two users on the same host don't collide.
  Docker stacks always pass CLAUDE_MEM_SERVER_PORT explicitly so the
  containerized deployment is unaffected.
- server-session-runtime test: close the pg.Pool in afterAll.
- server-beta-release-readiness.md: escape pipes inside table inline
  code, add `text` language tag to the fenced log block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): address Greptile + CodeRabbit third review on PR #2383

P1 fixes
- SessionsObservationsAdapter.resolveServerSession: catch unique-violation
  (23505) on concurrent compat inserts and re-fetch instead of returning
  500. Two compat callers carrying the same contentSessionId can both
  observe `existing===null` and race on the (project_id,
  external_session_id) unique constraint; the second now resolves to the
  raced row instead of dropping the event.
- /v1/events/batch: pass `sourceAdapter: null` to ingestBatch so each
  event's BullMQ payload (and persisted outbox payload column) reflects
  its own event.sourceAdapter via buildEventBullmqPayload's fallback,
  rather than stamping the whole batch with the first event's adapter.

Minor
- server-session-runtime test afterEach: wrap DROP SCHEMA in try/finally
  so client.release() always runs even if the drop throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): drop `pool as never` cast — pg.Pool already matches PostgresPool

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): retry of completed job now 409s instead of duplicating

retryGenerationJob previously fell through to the reset+re-enqueue path
when called on a job in `completed` status. The observations index
dedupes on (generation_job_id, parsed_observation_index, content) but
LLM output is non-deterministic, so a second provider run almost always
produced a different content string and bypassed the index, persisting a
parallel set of observation rows attributed to the same generation job.

Match cancelGenerationJob's 409 guard for completed jobs. failed and
cancelled remain valid retry targets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(server-beta): rebuild bundles after rebase onto main

Regenerates the three plugin bundles so they reflect the rebased source
state. Mechanical rebuild output only — no source changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): wrap resolveServerSession in try/catch for structured error response

Greptile P1 on PR #2383: resolveServerSession was called before the try/catch
in both compat adapters, so Postgres errors during session lookup (timeout,
pool exhaustion, etc.) escaped to Express's default error handler and returned
HTML/text 500s. Legacy clients calling response.json() would get a parse
failure instead of the documented { stored: false, reason: 'internal_error' }
(or { status: 'error', reason: 'internal_error' } for the summarize adapter)
shape.

Move the resolveServerSession call inside the existing try block in both
adapters so any failure flows through the structured catch handler.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): catch 23505 unique violation in POST /v1/sessions/start

Greptile P1 on PR #2383: concurrent requests with the same externalSessionId
can both pass the findByExternalIdForScope check, both call repo.create,
and the loser hits the (project_id, external_session_id) unique constraint.
The handler treated that as an unknown error and returned a 500.

Apply the same pattern resolveServerSession already uses: catch error.code
'23505' when externalSessionId is set, refetch the row inserted by the
winning request, and return 200 with that session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alex Newman
2026-05-11 00:26:11 -07:00
committed by GitHub
parent a10d1b342f
commit e7bbb2a9aa
72 changed files with 13901 additions and 982 deletions
+9 -1
View File
@@ -55,5 +55,13 @@ RUN chmod +x /usr/local/bin/claude-mem-entrypoint
USER node
WORKDIR /home/node
# Phase 10 — server-beta runtime is the only foregrounded process. The legacy
# worker binaries remain available for tooling but are NEVER spawned by the
# entrypoint. Mode selection happens via CLAUDE_MEM_CONTAINER_MODE
# (server | worker | shell). `docker run ... bash` works because shell mode
# falls through to "$@".
ENV CLAUDE_MEM_CONTAINER_MODE=server
ENV CLAUDE_MEM_RUNTIME=server-beta
ENTRYPOINT ["/usr/local/bin/claude-mem-entrypoint"]
CMD ["bash"]
CMD []
+44 -1
View File
@@ -1,5 +1,11 @@
#!/usr/bin/env bash
# Phase 10 — server-beta container entrypoint. The container ALWAYS runs the
# server-beta runtime; the legacy worker is never started here. Generation can
# be split into a separate `claude-mem server worker start` process by setting
# CLAUDE_MEM_GENERATION_DISABLED=true on this service and running the worker
# command in a sibling container.
set -euo pipefail
mkdir -p "$HOME/.claude" "$HOME/.claude-mem"
@@ -15,4 +21,41 @@ fi
export PATH="/usr/local/bun/bin:/usr/local/share/npm-global/bin:$PATH"
exec "$@"
# Mark this process tree as running inside Docker so server-beta env
# validation can refuse local-dev auth and require the full Postgres+Valkey
# configuration. /.dockerenv is also detected automatically; this is belt-
# and-suspenders for runtimes that don't expose it.
export CLAUDE_MEM_DOCKER=1
export CLAUDE_MEM_RUNTIME="${CLAUDE_MEM_RUNTIME:-server-beta}"
SERVER_BETA_SCRIPT="/opt/claude-mem/scripts/server-beta-service.cjs"
# Mode selection:
# CLAUDE_MEM_CONTAINER_MODE=server (default) — HTTP server-beta, no worker
# CLAUDE_MEM_CONTAINER_MODE=worker — BullMQ generation worker only
# CLAUDE_MEM_CONTAINER_MODE=shell — fall through to "$@" for tooling
MODE="${CLAUDE_MEM_CONTAINER_MODE:-server}"
case "$MODE" in
server)
echo "[claude-mem] starting server-beta runtime (HTTP, no legacy worker)" >&2
exec bun "$SERVER_BETA_SCRIPT" --daemon
;;
worker)
echo "[claude-mem] starting server-beta generation worker (no HTTP)" >&2
# Force generation enabled in the worker process even if the env var was
# set on the shared compose file; the worker IS the generation process.
unset CLAUDE_MEM_GENERATION_DISABLED
exec bun "$SERVER_BETA_SCRIPT" worker start
;;
shell|tooling)
if [[ $# -eq 0 ]]; then
exec bash
fi
exec "$@"
;;
*)
echo "ERROR: unknown CLAUDE_MEM_CONTAINER_MODE=$MODE (expected: server, worker, shell)" >&2
exit 1
;;
esac
+121 -126
View File
@@ -1,6 +1,18 @@
// Phase 10 — Docker E2E driver for server-beta. Verifies the
// runtime-relevant slice of the API actually shipped in the Postgres routes:
//
// - GET /healthz — server is alive
// - GET /api/readiness — Postgres bootstrap completed
// - GET /api/health — BullMQ queue engine is bullmq + redis ok
// - POST /v1/sessions/start, /v1/sessions/:id/end
// - POST /v1/events?wait=true (returns generationJob descriptor)
// - GET /v1/events/:id — read-back via team scope
// - GET /v1/jobs/:id — generation job status
// - 401/403 paths for missing/invalid/revoked keys
import net from 'node:net';
const baseUrl = process.env.E2E_BASE_URL ?? 'http://claude-mem-server:37777';
const baseUrl = process.env.E2E_BASE_URL ?? 'http://claude-mem-server:37877';
const redisHost = process.env.E2E_REDIS_HOST ?? 'valkey';
const redisPort = Number.parseInt(process.env.E2E_REDIS_PORT ?? '6379', 10);
const phase = process.env.E2E_PHASE ?? 'phase1';
@@ -8,7 +20,7 @@ const apiKey = requiredEnv('E2E_API_KEY');
const readOnlyKey = process.env.E2E_READ_ONLY_API_KEY ?? '';
const revokedKey = process.env.E2E_REVOKED_API_KEY ?? '';
const runId = process.env.E2E_RUN_ID ?? `e2e-${Date.now()}`;
const projectRoot = `/tmp/claude-mem-server-beta-${runId}`;
const projectIdFromEnv = process.env.E2E_PROJECT_ID ?? '';
function requiredEnv(key) {
const value = process.env[key];
@@ -112,63 +124,68 @@ async function assertQueueHealth() {
assert(response.ok, `/api/health expected OK, got ${response.status}`);
assert(body.queue?.engine === 'bullmq', `expected BullMQ queue engine, got ${JSON.stringify(body.queue)}`);
assert(body.queue?.redis?.status === 'ok', `expected Redis health ok, got ${JSON.stringify(body.queue?.redis)}`);
assert(body.queue?.redis?.mode === 'docker', `expected docker Redis mode, got ${JSON.stringify(body.queue?.redis)}`);
}
async function assertInfoEndpoint() {
const { response, body } = await requestJson('/v1/info');
assert(response.ok, `/v1/info expected OK, got ${response.status}`);
assert(body.runtime === 'server-beta', `expected runtime=server-beta, got ${body.runtime}`);
assert(body.postgres?.initialized === true, `expected postgres.initialized=true, got ${JSON.stringify(body.postgres)}`);
assert(body.boundaries?.queueManager?.status === 'active', `expected queue manager active, got ${JSON.stringify(body.boundaries?.queueManager)}`);
}
async function phase1() {
console.log(`[e2e] phase1 starting (${runId})`);
await waitForReadiness();
await assertQueueHealth();
await assertInfoEndpoint();
await assertRedisPing();
await expectStatus('/v1/projects', 401, {
// Auth — missing key returns 401, invalid key returns 403. Auth runs
// before body validation, so the body content is irrelevant here.
await expectStatus('/v1/sessions/start', 401, {
method: 'POST',
json: { name: 'unauthenticated' },
json: { projectId: projectIdFromEnv, contentSessionId: 'unauth' },
});
await expectStatus('/v1/projects', 403, {
await expectStatus('/v1/sessions/start', 403, {
method: 'POST',
apiKey: 'cmem_invalid_key',
json: { name: 'invalid' },
apiKey: 'cmem_invalid_key_for_e2e',
json: { projectId: projectIdFromEnv, contentSessionId: 'invalid' },
});
// Read-only key cannot write.
if (readOnlyKey) {
await expectStatus('/v1/projects', 403, {
await expectStatus('/v1/sessions/start', 403, {
method: 'POST',
apiKey: readOnlyKey,
json: { name: 'read-only denied' },
json: { projectId: projectIdFromEnv, contentSessionId: `readonly-${runId}` },
});
const readOnlyProjects = await request('/v1/projects', { apiKey: readOnlyKey });
assert(readOnlyProjects.ok, `read-only key should read projects, got ${readOnlyProjects.status}`);
}
const createdProject = await requestJson('/v1/projects', {
// Open a session. projectId is required in the body and must match the
// project the api-key is scoped to (passed in via E2E_PROJECT_ID).
assert(projectIdFromEnv, 'E2E_PROJECT_ID is required for phase1');
const sessionRes = await requestJson('/v1/sessions/start', {
apiKey,
json: {
name: `Server Beta E2E ${runId}`,
rootPath: projectRoot,
metadata: { runId },
},
});
assert(createdProject.response.status === 201, `project create failed: ${JSON.stringify(createdProject.body)}`);
const project = createdProject.body.project;
assert(project?.id, 'project response missing id');
const createdSession = await requestJson('/v1/sessions/start', {
apiKey,
json: {
projectId: project.id,
projectId: projectIdFromEnv,
contentSessionId: `content-${runId}`,
memorySessionId: `memory-${runId}`,
platformSource: 'docker-e2e',
title: 'Docker E2E session',
},
});
assert(createdSession.response.status === 201, `session create failed: ${JSON.stringify(createdSession.body)}`);
const session = createdSession.body.session;
assert(sessionRes.response.status === 201, `session create failed: ${sessionRes.response.status} ${JSON.stringify(sessionRes.body)}`);
const session = sessionRes.body.session;
assert(session?.id, `session response missing id: ${JSON.stringify(sessionRes.body)}`);
const projectId = session.projectId;
assert(projectId, `session missing projectId: ${JSON.stringify(session)}`);
const createdEvent = await requestJson('/v1/events', {
// POST /v1/events?wait=true — returns a generationJob descriptor on
// success. This is the Phase 10 contract: HTTP path returns the queued
// job, and the worker process generates the observation later.
const createdEvent = await requestJson('/v1/events?wait=true', {
apiKey,
json: {
projectId: project.id,
projectId,
serverSessionId: session.id,
sourceType: 'api',
eventType: 'observation.created',
@@ -178,124 +195,102 @@ async function phase1() {
occurredAtEpoch: Date.now(),
},
});
assert(createdEvent.response.status === 201, `event create failed: ${JSON.stringify(createdEvent.body)}`);
assert(
createdEvent.response.status === 201,
`event create failed: ${createdEvent.response.status} ${JSON.stringify(createdEvent.body)}`,
);
const event = createdEvent.body.event;
assert(event?.id, `event response missing id: ${JSON.stringify(createdEvent.body)}`);
// wait=true MUST include a generationJob descriptor (queued or generated).
// Its absence indicates the queue path was bypassed.
assert(
createdEvent.body.generationJob !== undefined && createdEvent.body.generationJob !== null,
`wait=true response missing generationJob: ${JSON.stringify(createdEvent.body)}`,
);
const batchEvents = await requestJson('/v1/events/batch', {
apiKey,
json: [
{
projectId: project.id,
sourceType: 'api',
eventType: 'observation.created',
payload: { index: 1, runId },
occurredAtEpoch: Date.now(),
},
{
projectId: project.id,
sourceType: 'api',
eventType: 'observation.created',
payload: { index: 2, runId },
occurredAtEpoch: Date.now(),
},
],
});
assert(batchEvents.response.status === 201, `event batch failed: ${JSON.stringify(batchEvents.body)}`);
assert(batchEvents.body.events.length === 2, 'event batch did not return two events');
// Read-back through the team-scoped GET /v1/events/:id route.
const fetched = await requestJson(`/v1/events/${event.id}`, { apiKey });
assert(fetched.response.ok, `event fetch failed: ${fetched.response.status} ${JSON.stringify(fetched.body)}`);
const fetchedEvent = await requestJson(`/v1/events/${event.id}`, { apiKey });
assert(fetchedEvent.response.ok, `event fetch failed: ${JSON.stringify(fetchedEvent.body)}`);
// Poll the generation job — it MUST exist in Postgres regardless of
// whether a provider is configured. Without a provider, status stays at
// `queued`; with one, it eventually becomes `generated`. Either way the
// job row is observable via GET /v1/jobs/:id.
const jobId = createdEvent.body.generationJob.id;
if (jobId) {
const jobRes = await requestJson(`/v1/jobs/${jobId}`, { apiKey });
assert(jobRes.response.ok, `job fetch failed: ${jobRes.response.status} ${JSON.stringify(jobRes.body)}`);
}
const createdMemory = await requestJson('/v1/memories', {
apiKey,
json: {
projectId: project.id,
serverSessionId: session.id,
kind: 'manual',
type: 'decision',
title: `Docker E2E memory ${runId}`,
narrative: `Server beta Docker E2E memory survives restart for ${runId}.`,
facts: ['BullMQ health is backed by Valkey', `run:${runId}`],
concepts: ['server-beta', 'docker-e2e'],
metadata: { runId },
},
});
assert(createdMemory.response.status === 201, `memory create failed: ${JSON.stringify(createdMemory.body)}`);
const memory = createdMemory.body.memory;
const patchedMemory = await requestJson(`/v1/memories/${memory.id}`, {
method: 'PATCH',
apiKey,
json: {
projectId: project.id,
kind: 'manual',
type: 'decision',
narrative: `Patched Docker E2E memory survives restart for ${runId}.`,
facts: ['patched', `run:${runId}`],
},
});
assert(patchedMemory.response.ok, `memory patch failed: ${JSON.stringify(patchedMemory.body)}`);
assert(patchedMemory.body.memory.narrative.includes('Patched'), 'patched memory narrative was not returned');
const fetchedMemory = await requestJson(`/v1/memories/${memory.id}`, { apiKey });
assert(fetchedMemory.response.ok, `memory fetch failed: ${JSON.stringify(fetchedMemory.body)}`);
const search = await requestJson('/v1/search', {
apiKey,
json: { projectId: project.id, query: runId, limit: 10 },
});
assert(search.response.ok, `search failed: ${JSON.stringify(search.body)}`);
assert(search.body.memories.some(item => item.id === memory.id), 'search did not return created memory');
const context = await requestJson('/v1/context', {
apiKey,
json: { projectId: project.id, query: 'patched', limit: 5 },
});
assert(context.response.ok, `context failed: ${JSON.stringify(context.body)}`);
assert(context.body.context.includes(runId), 'context did not include created memory text');
const endedSession = await requestJson(`/v1/sessions/${session.id}/end`, {
// Close the session.
const ended = await requestJson(`/v1/sessions/${session.id}/end`, {
method: 'POST',
apiKey,
json: {},
});
assert(endedSession.response.ok, `session end failed: ${JSON.stringify(endedSession.body)}`);
assert(endedSession.body.session.status === 'completed', 'session did not complete');
assert(ended.response.ok, `session end failed: ${ended.response.status} ${JSON.stringify(ended.body)}`);
const audit = await requestJson(`/v1/audit?projectId=${encodeURIComponent(project.id)}`, { apiKey });
assert(audit.response.ok, `audit failed: ${JSON.stringify(audit.body)}`);
assert(audit.body.audit.some(row => row.action === 'memory.write'), 'audit log missing memory.write');
console.log(`[e2e] phase1 passed project=${project.id} memory=${memory.id}`);
console.log(`[e2e] phase1 passed session=${session.id} event=${event.id} job=${jobId ?? 'none'}`);
}
async function phase2() {
console.log(`[e2e] phase2 after restart starting (${runId})`);
await waitForReadiness();
await assertQueueHealth();
await assertInfoEndpoint();
await assertRedisPing();
// Revoked key MUST fail on every authenticated route. The restart between
// phase1 and phase2 specifically asserts the revocation lives in Postgres,
// not an in-memory cache.
if (revokedKey) {
await expectStatus('/v1/projects', 403, { apiKey: revokedKey });
await expectStatus('/v1/sessions/start', 403, {
method: 'POST',
apiKey: revokedKey,
json: { projectId: projectIdFromEnv, contentSessionId: `revoked-${runId}` },
});
}
const projects = await requestJson('/v1/projects', { apiKey });
assert(projects.response.ok, `project list failed after restart: ${JSON.stringify(projects.body)}`);
const project = projects.body.projects.find(item => item.rootPath === projectRoot);
assert(project?.id, `persisted project not found for ${projectRoot}`);
const search = await requestJson('/v1/search', {
// Full key still works after restart — durable session creation + event
// ingest path through Postgres.
assert(projectIdFromEnv, 'E2E_PROJECT_ID is required for phase2');
const sessionRes = await requestJson('/v1/sessions/start', {
apiKey,
json: { projectId: project.id, query: runId, limit: 10 },
json: {
projectId: projectIdFromEnv,
contentSessionId: `content-after-restart-${runId}`,
platformSource: 'docker-e2e',
},
});
assert(search.response.ok, `search failed after restart: ${JSON.stringify(search.body)}`);
assert(search.body.memories.some(item => String(item.narrative ?? '').includes(runId)), 'persisted memory not found after restart');
assert(
sessionRes.response.status === 201,
`session create after restart failed: ${sessionRes.response.status} ${JSON.stringify(sessionRes.body)}`,
);
const session = sessionRes.body.session;
const projectId = session.projectId;
const audit = await requestJson(`/v1/audit?projectId=${encodeURIComponent(project.id)}`, { apiKey });
assert(audit.response.ok, `audit failed after restart: ${JSON.stringify(audit.body)}`);
assert(audit.body.audit.length > 0, 'audit log did not persist after restart');
const createdEvent = await requestJson('/v1/events?wait=true', {
apiKey,
json: {
projectId,
serverSessionId: session.id,
sourceType: 'api',
eventType: 'observation.created',
contentSessionId: `content-after-restart-${runId}`,
payload: { tool_name: 'Edit', runId, after: 'restart' },
occurredAtEpoch: Date.now(),
},
});
assert(
createdEvent.response.status === 201,
`event after restart failed: ${createdEvent.response.status} ${JSON.stringify(createdEvent.body)}`,
);
assert(
createdEvent.body.generationJob !== undefined && createdEvent.body.generationJob !== null,
`wait=true after restart missing generationJob: ${JSON.stringify(createdEvent.body)}`,
);
console.log(`[e2e] phase2 passed project=${project.id}`);
console.log(`[e2e] phase2 passed session=${session.id} event=${createdEvent.body.event.id}`);
}
if (phase === 'phase1') {