server-beta: Phases 4–13 — event pipeline, generation, MCP, compat, Docker, team audit, observability (#2383)

* feat(server-beta): Phase 4 — Postgres event-to-generation-job pipeline Adds POST /v1/events, /v1/events/batch, GET /v1/jobs/:id, GET /v1/events/:id, and POST /v1/memories on the server-beta runtime, backed by Postgres. - Event row + outbox generation-job row insert in one withPostgresTransaction. - BullMQ enqueue happens after commit; enqueue failure leaves the row queued for Phase 3 startup reconciliation. - ?generate=false skips the outbox; ?wait=true returns queue status only, never observation IDs (provider generation is Phase 5). - Batch pre-validates all event projectIds against api-key scope before any write; mixed-project batches reject 403 with zero side effects. - /v1/memories is a direct insert alias — no generator, no outbox. - Cross-tenant /v1/jobs/:id returns 404 to avoid leaking row existence. - New PostgresAuthMiddleware reads api_keys by SHA-256 hash; populates req.authContext.teamId/projectId; legacy ServerV1Routes (SQLite, used by worker runtime) is left untouched. - Tests: unit suite hardened with stubbed pool.query so route registration is safe; integration tests skip cleanly without CLAUDE_MEM_TEST_POSTGRES_URL. Verification: 87 pass / 1 skip / 0 fail. No new typecheck errors. Required greps for WorkerService and MemoryItemsRepository in src/server/routes/v1 and src/server/runtime return no hits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): Phase 5 — provider observation generator Adds independent provider generation under src/server/generation/ with no worker coupling. Server beta can now generate observations end-to-end: event -> outbox -> BullMQ -> provider -> parser -> persisted observation. - ProviderObservationGenerator orchestrates: lock outbox (queued -> processing), reload agent_event from Postgres (BullMQ payload is advisory only), call provider, hand raw text to processGeneratedResponse, route errors via markGenerationFailed with retryable flag from ServerClassifiedProviderError. - processGeneratedResponse parses with parseAgentXml, persists via PostgresObservationRepository with deterministic generation_key = generation:v1:{job_id}:{index}:{fingerprint}, links via PostgresObservationSourcesRepository, advances outbox status, appends observation_generation_job_events, audits — all in one withPostgresTransaction. Idempotent on retry via UNIQUE constraints. - Three provider adapters under src/server/generation/providers/: Claude, Gemini, OpenRouter. Self-contained — no imports from src/services/worker/*. Worker providers unchanged. - Shared error classification + prompt builder under providers/shared/. Prompt builder strips <private> at the edge; fully-private batches emit <skip_summary /> without billing the provider. - ActiveServerBetaGenerationWorkerManager wires BullMQ Worker via ServerJobQueue.start(...) with concurrency 1 + autorun:false + worker.on('error') per BullMQ docs. - New GET /v1/events/:id/observations on ServerV1PostgresRoutes returns observations linked via observation_sources, team/project scoped. Verification: 104 pass / 4 skip / 0 fail. No typecheck regressions. Anti-pattern greps clean for services/worker imports under src/server, WorkerRef/ActiveSession/SessionStore in src/server/generation. Deferred: ModeManager loading uses a stable fallback observation type list; summary and reindex queue lanes are not yet wired. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): Phase 6 — independent server session semantics server_sessions is now the canonical Server beta session model. Sessions are independent of legacy worker ActiveSession state. - PostgresServerSessionRepository extended: findByExternalIdForScope, endSession (idempotent via COALESCE(ended_at, now())), markGenerationStarted/Completed/Failed, listUnprocessedEvents (filters agent_events with completed agent_event jobs). - ServerSessionRuntimeRepository wraps the repo; every method requires explicit team_id + project_id and validates scope via assertProjectOwnership. - SessionGenerationPolicy supports per-event (default), debounce (BullMQ delayed-job replace via getJob+remove+add), and end-of-session. Configured via CLAUDE_MEM_SERVER_SESSION_POLICY and CLAUDE_MEM_SERVER_SESSION_DEBOUNCE_MS env vars; per-team override hooks are exposed on ServerV1PostgresRoutesOptions for future settings layer. - POST /v1/sessions/start (find-or-create on (project_id, external_session_id), GET /v1/sessions/:id (scoped 404), POST /v1/sessions/:id/end (transactional: end + create summary outbox via UNIQUE collapse + enqueue post-commit). Re-ending is fully idempotent. - processSessionSummaryResponse persists summary as kind='summary' observation with the same idempotency model (generation_key + observation_sources UNIQUE). - ProviderObservationGenerator dispatches on source_type: agent_event -> processGeneratedResponse, session_summary -> processSessionSummaryResponse; loadEvents handles session-summary by loading unprocessed events. - ActiveServerBetaGenerationWorkerManager wires summary BullMQ lane alongside event lane (concurrency=1, autorun=false, error listener attached per BullMQ docs). Verification: 110 pass / 6 skip / 0 fail. Net typecheck error count unchanged at 24 (pre-existing, none in Phase 6 files). Anti-pattern greps clean for ActiveSession/SessionStore in src/server/runtime, no worker imports anywhere in src/server. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): Phase 7 — hook routing without worker dependency Hooks can now talk directly to server-beta when CLAUDE_MEM_RUNTIME=server-beta is selected, with a clean worker fallback when server-beta is unhealthy. - src/services/hooks/server-beta-client.ts — typed HTTP client for /v1/sessions/start, /v1/events, /v1/sessions/:id/end. Throws ServerBetaClientError with kind classification (missing_api_key, transport, timeout, http_error, invalid_response) and isFallbackEligible helper. Zero imports from services/worker/. - src/services/hooks/runtime-selector.ts — reads CLAUDE_MEM_RUNTIME from settings, returns worker or server-beta context, logs [server-beta-fallback] reason=<code> on every config-time fallback. - src/services/hooks/server-beta-bootstrap.ts — Postgres-backed API key bootstrap. Find-or-creates local-hook-team + local-hook-project, generates cmem_<random> key (SHA-256 hashed), inserts into api_keys with scopes events:write/sessions:write/observations:read/jobs:read. Settings file written with chmod 0600. rotateServerBetaApiKey() wired to a new `claude-mem server keys rotate` command. - src/cli/handlers/{observation,session-init,summarize}.ts — every hook handler tries server-beta first when configured, falls through to the existing worker path on transport/5xx/429/missing-key. One WARN line per fallback. Hook JSON output shape unchanged. - src/shared/SettingsDefaultsManager.ts — three new keys with defaults: CLAUDE_MEM_SERVER_BETA_URL, CLAUDE_MEM_SERVER_BETA_API_KEY, CLAUDE_MEM_SERVER_BETA_PROJECT_ID. - src/npx-cli/commands/install.ts — when installer selects server-beta runtime and CLAUDE_MEM_SERVER_DATABASE_URL is set, bootstraps a local API key automatically. Warns and continues if the DB URL is missing. plugin/scripts/*.cjs bundles rebuilt via npm run build to pick up the new hook handler code path. No plaintext keys in the bundle (verified). Verification: 16 hook unit tests pass; 275 server/storage/services tests pass with 7 pre-existing failures (verified independent of this change via git stash --include-untracked). Build clean. No new typecheck errors in Phase 7 files. Anti-pattern guards verified: - /api/sessions/observations only reached via explicit fallback path - server-beta runtime never starts the worker process - API keys live only in ~/.claude-mem/settings.json (chmod 0600), never in the bundle (grep confirmed) - Worker fallback preserved, observable via single WARN line per call Deferred: semantic context injection (UserPromptSubmit hook) stays worker-only; server-beta does not yet expose /v1/context/semantic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): Phase 8 — MCP backed by server-beta core MCP tools now route through server-beta in server-beta mode while keeping worker-mode search/timeline/get_observations tools fully working. - src/servers/mcp-server.ts — five new observation_* tools registered: observation_add, observation_record_event, observation_search, observation_context, observation_generation_status. Three memory_* compatibility aliases delegate to the canonical handlers. Worker auto-start is gated when selectRuntime() === 'server-beta' so MCP in server-beta mode never spawns the worker. - src/services/hooks/server-beta-client.ts — addObservation, searchObservations, contextObservations, getJobStatus added so MCP shares one transport with hooks (Phase 7). - src/server/routes/v1/ServerV1PostgresRoutes.ts — POST /v1/search and POST /v1/context REST cores backed by PostgresObservationRepository full-text search (GIN tsvector from Phase 1). - Existing memory_search/timeline/get_observations tools call callWorkerAPI unchanged in worker mode; worker tests unaffected. Verification: 39 pass / 4 skip / 0 fail on targeted suite. Pre-existing 7 baseline failures verified independent (git stash). No new typecheck errors. WorkerService grep clean across src/servers/mcp-server.ts and src/server/. Anti-pattern guards verified: - No duplicate generation logic in MCP — observation_record_event hits /v1/events which owns event+outbox+enqueue inside one tx - WorkerService not imported anywhere under MCP server-beta path - No hardcoded worker URLs — all transport via Phase 7 ServerBetaClient - memory_* aliases retained, single handler per pair Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): Phase 9 — compatibility adapters without coupling Legacy /api/sessions/observations and /api/sessions/summarize endpoints keep working on server-beta runtime by translating to AgentEvent and session-end calls — no worker code, no route duplication. - src/server/services/IngestEventsService.ts — shared event-ingest path used by both /v1/events and the compat adapter. Owns transactional event row + outbox row + lifecycle log + post-commit BullMQ enqueue, honors Phase 6 SessionGenerationPolicy. - src/server/services/EndSessionService.ts — shared session-end path used by both /v1/sessions/:id/end and the compat adapter. Idempotent ended_at + summary outbox + deterministic summary job id. - src/server/compat/SessionsObservationsAdapter.ts — translates legacy POST /api/sessions/observations payload (Claude Code transcript shape) -> AgentEvent (source_adapter='claude-code-compat', event_type='tool_use') -> IngestEventsService.ingestOne. Resolves contentSessionId to server_sessions via find-or-create. - src/server/compat/SessionsSummarizeAdapter.ts — translates legacy POST /api/sessions/summarize -> EndSessionService.end. Preserves the legacy agentId -> {status:'skipped', reason:'subagent_context'} behavior so existing clients see the same response shape. - src/server/routes/v1/ServerV1PostgresRoutes.ts — refactored to delegate to the new shared services (-203 LoC net) so /v1 and /api compat both call the SAME canonical code path. - src/server/runtime/ServerBetaService.ts — registers both compat adapters alongside ServerV1PostgresRoutes, sharing service instances. - docs/server-beta-parity-map.md — full enumeration of legacy /api/* routes labeled native, adapter, or unsupported (with reasons). Viewer read-path adapters explicitly listed as unsupported pending a future viewer-rewrite phase. Verification: 7 compat tests pass, 6 v1-routes tests still pass (refactor preserved behavior), 4 session-routes tests pass. Pre- existing 16 baseline failures verified independent via git stash. Zero new typecheck errors. Anti-pattern guards verified: - No services/worker/http/routes or WorkerService imports under src/server/compat or src/server/runtime - Compat adapters are thin translators with names ending in *Adapter and a top-of-file comment noting they are legacy compatibility - /v1/* remains the canonical Server beta API; compat adapters call shared services rather than acting as a parallel API Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): Phase 10 — Docker stack and deployable runtime Server beta now ships as a Docker stack with no worker process anywhere and a separate horizontal generation worker for scaling. - src/server/runtime/create-server-beta-service.ts — validateServerBetaEnv() fails fast on missing CLAUDE_MEM_SERVER_DATABASE_URL, requires CLAUDE_MEM_QUEUE_ENGINE=bullmq in Docker, rejects CLAUDE_MEM_AUTH_MODE=local-dev and CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS inside containers (detected via /.dockerenv or CLAUDE_MEM_DOCKER=1). Adds CLAUDE_MEM_GENERATION_DISABLED so the HTTP service can run generator-free. - src/server/runtime/ServerBetaService.ts — runServerBetaGenerationWorker for the dedicated consumer process; runServerBetaApiKeyCli is a new Postgres-backed `server api-key` command (the legacy worker CLI wrote to SQLite and was invisible to the Postgres runtime); getQueueHealth shim feeds /api/health a consistent ObservationQueueHealth shape. - src/npx-cli/commands/{runtime,server}.ts — `claude-mem server worker start` subcommand that boots only the BullMQ consumer. - docker/claude-mem/{Dockerfile,entrypoint.sh} — entrypoint forces CLAUDE_MEM_DOCKER=1 + CLAUDE_MEM_RUNTIME=server-beta and exposes three modes: server (HTTP only, generation disabled), worker (BullMQ consumer), shell. Worker bundle is no longer the default CMD. - docker-compose.yml — full stack: postgres + valkey + claude-mem-server (HTTP-only) + claude-mem-worker (generation consumer). Wires service-to-service env vars. - scripts/e2e-server-beta-docker.sh + docker/e2e/server-beta-e2e.mjs — E2E now hits /v1/sessions/start, /v1/events?wait=true, /v1/jobs/:id; asserts no worker-service.cjs process anywhere in the stack; one-shot docker compose run --rm verifies local-dev auth is rejected with the expected stderr; restart-and-verify confirms Postgres durability and BullMQ retry idempotency. - docs/server.md — full Phase 10 doc: stack diagram, env table, worker mode, auth-in-Docker policy. - docs/api.md — event generation semantics (wait=true, generationJob). Verification: full Docker E2E PASSED on live daemon (phase1 + phase2 + restart-and-verify + revoked-key + no-worker- process + local-dev-rejected). Unit tests 292 pass / 9 skip / 7 fail (7 fails pre-existing baseline). Zero new typecheck errors. Anti-pattern guards verified: - entrypoint never execs worker-service.cjs; E2E greps prove no worker process anywhere in the stack - validateServerBetaEnv refuses local-dev auth in Docker with explicit remediation message; ALLOW_LOCAL_DEV_BYPASS rejected the same way - Docker requires CLAUDE_MEM_QUEUE_ENGINE=bullmq; in-process queue rejected at startup - claude-mem worker / worker-service / WorkerService greps clean in docker/ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): Phase 11 — team-aware generation with audit chain Generation jobs now carry team_id/project_id/api_key_id/actor_id/ source_adapter from enqueue through execution; the outbox is reloaded from Postgres before any side effect so BullMQ payload can never act as auth authority. - src/server/jobs/types.ts — ServerGenerationJobPayloadSchema (Zod discriminated union) requires team_id, project_id, generation_job_id, source_adapter, api_key_id, actor_id (nullable), source_type, source_id, plus event_id / server_session_id per kind. assertServerGenerationJobPayload is called at enqueue (outbox.ts) and again at execution boundary. - src/server/services/{IngestEventsService,EndSessionService}.ts + SessionGenerationPolicy.ts — thread identity context (apiKeyId, actorId, sourceAdapter) into both event and summary BullMQ payloads. - src/server/generation/ProviderObservationGenerator.ts — loadCanonicalOutbox loads the outbox row WITHOUT scope filter, then compares candidate.team_id/project_id to payload.team_id/project_id; mismatch -> ServerGenerationScopeViolationError (non-retryable), failed status, generation_job.scope_violation audit. isApiKeyRevoked checks api_keys (revoked_at, expires_at, row missing) before any provider call; revoked -> generation_job.revoked_key audit + non- retryable failure. generation_job.processing audit emitted on lock. - src/server/generation/processGeneratedResponse.ts — generated observations carry team_id/project_id/server_session_id from the reloaded source row (not job payload). observation_sources.metadata records source_adapter, actor_id, api_key_id for traceability. observation.created audit per observation; generation_job.completed audit per terminal transition. All audit rows reference the same generation_job_id in details. - src/server/routes/v1/ServerV1PostgresRoutes.ts — GET /v1/teams/:id/jobs and GET /v1/projects/:id/jobs with SQL-layer scoping (WHERE team_id=$1 [AND project_id=$2] [AND status=$3]); cross-tenant returns 404 to avoid leaking row existence. Pagination via status/limit/offset. audit_log rows for event.received, event.batch_received, observation.read. - src/server/compat/{SessionsObservationsAdapter,SessionsSummarizeAdapter}.ts — propagate apiKeyId and sourceAdapter='claude-code-compat'. Verification: 162 pass / 10 skip / 0 fail. Pre-existing failures in tests/services/queue and tests/services/worker confirmed independent via git stash. Zero new typecheck errors in server-beta files. Required greps: rg "team_id.*req\.body|project_id.*req\.body" src/server -> 0 matches Audit chain integration test passes — generation_job.processing, observation.created, and generation_job.completed audit rows all share the same generation_job_id reference. Anti-pattern guards verified: - BullMQ payload never acts as auth authority — Postgres outbox reload with mismatch check happens before every side effect - team_id / project_id never derived from request body for scope decisions; always req.authContext.teamId / projectId - Application-layer team/project filtering forbidden — listJobsForScope pushes scope into the SQL WHERE clause - Project-scoped key on cross-project /v1/teams/:id/jobs returns 404 - Revoked api keys cause non-retryable failure with audit before any provider call Deferred: a redundant generation_job.queued audit_log row (already covered by observation_generation_job_events lifecycle log per Phase 1 schema split). Compat adapters set actor_id=null but propagate api_key_id which is the canonical reference downstream. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): Phase 12 — observability and operations Operators can now inspect, retry, and cancel generation jobs from the CLI; queue lane metrics flow into /api/health and /v1/info; every request gets a stable request_id that flows through HTTP -> audit -> outbox -> generator -> completion log. - src/server/middleware/request-id.ts — honors safe inbound X-Request-Id, mints uuid v4 otherwise. Set on req.requestId and echoed via response header so external traces can correlate. - src/server/jobs/ServerJobQueue.ts — QueueEvents wired with completed, failed, progress, stalled, error listeners; lifecycle counters exposed via observe() API. Logs emitted as [generation] job=<id> source_type=<...> duration=<ms> attempts=<N> reason=<message>. Stalled and error counters survive worker restart. - src/server/jobs/types.ts — ServerGenerationJob payload schema extended with optional request_id; flows through from HTTP into every BullMQ job. - src/server/queue/ObservationQueueEngine.ts — health snapshot now carries per-lane (event, summary) counts via ObservationQueueHealthLaneSnapshot. - src/server/runtime/{ActiveServerBetaQueueManager, ActiveServerBetaGenerationWorkerManager,ServerBetaService}.ts — per-lane getJobCounts feed /api/health and /v1/info; stalled events audit through audit_log with action generation_job.stalled. - src/server/routes/v1/ServerV1PostgresRoutes.ts — GET /v1/jobs (status/source_type/since/limit/offset, scope from api-key, payload stripped unless ?include=payload AND admin scope), POST /v1/jobs/:id/retry (idempotent; queued -> no-op; audit generation_job.retried_by_operator), POST /v1/jobs/:id/cancel (terminal -> no-op; audit generation_job.cancelled_by_operator; generator reload-before-side-effects already prevents double work). - src/server/services/IngestEventsService.ts + SessionGenerationPolicy.ts + ProviderObservationGenerator.ts — request_id propagated end to end. Generator extracts request_id from BullMQ payload and includes it in lock/processing/completion logs and audit details. - src/npx-cli/commands/server-jobs.ts + src/npx-cli/commands/server.ts — `claude-mem server jobs status|failed|retry|cancel`. status compares Postgres outbox counts to BullMQ queue counts and surfaces divergence. failed prints attempts + last_error message. --team and --project filters. Verification: 350 pass / 12 skip / 7 fail (pre-existing baseline, verified independent via git stash). 18 new tests added (request-id middleware, server-jobs CLI seams, jobs list/retry/cancel routes Postgres-gated). Zero new typecheck errors. Anti-pattern guards verified: - agent_events.payload only emitted in /v1/jobs response inside the admin-gated branch (?include=payload + admin scope) — returns 403 otherwise - jobs retry on a queued row is a no-op (no double BullMQ enqueue, no double UPDATE) - Every operator action writes to audit_log with the *_by_operator action and request_id correlation in details - Stalled events audit through generation_job.stalled Sample correlated trace (one request_id end to end): HTTP middleware: req.requestId = 'req-abc' audit event.received: details.requestId = 'req-abc' BullMQ payload: { request_id: 'req-abc', generation_job_id: 'gj_x' } generator lock log: [generation] job locked { jobId, requestId } audit generation_job.processing: details.requestId = 'req-abc' completion log: [generation] job=evt_... duration=1230ms Deferred: live /api/health round-trip integration test (needs Redis); stalled event live integration test (needs Redis); storing request_id on the observations row itself (spec did not require). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(server-beta): add Phase 13 release readiness report Captures the final verification gate: tests (1749 pass, 45 fail all pre-existing baseline, zero regressions), required greps clean, Docker E2E green end-to-end, all 7 exit criteria met, build clean, typecheck unchanged from main. Documents deferred items. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * build(server-beta): rebuild server-beta-service bundle Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(server-beta): address Greptile review on PR #2383 - ProviderObservationGenerator.lockOutbox: skip duplicate worker run when another lock is active instead of returning the row, which previously let two BullMQ workers issue the (paid, rate-limited) external provider call before the persistence-layer terminal-status guard collapsed the duplicate. Reconciliation still recovers from a stale lock on startup or next retry. - docker-compose.yml: require POSTGRES_USER/PASSWORD/DB env vars (no defaults). Stack refuses to start without explicit secrets. Added a header warning that the file must not be deployed unmodified. - e2e-server-beta-docker.sh: export ephemeral test creds for the new required env vars so the Docker E2E driver still runs unattended. - ServerBetaService api-key list: bound query with LIMIT/OFFSET (default 100, max 500) and add optional --team filter to prevent unintentional cross-tenant key metadata disclosure on shared admin hosts. - SessionGenerationPolicy: fix dead `??` fallback for NaN parseInt result; use `||` so DEFAULT_DEBOUNCE_MS actually applies. - ServerV1PostgresRoutes: `?wait=true` now actually waits — polls the outbox row until terminal status (timeout 30s, 100ms interval) on both /v1/events and /v1/events/batch. Returns `waitTimedOut: true` if the cap is hit so callers can re-poll the status endpoints. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(server-beta): address CodeRabbit + Greptile second review on PR #2383 P1 fixes - Operator retry endpoint was re-publishing the Postgres outbox metadata column as the BullMQ payload; the worker's assertServerGenerationJobPayload always rejected it, leaving the row stuck in queued until startup reconciliation. Persist the BullMQ payload on the outbox row at create-time inside IngestEventsService and EndSessionService, then re-enqueue that canonical payload on retry. Major fixes - prompt-builder: escape server_session_id when interpolating into the XML prompt; previously a session id containing `<`, `&`, or quotes could inject XML into the provider input. - ServerJobQueue: route both worker.on('stalled') and the QueueEvents 'stalled' subscriber through a single notifyStalled helper that dedupes by jobId for 30s, so counters.stalled increments once per stall. QueueEvents 'error' now routes through notifyQueueError so it increments counters.errored and runs onError listeners — keeping observability symmetric across both sources. - ServerV1PostgresRoutes: convert PostgresObservationRepository from three dynamic imports to a single static import for consistency. - mcp-server / ServerBetaClient: actually forward the observation_record_event tool's `generate` flag through to the /v1/events endpoint as `?generate=false` instead of voiding it. - server-sessions.markGenerationFailed: guard jsonb_set against a null error payload so the failure path can't null out metadata before the generation_status='failed' write commits. Minor fixes - server-sessions.endSession: keep updated_at stable on repeated calls so the documented idempotency contract holds. - SettingsDefaultsManager + ServerBetaService.getServerBetaPort: derive the server-beta default port from UID (37877 + uid%100), matching the worker port pattern, so two users on the same host don't collide. Docker stacks always pass CLAUDE_MEM_SERVER_PORT explicitly so the containerized deployment is unaffected. - server-session-runtime test: close the pg.Pool in afterAll. - server-beta-release-readiness.md: escape pipes inside table inline code, add `text` language tag to the fenced log block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(server-beta): address Greptile + CodeRabbit third review on PR #2383 P1 fixes - SessionsObservationsAdapter.resolveServerSession: catch unique-violation (23505) on concurrent compat inserts and re-fetch instead of returning 500. Two compat callers carrying the same contentSessionId can both observe `existing===null` and race on the (project_id, external_session_id) unique constraint; the second now resolves to the raced row instead of dropping the event. - /v1/events/batch: pass `sourceAdapter: null` to ingestBatch so each event's BullMQ payload (and persisted outbox payload column) reflects its own event.sourceAdapter via buildEventBullmqPayload's fallback, rather than stamping the whole batch with the first event's adapter. Minor - server-session-runtime test afterEach: wrap DROP SCHEMA in try/finally so client.release() always runs even if the drop throws. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): drop `pool as never` cast — pg.Pool already matches PostgresPool Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(server-beta): retry of completed job now 409s instead of duplicating retryGenerationJob previously fell through to the reset+re-enqueue path when called on a job in `completed` status. The observations index dedupes on (generation_job_id, parsed_observation_index, content) but LLM output is non-deterministic, so a second provider run almost always produced a different content string and bypassed the index, persisting a parallel set of observation rows attributed to the same generation job. Match cancelGenerationJob's 409 guard for completed jobs. failed and cancelled remain valid retry targets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * build(server-beta): rebuild bundles after rebase onto main Regenerates the three plugin bundles so they reflect the rebased source state. Mechanical rebuild output only — no source changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(server-beta): wrap resolveServerSession in try/catch for structured error response Greptile P1 on PR #2383: resolveServerSession was called before the try/catch in both compat adapters, so Postgres errors during session lookup (timeout, pool exhaustion, etc.) escaped to Express's default error handler and returned HTML/text 500s. Legacy clients calling response.json() would get a parse failure instead of the documented { stored: false, reason: 'internal_error' } (or { status: 'error', reason: 'internal_error' } for the summarize adapter) shape. Move the resolveServerSession call inside the existing try block in both adapters so any failure flows through the structured catch handler. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(server-beta): catch 23505 unique violation in POST /v1/sessions/start Greptile P1 on PR #2383: concurrent requests with the same externalSessionId can both pass the findByExternalIdForScope check, both call repo.create, and the loser hits the (project_id, external_session_id) unique constraint. The handler treated that as an unknown error and returned a 500. Apply the same pattern resolveServerSession already uses: catch error.code '23505' when externalSessionId is set, refetch the row inserted by the winning request, and return 200 with that session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 00:26:11 -07:00
parent a10d1b342f
commit e7bbb2a9aa
72 changed files with 13901 additions and 982 deletions
@@ -0,0 +1,212 @@
+// SPDX-License-Identifier: Apache-2.0
+
+// Legacy compatibility — new clients should use POST /v1/events directly.
+//
+// Legacy worker payloads to `/api/sessions/observations` are translated into
+// the Server beta event/job model and delegated to IngestEventsService. The
+// adapter never touches worker code, never queues observations directly, and
+// never uses `src/services/worker/*` types.
+//
+// Translation rules:
+//   - `contentSessionId` (Claude Code session UUID) becomes the
+//     `external_session_id` of a Server beta `server_sessions` row, scoped to
+//     the API key's team and project. The session is create-or-found.
+//   - The tool-use shape (tool_name, tool_input, tool_response, tool_use_id)
+//     is mapped to an `agent_event` with sourceAdapter='claude-code-compat',
+//     eventType='tool_use', payload preserves the legacy fields verbatim.
+//   - The API key MUST be project-scoped. Cross-project compat calls return
+//     400; we never let compat traffic bypass project scope.
+
+import type { Application, Request, Response } from 'express';
+import { z } from 'zod';
+import type { RouteHandler } from '../../services/server/Server.js';
+import type { PostgresPool } from '../../storage/postgres/pool.js';
+import { PostgresServerSessionsRepository } from '../../storage/postgres/server-sessions.js';
+import { logger } from '../../utils/logger.js';
+import { requirePostgresServerAuth } from '../middleware/postgres-auth.js';
+import { IngestEventsService } from '../services/IngestEventsService.js';
+import type { CreatePostgresAgentEventInput } from '../../storage/postgres/agent-events.js';
+
+const COMPAT_SOURCE_ADAPTER = 'claude-code-compat';
+const COMPAT_EVENT_TYPE = 'tool_use';
+
+const observationsSchema = z.object({
+  contentSessionId: z.string().min(1),
+  tool_name: z.string().min(1),
+  tool_input: z.unknown().optional(),
+  tool_response: z.unknown().optional(),
+  cwd: z.string().optional(),
+  agentId: z.string().optional(),
+  agentType: z.string().optional(),
+  platformSource: z.string().optional(),
+  tool_use_id: z.string().optional(),
+  toolUseId: z.string().optional(),
+}).passthrough();
+
+export interface SessionsObservationsAdapterOptions {
+  pool: PostgresPool;
+  ingestEvents: IngestEventsService;
+  authMode?: string;
+  allowLocalDevBypass?: boolean;
+}
+
+export class SessionsObservationsAdapter implements RouteHandler {
+  constructor(private readonly options: SessionsObservationsAdapterOptions) {}
+
+  setupRoutes(app: Application): void {
+    const writeAuth = requirePostgresServerAuth(this.options.pool, {
+      authMode: this.options.authMode,
+      allowLocalDevBypass: this.options.allowLocalDevBypass,
+      requiredScopes: ['memories:write'],
+    });
+
+    app.post('/api/sessions/observations', writeAuth, this.asyncHandler(async (req, res) => {
+      const parsed = observationsSchema.safeParse(req.body);
+      if (!parsed.success) {
+        res.status(400).json({ error: 'ValidationError', issues: parsed.error.issues });
+        return;
+      }
+      const teamId = req.authContext?.teamId ?? null;
+      const projectId = req.authContext?.projectId ?? null;
+      if (!teamId) {
+        res.status(403).json({ error: 'Forbidden', message: 'API key is not bound to a team' });
+        return;
+      }
+      if (!projectId) {
+        // Compat mode requires a project-scoped key — the legacy payload does
+        // not carry a Server beta projectId, so without scope we cannot place
+        // the row in a tenant-scoped table.
+        res.status(400).json({
+          error: 'BadRequest',
+          message: 'Legacy /api/sessions/observations requires a project-scoped API key',
+        });
+        return;
+      }
+
+      try {
+        const session = await resolveServerSession({
+          pool: this.options.pool,
+          teamId,
+          projectId,
+          contentSessionId: parsed.data.contentSessionId,
+          platformSource: typeof parsed.data.platformSource === 'string' ? parsed.data.platformSource : null,
+          agentId: typeof parsed.data.agentId === 'string' ? parsed.data.agentId : null,
+          agentType: typeof parsed.data.agentType === 'string' ? parsed.data.agentType : null,
+        });
+
+        const toolUseId = typeof parsed.data.tool_use_id === 'string'
+          ? parsed.data.tool_use_id
+          : (typeof parsed.data.toolUseId === 'string' ? parsed.data.toolUseId : null);
+
+        const input: CreatePostgresAgentEventInput = {
+          projectId,
+          teamId,
+          serverSessionId: session.id,
+          sourceAdapter: COMPAT_SOURCE_ADAPTER,
+          sourceEventId: toolUseId,
+          eventType: COMPAT_EVENT_TYPE,
+          payload: {
+            contentSessionId: parsed.data.contentSessionId,
+            tool_name: parsed.data.tool_name,
+            tool_input: parsed.data.tool_input ?? null,
+            tool_response: parsed.data.tool_response ?? null,
+            cwd: parsed.data.cwd ?? null,
+            platformSource: parsed.data.platformSource ?? null,
+            agentId: parsed.data.agentId ?? null,
+            agentType: parsed.data.agentType ?? null,
+            toolUseId,
+          },
+          metadata: { compat: 'sessions/observations' },
+          occurredAt: new Date(),
+        };
+
+        const result = await this.options.ingestEvents.ingestOne(input, {
+          source: 'http_post_api_sessions_observations',
+          apiKeyId: req.authContext?.apiKeyId ?? null,
+          actorId: null,
+          sourceAdapter: COMPAT_SOURCE_ADAPTER,
+        });
+        // Legacy response shape — older clients only check `status`.
+        res.json({
+          status: 'queued',
+          observationCount: 1,
+          sessionId: session.id,
+          serverSessionId: session.id,
+          eventId: result.event.id,
+          generationJobId: result.outbox?.id ?? null,
+          transport: result.enqueueState,
+        });
+      } catch (error) {
+        logger.error('SYSTEM', 'compat observations adapter failed', {
+          error: error instanceof Error ? error.message : String(error),
+          contentSessionId: parsed.data.contentSessionId,
+        });
+        res.status(500).json({ stored: false, reason: 'internal_error' });
+      }
+    }));
+  }
+
+  private asyncHandler(fn: (req: Request, res: Response) => Promise<void> | void) {
+    return (req: Request, res: Response, next: (err?: unknown) => void): void => {
+      Promise.resolve(fn(req, res)).catch(next);
+    };
+  }
+}
+
+/**
+ * Look up an existing server_session by (project, team, externalSessionId)
+ * or create one if missing. Idempotent: re-issuing for the same content
+ * session returns the existing row.
+ *
+ * Concurrent compat callers can race here — both observe `existing===null`
+ * and both call `repo.create`, where the second will hit one of two unique
+ * constraints (`(project_id, idempotency_key)` covered by ON CONFLICT, or
+ * `(project_id, external_session_id)` which is NOT covered). Catch the
+ * unique-violation and re-fetch so the caller never sees a 500.
+ */
+export async function resolveServerSession(input: {
+  pool: PostgresPool;
+  teamId: string;
+  projectId: string;
+  contentSessionId: string;
+  platformSource: string | null;
+  agentId: string | null;
+  agentType: string | null;
+}): Promise<{ id: string; projectId: string; teamId: string }> {
+  const repo = new PostgresServerSessionsRepository(input.pool);
+  const existing = await repo.findByExternalIdForScope({
+    externalSessionId: input.contentSessionId,
+    projectId: input.projectId,
+    teamId: input.teamId,
+  });
+  if (existing) {
+    return { id: existing.id, projectId: existing.projectId, teamId: existing.teamId };
+  }
+  try {
+    const created = await repo.create({
+      projectId: input.projectId,
+      teamId: input.teamId,
+      externalSessionId: input.contentSessionId,
+      contentSessionId: input.contentSessionId,
+      agentId: input.agentId,
+      agentType: input.agentType,
+      platformSource: input.platformSource,
+    });
+    return { id: created.id, projectId: created.projectId, teamId: created.teamId };
+  } catch (error) {
+    // Postgres unique_violation. A concurrent compat call inserted the row
+    // for this (project, external_session_id) before we could; re-fetch
+    // and return that row instead of bubbling a 500 to the legacy client.
+    if ((error as { code?: string } | null)?.code === '23505') {
+      const racedRow = await repo.findByExternalIdForScope({
+        externalSessionId: input.contentSessionId,
+        projectId: input.projectId,
+        teamId: input.teamId,
+      });
+      if (racedRow) {
+        return { id: racedRow.id, projectId: racedRow.projectId, teamId: racedRow.teamId };
+      }
+    }
+    throw error;
+  }
+}
@@ -0,0 +1,127 @@
+// SPDX-License-Identifier: Apache-2.0
+
+// Legacy compatibility — new clients should use POST /v1/sessions/:id/end directly.
+//
+// Translates the legacy `/api/sessions/summarize` request into a call to
+// EndSessionService. The legacy shape carries `contentSessionId` and an
+// optional `last_assistant_message`; we resolve the server_session by
+// (team, project, external_session_id=contentSessionId), then end it.
+//
+// Re-summarizing the same session collapses to the same outbox row because
+// the (team_id, project_id, source_type='session_summary', source_id)
+// UNIQUE constraint stays in force — exactly the same idempotency guarantee
+// as `/v1/sessions/:id/end`.
+
+import type { Application, Request, Response } from 'express';
+import { z } from 'zod';
+import type { RouteHandler } from '../../services/server/Server.js';
+import type { PostgresPool } from '../../storage/postgres/pool.js';
+import { PostgresServerSessionsRepository } from '../../storage/postgres/server-sessions.js';
+import { logger } from '../../utils/logger.js';
+import { requirePostgresServerAuth } from '../middleware/postgres-auth.js';
+import { EndSessionService } from '../services/EndSessionService.js';
+import { resolveServerSession } from './SessionsObservationsAdapter.js';
+
+const summarizeSchema = z.object({
+  contentSessionId: z.string().min(1),
+  last_assistant_message: z.string().optional(),
+  agentId: z.string().optional(),
+  platformSource: z.string().optional(),
+}).passthrough();
+
+export interface SessionsSummarizeAdapterOptions {
+  pool: PostgresPool;
+  endSession: EndSessionService;
+  authMode?: string;
+  allowLocalDevBypass?: boolean;
+}
+
+export class SessionsSummarizeAdapter implements RouteHandler {
+  constructor(private readonly options: SessionsSummarizeAdapterOptions) {}
+
+  setupRoutes(app: Application): void {
+    const writeAuth = requirePostgresServerAuth(this.options.pool, {
+      authMode: this.options.authMode,
+      allowLocalDevBypass: this.options.allowLocalDevBypass,
+      requiredScopes: ['memories:write'],
+    });
+
+    app.post('/api/sessions/summarize', writeAuth, this.asyncHandler(async (req, res) => {
+      const parsed = summarizeSchema.safeParse(req.body);
+      if (!parsed.success) {
+        res.status(400).json({ error: 'ValidationError', issues: parsed.error.issues });
+        return;
+      }
+      const teamId = req.authContext?.teamId ?? null;
+      const projectId = req.authContext?.projectId ?? null;
+      if (!teamId) {
+        res.status(403).json({ error: 'Forbidden', message: 'API key is not bound to a team' });
+        return;
+      }
+      if (!projectId) {
+        res.status(400).json({
+          error: 'BadRequest',
+          message: 'Legacy /api/sessions/summarize requires a project-scoped API key',
+        });
+        return;
+      }
+
+      // Subagent contexts in legacy code emit summarize calls but the worker
+      // skipped them. We preserve the legacy semantics so existing clients
+      // see the same response shape.
+      if (parsed.data.agentId) {
+        res.json({ status: 'skipped', reason: 'subagent_context' });
+        return;
+      }
+
+      try {
+        const session = await resolveServerSession({
+          pool: this.options.pool,
+          teamId,
+          projectId,
+          contentSessionId: parsed.data.contentSessionId,
+          platformSource: typeof parsed.data.platformSource === 'string' ? parsed.data.platformSource : null,
+          agentId: null,
+          agentType: null,
+        });
+
+        const result = await this.options.endSession.end({
+          sessionId: session.id,
+          projectId,
+          teamId,
+          source: 'http_post_api_sessions_summarize',
+          apiKeyId: req.authContext?.apiKeyId ?? null,
+          actorId: null,
+          sourceAdapter: 'claude-code-compat',
+        });
+        if (!result.session) {
+          res.status(404).json({ status: 'not_found', reason: 'session_not_found' });
+          return;
+        }
+        res.json({
+          status: 'queued',
+          sessionId: session.id,
+          serverSessionId: session.id,
+          generationJobId: result.outbox?.id ?? null,
+          transport: result.enqueueState,
+        });
+      } catch (error) {
+        logger.error('SYSTEM', 'compat summarize adapter failed', {
+          error: error instanceof Error ? error.message : String(error),
+          contentSessionId: parsed.data.contentSessionId,
+        });
+        res.status(500).json({ status: 'error', reason: 'internal_error' });
+      }
+    }));
+  }
+
+  private asyncHandler(fn: (req: Request, res: Response) => Promise<void> | void) {
+    return (req: Request, res: Response, next: (err?: unknown) => void): void => {
+      Promise.resolve(fn(req, res)).catch(next);
+    };
+  }
+}
+
+// Side-effect import so PostgresServerSessionsRepository symbol is reachable
+// even when tree-shaking is aggressive in the main bundle.
+void PostgresServerSessionsRepository;
@@ -0,0 +1,538 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import type { Job } from 'bullmq';
+import { logger } from '../../utils/logger.js';
+import { PostgresAgentEventsRepository } from '../../storage/postgres/agent-events.js';
+import { PostgresObservationGenerationJobRepository } from '../../storage/postgres/generation-jobs.js';
+import { PostgresProjectsRepository } from '../../storage/postgres/projects.js';
+import { PostgresAuthRepository } from '../../storage/postgres/auth.js';
+import type { PostgresPool } from '../../storage/postgres/pool.js';
+import type { PostgresObservationGenerationJob } from '../../storage/postgres/generation-jobs.js';
+import {
+  assertServerGenerationJobPayload,
+  ServerGenerationJobPayloadValidationError,
+  type ServerGenerationJobPayload,
+} from '../jobs/types.js';
+import { ServerClassifiedProviderError } from './providers/shared/error-classification.js';
+import type { ServerGenerationProvider } from './providers/shared/types.js';
+import {
+  markGenerationFailed,
+  processGeneratedResponse,
+  processSessionSummaryResponse,
+  type ProcessGeneratedResponseOutcome,
+} from './processGeneratedResponse.js';
+import { PostgresServerSessionsRepository } from '../../storage/postgres/server-sessions.js';
+
+// Phase 11 — sentinel exception class so the worker can distinguish
+// scope-violation/revoked-key failures from generic processor errors and
+// audit them under the right action. Marked non-retryable: an attacker who
+// tampered with a payload should never be retried into the queue.
+export class ServerGenerationScopeViolationError extends Error {
+  readonly reason: 'scope_mismatch' | 'revoked_key';
+  constructor(reason: 'scope_mismatch' | 'revoked_key', message: string) {
+    super(message);
+    this.reason = reason;
+  }
+}
+
+// ProviderObservationGenerator is the BullMQ Worker processor for server-beta
+// observation generation. It does the following on every job invocation:
+//
+//   1. Reload the Postgres outbox row and the source agent_events row.
+//   2. Lock the outbox by transitioning queued -> processing.
+//   3. Call the provider with a fully-reloaded ServerGenerationContext.
+//      BullMQ payload data is advisory only.
+//   4. Hand the raw response to processGeneratedResponse, which persists +
+//      links + advances outbox in one Postgres transaction.
+//   5. On provider/parse error, route through markGenerationFailed which
+//      decides retry vs final failure based on attempt count + error class.
+//
+// Anti-pattern guards verified at the boundary:
+//   - no imports from src/services/worker/*
+//   - no use of WorkerRef / ActiveSession / SessionStore
+//   - no assumption of Claude Code transcript shape
+
+export interface ProviderObservationGeneratorOptions {
+  pool: PostgresPool;
+  provider: ServerGenerationProvider;
+  workerId?: string;
+}
+
+export class ProviderObservationGenerator {
+  constructor(private readonly options: ProviderObservationGeneratorOptions) {}
+
+  /**
+   * Worker entrypoint. Returns a small JSON summary on success so BullMQ's
+   * completed-state telemetry has something to inspect, but Postgres remains
+   * canonical authority.
+   */
+  async process(
+    job: Job<ServerGenerationJobPayload>,
+  ): Promise<{ jobId: string; status: 'completed'; observationCount: number }> {
+    const correlationId = `bullmq:${job.id ?? '?'}`;
+    // Phase 12 — pivot id captured up front so every log line in this
+    // dispatch carries the same identifier whether or not we manage to
+    // load the canonical row. requestId comes from payload (HTTP middleware).
+    const payloadRequestId = (job.data as { request_id?: string | null } | undefined)?.request_id ?? null;
+
+    // Phase 11 — validate the BullMQ payload against the discriminated-union
+    // schema BEFORE doing anything else. A malformed payload (missing
+    // team_id, project_id, generation_job_id, etc.) means the enqueue path
+    // bypassed the boundary contract; we refuse to run it. Throwing surfaces
+    // it on BullMQ's failed list with a clear message.
+    let payload: ServerGenerationJobPayload;
+    try {
+      payload = assertServerGenerationJobPayload(job.data);
+    } catch (error) {
+      if (error instanceof ServerGenerationJobPayloadValidationError) {
+        logger.error('SYSTEM', 'rejecting malformed job payload at execution', {
+          correlationId,
+          issues: error.issues,
+        });
+      }
+      throw error;
+    }
+
+    if (payload.kind !== 'event' && payload.kind !== 'event-batch' && payload.kind !== 'summary') {
+      logger.warn('SYSTEM', 'unsupported job kind for ProviderObservationGenerator', {
+        correlationId,
+        kind: payload.kind,
+      });
+      throw new Error(`unsupported job kind: ${payload.kind}`);
+    }
+
+    // Phase 11 — anti-bypass guard. We MUST NOT trust BullMQ payload data
+    // for tenant scope. Reload the canonical outbox row keyed by id only
+    // (no scope filter), then compare its team_id/project_id to the
+    // payload's. A mismatch indicates payload tampering or a programmer
+    // bug; either way we audit and refuse.
+    const candidate = await this.loadCanonicalOutbox(payload.generation_job_id);
+    if (!candidate) {
+      logger.info('SYSTEM', 'job row not found by id; nothing to do', {
+        correlationId,
+        generationJobId: payload.generation_job_id,
+      });
+      return { jobId: payload.generation_job_id, status: 'completed', observationCount: 0 };
+    }
+    if (candidate.teamId !== payload.team_id || candidate.projectId !== payload.project_id) {
+      const violation = new ServerGenerationScopeViolationError(
+        'scope_mismatch',
+        `BullMQ payload team/project does not match outbox row (jobId=${payload.generation_job_id})`,
+      );
+      await this.auditScopeViolation(payload, candidate, violation, correlationId);
+      // Tag the row as failed so subsequent retries do not pick it up.
+      await markGenerationFailed({
+        pool: this.options.pool,
+        job: candidate,
+        reason: violation.message,
+        classification: 'scope_mismatch',
+        retryable: false,
+        ...(this.options.workerId !== undefined ? { workerId: this.options.workerId } : {}),
+      });
+      throw violation;
+    }
+
+    // Phase 11 — revocation check. If the api_key that initiated this job
+    // was revoked between enqueue and execution, do not generate. Audit
+    // and fail without retry.
+    if (payload.api_key_id) {
+      const revoked = await this.isApiKeyRevoked(payload.api_key_id);
+      if (revoked) {
+        const violation = new ServerGenerationScopeViolationError(
+          'revoked_key',
+          `api key ${payload.api_key_id} is revoked; refusing to generate for outbox ${candidate.id}`,
+        );
+        await this.auditRevokedKey(payload, candidate, violation, correlationId);
+        await markGenerationFailed({
+          pool: this.options.pool,
+          job: candidate,
+          reason: violation.message,
+          classification: 'revoked_key',
+          retryable: false,
+          ...(this.options.workerId !== undefined ? { workerId: this.options.workerId } : {}),
+        });
+        throw violation;
+      }
+    }
+
+    const fresh = await this.lockOutbox(payload.generation_job_id, payload.team_id, payload.project_id);
+    if (!fresh) {
+      logger.info('SYSTEM', 'job no longer exists or is in terminal status; nothing to do', {
+        correlationId,
+        generationJobId: payload.generation_job_id,
+      });
+      return { jobId: payload.generation_job_id, status: 'completed', observationCount: 0 };
+    }
+
+    // Phase 11 — emit "processing started" audit so we have a row even if
+    // the provider crashes before completion.
+    // Phase 12 — log+audit carry the same job_id / request_id so support
+    // can pivot from BullMQ id -> outbox id -> originating HTTP request.
+    logger.info('SYSTEM', `[generation] job locked for processing`, {
+      correlationId,
+      jobId: fresh.id,
+      bullmqJobId: job.id ?? null,
+      requestId: payloadRequestId,
+      sourceType: fresh.sourceType,
+      attempt: fresh.attempts,
+    });
+    await this.auditEvent({
+      teamId: fresh.teamId,
+      projectId: fresh.projectId,
+      apiKeyId: payload.api_key_id,
+      actorId: payload.actor_id,
+      action: 'generation_job.processing',
+      resourceId: fresh.id,
+      details: {
+        sourceType: fresh.sourceType,
+        sourceId: fresh.sourceId,
+        sourceAdapter: payload.source_adapter,
+        attempt: fresh.attempts,
+        correlationId,
+        requestId: payloadRequestId,
+      },
+    });
+
+    try {
+      const events = await this.loadEvents(fresh, payload);
+      const project = await this.loadProject(fresh);
+
+      const result = await this.options.provider.generate({
+        job: fresh,
+        events,
+        project: {
+          projectId: fresh.projectId,
+          teamId: fresh.teamId,
+          serverSessionId: fresh.serverSessionId,
+          projectName: project?.name ?? null,
+        },
+      });
+
+      const persistInput = {
+        pool: this.options.pool,
+        job: fresh,
+        rawText: result.rawText,
+        modelId: result.modelId,
+        providerLabel: result.providerLabel,
+        // Phase 11 — flow identity context from BullMQ payload into the
+        // persistence layer so observations and audit rows carry the same
+        // generation_job_id reference back through to the original API key.
+        apiKeyId: payload.api_key_id,
+        actorId: payload.actor_id,
+        sourceAdapter: payload.source_adapter,
+        ...(this.options.workerId !== undefined ? { workerId: this.options.workerId } : {}),
+      };
+      const outcome: ProcessGeneratedResponseOutcome = fresh.sourceType === 'session_summary'
+        ? await processSessionSummaryResponse(persistInput)
+        : await processGeneratedResponse(persistInput);
+
+      if (outcome.kind === 'parse_error') {
+        await markGenerationFailed({
+          pool: this.options.pool,
+          job: fresh,
+          reason: outcome.reason,
+          classification: 'parse_error',
+          retryable: false,
+          ...(this.options.workerId !== undefined ? { workerId: this.options.workerId } : {}),
+        });
+        throw new Error(`generation parse error: ${outcome.reason}`);
+      }
+
+      logger.info('SYSTEM', 'generation completed', {
+        correlationId,
+        jobId: outcome.jobId,
+        bullmqJobId: job.id ?? null,
+        requestId: payloadRequestId,
+        observationCount: outcome.observations.length,
+        privateContentDetected: outcome.privateContentDetected,
+      });
+
+      return {
+        jobId: outcome.jobId,
+        status: 'completed',
+        observationCount: outcome.observations.length,
+      };
+    } catch (error) {
+      const classified = error instanceof ServerClassifiedProviderError ? error : null;
+      const retryable = classified
+        ? classified.kind === 'transient' || classified.kind === 'rate_limit'
+        : false;
+      await markGenerationFailed({
+        pool: this.options.pool,
+        job: fresh,
+        reason: error instanceof Error ? error.message : String(error),
+        classification: classified?.kind ?? 'unknown',
+        retryable,
+        ...(this.options.workerId !== undefined ? { workerId: this.options.workerId } : {}),
+      });
+      throw error;
+    }
+  }
+
+  // Phase 11 — load the outbox row by id WITHOUT a scope filter so we can
+  // compare its team_id/project_id to the BullMQ payload as a tampering
+  // detector. Authoritative scope decisions still come from this row, NEVER
+  // from the BullMQ payload.
+  private async loadCanonicalOutbox(jobId: string): Promise<PostgresObservationGenerationJob | null> {
+    const result = await this.options.pool.query<{
+      id: string;
+      project_id: string;
+      team_id: string;
+      agent_event_id: string | null;
+      source_type: 'agent_event' | 'session_summary' | 'observation_reindex';
+      source_id: string;
+      server_session_id: string | null;
+      job_type: string;
+      status: 'queued' | 'processing' | 'completed' | 'failed' | 'cancelled';
+      idempotency_key: string;
+      bullmq_job_id: string | null;
+      attempts: number;
+      max_attempts: number;
+      next_attempt_at: Date | null;
+      locked_at: Date | null;
+      locked_by: string | null;
+      completed_at: Date | null;
+      failed_at: Date | null;
+      cancelled_at: Date | null;
+      last_error: unknown;
+      payload: unknown;
+      created_at: Date;
+      updated_at: Date;
+    }>(
+      'SELECT * FROM observation_generation_jobs WHERE id = $1',
+      [jobId],
+    );
+    const row = result.rows[0];
+    if (!row) return null;
+    return {
+      id: row.id,
+      projectId: row.project_id,
+      teamId: row.team_id,
+      agentEventId: row.agent_event_id,
+      sourceType: row.source_type,
+      sourceId: row.source_id,
+      serverSessionId: row.server_session_id,
+      jobType: row.job_type,
+      status: row.status,
+      idempotencyKey: row.idempotency_key,
+      bullmqJobId: row.bullmq_job_id,
+      attempts: row.attempts,
+      maxAttempts: row.max_attempts,
+      nextAttemptAtEpoch: row.next_attempt_at?.getTime() ?? null,
+      lockedAtEpoch: row.locked_at?.getTime() ?? null,
+      lockedBy: row.locked_by,
+      completedAtEpoch: row.completed_at?.getTime() ?? null,
+      failedAtEpoch: row.failed_at?.getTime() ?? null,
+      cancelledAtEpoch: row.cancelled_at?.getTime() ?? null,
+      lastError: row.last_error && typeof row.last_error === 'object'
+        ? (row.last_error as Record<string, unknown>)
+        : null,
+      payload: row.payload && typeof row.payload === 'object' && !Array.isArray(row.payload)
+        ? (row.payload as Record<string, unknown>)
+        : {},
+      createdAtEpoch: row.created_at.getTime(),
+      updatedAtEpoch: row.updated_at.getTime(),
+    };
+  }
+
+  private async isApiKeyRevoked(apiKeyId: string): Promise<boolean> {
+    const result = await this.options.pool.query<{ revoked_at: Date | null; expires_at: Date | null }>(
+      'SELECT revoked_at, expires_at FROM api_keys WHERE id = $1',
+      [apiKeyId],
+    );
+    const row = result.rows[0];
+    if (!row) {
+      // The key was deleted entirely. Treat as revoked.
+      return true;
+    }
+    if (row.revoked_at) return true;
+    if (row.expires_at && row.expires_at.getTime() <= Date.now()) return true;
+    return false;
+  }
+
+  private async auditScopeViolation(
+    payload: ServerGenerationJobPayload,
+    canonical: PostgresObservationGenerationJob,
+    error: ServerGenerationScopeViolationError,
+    correlationId: string,
+  ): Promise<void> {
+    logger.error('SYSTEM', 'BullMQ payload scope mismatch — refusing to generate', {
+      correlationId,
+      generationJobId: payload.generation_job_id,
+      payloadTeamId: payload.team_id,
+      payloadProjectId: payload.project_id,
+      canonicalTeamId: canonical.teamId,
+      canonicalProjectId: canonical.projectId,
+    });
+    await this.auditEvent({
+      teamId: canonical.teamId,
+      projectId: canonical.projectId,
+      apiKeyId: payload.api_key_id,
+      actorId: payload.actor_id,
+      action: 'generation_job.scope_violation',
+      resourceId: canonical.id,
+      details: {
+        reason: 'scope_mismatch',
+        message: error.message,
+        payloadTeamId: payload.team_id,
+        payloadProjectId: payload.project_id,
+        canonicalTeamId: canonical.teamId,
+        canonicalProjectId: canonical.projectId,
+        sourceAdapter: payload.source_adapter,
+        correlationId,
+      },
+    });
+  }
+
+  private async auditRevokedKey(
+    payload: ServerGenerationJobPayload,
+    canonical: PostgresObservationGenerationJob,
+    error: ServerGenerationScopeViolationError,
+    correlationId: string,
+  ): Promise<void> {
+    logger.warn('SYSTEM', 'api key revoked between enqueue and execute — refusing to generate', {
+      correlationId,
+      generationJobId: payload.generation_job_id,
+      apiKeyId: payload.api_key_id,
+    });
+    await this.auditEvent({
+      teamId: canonical.teamId,
+      projectId: canonical.projectId,
+      apiKeyId: payload.api_key_id,
+      actorId: payload.actor_id,
+      action: 'generation_job.revoked_key',
+      resourceId: canonical.id,
+      details: {
+        reason: 'revoked_key',
+        message: error.message,
+        sourceAdapter: payload.source_adapter,
+        correlationId,
+      },
+    });
+  }
+
+  private async auditEvent(input: {
+    teamId: string | null;
+    projectId: string | null;
+    apiKeyId: string | null;
+    actorId: string | null;
+    action: string;
+    resourceId: string | null;
+    details?: Record<string, unknown>;
+  }): Promise<void> {
+    try {
+      const repo = new PostgresAuthRepository(this.options.pool);
+      await repo.createAuditLog({
+        teamId: input.teamId,
+        projectId: input.projectId,
+        actorId: input.actorId,
+        apiKeyId: input.apiKeyId,
+        action: input.action,
+        resourceType: 'observation_generation_job',
+        resourceId: input.resourceId,
+        details: input.details ?? {},
+      });
+    } catch (auditError) {
+      logger.warn('SYSTEM', 'audit_log insert failed in ProviderObservationGenerator', {
+        action: input.action,
+        error: auditError instanceof Error ? auditError.message : String(auditError),
+      });
+    }
+  }
+
+  private async lockOutbox(
+    jobId: string,
+    teamId: string,
+    projectId: string,
+  ): Promise<PostgresObservationGenerationJob | null> {
+    const repo = new PostgresObservationGenerationJobRepository(this.options.pool);
+    const current = await repo.getByIdForScope({ id: jobId, projectId, teamId });
+    if (!current) {
+      return null;
+    }
+    if (current.status === 'completed' || current.status === 'cancelled' || current.status === 'failed') {
+      return null;
+    }
+    if (current.status === 'processing') {
+      // Another worker holds the lock — most commonly this fires when BullMQ
+      // redelivers a stalled job to a second worker while the first is still
+      // mid-`provider.generate()`. Returning the row here would cause both
+      // workers to issue the (paid, rate-limited) external provider call,
+      // and the persistence-level terminal-status guard only collapses the
+      // duplicate after the call has already happened. Skip instead. If the
+      // first worker truly died, `reconcileOnStartup` (and the next BullMQ
+      // retry) will resurrect the row.
+      logger.info('SYSTEM', 'generation job already in processing; skipping duplicate worker run', {
+        jobId: current.id,
+        lockedBy: current.lockedBy,
+        lockedAtEpoch: current.lockedAtEpoch,
+        attempts: current.attempts,
+      });
+      return null;
+    }
+    const transitioned = await repo.transitionStatus({
+      id: current.id,
+      projectId: current.projectId,
+      teamId: current.teamId,
+      status: 'processing',
+      lockedBy: this.options.workerId ?? 'server-beta-worker',
+    });
+    return transitioned;
+  }
+
+  private async loadEvents(
+    job: PostgresObservationGenerationJob,
+    payload: ServerGenerationJobPayload,
+  ): Promise<NonNullable<Awaited<ReturnType<PostgresAgentEventsRepository['getByIdForScope']>>>[]> {
+    const repo = new PostgresAgentEventsRepository(this.options.pool);
+
+    type Event = NonNullable<Awaited<ReturnType<PostgresAgentEventsRepository['getByIdForScope']>>>;
+
+    if (job.sourceType === 'session_summary') {
+      // Summary jobs feed the provider every event tied to the server_session
+      // that hasn't already been collapsed into a completed event-generation
+      // job. The session repo enforces tenant scope inside its WHERE clause.
+      if (!job.serverSessionId) return [];
+      const sessions = new PostgresServerSessionsRepository(this.options.pool);
+      const events = await sessions.listUnprocessedEvents({
+        serverSessionId: job.serverSessionId,
+        projectId: job.projectId,
+        teamId: job.teamId,
+      });
+      return events;
+    }
+
+    if (job.sourceType !== 'agent_event') {
+      return [];
+    }
+
+    if (payload.kind === 'event') {
+      const event = await repo.getByIdForScope({
+        id: payload.agent_event_id,
+        projectId: job.projectId,
+        teamId: job.teamId,
+      });
+      return event ? [event] : [];
+    }
+
+    if (payload.kind === 'event-batch') {
+      const out: Event[] = [];
+      for (const id of payload.agent_event_ids) {
+        const event = await repo.getByIdForScope({
+          id,
+          projectId: job.projectId,
+          teamId: job.teamId,
+        });
+        if (event) out.push(event);
+      }
+      return out;
+    }
+
+    return [];
+  }
+
+  private async loadProject(job: PostgresObservationGenerationJob) {
+    const repo = new PostgresProjectsRepository(this.options.pool);
+    return await repo.getByIdForTeam(job.projectId, job.teamId);
+  }
+}
@@ -0,0 +1,539 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import { parseAgentXml, type ParsedObservation, type ParsedSummary } from '../../sdk/parser.js';
+import { logger } from '../../utils/logger.js';
+import {
+  PostgresObservationRepository,
+  PostgresObservationSourcesRepository,
+  buildObservationGenerationKey,
+  type PostgresObservation,
+} from '../../storage/postgres/observations.js';
+import {
+  PostgresObservationGenerationJobEventsRepository,
+  PostgresObservationGenerationJobRepository,
+  type PostgresObservationGenerationJob,
+} from '../../storage/postgres/generation-jobs.js';
+import { PostgresAuthRepository } from '../../storage/postgres/auth.js';
+import {
+  withPostgresTransaction,
+  type PostgresPool,
+} from '../../storage/postgres/pool.js';
+import { stripTags } from '../../utils/tag-stripping.js';
+
+// processGeneratedResponse owns the full "we got XML from a provider →
+// persist + link + advance outbox" pipeline. Every side-effect runs inside
+// a single Postgres transaction so retries are idempotent:
+//
+//   - observations.generation_key (UNIQUE per team/project) collapses retry
+//     duplicates to a single row.
+//   - observation_sources (UNIQUE on observation_id, source_type, source_id)
+//     collapses duplicate source links.
+//   - observation_generation_jobs.transitionStatus is the lifecycle gate.
+//
+// The function NEVER touches worker SessionStore tables, NEVER assumes a
+// Claude Code transcript shape, and ALWAYS reloads the job before mutating.
+// BullMQ payload data is advisory; the outbox row is canonical.
+
+export type ProcessGeneratedResponseOutcome =
+  | {
+      kind: 'completed';
+      jobId: string;
+      observations: PostgresObservation[];
+      privateContentDetected: boolean;
+    }
+  | { kind: 'parse_error'; jobId: string; reason: string };
+
+export interface ProcessGeneratedResponseInput {
+  pool: PostgresPool;
+  job: PostgresObservationGenerationJob;
+  rawText: string;
+  modelId?: string;
+  providerLabel: string;
+  workerId?: string;
+  // Phase 11 — identity context propagated from the BullMQ payload (and
+  // ultimately the API-key that ingested the source row). Persisted on
+  // observation_sources.metadata for traceability and re-emitted in the
+  // observation.created audit row.
+  apiKeyId?: string | null;
+  actorId?: string | null;
+  sourceAdapter?: string | null;
+}
+
+export async function processGeneratedResponse(
+  input: ProcessGeneratedResponseInput,
+): Promise<ProcessGeneratedResponseOutcome> {
+  const { job, rawText } = input;
+
+  const parsed = parseAgentXml(rawText, job.id);
+  if (!parsed.valid) {
+    return { kind: 'parse_error', jobId: job.id, reason: 'parser rejected response' };
+  }
+
+  // Skip-summary or zero-observation responses are still a success — the
+  // provider explicitly decided there's nothing worth recording (e.g.
+  // privacy-stripped batch). Mark the job completed with no observations.
+  const observationsToWrite = parsed.observations ?? [];
+  const skipped = parsed.summary?.skipped === true;
+  const privateContentDetected = skipped || observationsToWrite.length === 0;
+
+  return await withPostgresTransaction(input.pool, async (client) => {
+    const obsRepo = new PostgresObservationRepository(client);
+    const sourcesRepo = new PostgresObservationSourcesRepository(client);
+    const jobsRepo = new PostgresObservationGenerationJobRepository(client);
+    const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
+    const auditRepo = new PostgresAuthRepository(client);
+
+    // Reload the job inside the transaction. If it was already completed
+    // by another worker, return its existing observations idempotently.
+    const fresh = await jobsRepo.getByIdForScope({
+      id: job.id,
+      projectId: job.projectId,
+      teamId: job.teamId,
+    });
+    if (!fresh) {
+      throw new Error(`generation job ${job.id} not found in scope`);
+    }
+    if (fresh.status === 'completed' || fresh.status === 'cancelled' || fresh.status === 'failed') {
+      logger.info('SYSTEM', 'generation job already in terminal status; skipping persistence', {
+        jobId: fresh.id,
+        status: fresh.status,
+      });
+      return {
+        kind: 'completed' as const,
+        jobId: fresh.id,
+        observations: [],
+        privateContentDetected,
+      };
+    }
+
+    const persisted: PostgresObservation[] = [];
+    for (let index = 0; index < observationsToWrite.length; index++) {
+      const parsedObservation = observationsToWrite[index]!;
+      const content = renderObservationContent(parsedObservation);
+      if (!content || content.trim().length === 0) {
+        continue;
+      }
+
+      // Defense-in-depth: even if the parser slipped a private-tagged
+      // string through, scrub before persisting.
+      const scrubbed = stripTags(content);
+      if (!scrubbed.stripped || scrubbed.stripped.trim().length === 0) {
+        continue;
+      }
+
+      const generationKey = buildObservationGenerationKey({
+        generationJobId: fresh.id,
+        parsedObservationIndex: index,
+        content: scrubbed.stripped,
+      });
+
+      const observation = await obsRepo.create({
+        projectId: fresh.projectId,
+        teamId: fresh.teamId,
+        serverSessionId: fresh.serverSessionId,
+        kind: parsedObservation.type ?? 'observation',
+        content: scrubbed.stripped,
+        generationKey,
+        metadata: {
+          title: parsedObservation.title,
+          subtitle: parsedObservation.subtitle,
+          facts: parsedObservation.facts,
+          narrative: parsedObservation.narrative,
+          concepts: parsedObservation.concepts,
+          files_read: parsedObservation.files_read,
+          files_modified: parsedObservation.files_modified,
+          provider: input.providerLabel,
+          model: input.modelId ?? null,
+        },
+        createdByJobId: fresh.id,
+      });
+      persisted.push(observation);
+
+      await sourcesRepo.addSource({
+        observationId: observation.id,
+        projectId: fresh.projectId,
+        teamId: fresh.teamId,
+        sourceType: fresh.sourceType,
+        sourceId: fresh.sourceId,
+        agentEventId: fresh.agentEventId ?? null,
+        generationJobId: fresh.id,
+        metadata: {
+          provider: input.providerLabel,
+          parsedObservationIndex: index,
+          // Phase 11 — denormalize identity context for traceability so an
+          // operator can answer "which api key produced this observation?"
+          // without joining back through generation_job → outbox → key.
+          source_adapter: input.sourceAdapter ?? null,
+          actor_id: input.actorId ?? null,
+          api_key_id: input.apiKeyId ?? null,
+        },
+      });
+
+      // Phase 11 — audit each generated observation. Using the SAME
+      // generation_job_id reference so the audit chain (event_received →
+      // generation_job.queued → generation_job.processing → observation.
+      // created → observation.read) can be reconstructed.
+      try {
+        await auditRepo.createAuditLog({
+          teamId: fresh.teamId,
+          projectId: fresh.projectId,
+          actorId: input.actorId ?? null,
+          apiKeyId: input.apiKeyId ?? null,
+          action: 'observation.created',
+          resourceType: 'observation',
+          resourceId: observation.id,
+          details: {
+            generationJobId: fresh.id,
+            sourceType: fresh.sourceType,
+            sourceId: fresh.sourceId,
+            provider: input.providerLabel,
+            model: input.modelId ?? null,
+            sourceAdapter: input.sourceAdapter ?? null,
+            parsedObservationIndex: index,
+          },
+        });
+      } catch (auditError) {
+        logger.warn('SYSTEM', 'audit_log observation.created insert failed', {
+          observationId: observation.id,
+          error: auditError instanceof Error ? auditError.message : String(auditError),
+        });
+      }
+    }
+
+    // Advance outbox status. Phase 1 transitionStatus enforces legal
+    // transitions and tenant scope inside its WHERE clause.
+    await jobsRepo.transitionStatus({
+      id: fresh.id,
+      projectId: fresh.projectId,
+      teamId: fresh.teamId,
+      status: 'completed',
+    });
+    await eventsLogRepo.append({
+      generationJobId: fresh.id,
+      projectId: fresh.projectId,
+      teamId: fresh.teamId,
+      eventType: 'completed',
+      statusAfter: 'completed',
+      attempt: fresh.attempts,
+      details: {
+        provider: input.providerLabel,
+        model: input.modelId ?? null,
+        observationCount: persisted.length,
+        privateContentDetected,
+        workerId: input.workerId ?? null,
+      },
+    });
+
+    // Audit log — best-effort; failure here would already be inside the
+    // transaction so any insert error rolls everything back. We accept
+    // that to keep the pipeline observable end-to-end.
+    try {
+      await auditRepo.createAuditLog({
+        teamId: fresh.teamId,
+        projectId: fresh.projectId,
+        actorId: input.actorId ?? null,
+        apiKeyId: input.apiKeyId ?? null,
+        action: 'generation_job.completed',
+        resourceType: 'observation_generation_job',
+        resourceId: fresh.id,
+        details: {
+          generationJobId: fresh.id,
+          provider: input.providerLabel,
+          model: input.modelId ?? null,
+          observationCount: persisted.length,
+          observationIds: persisted.map(o => o.id),
+          sourceAdapter: input.sourceAdapter ?? null,
+        },
+      });
+    } catch (auditError) {
+      // The audit log table may not have a metadata column on older
+      // schemas; swallow rather than failing generation.
+      logger.warn('SYSTEM', 'audit log insert failed during generation', {
+        jobId: fresh.id,
+        error: auditError instanceof Error ? auditError.message : String(auditError),
+      });
+    }
+
+    return {
+      kind: 'completed' as const,
+      jobId: fresh.id,
+      observations: persisted,
+      privateContentDetected,
+    };
+  });
+}
+
+export interface MarkGenerationFailedInput {
+  pool: PostgresPool;
+  job: PostgresObservationGenerationJob;
+  reason: string;
+  classification?: string;
+  retryable: boolean;
+  workerId?: string;
+}
+
+/**
+ * Move a generation job to a non-success terminal state. Used when the
+ * provider returned an error or invalid XML. Retryable failures move the
+ * job back to `queued` so reconciliation can re-enqueue; non-retryable
+ * failures move to `failed`.
+ */
+export async function markGenerationFailed(input: MarkGenerationFailedInput): Promise<void> {
+  await withPostgresTransaction(input.pool, async (client) => {
+    const jobsRepo = new PostgresObservationGenerationJobRepository(client);
+    const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
+
+    const fresh = await jobsRepo.getByIdForScope({
+      id: input.job.id,
+      projectId: input.job.projectId,
+      teamId: input.job.teamId,
+    });
+    if (!fresh || fresh.status === 'completed' || fresh.status === 'cancelled') {
+      return;
+    }
+
+    const canRetry = input.retryable && fresh.attempts < fresh.maxAttempts;
+    const target = canRetry ? 'queued' : 'failed';
+
+    await jobsRepo.transitionStatus({
+      id: fresh.id,
+      projectId: fresh.projectId,
+      teamId: fresh.teamId,
+      status: target,
+      lastError: { reason: input.reason, classification: input.classification ?? null },
+      ...(canRetry ? { nextAttemptAt: new Date(Date.now() + retryDelayMs(fresh.attempts)) } : {}),
+    });
+
+    await eventsLogRepo.append({
+      generationJobId: fresh.id,
+      projectId: fresh.projectId,
+      teamId: fresh.teamId,
+      eventType: canRetry ? 'retry_scheduled' : 'failed',
+      statusAfter: target,
+      attempt: fresh.attempts,
+      details: {
+        reason: input.reason,
+        classification: input.classification ?? null,
+        workerId: input.workerId ?? null,
+      },
+    });
+  });
+}
+
+/**
+ * Persist a parsed session summary as an observations row with kind='summary'.
+ *
+ * Wraps the same outbox transition / source-link / audit pipeline as
+ * processGeneratedResponse but emits a single 'summary'-kind observation
+ * derived from the summary fields. Idempotency is enforced through the same
+ * `observations.generation_key` UNIQUE index — re-running the summary job
+ * after a restart will collapse to one row.
+ */
+export async function processSessionSummaryResponse(
+  input: ProcessGeneratedResponseInput,
+): Promise<ProcessGeneratedResponseOutcome> {
+  const { job, rawText } = input;
+
+  if (job.sourceType !== 'session_summary') {
+    return { kind: 'parse_error', jobId: job.id, reason: 'session summary processor invoked on non-summary job' };
+  }
+
+  const parsed = parseAgentXml(rawText, job.id);
+  if (!parsed.valid) {
+    return { kind: 'parse_error', jobId: job.id, reason: 'parser rejected summary response' };
+  }
+
+  const summary = parsed.summary ?? null;
+  const skipped = summary?.skipped === true;
+  const summaryContent = summary ? renderSummaryContent(summary) : '';
+  const privateContentDetected = skipped || summaryContent.trim().length === 0;
+
+  return await withPostgresTransaction(input.pool, async (client) => {
+    const obsRepo = new PostgresObservationRepository(client);
+    const sourcesRepo = new PostgresObservationSourcesRepository(client);
+    const jobsRepo = new PostgresObservationGenerationJobRepository(client);
+    const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
+    const auditRepo = new PostgresAuthRepository(client);
+
+    const fresh = await jobsRepo.getByIdForScope({
+      id: job.id,
+      projectId: job.projectId,
+      teamId: job.teamId,
+    });
+    if (!fresh) {
+      throw new Error(`session summary generation job ${job.id} not found in scope`);
+    }
+    if (fresh.status === 'completed' || fresh.status === 'cancelled' || fresh.status === 'failed') {
+      logger.info('SYSTEM', 'session summary job already in terminal status; skipping persistence', {
+        jobId: fresh.id,
+        status: fresh.status,
+      });
+      return {
+        kind: 'completed' as const,
+        jobId: fresh.id,
+        observations: [],
+        privateContentDetected,
+      };
+    }
+
+    const persisted: PostgresObservation[] = [];
+    if (!privateContentDetected) {
+      const scrubbed = stripTags(summaryContent);
+      const scrubbedContent = scrubbed.stripped ?? '';
+      if (scrubbedContent.trim().length > 0) {
+        const generationKey = buildObservationGenerationKey({
+          generationJobId: fresh.id,
+          parsedObservationIndex: 0,
+          content: scrubbedContent,
+        });
+        const observation = await obsRepo.create({
+          projectId: fresh.projectId,
+          teamId: fresh.teamId,
+          serverSessionId: fresh.serverSessionId,
+          kind: 'summary',
+          content: scrubbedContent,
+          generationKey,
+          metadata: {
+            request: summary?.request ?? null,
+            investigated: summary?.investigated ?? null,
+            learned: summary?.learned ?? null,
+            completed: summary?.completed ?? null,
+            next_steps: summary?.next_steps ?? null,
+            notes: summary?.notes ?? null,
+            provider: input.providerLabel,
+            model: input.modelId ?? null,
+          },
+          createdByJobId: fresh.id,
+        });
+        persisted.push(observation);
+
+        await sourcesRepo.addSource({
+          observationId: observation.id,
+          projectId: fresh.projectId,
+          teamId: fresh.teamId,
+          sourceType: 'session_summary',
+          sourceId: fresh.sourceId,
+          generationJobId: fresh.id,
+          metadata: {
+            provider: input.providerLabel,
+            parsedObservationIndex: 0,
+            source_adapter: input.sourceAdapter ?? null,
+            actor_id: input.actorId ?? null,
+            api_key_id: input.apiKeyId ?? null,
+          },
+        });
+
+        // Phase 11 — observation.created audit for the summary observation.
+        try {
+          await auditRepo.createAuditLog({
+            teamId: fresh.teamId,
+            projectId: fresh.projectId,
+            actorId: input.actorId ?? null,
+            apiKeyId: input.apiKeyId ?? null,
+            action: 'observation.created',
+            resourceType: 'observation',
+            resourceId: observation.id,
+            details: {
+              generationJobId: fresh.id,
+              sourceType: 'session_summary',
+              sourceId: fresh.sourceId,
+              provider: input.providerLabel,
+              model: input.modelId ?? null,
+              sourceAdapter: input.sourceAdapter ?? null,
+              kind: 'summary',
+            },
+          });
+        } catch (auditError) {
+          logger.warn('SYSTEM', 'audit_log observation.created (summary) insert failed', {
+            observationId: observation.id,
+            error: auditError instanceof Error ? auditError.message : String(auditError),
+          });
+        }
+      }
+    }
+
+    await jobsRepo.transitionStatus({
+      id: fresh.id,
+      projectId: fresh.projectId,
+      teamId: fresh.teamId,
+      status: 'completed',
+    });
+    await eventsLogRepo.append({
+      generationJobId: fresh.id,
+      projectId: fresh.projectId,
+      teamId: fresh.teamId,
+      eventType: 'completed',
+      statusAfter: 'completed',
+      attempt: fresh.attempts,
+      details: {
+        provider: input.providerLabel,
+        model: input.modelId ?? null,
+        observationCount: persisted.length,
+        privateContentDetected,
+        workerId: input.workerId ?? null,
+        sourceType: 'session_summary',
+      },
+    });
+
+    try {
+      await auditRepo.createAuditLog({
+        teamId: fresh.teamId,
+        projectId: fresh.projectId,
+        actorId: input.actorId ?? null,
+        apiKeyId: input.apiKeyId ?? null,
+        action: 'generation_job.completed',
+        resourceType: 'observation_generation_job',
+        resourceId: fresh.id,
+        details: {
+          generationJobId: fresh.id,
+          provider: input.providerLabel,
+          model: input.modelId ?? null,
+          observationCount: persisted.length,
+          observationIds: persisted.map(o => o.id),
+          sourceAdapter: input.sourceAdapter ?? null,
+          sourceType: 'session_summary',
+        },
+      });
+    } catch (auditError) {
+      logger.warn('SYSTEM', 'audit log insert failed during summary generation', {
+        jobId: fresh.id,
+        error: auditError instanceof Error ? auditError.message : String(auditError),
+      });
+    }
+
+    return {
+      kind: 'completed' as const,
+      jobId: fresh.id,
+      observations: persisted,
+      privateContentDetected,
+    };
+  });
+}
+
+function renderSummaryContent(summary: ParsedSummary): string {
+  const parts: string[] = [];
+  if (summary.request) parts.push(`Request: ${summary.request}`);
+  if (summary.investigated) parts.push(`Investigated: ${summary.investigated}`);
+  if (summary.learned) parts.push(`Learned: ${summary.learned}`);
+  if (summary.completed) parts.push(`Completed: ${summary.completed}`);
+  if (summary.next_steps) parts.push(`Next steps: ${summary.next_steps}`);
+  if (summary.notes) parts.push(`Notes: ${summary.notes}`);
+  return parts.join('\n\n').trim();
+}
+
+function renderObservationContent(observation: ParsedObservation): string {
+  const parts: string[] = [];
+  if (observation.title) parts.push(observation.title);
+  if (observation.subtitle) parts.push(observation.subtitle);
+  if (observation.narrative) parts.push(observation.narrative);
+  if (observation.facts && observation.facts.length > 0) {
+    parts.push(observation.facts.map(f => `- ${f}`).join('\n'));
+  }
+  return parts.join('\n\n').trim();
+}
+
+function retryDelayMs(attempts: number): number {
+  // Exponential backoff: 5s, 25s, 125s, capped at 10 minutes.
+  const base = 5000 * Math.pow(5, Math.max(0, attempts));
+  return Math.min(base, 10 * 60 * 1000);
+}
@@ -0,0 +1,247 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import { logger } from '../../../utils/logger.js';
+import {
+  ServerClassifiedProviderError,
+  parseRetryAfterMs,
+} from './shared/error-classification.js';
+import { buildServerGenerationPrompt } from './shared/prompt-builder.js';
+import type {
+  ServerGenerationContext,
+  ServerGenerationProvider,
+  ServerGenerationResult,
+} from './shared/types.js';
+
+const ANTHROPIC_API_URL = 'https://api.anthropic.com/v1/messages';
+const ANTHROPIC_VERSION = '2023-06-01';
+const DEFAULT_MODEL = 'claude-3-5-sonnet-latest';
+
+export interface ClaudeObservationProviderOptions {
+  apiKey: string;
+  model?: string;
+  maxOutputTokens?: number;
+  fetchImpl?: typeof fetch;
+}
+
+interface AnthropicMessagesResponse {
+  content?: Array<{ type?: string; text?: string }>;
+  usage?: { input_tokens?: number; output_tokens?: number };
+  error?: { type?: string; message?: string };
+}
+
+export class ClaudeObservationProvider implements ServerGenerationProvider {
+  readonly providerLabel = 'claude' as const;
+  private readonly apiKey: string;
+  private readonly model: string;
+  private readonly maxOutputTokens: number;
+  private readonly fetchImpl: typeof fetch;
+
+  constructor(options: ClaudeObservationProviderOptions) {
+    if (!options.apiKey) {
+      throw new ServerClassifiedProviderError('Anthropic API key not configured', {
+        kind: 'auth_invalid',
+        cause: new Error('apiKey is required'),
+      });
+    }
+    this.apiKey = options.apiKey;
+    this.model = options.model ?? DEFAULT_MODEL;
+    this.maxOutputTokens = options.maxOutputTokens ?? 4096;
+    this.fetchImpl = options.fetchImpl ?? fetch;
+  }
+
+  async generate(
+    context: ServerGenerationContext,
+    signal?: AbortSignal,
+  ): Promise<ServerGenerationResult> {
+    const { prompt, skippedAll } = buildServerGenerationPrompt(context);
+    if (skippedAll) {
+      // All events were scrubbed by privacy stripping. Don't bill the
+      // provider — return a synthetic skip response that parser accepts.
+      return {
+        rawText: '<skip_summary reason="all_events_private" />',
+        providerLabel: this.providerLabel,
+        modelId: this.model,
+      };
+    }
+
+    let response: Response;
+    try {
+      response = await this.fetchImpl(ANTHROPIC_API_URL, {
+        method: 'POST',
+        headers: {
+          'Content-Type': 'application/json',
+          'x-api-key': this.apiKey,
+          'anthropic-version': ANTHROPIC_VERSION,
+        },
+        body: JSON.stringify({
+          model: this.model,
+          max_tokens: this.maxOutputTokens,
+          temperature: 0.3,
+          messages: [{ role: 'user', content: prompt }],
+        }),
+        signal,
+      });
+    } catch (networkError) {
+      throw classifyClaudeServerError({
+        cause: networkError,
+      });
+    }
+
+    if (!response.ok) {
+      const bodyText = await safeReadBody(response);
+      throw classifyClaudeServerError({
+        status: response.status,
+        bodyText,
+        headers: response.headers,
+        cause: new Error(`Anthropic API error: ${response.status} - ${bodyText}`),
+      });
+    }
+
+    let data: AnthropicMessagesResponse;
+    try {
+      data = (await response.json()) as AnthropicMessagesResponse;
+    } catch (parseError) {
+      throw new ServerClassifiedProviderError('Anthropic returned invalid JSON', {
+        kind: 'parse_error',
+        cause: parseError,
+      });
+    }
+
+    if (data.error) {
+      throw classifyClaudeServerError({
+        status: response.status,
+        bodyText: `${data.error.type ?? ''} ${data.error.message ?? ''}`,
+        headers: response.headers,
+        cause: new Error(`Anthropic API error: ${data.error.type} - ${data.error.message}`),
+      });
+    }
+
+    const blocks = Array.isArray(data.content) ? data.content : [];
+    const rawText = blocks
+      .filter(block => block?.type === 'text' && typeof block.text === 'string')
+      .map(block => block.text!)
+      .join('\n')
+      .trim();
+
+    if (!rawText) {
+      logger.warn('SDK', 'Anthropic returned empty content array', {
+        provider: 'claude',
+        model: this.model,
+      });
+    }
+
+    const usage = data.usage ?? {};
+    const tokensUsed =
+      typeof usage.input_tokens === 'number' || typeof usage.output_tokens === 'number'
+        ? (usage.input_tokens ?? 0) + (usage.output_tokens ?? 0)
+        : undefined;
+
+    return {
+      rawText,
+      ...(tokensUsed !== undefined ? { tokensUsed } : {}),
+      providerLabel: this.providerLabel,
+      modelId: this.model,
+    };
+  }
+}
+
+interface ClassifyInput {
+  status?: number;
+  bodyText?: string;
+  headers?: Headers | { get(name: string): string | null };
+  cause: unknown;
+}
+
+/**
+ * Anthropic-specific HTTP error classification. Mirrors worker
+ * `classifyClaudeError`, but extracted for server-beta and rebound to
+ * Anthropic Messages REST semantics rather than SDK error classes.
+ */
+export function classifyClaudeServerError(input: ClassifyInput): ServerClassifiedProviderError {
+  const status = input.status;
+  const body = input.bodyText ?? '';
+  const lower = body.toLowerCase();
+  const retryAfterMs = input.headers ? parseRetryAfterMs(input.headers.get('retry-after')) : undefined;
+
+  if (lower.includes('overloaded')) {
+    return new ServerClassifiedProviderError(
+      `Anthropic overloaded${status !== undefined ? ` (status ${status})` : ''}`,
+      { kind: 'transient', cause: input.cause },
+    );
+  }
+
+  if (status === 401 || status === 403 || lower.includes('invalid api key')) {
+    return new ServerClassifiedProviderError(
+      `Anthropic auth invalid${status !== undefined ? ` (status ${status})` : ''}`,
+      { kind: 'auth_invalid', cause: input.cause },
+    );
+  }
+
+  if (status === 429) {
+    return new ServerClassifiedProviderError('Anthropic rate limit (429)', {
+      kind: 'rate_limit',
+      cause: input.cause,
+      ...(retryAfterMs !== undefined ? { retryAfterMs } : {}),
+    });
+  }
+
+  if (lower.includes('quota exceeded')) {
+    return new ServerClassifiedProviderError('Anthropic quota exhausted', {
+      kind: 'quota_exhausted',
+      cause: input.cause,
+    });
+  }
+
+  if (
+    lower.includes('prompt is too long') ||
+    lower.includes('context window') ||
+    lower.includes('max_tokens')
+  ) {
+    return new ServerClassifiedProviderError('Anthropic context overflow', {
+      kind: 'unrecoverable',
+      cause: input.cause,
+    });
+  }
+
+  if (status === 529) {
+    return new ServerClassifiedProviderError('Anthropic overloaded (529)', {
+      kind: 'transient',
+      cause: input.cause,
+    });
+  }
+
+  if (status !== undefined && status >= 500 && status < 600) {
+    return new ServerClassifiedProviderError(`Anthropic upstream error (status ${status})`, {
+      kind: 'transient',
+      cause: input.cause,
+    });
+  }
+
+  if (status === 400) {
+    return new ServerClassifiedProviderError('Anthropic bad request (400)', {
+      kind: 'unrecoverable',
+      cause: input.cause,
+    });
+  }
+
+  if (status === undefined) {
+    const message = input.cause instanceof Error ? input.cause.message : String(input.cause);
+    return new ServerClassifiedProviderError(`Anthropic network error: ${message}`, {
+      kind: 'transient',
+      cause: input.cause,
+    });
+  }
+
+  return new ServerClassifiedProviderError(
+    `Anthropic API error: ${status}${body ? ` - ${body.substring(0, 200)}` : ''}`,
+    { kind: 'unrecoverable', cause: input.cause },
+  );
+}
+
+async function safeReadBody(response: Response): Promise<string> {
+  try {
+    return await response.text();
+  } catch {
+    return '';
+  }
+}
@@ -0,0 +1,148 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import { logger } from '../../../utils/logger.js';
+import {
+  ServerClassifiedProviderError,
+  classifyHttpProviderError,
+  parseRetryAfterMs,
+} from './shared/error-classification.js';
+import { buildServerGenerationPrompt } from './shared/prompt-builder.js';
+import type {
+  ServerGenerationContext,
+  ServerGenerationProvider,
+  ServerGenerationResult,
+} from './shared/types.js';
+
+const GEMINI_API_URL = 'https://generativelanguage.googleapis.com/v1/models';
+const DEFAULT_MODEL = 'gemini-2.5-flash';
+
+export interface GeminiObservationProviderOptions {
+  apiKey: string;
+  model?: string;
+  maxOutputTokens?: number;
+  fetchImpl?: typeof fetch;
+}
+
+interface GeminiResponse {
+  candidates?: Array<{
+    content?: { parts?: Array<{ text?: string }> };
+  }>;
+  usageMetadata?: { totalTokenCount?: number };
+  error?: { code?: number; status?: string; message?: string };
+}
+
+export class GeminiObservationProvider implements ServerGenerationProvider {
+  readonly providerLabel = 'gemini' as const;
+  private readonly apiKey: string;
+  private readonly model: string;
+  private readonly maxOutputTokens: number;
+  private readonly fetchImpl: typeof fetch;
+
+  constructor(options: GeminiObservationProviderOptions) {
+    if (!options.apiKey) {
+      throw new ServerClassifiedProviderError('Gemini API key not configured', {
+        kind: 'auth_invalid',
+        cause: new Error('apiKey is required'),
+      });
+    }
+    this.apiKey = options.apiKey;
+    this.model = options.model ?? DEFAULT_MODEL;
+    this.maxOutputTokens = options.maxOutputTokens ?? 4096;
+    this.fetchImpl = options.fetchImpl ?? fetch;
+  }
+
+  async generate(
+    context: ServerGenerationContext,
+    signal?: AbortSignal,
+  ): Promise<ServerGenerationResult> {
+    const { prompt, skippedAll } = buildServerGenerationPrompt(context);
+    if (skippedAll) {
+      return {
+        rawText: '<skip_summary reason="all_events_private" />',
+        providerLabel: this.providerLabel,
+        modelId: this.model,
+      };
+    }
+
+    const url = `${GEMINI_API_URL}/${encodeURIComponent(this.model)}:generateContent?key=${encodeURIComponent(this.apiKey)}`;
+
+    let response: Response;
+    try {
+      response = await this.fetchImpl(url, {
+        method: 'POST',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify({
+          contents: [{ role: 'user', parts: [{ text: prompt }] }],
+          generationConfig: {
+            temperature: 0.3,
+            maxOutputTokens: this.maxOutputTokens,
+          },
+        }),
+        signal,
+      });
+    } catch (networkError) {
+      throw classifyHttpProviderError({
+        cause: networkError,
+        providerLabel: 'Gemini',
+      });
+    }
+
+    if (!response.ok) {
+      const bodyText = await safeReadBody(response);
+      throw classifyHttpProviderError({
+        status: response.status,
+        bodyText,
+        headers: response.headers,
+        cause: new Error(`Gemini API error: ${response.status} - ${bodyText}`),
+        providerLabel: 'Gemini',
+      });
+    }
+
+    let data: GeminiResponse;
+    try {
+      data = (await response.json()) as GeminiResponse;
+    } catch (parseError) {
+      throw new ServerClassifiedProviderError('Gemini returned invalid JSON', {
+        kind: 'parse_error',
+        cause: parseError,
+      });
+    }
+
+    if (data.error) {
+      throw classifyHttpProviderError({
+        status: response.status,
+        bodyText: `${data.error.status ?? ''} ${data.error.message ?? ''}`,
+        headers: response.headers,
+        cause: new Error(`Gemini API error: ${data.error.status} - ${data.error.message}`),
+        providerLabel: 'Gemini',
+      });
+    }
+
+    const rawText = data.candidates?.[0]?.content?.parts?.[0]?.text?.trim() ?? '';
+    if (!rawText) {
+      logger.warn('SDK', 'Gemini returned empty content', { provider: 'gemini', model: this.model });
+    }
+
+    const tokensUsed = typeof data.usageMetadata?.totalTokenCount === 'number'
+      ? data.usageMetadata.totalTokenCount
+      : undefined;
+
+    return {
+      rawText,
+      ...(tokensUsed !== undefined ? { tokensUsed } : {}),
+      providerLabel: this.providerLabel,
+      modelId: this.model,
+    };
+  }
+}
+
+// Re-export for tests/auditing parity with worker classifier surface.
+export { parseRetryAfterMs };
+
+async function safeReadBody(response: Response): Promise<string> {
+  try {
+    return await response.text();
+  } catch {
+    return '';
+  }
+}
@@ -0,0 +1,151 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import { logger } from '../../../utils/logger.js';
+import {
+  ServerClassifiedProviderError,
+  classifyHttpProviderError,
+} from './shared/error-classification.js';
+import { buildServerGenerationPrompt } from './shared/prompt-builder.js';
+import type {
+  ServerGenerationContext,
+  ServerGenerationProvider,
+  ServerGenerationResult,
+} from './shared/types.js';
+
+const OPENROUTER_API_URL = 'https://openrouter.ai/api/v1/chat/completions';
+const DEFAULT_MODEL = 'anthropic/claude-3.5-sonnet';
+
+export interface OpenRouterObservationProviderOptions {
+  apiKey: string;
+  model?: string;
+  maxOutputTokens?: number;
+  siteUrl?: string;
+  appName?: string;
+  fetchImpl?: typeof fetch;
+}
+
+interface OpenRouterResponse {
+  choices?: Array<{ message?: { content?: string } }>;
+  usage?: { total_tokens?: number };
+  error?: { code?: string | number; message?: string };
+}
+
+export class OpenRouterObservationProvider implements ServerGenerationProvider {
+  readonly providerLabel = 'openrouter' as const;
+  private readonly apiKey: string;
+  private readonly model: string;
+  private readonly maxOutputTokens: number;
+  private readonly siteUrl: string;
+  private readonly appName: string;
+  private readonly fetchImpl: typeof fetch;
+
+  constructor(options: OpenRouterObservationProviderOptions) {
+    if (!options.apiKey) {
+      throw new ServerClassifiedProviderError('OpenRouter API key not configured', {
+        kind: 'auth_invalid',
+        cause: new Error('apiKey is required'),
+      });
+    }
+    this.apiKey = options.apiKey;
+    this.model = options.model ?? DEFAULT_MODEL;
+    this.maxOutputTokens = options.maxOutputTokens ?? 4096;
+    this.siteUrl = options.siteUrl ?? 'https://github.com/thedotmack/claude-mem';
+    this.appName = options.appName ?? 'claude-mem';
+    this.fetchImpl = options.fetchImpl ?? fetch;
+  }
+
+  async generate(
+    context: ServerGenerationContext,
+    signal?: AbortSignal,
+  ): Promise<ServerGenerationResult> {
+    const { prompt, skippedAll } = buildServerGenerationPrompt(context);
+    if (skippedAll) {
+      return {
+        rawText: '<skip_summary reason="all_events_private" />',
+        providerLabel: this.providerLabel,
+        modelId: this.model,
+      };
+    }
+
+    let response: Response;
+    try {
+      response = await this.fetchImpl(OPENROUTER_API_URL, {
+        method: 'POST',
+        headers: {
+          Authorization: `Bearer ${this.apiKey}`,
+          'HTTP-Referer': this.siteUrl,
+          'X-Title': this.appName,
+          'Content-Type': 'application/json',
+        },
+        body: JSON.stringify({
+          model: this.model,
+          messages: [{ role: 'user', content: prompt }],
+          temperature: 0.3,
+          max_tokens: this.maxOutputTokens,
+        }),
+        signal,
+      });
+    } catch (networkError) {
+      throw classifyHttpProviderError({
+        cause: networkError,
+        providerLabel: 'OpenRouter',
+      });
+    }
+
+    if (!response.ok) {
+      const bodyText = await safeReadBody(response);
+      throw classifyHttpProviderError({
+        status: response.status,
+        bodyText,
+        headers: response.headers,
+        cause: new Error(`OpenRouter API error: ${response.status} - ${bodyText}`),
+        providerLabel: 'OpenRouter',
+      });
+    }
+
+    let data: OpenRouterResponse;
+    try {
+      data = (await response.json()) as OpenRouterResponse;
+    } catch (parseError) {
+      throw new ServerClassifiedProviderError('OpenRouter returned invalid JSON', {
+        kind: 'parse_error',
+        cause: parseError,
+      });
+    }
+
+    if (data.error) {
+      throw classifyHttpProviderError({
+        status: response.status,
+        bodyText: `${data.error.code ?? ''} ${data.error.message ?? ''}`,
+        headers: response.headers,
+        cause: new Error(`OpenRouter API error: ${data.error.code} - ${data.error.message}`),
+        providerLabel: 'OpenRouter',
+      });
+    }
+
+    const rawText = data.choices?.[0]?.message?.content?.trim() ?? '';
+    if (!rawText) {
+      logger.warn('SDK', 'OpenRouter returned empty content', {
+        provider: 'openrouter',
+        model: this.model,
+      });
+    }
+
+    const tokensUsed = typeof data.usage?.total_tokens === 'number' ? data.usage.total_tokens : undefined;
+
+    return {
+      rawText,
+      ...(tokensUsed !== undefined ? { tokensUsed } : {}),
+      providerLabel: this.providerLabel,
+      modelId: this.model,
+    };
+  }
+}
+
+async function safeReadBody(response: Response): Promise<string> {
+  try {
+    return await response.text();
+  } catch {
+    return '';
+  }
+}
@@ -0,0 +1,136 @@
+// SPDX-License-Identifier: Apache-2.0
+
+// Server-beta-local copy of the worker provider error classification model.
+// Phase 5 anti-pattern guard: src/server/* must not import from
+// src/services/worker/*, so we duplicate the small, stable error model here.
+// Worker code keeps src/services/worker/provider-errors.ts unchanged.
+
+export type ServerProviderErrorClass =
+  | 'transient'
+  | 'unrecoverable'
+  | 'rate_limit'
+  | 'quota_exhausted'
+  | 'auth_invalid'
+  | 'parse_error'
+  | (string & {});
+
+export class ServerClassifiedProviderError extends Error {
+  readonly kind: ServerProviderErrorClass;
+  readonly retryAfterMs?: number;
+  readonly cause: unknown;
+
+  constructor(
+    message: string,
+    opts: {
+      kind: ServerProviderErrorClass;
+      cause: unknown;
+      retryAfterMs?: number;
+    },
+  ) {
+    super(message);
+    this.name = 'ServerClassifiedProviderError';
+    this.kind = opts.kind;
+    this.cause = opts.cause;
+    if (opts.retryAfterMs !== undefined) {
+      this.retryAfterMs = opts.retryAfterMs;
+    }
+  }
+}
+
+export function isServerClassified(err: unknown): err is ServerClassifiedProviderError {
+  return err instanceof ServerClassifiedProviderError;
+}
+
+/**
+ * Parse Retry-After header (seconds or HTTP-date). Returns ms or undefined.
+ * Behavior intentionally mirrors the worker providers' helper so server
+ * retries match worker retry policy.
+ */
+export function parseRetryAfterMs(value: string | null): number | undefined {
+  if (!value) return undefined;
+  const seconds = Number(value);
+  if (!Number.isNaN(seconds) && seconds >= 0) {
+    return Math.floor(seconds * 1000);
+  }
+  const dateMs = Date.parse(value);
+  if (!Number.isNaN(dateMs)) {
+    const delta = dateMs - Date.now();
+    return delta > 0 ? delta : 0;
+  }
+  return undefined;
+}
+
+interface ClassifyHttpInput {
+  status?: number;
+  bodyText?: string;
+  headers?: Headers | { get(name: string): string | null };
+  cause: unknown;
+  providerLabel: string;
+}
+
+/**
+ * Generic HTTP-error → ServerClassifiedProviderError mapping shared by
+ * Gemini and OpenRouter server adapters. Provider-specific overrides (e.g.
+ * Anthropic OverloadedError, Gemini quota body markers) are layered on top
+ * by the per-provider classifier wrappers in this module.
+ */
+export function classifyHttpProviderError(input: ClassifyHttpInput): ServerClassifiedProviderError {
+  const { status, providerLabel } = input;
+  const body = input.bodyText ?? '';
+  const lower = body.toLowerCase();
+  const retryAfterMs = input.headers ? parseRetryAfterMs(input.headers.get('retry-after')) : undefined;
+
+  if (
+    lower.includes('quota exceeded') ||
+    lower.includes('insufficient credits') ||
+    lower.includes('insufficient_quota') ||
+    lower.includes('resource_exhausted')
+  ) {
+    return new ServerClassifiedProviderError(
+      `${providerLabel} quota exhausted${status !== undefined ? ` (status ${status})` : ''}`,
+      { kind: 'quota_exhausted', cause: input.cause },
+    );
+  }
+
+  if (status === 429) {
+    return new ServerClassifiedProviderError(`${providerLabel} rate limit (429)`, {
+      kind: 'rate_limit',
+      cause: input.cause,
+      ...(retryAfterMs !== undefined ? { retryAfterMs } : {}),
+    });
+  }
+
+  if (status === 401 || status === 403) {
+    return new ServerClassifiedProviderError(`${providerLabel} auth error (status ${status})`, {
+      kind: 'auth_invalid',
+      cause: input.cause,
+    });
+  }
+
+  if (status === 400 || status === 404) {
+    return new ServerClassifiedProviderError(`${providerLabel} bad request (status ${status})`, {
+      kind: 'unrecoverable',
+      cause: input.cause,
+    });
+  }
+
+  if (status !== undefined && status >= 500 && status < 600) {
+    return new ServerClassifiedProviderError(`${providerLabel} upstream error (status ${status})`, {
+      kind: 'transient',
+      cause: input.cause,
+    });
+  }
+
+  if (status === undefined) {
+    const message = input.cause instanceof Error ? input.cause.message : String(input.cause);
+    return new ServerClassifiedProviderError(`${providerLabel} network error: ${message}`, {
+      kind: 'transient',
+      cause: input.cause,
+    });
+  }
+
+  return new ServerClassifiedProviderError(
+    `${providerLabel} API error: ${status}${body ? ` - ${body.substring(0, 200)}` : ''}`,
+    { kind: 'unrecoverable', cause: input.cause },
+  );
+}
@@ -0,0 +1,164 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import { ModeManager } from '../../../../services/domain/ModeManager.js';
+import type { ModeConfig, ObservationType } from '../../../../services/domain/types.js';
+import { stripTags } from '../../../../utils/tag-stripping.js';
+import type { PostgresAgentEvent } from '../../../../storage/postgres/agent-events.js';
+import type { ServerGenerationContext } from './types.js';
+
+// Fallback list mirrors the default observation types used by claude-mem
+// modes. The server-beta prompt does not strictly need a loaded mode file —
+// the parser accepts any of these as the <type> value — so when no mode is
+// loaded (tests, fresh installs) we synthesize a minimal type list rather
+// than throwing.
+const FALLBACK_OBSERVATION_TYPES: ReadonlyArray<Pick<ObservationType, 'id'>> = [
+  { id: 'discovery' },
+  { id: 'progress' },
+  { id: 'blocker' },
+  { id: 'decision' },
+];
+
+// Build a single-shot generation prompt from a list of AgentEvent records
+// plus project/session metadata. Output: a user prompt asking the provider
+// to return one or more <observation> XML blocks (or an empty response if
+// the batch should be skipped). This is intentionally a single-turn request
+// — server-beta does NOT use the worker's multi-turn SDK conversation
+// model. parseAgentXml(...) accepts the response unchanged.
+//
+// Privacy: every event payload field passes through `stripTags` (which
+// removes <private>, <claude-mem-context>, <system-reminder>, etc.) before
+// being included in the prompt. Privacy enforcement here is belt-and-suspenders
+// — `processGeneratedResponse` also discards observations that are entirely
+// derived from privately-tagged inputs.
+
+export interface BuildServerPromptResult {
+  readonly prompt: string;
+  readonly hadPrivateContent: boolean;
+  readonly skippedAll: boolean;
+}
+
+const MAX_PAYLOAD_CHARS = 16 * 1024;
+
+export function buildServerGenerationPrompt(
+  context: ServerGenerationContext,
+  options: { mode?: ModeConfig } = {},
+): BuildServerPromptResult {
+  const mode = options.mode ?? loadActiveModeOrFallback();
+
+  let hadPrivateContent = false;
+  let allEventsScrubbedToEmpty = true;
+  const eventBlocks: string[] = [];
+
+  for (const event of context.events) {
+    const block = buildEventBlock(event);
+    if (block.hadPrivate) {
+      hadPrivateContent = true;
+    }
+    if (block.body.length > 0) {
+      allEventsScrubbedToEmpty = false;
+      eventBlocks.push(block.body);
+    }
+  }
+
+  const skippedAll = context.events.length > 0 && allEventsScrubbedToEmpty;
+
+  const sessionTag = context.project.serverSessionId
+    ? `\n  <server_session_id>${escapeXml(context.project.serverSessionId)}</server_session_id>`
+    : '';
+  const projectTag = context.project.projectName
+    ? `\n  <project_name>${escapeXml(context.project.projectName)}</project_name>`
+    : '';
+
+  const observationOutputSchema = buildObservationOutputSchema(mode);
+
+  const prompt = [
+    '<server_beta_observation_request>',
+    `  <project_id>${escapeXml(context.project.projectId)}</project_id>`,
+    `  <team_id>${escapeXml(context.project.teamId)}</team_id>` + sessionTag + projectTag,
+    `  <generation_job_id>${escapeXml(context.job.id)}</generation_job_id>`,
+    '  <agent_events>',
+    eventBlocks.length > 0 ? eventBlocks.join('\n') : '    <!-- empty after privacy stripping -->',
+    '  </agent_events>',
+    '</server_beta_observation_request>',
+    '',
+    'You are observing an agent at work. Return one or more',
+    '<observation>...</observation> XML blocks summarizing durable, useful',
+    'discoveries from the events above. If the events contain nothing worth',
+    'recording (e.g., everything was scrubbed by privacy filters or the',
+    'activity was trivial), return a single self-closing <skip_summary />',
+    'tag and nothing else. Do not include any prose outside the XML.',
+    '',
+    'Schema for each <observation> block:',
+    observationOutputSchema,
+  ].join('\n');
+
+  return { prompt, hadPrivateContent, skippedAll };
+}
+
+interface EventBlockResult {
+  body: string;
+  hadPrivate: boolean;
+}
+
+function buildEventBlock(event: PostgresAgentEvent): EventBlockResult {
+  const rawPayload =
+    typeof event.payload === 'string' ? event.payload : JSON.stringify(event.payload ?? {}, null, 2);
+
+  const stripResult = stripTags(rawPayload);
+  const hadPrivate = (stripResult.counts.private ?? 0) > 0;
+  const truncatedPayload = stripResult.stripped.length > MAX_PAYLOAD_CHARS
+    ? stripResult.stripped.slice(0, MAX_PAYLOAD_CHARS) + '\n[...truncated]'
+    : stripResult.stripped;
+
+  if (truncatedPayload.trim().length === 0) {
+    return { body: '', hadPrivate };
+  }
+
+  return {
+    body: [
+      '    <agent_event>',
+      `      <id>${escapeXml(event.id)}</id>`,
+      `      <event_type>${escapeXml(event.eventType)}</event_type>`,
+      `      <source_adapter>${escapeXml(event.sourceAdapter)}</source_adapter>`,
+      `      <occurred_at>${new Date(event.occurredAtEpoch).toISOString()}</occurred_at>`,
+      '      <payload>',
+      escapeXml(truncatedPayload),
+      '      </payload>',
+      '    </agent_event>',
+    ].join('\n'),
+    hadPrivate,
+  };
+}
+
+function loadActiveModeOrFallback(): ModeConfig | { observation_types: ReadonlyArray<Pick<ObservationType, 'id'>> } {
+  try {
+    return ModeManager.getInstance().getActiveMode();
+  } catch {
+    return { observation_types: FALLBACK_OBSERVATION_TYPES } as unknown as ModeConfig;
+  }
+}
+
+function buildObservationOutputSchema(mode: ModeConfig | { observation_types: ReadonlyArray<Pick<ObservationType, 'id'>> }): string {
+  const types = mode.observation_types.map(t => t.id).join(' | ');
+  return [
+    '<observation>',
+    `  <type>[ ${types} ]</type>`,
+    '  <title>...</title>',
+    '  <subtitle>...</subtitle>',
+    '  <facts><fact>...</fact></facts>',
+    '  <narrative>...</narrative>',
+    '  <concepts><concept>...</concept></concepts>',
+    '  <files_read><file>...</file></files_read>',
+    '  <files_modified><file>...</file></files_modified>',
+    '</observation>',
+  ].join('\n');
+}
+
+function escapeXml(text: string): string {
+  return text
+    .replace(/&/g, '&amp;')
+    .replace(/</g, '&lt;')
+    .replace(/>/g, '&gt;')
+    .replace(/"/g, '&quot;')
+    .replace(/'/g, '&apos;');
+}
@@ -0,0 +1,33 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import type { PostgresAgentEvent } from '../../../../storage/postgres/agent-events.js';
+import type { PostgresObservationGenerationJob } from '../../../../storage/postgres/generation-jobs.js';
+
+// ServerGenerationContext is the input handed to a server provider adapter.
+// It is reloaded from Postgres on every retry; BullMQ payload is advisory.
+// Anti-pattern guard: this MUST NOT carry worker session state.
+export interface ServerGenerationContext {
+  readonly job: PostgresObservationGenerationJob;
+  readonly events: readonly PostgresAgentEvent[];
+  readonly project: {
+    readonly projectId: string;
+    readonly teamId: string;
+    readonly serverSessionId: string | null;
+    readonly projectName?: string | null;
+  };
+}
+
+// ServerGenerationResult is the raw provider response (XML accepted by
+// parseAgentXml). Empty string means provider returned nothing — handled
+// upstream as a "skip with no observation" outcome by processGeneratedResponse.
+export interface ServerGenerationResult {
+  readonly rawText: string;
+  readonly tokensUsed?: number;
+  readonly providerLabel: string;
+  readonly modelId?: string;
+}
+
+export interface ServerGenerationProvider {
+  readonly providerLabel: 'claude' | 'gemini' | 'openrouter';
+  generate(context: ServerGenerationContext, signal?: AbortSignal): Promise<ServerGenerationResult>;
+}
@@ -2,10 +2,12 @@

 import {
  Queue,
+  QueueEvents,
  Worker,
  type Job,
  type JobsOptions,
  type Processor,
+  type QueueEventsOptions,
  type QueueOptions,
  type WorkerOptions
 } from 'bullmq';
@@ -33,6 +35,22 @@ export interface ServerJobCounts {
  completed: number;
 }

+// Phase 12 — runtime stalled counter. BullMQ doesn't expose a stalled counter
+// from getJobCounts (the underlying list is rotated on consumption). We keep
+// a per-process counter that tracks how many distinct stalled events we've
+// observed since startup. /api/health and /v1/info surface this.
+export interface ServerJobLifecycleCounters {
+  stalled: number;
+  errored: number;
+}
+
+export interface ServerJobObservedListener {
+  onCompleted?: (jobId: string, durationMs: number, returnvalue: unknown) => void;
+  onFailed?: (jobId: string | undefined, attemptsMade: number, reason: string) => void;
+  onStalled?: (jobId: string) => void;
+  onError?: (error: unknown) => void;
+}
+
 export interface ServerJobQueueOptions<TPayload> {
  name: string;
  config: RedisQueueConfig;
@@ -63,7 +81,18 @@ export class ServerJobQueue<TPayload extends object = object> {
  private readonly workerFactory?: ServerJobQueueOptions<TPayload>['workerFactory'];
  private queue: ReturnType<NonNullable<ServerJobQueueOptions<TPayload>['queueFactory']>> | Queue<TPayload> | null = null;
  private worker: ReturnType<NonNullable<ServerJobQueueOptions<TPayload>['workerFactory']>> | Worker<TPayload> | null = null;
+  private queueEvents: QueueEvents | null = null;
  private started = false;
+  private readonly counters: ServerJobLifecycleCounters = { stalled: 0, errored: 0 };
+  private readonly listeners: ServerJobObservedListener[] = [];
+  private readonly jobStartTimes = new Map<string, number>();
+  // worker.on('stalled') and the QueueEvents 'stalled' subscriber both fire
+  // for the same job — BullMQ's docs explicitly recommend listening on both
+  // for production reliability. To avoid double-counting and double-callback
+  // we record each stalled jobId here for a short TTL and treat the second
+  // signal as an idempotent no-op.
+  private readonly recentlyStalled = new Map<string, NodeJS.Timeout>();
+  private static readonly STALLED_DEDUPE_WINDOW_MS = 30_000;

  constructor(options: ServerJobQueueOptions<TPayload>) {
    this.name = options.name;
@@ -154,6 +183,53 @@ export class ServerJobQueue<TPayload extends object = object> {
  // BullMQ docs require `worker.on('error', ...)` to avoid unhandled rejections
  // when a job throws. We construct the Worker with autorun: false so the
  // caller controls startup explicitly via run().
+  //
+  // Phase 12 — wire `completed`, `failed`, `progress`, `error`, and the
+  // QueueEvents `stalled` listener. Stalled events go through QueueEvents
+  // because BullMQ's docs note rare stalls don't always reach the local
+  // worker.on('stalled') listener; QueueEvents publishes from Redis.
+  // Deduped stalled handler. Counts the stall once even though BullMQ may
+  // surface it via both worker.on('stalled') and QueueEvents 'stalled'.
+  private notifyStalled(jobId: string, source: 'worker' | 'queue-events'): void {
+    if (this.recentlyStalled.has(jobId)) {
+      logger.debug?.('QUEUE', `[generation] job=${jobId} stalled (suppressed duplicate from ${source})`, {
+        queue: this.name,
+        jobId,
+        source,
+      });
+      return;
+    }
+    const timer = setTimeout(() => {
+      this.recentlyStalled.delete(jobId);
+    }, ServerJobQueue.STALLED_DEDUPE_WINDOW_MS);
+    if (typeof (timer as { unref?: () => void }).unref === 'function') {
+      (timer as { unref: () => void }).unref();
+    }
+    this.recentlyStalled.set(jobId, timer);
+    this.counters.stalled += 1;
+    logger.warn('QUEUE', `[generation] job=${jobId} stalled${source === 'queue-events' ? ' (queue-events)' : ''}`, {
+      queue: this.name,
+      jobId,
+      source,
+    });
+    for (const l of this.listeners) {
+      try { l.onStalled?.(jobId); } catch { /* listener errors must not propagate */ }
+    }
+  }
+
+  // Single source of truth for queue-side error accounting. worker errors and
+  // QueueEvents errors both increment counters.errored and notify listeners,
+  // so per-process metrics aren't asymmetric across the two sources.
+  private notifyQueueError(error: unknown, source: 'worker' | 'queue-events'): void {
+    this.counters.errored += 1;
+    logger.warn('QUEUE', `${this.name} ${source} error`, {
+      error: error instanceof Error ? error.message : String(error),
+    });
+    for (const l of this.listeners) {
+      try { l.onError?.(error); } catch { /* listener errors must not propagate */ }
+    }
+  }
+
  start(processor: Processor<TPayload>): void {
    if (this.started) {
      throw new Error(`ServerJobQueue ${this.name} is already started`);
@@ -168,22 +244,115 @@ export class ServerJobQueue<TPayload extends object = object> {
    const worker = this.workerFactory
      ? this.workerFactory(this.name, processor, workerOptions)
      : new Worker<TPayload>(this.name, processor, workerOptions);
-    worker.on('error', (error: unknown) => {
-      logger.warn('QUEUE', `${this.name} worker error`, {
-        error: error instanceof Error ? error.message : String(error)
+    worker.on('error', (error: unknown) => this.notifyQueueError(error, 'worker'));
+    // BullMQ Worker exposes `active`, `completed`, `failed`, `progress`, and
+    // `stalled` events. We attach to all five because the runtime relies on
+    // them for observability (Phase 12).
+    if (typeof (worker as { on?: unknown }).on === 'function') {
+      const w = worker as Worker<TPayload>;
+      w.on('active', (job: Job<TPayload>) => {
+        if (job.id) this.jobStartTimes.set(job.id, Date.now());
      });
-    });
+      w.on('completed', (job: Job<TPayload>, returnvalue: unknown) => {
+        const startedAt = job.id ? this.jobStartTimes.get(job.id) : undefined;
+        const durationMs = startedAt ? Date.now() - startedAt : 0;
+        if (job.id) this.jobStartTimes.delete(job.id);
+        const sourceType = (job.data as { source_type?: string } | undefined)?.source_type ?? '?';
+        logger.info('QUEUE', `[generation] job=${job.id ?? '?'} source_type=${sourceType} duration=${durationMs}ms`, {
+          queue: this.name,
+          jobId: job.id ?? null,
+          sourceType,
+          durationMs,
+        });
+        for (const l of this.listeners) {
+          try { l.onCompleted?.(job.id ?? '?', durationMs, returnvalue); } catch { /* swallow listener errors only */ }
+        }
+      });
+      w.on('failed', (job: Job<TPayload> | undefined, error: Error) => {
+        if (job?.id) this.jobStartTimes.delete(job.id);
+        const sourceType = (job?.data as { source_type?: string } | undefined)?.source_type ?? '?';
+        const attemptsMade = job?.attemptsMade ?? 0;
+        logger.warn('QUEUE', `[generation] job=${job?.id ?? '?'} source_type=${sourceType} attempts=${attemptsMade} reason=${error.message}`, {
+          queue: this.name,
+          jobId: job?.id ?? null,
+          sourceType,
+          attemptsMade,
+          reason: error.message,
+        });
+        for (const l of this.listeners) {
+          try { l.onFailed?.(job?.id, attemptsMade, error.message); } catch { /* swallow */ }
+        }
+      });
+      w.on('progress', (job: Job<TPayload>, progress: unknown) => {
+        logger.debug?.('QUEUE', `[generation] job=${job.id ?? '?'} progress`, {
+          queue: this.name,
+          jobId: job.id ?? null,
+          progress,
+        });
+      });
+      w.on('stalled', (jobId: string) => this.notifyStalled(jobId, 'worker'));
+    }
    worker.run();
    this.worker = worker;
+
+    // QueueEvents subscribes to Redis pub/sub for cross-process events
+    // (BullMQ "Stalled Jobs" docs recommend this for production reliability).
+    // Skip in test/factory mode since the test factory does not provide a
+    // real Redis connection.
+    if (!this.workerFactory) {
+      try {
+        const events = new QueueEvents(this.name, {
+          connection: this.config.connection,
+          prefix: this.config.prefix,
+        } as QueueEventsOptions);
+        events.on('stalled', ({ jobId }: { jobId: string }) => this.notifyStalled(jobId, 'queue-events'));
+        // QueueEvents emits its own 'error' too — surface through the same
+        // counter+listener path as worker errors so observability stays symmetric.
+        events.on('error', (error: Error) => this.notifyQueueError(error, 'queue-events'));
+        this.queueEvents = events;
+      } catch (error) {
+        logger.warn('QUEUE', `${this.name} failed to start QueueEvents listener`, {
+          error: error instanceof Error ? error.message : String(error),
+        });
+      }
+    }
+
    this.started = true;
  }

+  /**
+   * Phase 12 — register an observer for completed/failed/stalled/error
+   * events. Used by the runtime to surface lifecycle hooks (audit, metrics)
+   * without subclassing. Listeners that throw are isolated.
+   */
+  observe(listener: ServerJobObservedListener): void {
+    this.listeners.push(listener);
+  }
+
+  /**
+   * Phase 12 — runtime counters for stalled/errored events. waiting/active/
+   * completed/failed/delayed live in `getCounts()` (BullMQ getJobCounts).
+   * Stalled is a per-process counter because BullMQ rotates the underlying
+   * list and there's no reliable count from getJobCounts.
+   */
+  getLifecycleCounters(): ServerJobLifecycleCounters {
+    return { ...this.counters };
+  }
+
  isStarted(): boolean {
    return this.started;
  }

  async close(): Promise<void> {
    const errors: Error[] = [];
+    if (this.queueEvents) {
+      try {
+        await this.queueEvents.close();
+      } catch (error) {
+        errors.push(error instanceof Error ? error : new Error(String(error)));
+      }
+      this.queueEvents = null;
+    }
    if (this.worker) {
      try {
        await this.worker.close();
@@ -201,6 +370,10 @@ export class ServerJobQueue<TPayload extends object = object> {
      }
      this.queue = null;
    }
+    for (const timer of this.recentlyStalled.values()) {
+      clearTimeout(timer);
+    }
+    this.recentlyStalled.clear();
    if (errors.length > 0) {
      throw errors[0];
    }
@@ -9,11 +9,12 @@ import type { JsonObject } from '../../storage/postgres/utils.js';
 import { logger } from '../../utils/logger.js';
 import { buildServerJobId } from './job-id.js';
 import type { ServerJobQueue } from './ServerJobQueue.js';
-import type {
-  GenerateObservationsForEventJob,
-  GenerateSessionSummaryJob,
-  ReindexObservationJob,
-  ServerGenerationJobKind
+import {
+  assertServerGenerationJobPayload,
+  type GenerateObservationsForEventJob,
+  type GenerateSessionSummaryJob,
+  type ReindexObservationJob,
+  type ServerGenerationJobKind,
 } from './types.js';

 // Postgres outbox is canonical history; BullMQ is the execution transport.
@@ -86,6 +87,10 @@ export async function enqueueOutbox(
  });

  try {
+    // Phase 11 — defense in depth. Validate the payload shape at the queue
+    // boundary so a malformed enqueue is rejected synchronously and never
+    // produces a job whose audit trail is missing fields.
+    assertServerGenerationJobPayload(payload);
    await queue.add(bullmqJobId, payload);
    await eventsRepo.append({
      generationJobId: row.id,
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: Apache-2.0

+import { z } from 'zod';
 import type {
  ObservationGenerationJobSourceType,
  ObservationGenerationJobStatus
@@ -9,6 +10,12 @@ export type ServerGenerationJobKind = 'event' | 'event-batch' | 'summary' | 'rei

 export type ServerGenerationJobStatus = ObservationGenerationJobStatus;

+// Phase 11 — every BullMQ job carries the full team-aware tracing surface so
+// the worker can audit and scope-check on every retry. team_id and project_id
+// are advisory: the worker MUST reload the canonical outbox row from Postgres
+// and compare these fields before any side effect. Treating these as auth
+// authority would be a bypass — the comparison is a tampering detector, not
+// the auth gate.
 export interface ServerGenerationJob {
  kind: ServerGenerationJobKind;
  team_id: string;
@@ -16,6 +23,18 @@ export interface ServerGenerationJob {
  source_type: ObservationGenerationJobSourceType;
  source_id: string;
  generation_job_id: string;
+  // Identity of the API key that initiated this job at the HTTP boundary.
+  // Reused at execution time to detect revocation between enqueue and run.
+  api_key_id: string | null;
+  // The actor associated with the api key at enqueue time. Audit-only;
+  // never trust this for authz decisions.
+  actor_id: string | null;
+  // Legacy adapter or surface that produced the source row, for routing
+  // and audit (e.g. 'api', 'hooks', 'mcp', 'compat:sessions-observations').
+  source_adapter: string;
+  // Phase 12 — request correlation id, optional but always serialized as a
+  // nullable field so downstream consumers can rely on shape stability.
+  request_id?: string | null;
 }

 export interface GenerateObservationsForEventJob extends ServerGenerationJob {
@@ -57,3 +76,80 @@ export const SERVER_JOB_KIND_PREFIX: Record<ServerGenerationJobKind, string> = {
  summary: 'sum',
  reindex: 'rdx'
 };
+
+// Phase 11 — Zod schema validates payloads at the queue boundary so a
+// malformed enqueue is rejected synchronously rather than silently producing
+// a job the worker can't audit. Required fields here mirror the
+// ServerGenerationJob interface; a missing team_id, project_id, or
+// generation_job_id should always be a programmer error caught at enqueue.
+
+const baseFieldsSchema = z.object({
+  team_id: z.string().min(1, 'team_id is required'),
+  project_id: z.string().min(1, 'project_id is required'),
+  source_type: z.enum(['agent_event', 'session_summary', 'observation_reindex']),
+  source_id: z.string().min(1, 'source_id is required'),
+  generation_job_id: z.string().min(1, 'generation_job_id is required'),
+  // api_key_id and actor_id are nullable to accommodate local-dev/system
+  // enqueues, but the *field* must be present in the payload so audit
+  // records always render the same shape.
+  api_key_id: z.string().min(1).nullable(),
+  actor_id: z.string().min(1).nullable(),
+  source_adapter: z.string().min(1, 'source_adapter is required'),
+  // Phase 12 — request_id is optional in the schema (older jobs predating
+  // this phase have nullable/missing values) but always passes through to
+  // logs and audit when present.
+  request_id: z.string().min(1).nullable().optional(),
+});
+
+export const GenerateObservationsForEventJobSchema = baseFieldsSchema.extend({
+  kind: z.literal('event'),
+  agent_event_id: z.string().min(1),
+});
+
+export const GenerateObservationsForEventBatchJobSchema = baseFieldsSchema.extend({
+  kind: z.literal('event-batch'),
+  agent_event_ids: z.array(z.string().min(1)).min(1),
+});
+
+export const GenerateSessionSummaryJobSchema = baseFieldsSchema.extend({
+  kind: z.literal('summary'),
+  server_session_id: z.string().min(1),
+});
+
+export const ReindexObservationJobSchema = baseFieldsSchema.extend({
+  kind: z.literal('reindex'),
+  observation_id: z.string().min(1),
+});
+
+export const ServerGenerationJobPayloadSchema = z.discriminatedUnion('kind', [
+  GenerateObservationsForEventJobSchema,
+  GenerateObservationsForEventBatchJobSchema,
+  GenerateSessionSummaryJobSchema,
+  ReindexObservationJobSchema,
+]);
+
+export class ServerGenerationJobPayloadValidationError extends Error {
+  readonly issues: z.ZodIssue[];
+
+  constructor(issues: z.ZodIssue[]) {
+    super(`invalid server generation job payload: ${issues.map(i => i.message).join('; ')}`);
+    this.issues = issues;
+  }
+}
+
+/**
+ * Validate a candidate BullMQ payload against the discriminated union and
+ * return a typed payload, or throw `ServerGenerationJobPayloadValidationError`.
+ * Use this at every enqueue site so a malformed payload never enters the
+ * transport — the worker MUST also re-validate from Postgres but defense in
+ * depth is cheap.
+ */
+export function assertServerGenerationJobPayload(
+  candidate: unknown,
+): ServerGenerationJobPayload {
+  const result = ServerGenerationJobPayloadSchema.safeParse(candidate);
+  if (!result.success) {
+    throw new ServerGenerationJobPayloadValidationError(result.error.issues);
+  }
+  return result.data as ServerGenerationJobPayload;
+}
@@ -0,0 +1,199 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import { createHash } from 'crypto';
+import type { NextFunction, Request, RequestHandler, Response } from 'express';
+import type { PostgresPool } from '../../storage/postgres/pool.js';
+import type { PostgresApiKey } from '../../storage/postgres/auth.js';
+import type { AuthContext } from './auth.js';
+
+// Postgres-backed auth middleware for the server-beta runtime.
+//
+// Mirrors src/server/middleware/auth.ts but reads API keys from the Postgres
+// `api_keys` table instead of bun:sqlite. Phase 4 routes use this so the
+// runtime depends only on the Postgres pool and Postgres-backed repositories.
+//
+// teamId / projectId on req.authContext come straight from the Postgres
+// api_keys row. Routes use those to scope every read and write.
+
+export interface PostgresRequireAuthOptions {
+  requiredScopes?: string[];
+  authMode?: string;
+  allowLocalDevBypass?: boolean;
+  // Local-dev fallback team for unauthenticated loopback requests. This is
+  // only used when authMode === 'local-dev' AND allowLocalDevBypass is true
+  // AND the request is on loopback. It must NEVER be used to scope a real
+  // production request.
+  localDevTeamId?: string | null;
+}
+
+export function requirePostgresServerAuth(
+  pool: PostgresPool,
+  options: PostgresRequireAuthOptions = {},
+): RequestHandler {
+  return async (req: Request, res: Response, next: NextFunction) => {
+    try {
+      const authMode = options.authMode ?? process.env.CLAUDE_MEM_AUTH_MODE ?? 'api-key';
+      const authorization = req.header('authorization') ?? '';
+      const rawKey = parseBearerToken(authorization);
+
+      const allowLocalDevBypass = options.allowLocalDevBypass
+        ?? process.env.CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS === '1';
+      if (
+        !rawKey
+        && authMode === 'local-dev'
+        && allowLocalDevBypass
+        && isLocalhost(req)
+        && hasLoopbackHostHeader(req)
+        && !hasForwardedClientHeaders(req)
+      ) {
+        const ctx: AuthContext = {
+          userId: null,
+          organizationId: null,
+          teamId: options.localDevTeamId ?? null,
+          projectId: null,
+          scopes: ['local-dev'],
+          apiKeyId: null,
+          mode: 'local-dev',
+        };
+        req.authContext = ctx;
+        next();
+        return;
+      }
+
+      if (!rawKey) {
+        res.status(401).json({ error: 'Unauthorized', message: 'Missing bearer API key' });
+        return;
+      }
+
+      const verified = await verifyPostgresApiKey(pool, rawKey, options.requiredScopes ?? []);
+      if (!verified) {
+        res.status(403).json({ error: 'Forbidden', message: 'Invalid API key or insufficient scope' });
+        return;
+      }
+
+      const ctx: AuthContext = {
+        userId: null,
+        organizationId: null,
+        teamId: verified.teamId,
+        projectId: verified.projectId,
+        scopes: verified.scopes,
+        apiKeyId: verified.apiKeyId,
+        mode: 'api-key',
+      };
+      req.authContext = ctx;
+      next();
+    } catch (error) {
+      next(error);
+    }
+  };
+}
+
+interface VerifiedPostgresApiKey {
+  apiKeyId: string;
+  teamId: string | null;
+  projectId: string | null;
+  scopes: string[];
+}
+
+export async function verifyPostgresApiKey(
+  pool: PostgresPool,
+  rawKey: string,
+  requiredScopes: string[],
+): Promise<VerifiedPostgresApiKey | null> {
+  const keyHash = createHash('sha256').update(rawKey).digest('hex');
+  const result = await pool.query(
+    `
+      SELECT id, team_id, project_id, scopes, revoked_at, expires_at
+      FROM api_keys
+      WHERE key_hash = $1
+    `,
+    [keyHash],
+  );
+  const row = result.rows[0] as Pick<
+    PostgresApiKey,
+    'id' | 'teamId' | 'projectId'
+  > & {
+    id: string;
+    team_id: string | null;
+    project_id: string | null;
+    scopes: unknown;
+    revoked_at: Date | null;
+    expires_at: Date | null;
+  } | undefined;
+  if (!row) {
+    return null;
+  }
+  if (row.revoked_at) {
+    return null;
+  }
+  if (row.expires_at && row.expires_at.getTime() <= Date.now()) {
+    return null;
+  }
+  const scopes = normalizeScopes(row.scopes);
+  if (!hasRequiredScopes(scopes, requiredScopes)) {
+    return null;
+  }
+  return {
+    apiKeyId: row.id,
+    teamId: row.team_id,
+    projectId: row.project_id,
+    scopes,
+  };
+}
+
+function normalizeScopes(value: unknown): string[] {
+  if (!Array.isArray(value)) {
+    return [];
+  }
+  return value.filter((item): item is string => typeof item === 'string');
+}
+
+function hasRequiredScopes(grantedScopes: string[], requiredScopes: string[]): boolean {
+  if (requiredScopes.length === 0 || grantedScopes.includes('*')) {
+    return true;
+  }
+  return requiredScopes.every(scope => grantedScopes.includes(scope));
+}
+
+function parseBearerToken(header: string): string | null {
+  const match = /^Bearer\s+(.+)$/i.exec(header.trim());
+  return match?.[1]?.trim() || null;
+}
+
+function isLocalhost(req: Request): boolean {
+  const clientIp = req.ip || req.socket.remoteAddress || '';
+  return clientIp === '127.0.0.1'
+    || clientIp === '::1'
+    || clientIp === '::ffff:127.0.0.1'
+    || clientIp === 'localhost';
+}
+
+function hasLoopbackHostHeader(req: Request): boolean {
+  const host = parseHostWithoutPort(req.header('host') ?? '');
+  return host === '127.0.0.1'
+    || host === 'localhost'
+    || host === '::1';
+}
+
+function parseHostWithoutPort(rawHost: string): string {
+  const host = rawHost.trim().toLowerCase();
+  if (host.startsWith('[')) {
+    const closeBracketIndex = host.indexOf(']');
+    return closeBracketIndex === -1 ? host : host.slice(1, closeBracketIndex);
+  }
+
+  const lastColonIndex = host.lastIndexOf(':');
+  if (lastColonIndex > -1 && /^\d+$/.test(host.slice(lastColonIndex + 1))) {
+    return host.slice(0, lastColonIndex);
+  }
+  return host;
+}
+
+function hasForwardedClientHeaders(req: Request): boolean {
+  return Boolean(
+    req.header('forwarded')
+      || req.header('x-forwarded-for')
+      || req.header('x-forwarded-host')
+      || req.header('x-real-ip'),
+  );
+}
@@ -0,0 +1,40 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import { randomUUID } from 'crypto';
+import type { NextFunction, Request, RequestHandler, Response } from 'express';
+
+// Phase 12 — request_id middleware. Mints a UUID per inbound request and
+// attaches it to req.requestId so route handlers, ingest services, and
+// generation jobs can correlate logs back to the original HTTP call. Honors
+// an inbound `X-Request-Id` header so an upstream load balancer / gateway
+// can supply the id, but rejects non-conformant values to keep audit rows
+// clean (UUID v4 OR a small whitelist of [a-zA-Z0-9-_] up to 64 chars).
+//
+// Anti-pattern guard: never trust the inbound id for auth — this is purely
+// an audit/log correlator. Auth still flows through requirePostgresServerAuth.
+
+const REQUEST_ID_HEADER = 'x-request-id';
+const REQUEST_ID_MAX_LENGTH = 64;
+const REQUEST_ID_SAFE_PATTERN = /^[A-Za-z0-9][A-Za-z0-9\-_]{0,63}$/;
+
+declare module 'express-serve-static-core' {
+  interface Request {
+    requestId?: string;
+  }
+}
+
+export function requestIdMiddleware(): RequestHandler {
+  return (req: Request, res: Response, next: NextFunction) => {
+    const inbound = req.header(REQUEST_ID_HEADER);
+    const accepted = inbound && isAcceptableRequestId(inbound) ? inbound : randomUUID();
+    req.requestId = accepted;
+    res.setHeader('X-Request-Id', accepted);
+    next();
+  };
+}
+
+export function isAcceptableRequestId(value: string): boolean {
+  if (typeof value !== 'string') return false;
+  if (value.length === 0 || value.length > REQUEST_ID_MAX_LENGTH) return false;
+  return REQUEST_ID_SAFE_PATTERN.test(value);
+}
@@ -17,6 +17,23 @@ export interface ObservationQueueEngine {
  close(): Promise<void>;
 }

+// Phase 12 — `lanes` exposes per-queue counts (waiting/active/completed/
+// failed/delayed/stalled) so deploy probes can monitor saturation per lane.
+// `unavailable: true` means the sample failed; the health endpoint MUST NOT
+// 503 just because counts are stale.
+export interface ObservationQueueHealthLaneSnapshot {
+  kind: string;
+  name: string;
+  waiting: number;
+  active: number;
+  completed: number;
+  failed: number;
+  delayed: number;
+  stalled: number;
+  unavailable: boolean;
+  unavailableReason?: string;
+}
+
 export interface ObservationQueueHealth {
  engine: 'bullmq';
  redis: {
@@ -27,6 +44,7 @@ export interface ObservationQueueHealth {
    prefix: string;
    error?: string;
  };
+  lanes?: ObservationQueueHealthLaneSnapshot[];
 }

 export interface ObservationQueueInspection {
@@ -0,0 +1,164 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import type { Job } from 'bullmq';
+import { logger } from '../../utils/logger.js';
+import { PostgresAuthRepository } from '../../storage/postgres/auth.js';
+import type { PostgresPool } from '../../storage/postgres/pool.js';
+import { ProviderObservationGenerator } from '../generation/ProviderObservationGenerator.js';
+import type { ServerGenerationProvider } from '../generation/providers/shared/types.js';
+import type { ServerGenerationJobPayload } from '../jobs/types.js';
+import type { ActiveServerBetaQueueManager } from './ActiveServerBetaQueueManager.js';
+import type {
+  ServerBetaBoundaryHealth,
+  ServerBetaGenerationWorkerManager,
+} from './types.js';
+
+// ActiveServerBetaGenerationWorkerManager wires a BullMQ Worker (per the
+// 'event' queue) to a ProviderObservationGenerator. Concurrency defaults to
+// 1 per the plan (line 80–86) so retries observe a single inflight provider
+// call per server. autorun:false / explicit run() is enforced by
+// ServerJobQueue.start.
+//
+// This class is wired in only when both a queue manager AND a configured
+// provider are present. create-server-beta-service keeps the disabled
+// adapter otherwise so server beta can boot without provider credentials.
+
+export interface ActiveServerBetaGenerationWorkerManagerOptions {
+  pool: PostgresPool;
+  queueManager: ActiveServerBetaQueueManager;
+  provider: ServerGenerationProvider;
+  workerId?: string;
+  // Test seam: replace the generator with a stub.
+  generatorFactory?: (
+    pool: PostgresPool,
+    provider: ServerGenerationProvider,
+    workerId: string,
+  ) => ProviderObservationGenerator;
+}
+
+export class ActiveServerBetaGenerationWorkerManager implements ServerBetaGenerationWorkerManager {
+  readonly kind = 'generation-worker-manager' as const;
+  private started = false;
+  private closed = false;
+  private readonly generator: ProviderObservationGenerator;
+  private readonly workerId: string;
+
+  constructor(private readonly options: ActiveServerBetaGenerationWorkerManagerOptions) {
+    this.workerId = options.workerId ?? `server-beta-${process.pid}`;
+    this.generator = options.generatorFactory
+      ? options.generatorFactory(options.pool, options.provider, this.workerId)
+      : new ProviderObservationGenerator({
+          pool: options.pool,
+          provider: options.provider,
+          workerId: this.workerId,
+        });
+  }
+
+  /**
+   * Attach BullMQ Worker to the 'event' queue. Per BullMQ docs we use
+   *   new Worker(queueName, processor, { concurrency, autorun })
+   * via ServerJobQueue.start(...). Errors are surfaced through the queue
+   * wrapper's worker.on('error', ...) listener.
+   */
+  start(): void {
+    if (this.started) return;
+    const dispatcher = async (job: Job<ServerGenerationJobPayload>) => {
+      try {
+        return await this.generator.process(job);
+      } catch (error) {
+        logger.warn('SYSTEM', 'observation generator failed', {
+          jobId: job.id,
+          kind: job.data.kind,
+          error: error instanceof Error ? error.message : String(error),
+        });
+        throw error;
+      }
+    };
+    this.options.queueManager.start('event', dispatcher);
+    // Phase 6: wire the summary lane alongside the event lane. Concurrency
+    // defaults to 1 per ServerJobQueue config (per the plan), and the same
+    // ProviderObservationGenerator dispatches on job.data.source_type via the
+    // outbox row reload inside lockOutbox+process.
+    this.options.queueManager.start('summary', dispatcher);
+
+    // Phase 12 — audit stalled events directly. Phase 11's audit chain now
+    // covers the operator and provider lifecycle; stalled jobs come from
+    // BullMQ runtime not the HTTP boundary, so we wire them in here. Best-
+    // effort: a missing/unscoped audit MUST NOT crash the worker.
+    for (const lane of ['event', 'summary'] as const) {
+      try {
+        const queue = this.options.queueManager.getQueue(lane);
+        queue.observe({
+          onStalled: (jobId) => {
+            void this.auditStalledJob(jobId, lane);
+          },
+        });
+      } catch (error) {
+        logger.warn('SYSTEM', `failed to wire stalled observer for ${lane} lane`, {
+          error: error instanceof Error ? error.message : String(error),
+        });
+      }
+    }
+
+    this.started = true;
+  }
+
+  // Phase 12 — write a `generation_job.stalled` audit row. We look up the
+  // outbox row by BullMQ jobId (== bullmq_job_id column) so team/project
+  // scope is correct on the audit row even when the original API key
+  // metadata is unavailable (BullMQ retries can outlive a session).
+  private async auditStalledJob(bullmqJobId: string, lane: 'event' | 'summary'): Promise<void> {
+    try {
+      const result = await this.options.pool.query<{
+        id: string;
+        team_id: string | null;
+        project_id: string | null;
+      }>(
+        'SELECT id, team_id, project_id FROM observation_generation_jobs WHERE bullmq_job_id = $1 LIMIT 1',
+        [bullmqJobId],
+      );
+      const row = result.rows[0];
+      if (!row) return;
+      const repo = new PostgresAuthRepository(this.options.pool);
+      await repo.createAuditLog({
+        teamId: row.team_id,
+        projectId: row.project_id,
+        actorId: null,
+        apiKeyId: null,
+        action: 'generation_job.stalled',
+        resourceType: 'observation_generation_job',
+        resourceId: row.id,
+        details: { lane, bullmqJobId },
+      });
+    } catch (error) {
+      logger.warn('SYSTEM', 'failed to audit stalled generation_job', {
+        bullmqJobId,
+        error: error instanceof Error ? error.message : String(error),
+      });
+    }
+  }
+
+  getHealth(): ServerBetaBoundaryHealth {
+    if (this.closed) {
+      return { status: 'errored', reason: 'generation-worker-manager closed' };
+    }
+    return {
+      status: this.started ? 'active' : 'disabled',
+      reason: this.started
+        ? 'BullMQ Worker attached to event queue with ProviderObservationGenerator'
+        : 'wired but not started',
+      details: {
+        provider: this.options.provider.providerLabel,
+        workerId: this.workerId,
+      },
+    };
+  }
+
+  async close(): Promise<void> {
+    if (this.closed) return;
+    this.closed = true;
+    // The underlying Worker is owned by ServerJobQueue.close() (driven by
+    // the queue manager). We do not double-close here; the queue manager's
+    // close cascade handles it.
+  }
+}
@@ -11,6 +11,7 @@ import type { RedisQueueConfig } from '../queue/redis-config.js';
 import { logger } from '../../utils/logger.js';
 import type {
  ServerBetaBoundaryHealth,
+  ServerBetaQueueLaneMetric,
  ServerBetaQueueManager,
 } from './types.js';

@@ -75,6 +76,49 @@ export class ActiveServerBetaQueueManager implements ServerBetaQueueManager {
    };
  }

+  /**
+   * Phase 12 — per-lane counts. Returns BullMQ getJobCounts plus the
+   * per-process stalled counter. If Redis is unreachable, the lane is
+   * reported with an `unavailable` flag rather than throwing so /api/health
+   * remains responsive even in partial-failure modes.
+   */
+  async getLaneMetrics(): Promise<ServerBetaQueueLaneMetric[]> {
+    const out: ServerBetaQueueLaneMetric[] = [];
+    for (const kind of QUEUE_KINDS) {
+      const queue = this.queues.get(kind);
+      if (!queue) continue;
+      const lifecycle = queue.getLifecycleCounters();
+      try {
+        const counts = await queue.getCounts();
+        out.push({
+          kind,
+          name: SERVER_JOB_QUEUE_NAMES[kind],
+          waiting: counts.waiting,
+          active: counts.active,
+          completed: counts.completed,
+          failed: counts.failed,
+          delayed: counts.delayed,
+          stalled: lifecycle.stalled,
+          unavailable: false,
+        });
+      } catch (error) {
+        out.push({
+          kind,
+          name: SERVER_JOB_QUEUE_NAMES[kind],
+          waiting: 0,
+          active: 0,
+          completed: 0,
+          failed: 0,
+          delayed: 0,
+          stalled: lifecycle.stalled,
+          unavailable: true,
+          unavailableReason: error instanceof Error ? error.message : String(error),
+        });
+      }
+    }
+    return out;
+  }
+
  async close(): Promise<void> {
    if (this.closed) {
      return;
@@ -14,7 +14,11 @@ import {
  verifyPidFileOwnership,
  type PidInfo,
 } from '../../supervisor/process-registry.js';
-import type { ServerBetaServiceGraph } from './types.js';
+import { ServerV1PostgresRoutes } from '../routes/v1/ServerV1PostgresRoutes.js';
+import { SessionsObservationsAdapter } from '../compat/SessionsObservationsAdapter.js';
+import { SessionsSummarizeAdapter } from '../compat/SessionsSummarizeAdapter.js';
+import { ActiveServerBetaQueueManager } from './ActiveServerBetaQueueManager.js';
+import type { ServerBetaServiceGraph, ServerBetaQueueLaneMetric } from './types.js';

 const SERVER_BETA_RUNTIME = 'server-beta';
 const DEFAULT_SERVER_BETA_HOST = '127.0.0.1';
@@ -50,7 +54,12 @@ class ServerBetaRuntimeInfoRoutes implements RouteHandler {
      res.json({ status: 'ok', runtime: SERVER_BETA_RUNTIME });
    });

-    app.get('/v1/info', (_req, res) => {
+    // Phase 12 — `/v1/info` includes per-lane queue metrics so deploy probes
+    // can read waiting/active/completed/failed/delayed/stalled without
+    // hitting `/api/health`. Sampling is best-effort: a Redis blip surfaces
+    // the lane with `unavailable: true` rather than crashing the route.
+    app.get('/v1/info', async (_req, res) => {
+      const queueLanes = await collectQueueLaneMetrics(this.graph);
      res.json({
        name: 'claude-mem-server',
        runtime: SERVER_BETA_RUNTIME,
@@ -65,11 +74,28 @@ class ServerBetaRuntimeInfoRoutes implements RouteHandler {
          providerRegistry: this.graph.providerRegistry.getHealth(),
          eventBroadcaster: this.graph.eventBroadcaster.getHealth(),
        },
+        queueLanes,
      });
    });
  }
 }

+async function collectQueueLaneMetrics(
+  graph: ServerBetaServiceGraph,
+): Promise<ServerBetaQueueLaneMetric[]> {
+  const manager = graph.queueManager;
+  if (!(manager instanceof ActiveServerBetaQueueManager)) {
+    return [];
+  }
+  try {
+    return await manager.getLaneMetrics();
+  } catch {
+    // /api/health and /v1/info MUST never throw on a queue blip — surface
+    // empty lanes so the rest of the payload still renders.
+    return [];
+  }
+}
+
 export class ServerBetaService {
  private readonly graph: ServerBetaServiceGraph;
  private readonly host: string;
@@ -106,8 +132,73 @@ export class ServerBetaService {
        authMethod: this.graph.authMode,
        lastInteraction: null,
      }),
+      // Phase 10 — surface BullMQ/Valkey health on /api/health so deploy
+      // probes (and the Docker E2E) can confirm the queue engine without
+      // peeking at /v1/info. The queue manager's getHealth() returns its
+      // boundary descriptor; we shape it into the worker-compatible
+      // ObservationQueueHealth schema the Server class expects.
+      // Phase 12 — also include per-lane counts (waiting/active/completed/
+      // failed/delayed/stalled) so deploy probes can monitor saturation.
+      getQueueHealth: async () => {
+        const health = this.graph.queueManager.getHealth();
+        const details = (health.details ?? {}) as Record<string, unknown>;
+        if (health.status !== 'active' || details.engine !== 'bullmq') {
+          return null;
+        }
+        const lanes = await collectQueueLaneMetrics(this.graph);
+        return {
+          engine: 'bullmq' as const,
+          redis: {
+            status: 'ok' as const,
+            mode: String(details.mode ?? 'unknown'),
+            host: String(details.host ?? '127.0.0.1'),
+            port: typeof details.port === 'number' ? details.port : 6379,
+            prefix: String(details.prefix ?? 'claude_mem'),
+          },
+          lanes: lanes.map(lane => ({
+            kind: lane.kind,
+            name: lane.name,
+            waiting: lane.waiting,
+            active: lane.active,
+            completed: lane.completed,
+            failed: lane.failed,
+            delayed: lane.delayed,
+            stalled: lane.stalled,
+            unavailable: lane.unavailable,
+            ...(lane.unavailableReason ? { unavailableReason: lane.unavailableReason } : {}),
+          })),
+        };
+      },
    });
    server.registerRoutes(new ServerBetaRuntimeInfoRoutes(this.graph));
+    const v1Routes = new ServerV1PostgresRoutes({
+      pool: this.graph.postgres.pool,
+      queueManager: this.graph.queueManager,
+      authMode: this.graph.authMode === 'disabled' ? 'api-key' : this.graph.authMode,
+      runtime: SERVER_BETA_RUNTIME,
+      // Session policy is read inside the routes (default 'per-event' from
+      // resolveSessionGenerationPolicy(), env-overridable via
+      // CLAUDE_MEM_SERVER_SESSION_POLICY). We do not duplicate it here.
+    });
+    server.registerRoutes(v1Routes);
+
+    // Phase 9 — legacy compatibility adapters. These translate the old
+    // `/api/sessions/observations` and `/api/sessions/summarize` worker
+    // routes to the canonical Server beta event/job model. They share the
+    // SAME shared services with /v1/* routes — never duplicate ingest or
+    // session-end logic. New clients should hit /v1/* directly.
+    const compatAuthMode = this.graph.authMode === 'disabled' ? 'api-key' : this.graph.authMode;
+    server.registerRoutes(new SessionsObservationsAdapter({
+      pool: this.graph.postgres.pool,
+      ingestEvents: v1Routes.getIngestEventsService(),
+      authMode: compatAuthMode,
+    }));
+    server.registerRoutes(new SessionsSummarizeAdapter({
+      pool: this.graph.postgres.pool,
+      endSession: v1Routes.getEndSessionService(),
+      authMode: compatAuthMode,
+    }));
+
    server.finalizeRoutes();

    await server.listen(this.requestedPort, this.host);
@@ -184,6 +275,28 @@ export async function runServerBetaCli(argv: string[] = process.argv.slice(2)):
  const port = getServerBetaPort();
  const host = process.env.CLAUDE_MEM_SERVER_HOST ?? DEFAULT_SERVER_BETA_HOST;

+  // Phase 10: `claude-mem server worker [start|--daemon]` runs the BullMQ
+  // generation worker as a foregrounded process — no HTTP server, no route
+  // registration. In Compose this becomes a separately scaled service.
+  if (command === 'worker') {
+    const sub = (argv[1] ?? '--daemon').toLowerCase();
+    if (sub === 'start' || sub === '--daemon' || sub === 'run') {
+      await runServerBetaGenerationWorker();
+      return;
+    }
+    console.error('Usage: server-beta-service worker start');
+    process.exit(1);
+  }
+
+  // `server api-key create|list|revoke` mirrors the worker-service tooling
+  // but writes to the Postgres `api_keys` table the server-beta runtime
+  // actually reads from. The legacy worker-service CLI talks to SQLite and
+  // would be invisible to this stack.
+  if (command === 'server' && argv[1]?.toLowerCase() === 'api-key') {
+    await runServerBetaApiKeyCli(argv.slice(2));
+    return;
+  }
+
  switch (command) {
    case 'start': {
      const existing = readServerBetaPidFile();
@@ -258,9 +371,212 @@ export async function runServerBetaCli(argv: string[] = process.argv.slice(2)):
  }
 }

+// Phase 10 — Postgres-backed `server api-key create|list|revoke` CLI. The
+// legacy `worker-service.cjs server api-key` command talks to SQLite and
+// is invisible to the server-beta runtime, which reads keys from
+// Postgres. Use this entrypoint inside Docker / Compose.
+export async function runServerBetaApiKeyCli(argv: string[]): Promise<void> {
+  const sub = argv[0]?.toLowerCase();
+  const options = parseFlagArgs(argv.slice(1));
+
+  if (!process.env.CLAUDE_MEM_SERVER_DATABASE_URL) {
+    console.error('CLAUDE_MEM_SERVER_DATABASE_URL is required for `server api-key` commands.');
+    process.exit(1);
+  }
+
+  const { getSharedPostgresPool } = await import('../../storage/postgres/index.js');
+  const { PostgresAuthRepository } = await import('../../storage/postgres/auth.js');
+  const { createHash, randomBytes } = await import('crypto');
+  const pool = getSharedPostgresPool({ requireDatabaseUrl: true });
+  const repo = new PostgresAuthRepository(pool);
+
+  try {
+    if (sub === 'create') {
+      const scopes = (options.scope ?? options.scopes ?? 'memories:read')
+        .split(',')
+        .map((scope: string) => scope.trim())
+        .filter(Boolean);
+      // Resolve team/project. If the caller passed --team/--project, honor
+      // them. Otherwise, run the server-beta bootstrap to get-or-create the
+      // local team+project, then create a NEW key against those IDs with
+      // the caller's requested scopes (the bootstrap key uses hook scopes,
+      // which is the wrong default for an arbitrary CLI-issued key).
+      let teamId = options.team ?? null;
+      let projectId = options.project ?? null;
+      if (!teamId || !projectId) {
+        const { bootstrapServerBetaApiKey } = await import('../../services/hooks/server-beta-bootstrap.js');
+        const result = await bootstrapServerBetaApiKey({ pool, closePool: false });
+        teamId = result.teamId;
+        projectId = result.projectId;
+      }
+      const rawKey = `cmem_${randomBytes(24).toString('hex')}`;
+      const keyHash = createHash('sha256').update(rawKey).digest('hex');
+      const created = await repo.createApiKey({
+        keyHash,
+        teamId,
+        projectId,
+        scopes,
+        actorId: 'system:server-beta-cli',
+      });
+      console.log(JSON.stringify({
+        id: created.id,
+        key: rawKey,
+        name: options.name ?? 'server-api-key',
+        teamId,
+        projectId,
+        scopes,
+      }, null, 2));
+      return;
+    }
+
+    if (sub === 'list') {
+      // Bound the result set to prevent unintentional cross-tenant key
+      // metadata disclosure when an admin runs `api-key list` on a shared
+      // host. Default page is 100; --team filters to a single tenant.
+      const teamFilter = options.team ?? null;
+      const limitArg = Number.parseInt(options.limit ?? '100', 10);
+      const offsetArg = Number.parseInt(options.offset ?? '0', 10);
+      const limit = Number.isFinite(limitArg) && limitArg > 0 && limitArg <= 500
+        ? limitArg
+        : 100;
+      const offset = Number.isFinite(offsetArg) && offsetArg >= 0 ? offsetArg : 0;
+      const where = teamFilter ? 'WHERE team_id = $1' : '';
+      const params: unknown[] = teamFilter ? [teamFilter, limit, offset] : [limit, offset];
+      const limitIdx = teamFilter ? 2 : 1;
+      const offsetIdx = teamFilter ? 3 : 2;
+      const result = await pool.query<{
+        id: string;
+        team_id: string | null;
+        project_id: string | null;
+        scopes: unknown;
+        revoked_at: Date | null;
+        expires_at: Date | null;
+        last_used_at: Date | null;
+        created_at: Date;
+      }>(
+        `SELECT id, team_id, project_id, scopes, revoked_at, expires_at, last_used_at, created_at
+         FROM api_keys
+         ${where}
+         ORDER BY created_at DESC
+         LIMIT $${limitIdx} OFFSET $${offsetIdx}`,
+        params,
+      );
+      console.log(JSON.stringify({
+        teamId: teamFilter,
+        limit,
+        offset,
+        count: result.rows.length,
+        keys: result.rows.map(row => ({
+          id: row.id,
+          teamId: row.team_id,
+          projectId: row.project_id,
+          scopes: row.scopes,
+          status: row.revoked_at ? 'revoked' : 'active',
+          lastUsedAt: row.last_used_at?.toISOString() ?? null,
+          expiresAt: row.expires_at?.toISOString() ?? null,
+          createdAt: row.created_at.toISOString(),
+        })),
+      }, null, 2));
+      return;
+    }
+
+    if (sub === 'revoke') {
+      const id = argv[1];
+      if (!id) {
+        console.error('Usage: server-beta-service server api-key revoke <id>');
+        process.exit(1);
+      }
+      const result = await pool.query(
+        `UPDATE api_keys SET revoked_at = now()
+         WHERE id = $1 AND revoked_at IS NULL
+         RETURNING id`,
+        [id],
+      );
+      if (result.rowCount === 0) {
+        console.error(`API key not found or already revoked: ${id}`);
+        process.exit(1);
+      }
+      console.log(JSON.stringify({ id, status: 'revoked' }, null, 2));
+      return;
+    }
+
+    console.error(`Unknown server api-key subcommand: ${sub ?? '(none)'}`);
+    console.error('Usage: server-beta-service server api-key create|list|revoke');
+    process.exit(1);
+  } finally {
+    // Pool is shared; do not close here. The process will exit and the
+    // pool tears down via the shared module's process exit hook.
+  }
+}
+
+function parseFlagArgs(argv: string[]): Record<string, string> {
+  const out: Record<string, string> = {};
+  for (let i = 0; i < argv.length; i++) {
+    const arg = argv[i];
+    if (!arg) continue;
+    if (arg.startsWith('--')) {
+      const equalsIdx = arg.indexOf('=');
+      if (equalsIdx > -1) {
+        out[arg.slice(2, equalsIdx)] = arg.slice(equalsIdx + 1);
+      } else {
+        out[arg.slice(2)] = argv[i + 1] ?? '';
+        i += 1;
+      }
+    }
+  }
+  return out;
+}
+
+// Phase 10 — generation-worker-only entrypoint. Starts BullMQ workers against
+// the same Postgres + Valkey/Redis the HTTP server-beta service uses, but
+// never opens an HTTP listener. In Compose this is a separate, horizontally
+// scalable service. The HTTP server-beta service should run with
+// CLAUDE_MEM_GENERATION_DISABLED=true so generation only happens in this
+// process.
+export async function runServerBetaGenerationWorker(): Promise<void> {
+  const { validateServerBetaEnv, createServerBetaService } = await import('./create-server-beta-service.js');
+  validateServerBetaEnv();
+  // Build the service WITHOUT starting HTTP. We reuse createServerBetaService
+  // for pool + bootstrap + queue + generation worker wiring, but never call
+  // service.start(). Generation is enabled here even if env says
+  // CLAUDE_MEM_GENERATION_DISABLED, because this IS the generation worker.
+  delete process.env.CLAUDE_MEM_GENERATION_DISABLED;
+  const service = await createServerBetaService();
+  const state = service.getRuntimeState();
+  logger.info('SYSTEM', 'Server beta generation worker started (no HTTP)', {
+    pid: process.pid,
+    queue: state.boundaries.queueManager,
+    generation: state.boundaries.generationWorkerManager,
+  });
+  console.log(JSON.stringify({ status: 'worker-running', runtime: SERVER_BETA_RUNTIME, pid: process.pid }));
+
+  let stopping = false;
+  const shutdown = async () => {
+    if (stopping) return;
+    stopping = true;
+    try {
+      await service.stop();
+    } finally {
+      process.exit(0);
+    }
+  };
+  process.once('SIGTERM', shutdown);
+  process.once('SIGINT', shutdown);
+
+  // Block forever — Workers run in background via BullMQ. Without this the
+  // process would exit and BullMQ jobs would never be consumed.
+  await new Promise<void>(() => {});
+}
+
 function getServerBetaPort(): number {
  const parsed = Number.parseInt(process.env.CLAUDE_MEM_SERVER_PORT ?? '', 10);
-  return Number.isInteger(parsed) && parsed > 0 ? parsed : DEFAULT_SERVER_BETA_PORT;
+  if (Number.isInteger(parsed) && parsed > 0) {
+    return parsed;
+  }
+  // UID-derived default for multi-account isolation: two users on the same
+  // host get distinct ports without explicit configuration. Containerized
+  // deployments always pass CLAUDE_MEM_SERVER_PORT so this branch is local-only.
+  return DEFAULT_SERVER_BETA_PORT + ((process.getuid?.() ?? 77) % 100);
 }

 function spawnServerBetaDaemon(port: number): number | undefined {
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import {
+  PostgresServerSessionsRepository,
+  type PostgresServerSession,
+} from '../../storage/postgres/server-sessions.js';
+import type { PostgresAgentEvent } from '../../storage/postgres/agent-events.js';
+import type { JsonObject } from '../../storage/postgres/utils.js';
+import type { PostgresPool } from '../../storage/postgres/pool.js';
+import type { PostgresQueryable } from '../../storage/postgres/utils.js';
+
+// ServerSessionRuntimeRepository is the runtime helper layer used by Server
+// beta routes and generation policies. It is intentionally thin: every method
+// requires explicit `team_id` + `project_id` and validates scope through the
+// underlying PostgresServerSessionsRepository (which calls
+// assertProjectOwnership before any write). It does NOT cache state — every
+// call hits Postgres so the runtime never trusts in-memory ActiveSession-style
+// objects, per the Phase 6 anti-pattern guard.
+
+export interface ServerSessionScope {
+  teamId: string;
+  projectId: string;
+}
+
+export interface GetActiveSessionInput extends ServerSessionScope {
+  externalSessionId: string;
+  contentSessionId?: string | null;
+  agentId?: string | null;
+  agentType?: string | null;
+  platformSource?: string | null;
+  metadata?: JsonObject;
+}
+
+export interface ServerSessionRuntimeRepositoryOptions {
+  client: PostgresQueryable;
+}
+
+export class ServerSessionRuntimeRepository {
+  private readonly repo: PostgresServerSessionsRepository;
+
+  constructor(private readonly options: ServerSessionRuntimeRepositoryOptions) {
+    this.repo = new PostgresServerSessionsRepository(options.client);
+  }
+
+  /**
+   * Find or create the canonical Server beta session row for an external
+   * session id. Idempotent on (project_id, external_session_id).
+   *
+   * Anti-pattern guard: this MUST NOT consult worker `ActiveSession` or any
+   * legacy SessionStore. server_sessions is the canonical model.
+   */
+  async getActiveSession(input: GetActiveSessionInput): Promise<PostgresServerSession> {
+    const existing = await this.repo.findByExternalIdForScope({
+      externalSessionId: input.externalSessionId,
+      projectId: input.projectId,
+      teamId: input.teamId,
+    });
+    if (existing) {
+      return existing;
+    }
+    return this.repo.create({
+      projectId: input.projectId,
+      teamId: input.teamId,
+      externalSessionId: input.externalSessionId,
+      contentSessionId: input.contentSessionId ?? null,
+      agentId: input.agentId ?? null,
+      agentType: input.agentType ?? null,
+      platformSource: input.platformSource ?? null,
+      metadata: input.metadata ?? {},
+    });
+  }
+
+  async getById(input: { id: string } & ServerSessionScope): Promise<PostgresServerSession | null> {
+    return this.repo.getByIdForScope({
+      id: input.id,
+      projectId: input.projectId,
+      teamId: input.teamId,
+    });
+  }
+
+  async findByExternalId(input: {
+    externalSessionId: string;
+  } & ServerSessionScope): Promise<PostgresServerSession | null> {
+    return this.repo.findByExternalIdForScope({
+      externalSessionId: input.externalSessionId,
+      projectId: input.projectId,
+      teamId: input.teamId,
+    });
+  }
+
+  async listUnprocessedEvents(
+    input: { serverSessionId: string; limit?: number } & ServerSessionScope,
+  ): Promise<PostgresAgentEvent[]> {
+    const params: {
+      serverSessionId: string;
+      projectId: string;
+      teamId: string;
+      limit?: number;
+    } = {
+      serverSessionId: input.serverSessionId,
+      projectId: input.projectId,
+      teamId: input.teamId,
+    };
+    if (input.limit !== undefined) {
+      params.limit = input.limit;
+    }
+    return this.repo.listUnprocessedEvents(params);
+  }
+
+  /**
+   * End the session if not already ended. Idempotent — re-ending a session
+   * returns the unchanged row and never creates a duplicate summary job
+   * because the (team_id, project_id, source_type='session_summary',
+   * source_id) UNIQUE constraint on observation_generation_jobs collapses
+   * duplicate enqueue attempts.
+   */
+  async endSession(
+    input: { id: string } & ServerSessionScope,
+  ): Promise<PostgresServerSession | null> {
+    return this.repo.endSession({
+      id: input.id,
+      projectId: input.projectId,
+      teamId: input.teamId,
+    });
+  }
+
+  async markGenerationStarted(
+    input: { id: string } & ServerSessionScope,
+  ): Promise<PostgresServerSession | null> {
+    return this.repo.markGenerationStarted({
+      id: input.id,
+      projectId: input.projectId,
+      teamId: input.teamId,
+    });
+  }
+
+  async markGenerationCompleted(
+    input: { id: string } & ServerSessionScope,
+  ): Promise<PostgresServerSession | null> {
+    return this.repo.markGenerationCompleted({
+      id: input.id,
+      projectId: input.projectId,
+      teamId: input.teamId,
+    });
+  }
+
+  async markGenerationFailed(
+    input: { id: string; error?: string | null } & ServerSessionScope,
+  ): Promise<PostgresServerSession | null> {
+    return this.repo.markGenerationFailed({
+      id: input.id,
+      projectId: input.projectId,
+      teamId: input.teamId,
+      error: input.error ?? null,
+    });
+  }
+}
+
+export function createServerSessionRuntimeRepository(
+  pool: PostgresPool,
+): ServerSessionRuntimeRepository {
+  return new ServerSessionRuntimeRepository({ client: pool });
+}
@@ -0,0 +1,206 @@
+// SPDX-License-Identifier: Apache-2.0
+
+import type { JobsOptions } from 'bullmq';
+import type {
+  GenerateObservationsForEventJob,
+  GenerateSessionSummaryJob,
+} from '../jobs/types.js';
+import { buildServerJobId } from '../jobs/job-id.js';
+import type { PostgresAgentEvent } from '../../storage/postgres/agent-events.js';
+import type { PostgresObservationGenerationJob } from '../../storage/postgres/generation-jobs.js';
+
+// SessionGenerationPolicy decides WHEN to enqueue work for the BullMQ event
+// and summary lanes. It is configurable via:
+//   - CLAUDE_MEM_SERVER_SESSION_POLICY env var (per-process default)
+//   - per-call override (per-team settings can plug in here later)
+//
+// Three policies are supported:
+//   - 'per-event'      (default): enqueue immediately on every event POST.
+//                       Matches Phase 4/5 behavior.
+//   - 'debounce':       enqueue with `delay`; when a new event arrives within
+//                       the window, replace the delayed job (deterministic
+//                       BullMQ jobId means re-add(jobId, ...) overwrites the
+//                       waiting entry, and removeOnComplete/Fail keep things
+//                       tidy). Outbox row is canonical so durability is safe.
+//   - 'end-of-session': only enqueue summary jobs at /v1/sessions/:id/end.
+//                       Per-event posts skip BullMQ entirely; the outbox row
+//                       remains in `queued` state and startup reconciliation
+//                       will publish it later (or it can be cancelled).
+//
+// Anti-pattern guard: the policy MUST NOT use ActiveSession-style cached
+// state. Inputs are always reloaded by the caller from Postgres before this
+// fires.
+
+export type ServerSessionGenerationPolicy = 'per-event' | 'debounce' | 'end-of-session';
+
+const DEFAULT_DEBOUNCE_MS = 5000;
+
+export interface SessionGenerationPolicyOptions {
+  policy?: ServerSessionGenerationPolicy;
+  debounceWindowMs?: number;
+}
+
+export function resolveSessionGenerationPolicy(
+  options: SessionGenerationPolicyOptions = {},
+): { policy: ServerSessionGenerationPolicy; debounceWindowMs: number } {
+  const envPolicy = (process.env.CLAUDE_MEM_SERVER_SESSION_POLICY ?? '').trim().toLowerCase();
+  const policy: ServerSessionGenerationPolicy = options.policy
+    ?? (envPolicy === 'debounce' || envPolicy === 'end-of-session' || envPolicy === 'per-event'
+      ? envPolicy
+      : 'per-event');
+  const debounceWindowMs = options.debounceWindowMs
+    ?? (Number.parseInt(process.env.CLAUDE_MEM_SERVER_SESSION_DEBOUNCE_MS ?? '', 10)
+      || DEFAULT_DEBOUNCE_MS);
+  return {
+    policy,
+    debounceWindowMs: Number.isFinite(debounceWindowMs) && debounceWindowMs > 0
+      ? debounceWindowMs
+      : DEFAULT_DEBOUNCE_MS,
+  };
+}
+
+export interface EnqueueEventDecisionInput {
+  event: PostgresAgentEvent;
+  outbox: PostgresObservationGenerationJob;
+  // Phase 11 — identity context captured at HTTP ingest time so the BullMQ
+  // payload carries every audit field. apiKeyId may be null for local-dev
+  // enqueues and `actorId` follows the api key's `actor_id` column.
+  apiKeyId?: string | null;
+  actorId?: string | null;
+  sourceAdapter?: string | null;
+  // Phase 12 — request correlation id minted at the HTTP boundary.
+  requestId?: string | null;
+}
+
+export interface EnqueueEventDecision {
+  shouldEnqueue: boolean;
+  jobId: string;
+  payload: GenerateObservationsForEventJob;
+  jobsOptions?: JobsOptions;
+  reason: 'per-event' | 'debounce' | 'end-of-session-skip';
+}
+
+export function buildEnqueueEventDecision(
+  input: EnqueueEventDecisionInput,
+  options: SessionGenerationPolicyOptions = {},
+): EnqueueEventDecision {
+  const resolved = resolveSessionGenerationPolicy(options);
+  const jobId = input.outbox.bullmqJobId ?? buildServerJobId({
+    kind: 'event',
+    team_id: input.event.teamId,
+    project_id: input.event.projectId,
+    source_type: 'agent_event',
+    source_id: input.event.id,
+  });
+  const payload: GenerateObservationsForEventJob = {
+    kind: 'event',
+    team_id: input.outbox.teamId,
+    project_id: input.outbox.projectId,
+    source_type: 'agent_event',
+    source_id: input.event.id,
+    generation_job_id: input.outbox.id,
+    agent_event_id: input.event.id,
+    api_key_id: input.apiKeyId ?? null,
+    actor_id: input.actorId ?? null,
+    source_adapter: input.sourceAdapter ?? input.event.sourceAdapter ?? 'api',
+    request_id: input.requestId ?? null,
+  };
+
+  if (resolved.policy === 'end-of-session') {
+    return { shouldEnqueue: false, jobId, payload, reason: 'end-of-session-skip' };
+  }
+
+  if (resolved.policy === 'debounce') {
+    return {
+      shouldEnqueue: true,
+      jobId,
+      payload,
+      jobsOptions: { delay: resolved.debounceWindowMs },
+      reason: 'debounce',
+    };
+  }
+
+  return { shouldEnqueue: true, jobId, payload, reason: 'per-event' };
+}
+
+// Minimal queue surface used by scheduleDebouncedEventJob. Declared as an
+// interface (instead of `Pick<ServerJobQueue<...>, ...>`) so the parameter
+// accepts ServerJobQueue<ServerGenerationJobPayload> at the call site without
+// triggering invariant TPayload type errors. The ServerJobQueue.add signature
+// is structurally compatible — it requires `payload: TPayload`, and we only
+// hand in narrowed payloads.
+export interface DebounceableEventQueue {
+  add(jobId: string, payload: GenerateObservationsForEventJob, options?: JobsOptions): Promise<void>;
+  remove(jobId: string): Promise<void>;
+  getJob(jobId: string): Promise<unknown>;
+}
+
+/**
+ * Apply a debounce decision to a BullMQ queue. If a delayed job already exists
+ * for this deterministic id, BullMQ's `add(jobId, ...)` will be a no-op, so we
+ * proactively remove it first so the new event's delay window starts fresh.
+ *
+ * This implements the "if a new event arrives within window, replace the
+ * delayed job" requirement.
+ */
+export async function scheduleDebouncedEventJob(
+  queue: DebounceableEventQueue,
+  decision: EnqueueEventDecision,
+): Promise<void> {
+  if (!decision.shouldEnqueue) return;
+  if (decision.reason === 'debounce') {
+    try {
+      const existing = await queue.getJob(decision.jobId);
+      if (existing) {
+        await queue.remove(decision.jobId);
+      }
+    } catch {
+      // best-effort; if remove fails because the job already moved to active
+      // we just let `add` no-op or fail through to the caller's error handler
+    }
+  }
+  await queue.add(decision.jobId, decision.payload, decision.jobsOptions);
+}
+
+export interface BuildSummaryJobInput {
+  serverSessionId: string;
+  teamId: string;
+  projectId: string;
+  generationJobId: string;
+  // Phase 11 — same identity context the event-payload builder receives.
+  apiKeyId?: string | null;
+  actorId?: string | null;
+  sourceAdapter?: string | null;
+  // Phase 12 — request correlation id flows into the summary lane too.
+  requestId?: string | null;
+}
+
+export function buildSummaryJobId(input: {
+  serverSessionId: string;
+  teamId: string;
+  projectId: string;
+}): string {
+  return buildServerJobId({
+    kind: 'summary',
+    team_id: input.teamId,
+    project_id: input.projectId,
+    source_type: 'session_summary',
+    source_id: input.serverSessionId,
+  });
+}
+
+export function buildSummaryJobPayload(input: BuildSummaryJobInput): GenerateSessionSummaryJob {
+  return {
+    kind: 'summary',
+    team_id: input.teamId,
+    project_id: input.projectId,
+    source_type: 'session_summary',
+    source_id: input.serverSessionId,
+    generation_job_id: input.generationJobId,
+    server_session_id: input.serverSessionId,
+    api_key_id: input.apiKeyId ?? null,
+    actor_id: input.actorId ?? null,
+    source_adapter: input.sourceAdapter ?? 'api',
+    request_id: input.requestId ?? null,
+  };
+}
@@ -1,10 +1,17 @@
 // SPDX-License-Identifier: Apache-2.0

+import { existsSync } from 'fs';
+import { logger } from '../../utils/logger.js';
 import { createPostgresStorageRepositories, getSharedPostgresPool, SERVER_BETA_POSTGRES_SCHEMA_VERSION } from '../../storage/postgres/index.js';
 import { bootstrapServerBetaPostgresSchema } from '../../storage/postgres/schema.js';
 import type { PostgresPool } from '../../storage/postgres/pool.js';
 import { getRedisQueueConfig } from '../queue/redis-config.js';
 import { ActiveServerBetaQueueManager } from './ActiveServerBetaQueueManager.js';
+import { ActiveServerBetaGenerationWorkerManager } from './ActiveServerBetaGenerationWorkerManager.js';
+import { ClaudeObservationProvider } from '../generation/providers/ClaudeObservationProvider.js';
+import { GeminiObservationProvider } from '../generation/providers/GeminiObservationProvider.js';
+import { OpenRouterObservationProvider } from '../generation/providers/OpenRouterObservationProvider.js';
+import type { ServerGenerationProvider } from '../generation/providers/shared/types.js';
 import { ServerBetaService } from './ServerBetaService.js';
 import {
  DisabledServerBetaEventBroadcaster,
@@ -13,6 +20,7 @@ import {
  DisabledServerBetaQueueManager,
  type ServerBetaAuthMode,
  type ServerBetaBootstrapStatus,
+  type ServerBetaGenerationWorkerManager,
  type ServerBetaQueueManager,
  type ServerBetaServiceGraph,
 } from './types.js';
@@ -22,13 +30,147 @@ export interface CreateServerBetaServiceOptions {
  authMode?: ServerBetaAuthMode;
  bootstrapSchema?: boolean;
  queueManager?: ServerBetaQueueManager;
+  // Phase 5 seam: tests can inject a fake provider without env config.
+  generationProvider?: ServerGenerationProvider;
+  generationWorkerManager?: ServerBetaGenerationWorkerManager;
+  // Phase 10: when true, skip building the generation worker. Used when the
+  // service is just an HTTP front-end and a separate `server worker` process
+  // consumes the BullMQ queues.
+  generationDisabled?: boolean;
+  // Phase 10: skip env validation (tests). Production code paths always run
+  // validation so misconfiguration fails fast at startup.
+  skipEnvValidation?: boolean;
+}
+
+// Phase 10 — env validation. Server beta in Docker requires explicit, complete
+// configuration. Missing pieces fail fast at startup rather than silently
+// degrading. Required env when running in Docker:
+//   - CLAUDE_MEM_SERVER_DATABASE_URL  (Postgres)
+//   - CLAUDE_MEM_QUEUE_ENGINE=bullmq  (no in-memory queue in Docker)
+//   - CLAUDE_MEM_REDIS_URL            (BullMQ requires Redis/Valkey)
+//   - CLAUDE_MEM_AUTH_MODE != local-dev (auth must be real in Docker)
+// `local-dev` bypass is only valid on a developer's loopback; in Docker the
+// container is reachable via service-to-service networking and exposed ports,
+// so the loopback assumption is invalid.
+export interface ServerBetaEnvValidationOptions {
+  env?: NodeJS.ProcessEnv;
+  isDocker?: boolean;
+}
+
+export interface ServerBetaEnvValidationResult {
+  isDocker: boolean;
+  runtime: string;
+  authMode: string;
+  queueEngine: string;
+  hasDatabaseUrl: boolean;
+  hasRedisUrl: boolean;
+}
+
+export function detectDockerEnvironment(env: NodeJS.ProcessEnv = process.env): boolean {
+  if (env.CLAUDE_MEM_DOCKER === '1' || env.CLAUDE_MEM_DOCKER === 'true') return true;
+  // /.dockerenv is the canonical Docker marker; existsSync is cheap.
+  try {
+    if (existsSync('/.dockerenv')) return true;
+  } catch {
+    // ignore
+  }
+  return false;
+}
+
+export function validateServerBetaEnv(
+  options: ServerBetaEnvValidationOptions = {},
+): ServerBetaEnvValidationResult {
+  const env = options.env ?? process.env;
+  const isDocker = options.isDocker ?? detectDockerEnvironment(env);
+  const errors: string[] = [];
+
+  const runtime = (env.CLAUDE_MEM_RUNTIME ?? '').trim();
+  if (!runtime) {
+    // Warn but allow — defaulted to 'worker' upstream; we log a warning so
+    // operators know server-beta is the active runtime here.
+    if (isDocker) {
+      logger.warn('SYSTEM', 'CLAUDE_MEM_RUNTIME unset; server-beta container assumes runtime=server-beta');
+    }
+  } else if (runtime !== 'server-beta' && isDocker) {
+    errors.push(
+      `CLAUDE_MEM_RUNTIME=${runtime} is invalid in Docker; the server-beta image only runs CLAUDE_MEM_RUNTIME=server-beta.`,
+    );
+  }
+
+  const authMode = (env.CLAUDE_MEM_AUTH_MODE ?? 'api-key').trim();
+  if (isDocker) {
+    if (authMode === 'local-dev') {
+      errors.push(
+        'CLAUDE_MEM_AUTH_MODE=local-dev is not allowed in Docker. Set CLAUDE_MEM_AUTH_MODE=api-key and create a key with `claude-mem server api-key create`.',
+      );
+    }
+    if (
+      env.CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS === '1'
+      || env.CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS === 'true'
+    ) {
+      errors.push(
+        'CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS is not allowed in Docker. Loopback bypass cannot be enforced inside a container; remove the variable.',
+      );
+    }
+  }
+
+  const queueEngine = (env.CLAUDE_MEM_QUEUE_ENGINE ?? '').trim().toLowerCase();
+  if (isDocker) {
+    if (!queueEngine) {
+      errors.push('CLAUDE_MEM_QUEUE_ENGINE is required in Docker; set it to "bullmq".');
+    } else if (queueEngine !== 'bullmq') {
+      errors.push(
+        `CLAUDE_MEM_QUEUE_ENGINE=${queueEngine} is not allowed in Docker. Only "bullmq" is supported (no in-process queues across container boundaries).`,
+      );
+    }
+  }
+
+  const hasDatabaseUrl = Boolean((env.CLAUDE_MEM_SERVER_DATABASE_URL ?? '').trim());
+  if (!hasDatabaseUrl) {
+    errors.push('CLAUDE_MEM_SERVER_DATABASE_URL is required to start server-beta (Postgres connection string).');
+  }
+
+  const hasRedisUrl = Boolean((env.CLAUDE_MEM_REDIS_URL ?? '').trim());
+  if (queueEngine === 'bullmq' && !hasRedisUrl) {
+    errors.push('CLAUDE_MEM_REDIS_URL is required when CLAUDE_MEM_QUEUE_ENGINE=bullmq.');
+  }
+
+  if (errors.length > 0) {
+    const message = [
+      'server-beta startup configuration is invalid:',
+      ...errors.map(line => `  - ${line}`),
+    ].join('\n');
+    throw new Error(message);
+  }
+
+  return {
+    isDocker,
+    runtime: runtime || 'server-beta',
+    authMode,
+    queueEngine: queueEngine || 'disabled',
+    hasDatabaseUrl,
+    hasRedisUrl,
+  };
 }

 export async function createServerBetaService(
  options: CreateServerBetaServiceOptions = {},
 ): Promise<ServerBetaService> {
+  if (!options.skipEnvValidation) {
+    validateServerBetaEnv();
+  }
  const pool = options.pool ?? getSharedPostgresPool({ requireDatabaseUrl: true });
  const bootstrap = await initializePostgres(pool, options.bootstrapSchema ?? true);
+  const queueManager = options.queueManager ?? buildQueueManager();
+  const generationDisabled = options.generationDisabled
+    ?? (process.env.CLAUDE_MEM_GENERATION_DISABLED === '1'
+      || process.env.CLAUDE_MEM_GENERATION_DISABLED === 'true');
+  const generationWorkerManager = options.generationWorkerManager
+    ?? (generationDisabled
+      ? new DisabledServerBetaGenerationWorkerManager(
+          'CLAUDE_MEM_GENERATION_DISABLED is set; this server runs HTTP only. A separate `claude-mem server worker start` process consumes the BullMQ queues.',
+        )
+      : buildGenerationWorkerManager(pool, queueManager, options.generationProvider));
  const graph: ServerBetaServiceGraph = {
    runtime: 'server-beta',
    postgres: {
@@ -36,16 +178,74 @@ export async function createServerBetaService(
      bootstrap,
    },
    authMode: options.authMode ?? parseAuthMode(process.env.CLAUDE_MEM_AUTH_MODE),
-    queueManager: options.queueManager ?? buildQueueManager(),
-    generationWorkerManager: new DisabledServerBetaGenerationWorkerManager('Phase 2 boundary only; generation workers are not wired.'),
-    providerRegistry: new DisabledServerBetaProviderRegistry('Phase 2 boundary only; provider-backed generation is not wired.'),
+    queueManager,
+    generationWorkerManager,
+    providerRegistry: new DisabledServerBetaProviderRegistry('Phase 5 keeps the provider registry boundary as inert; per-call providers are owned by the generation worker manager.'),
    eventBroadcaster: new DisabledServerBetaEventBroadcaster('Phase 2 boundary only; SSE/event broadcasting is not wired.'),
    storage: createPostgresStorageRepositories(pool),
  };

+  if (generationWorkerManager instanceof ActiveServerBetaGenerationWorkerManager) {
+    generationWorkerManager.start();
+  }
+
  return new ServerBetaService({ graph });
 }

+function buildGenerationWorkerManager(
+  pool: PostgresPool,
+  queueManager: ServerBetaQueueManager,
+  injectedProvider?: ServerGenerationProvider,
+): ServerBetaGenerationWorkerManager {
+  if (!(queueManager instanceof ActiveServerBetaQueueManager)) {
+    return new DisabledServerBetaGenerationWorkerManager(
+      'queue manager is disabled; set CLAUDE_MEM_QUEUE_ENGINE=bullmq to enable provider generation.',
+    );
+  }
+  const provider = injectedProvider ?? buildServerGenerationProviderFromEnv();
+  if (!provider) {
+    return new DisabledServerBetaGenerationWorkerManager(
+      'no server generation provider configured; set CLAUDE_MEM_SERVER_PROVIDER and the matching API key to enable.',
+    );
+  }
+  return new ActiveServerBetaGenerationWorkerManager({
+    pool,
+    queueManager,
+    provider,
+  });
+}
+
+function buildServerGenerationProviderFromEnv(): ServerGenerationProvider | null {
+  const provider = (process.env.CLAUDE_MEM_SERVER_PROVIDER ?? '').trim().toLowerCase();
+  if (!provider) return null;
+  try {
+    if (provider === 'claude' || provider === 'anthropic') {
+      const apiKey = process.env.ANTHROPIC_API_KEY ?? process.env.CLAUDE_MEM_ANTHROPIC_API_KEY ?? '';
+      if (!apiKey) return null;
+      const opts: { apiKey: string; model?: string } = { apiKey };
+      if (process.env.CLAUDE_MEM_SERVER_MODEL) opts.model = process.env.CLAUDE_MEM_SERVER_MODEL;
+      return new ClaudeObservationProvider(opts);
+    }
+    if (provider === 'gemini') {
+      const apiKey = process.env.GEMINI_API_KEY ?? process.env.CLAUDE_MEM_GEMINI_API_KEY ?? '';
+      if (!apiKey) return null;
+      const opts: { apiKey: string; model?: string } = { apiKey };
+      if (process.env.CLAUDE_MEM_SERVER_MODEL) opts.model = process.env.CLAUDE_MEM_SERVER_MODEL;
+      return new GeminiObservationProvider(opts);
+    }
+    if (provider === 'openrouter') {
+      const apiKey = process.env.OPENROUTER_API_KEY ?? process.env.CLAUDE_MEM_OPENROUTER_API_KEY ?? '';
+      if (!apiKey) return null;
+      const opts: { apiKey: string; model?: string } = { apiKey };
+      if (process.env.CLAUDE_MEM_SERVER_MODEL) opts.model = process.env.CLAUDE_MEM_SERVER_MODEL;
+      return new OpenRouterObservationProvider(opts);
+    }
+  } catch {
+    return null;
+  }
+  return null;
+}
+
 // Queue manager selection is fail-fast on misconfiguration. If the user
 // explicitly opts into BullMQ via CLAUDE_MEM_QUEUE_ENGINE=bullmq we build
 // the active manager; any error there throws so the runtime does not
@@ -20,6 +20,24 @@ export interface ServerBetaBoundaryHealth {
  details?: Record<string, unknown>;
 }

+// Phase 12 — per-lane queue metric snapshot. Returned by
+// ActiveServerBetaQueueManager.getLaneMetrics so /api/health and /v1/info
+// can publish current waiting/active/completed/failed/delayed/stalled counts
+// for each generation lane. `unavailable` is set when Redis was unreachable
+// at sample time so /api/health still responds rather than 500'ing.
+export interface ServerBetaQueueLaneMetric {
+  kind: string;
+  name: string;
+  waiting: number;
+  active: number;
+  completed: number;
+  failed: number;
+  delayed: number;
+  stalled: number;
+  unavailable: boolean;
+  unavailableReason?: string;
+}
+
 export interface ServerBetaQueueManager {
  readonly kind: 'queue-manager';
  getHealth(): ServerBetaBoundaryHealth;
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: Apache-2.0
+
+// Shared session-end + summary-job path used by both `/v1/sessions/:id/end`
+// (canonical) and `src/server/compat/SessionsSummarizeAdapter.ts` (legacy
+// translator). Both call sites must produce identical Postgres state and
+// queue effects: ended_at idempotency, exactly one outbox row per session
+// summary, deterministic BullMQ job id.
+//
+// This module MUST NOT import from src/services/worker/* — Phase 9 keeps
+// the compat shim coupled to Server beta core only.
+
+import {
+  PostgresObservationGenerationJobEventsRepository,
+  PostgresObservationGenerationJobRepository,
+  type PostgresObservationGenerationJob,
+} from '../../storage/postgres/generation-jobs.js';
+import type { PostgresPool } from '../../storage/postgres/pool.js';
+import { withPostgresTransaction } from '../../storage/postgres/pool.js';
+import {
+  PostgresServerSessionsRepository,
+  type PostgresServerSession,
+} from '../../storage/postgres/server-sessions.js';
+import { logger } from '../../utils/logger.js';
+import { buildSummaryJobId, buildSummaryJobPayload } from '../runtime/SessionGenerationPolicy.js';
+import type { GenerateSessionSummaryJob } from '../jobs/types.js';
+import type { EnqueueOutcome, EventQueueLike } from './IngestEventsService.js';
+import { newId } from '../../storage/postgres/utils.js';
+
+const SUMMARY_JOB_TYPE = 'observation_generate_session_summary';
+
+export interface EndSessionServiceOptions {
+  pool: PostgresPool;
+  resolveSummaryQueue: () => EventQueueLike | null;
+}
+
+export interface EndSessionResult {
+  session: PostgresServerSession | null;
+  outbox: PostgresObservationGenerationJob | null;
+  enqueueState: EnqueueOutcome;
+}
+
+export interface EndSessionInput {
+  sessionId: string;
+  projectId: string;
+  teamId: string;
+  source?: string;
+  // Phase 11 — identity context propagated into the BullMQ summary payload.
+  apiKeyId?: string | null;
+  actorId?: string | null;
+  sourceAdapter?: string | null;
+}
+
+export class EndSessionService {
+  constructor(private readonly options: EndSessionServiceOptions) {}
+
+  async end(input: EndSessionInput): Promise<EndSessionResult> {
+    const source = input.source ?? 'http_post_v1_sessions_end';
+
+    const txResult = await withPostgresTransaction(this.options.pool, async (client) => {
+      const sessionsRepo = new PostgresServerSessionsRepository(client);
+      const ended = await sessionsRepo.endSession({
+        id: input.sessionId,
+        projectId: input.projectId,
+        teamId: input.teamId,
+      });
+      if (!ended) {
+        return {
+          session: null as PostgresServerSession | null,
+          outbox: null as PostgresObservationGenerationJob | null,
+        };
+      }
+      const jobsRepo = new PostgresObservationGenerationJobRepository(client);
+      const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
+      // Persist the BullMQ payload at create-time so reconciliation and
+      // operator retry can re-enqueue a payload that passes the worker's
+      // assertServerGenerationJobPayload validation.
+      const outboxId = newId();
+      const summaryPayload = buildSummaryJobPayload({
+        serverSessionId: ended.id,
+        teamId: ended.teamId,
+        projectId: ended.projectId,
+        generationJobId: outboxId,
+        apiKeyId: input.apiKeyId ?? null,
+        actorId: input.actorId ?? null,
+        sourceAdapter: input.sourceAdapter ?? null,
+      });
+      const outbox = await jobsRepo.create({
+        id: outboxId,
+        projectId: ended.projectId,
+        teamId: ended.teamId,
+        sourceType: 'session_summary',
+        sourceId: ended.id,
+        serverSessionId: ended.id,
+        jobType: SUMMARY_JOB_TYPE,
+        bullmqJobId: buildSummaryJobId({
+          serverSessionId: ended.id,
+          teamId: ended.teamId,
+          projectId: ended.projectId,
+        }),
+        payload: summaryPayload as unknown as Record<string, unknown>,
+      });
+      await eventsLogRepo.append({
+        generationJobId: outbox.id,
+        projectId: outbox.projectId,
+        teamId: outbox.teamId,
+        eventType: 'queued',
+        statusAfter: outbox.status,
+        attempt: outbox.attempts,
+        details: { source },
+      });
+      return { session: ended, outbox };
+    });
+
+    if (!txResult.session || !txResult.outbox) {
+      return { session: txResult.session, outbox: null, enqueueState: 'skipped' };
+    }
+    const enqueueState = await this.publishSummaryJob(txResult.session.id, txResult.outbox, input);
+    return { session: txResult.session, outbox: txResult.outbox, enqueueState };
+  }
+
+  private async publishSummaryJob(
+    serverSessionId: string,
+    outbox: PostgresObservationGenerationJob,
+    input: EndSessionInput,
+  ): Promise<'enqueued' | 'queued_only'> {
+    const queue = this.options.resolveSummaryQueue();
+    if (!queue) {
+      return 'queued_only';
+    }
+    const jobId = outbox.bullmqJobId ?? buildSummaryJobId({
+      serverSessionId,
+      teamId: outbox.teamId,
+      projectId: outbox.projectId,
+    });
+    const payload: GenerateSessionSummaryJob = buildSummaryJobPayload({
+      serverSessionId,
+      teamId: outbox.teamId,
+      projectId: outbox.projectId,
+      generationJobId: outbox.id,
+      apiKeyId: input.apiKeyId ?? null,
+      actorId: input.actorId ?? null,
+      sourceAdapter: input.sourceAdapter ?? null,
+    });
+    try {
+      await queue.add(jobId, payload);
+      return 'enqueued';
+    } catch (error) {
+      logger.warn('SYSTEM', 'failed to publish summary generation job to BullMQ', {
+        outboxId: outbox.id,
+        error: error instanceof Error ? error.message : String(error),
+      });
+      return 'queued_only';
+    }
+  }
+}
@@ -0,0 +1,273 @@
+// SPDX-License-Identifier: Apache-2.0
+
+// Shared event-ingest path used by both `/v1/events` (canonical) and
+// `src/server/compat/SessionsObservationsAdapter.ts` (legacy translator).
+// Centralizes the transactional write (event row + outbox row + lifecycle
+// log) and the post-commit BullMQ enqueue so both call sites apply the
+// exact same SessionGenerationPolicy and outbox-then-publish guarantees.
+//
+// This module MUST NOT import from src/services/worker/* — the whole point
+// of Phase 9 is to give the compat adapters a translation surface that
+// reaches Server beta core directly, with no worker-layer detours.
+
+import type { CreatePostgresAgentEventInput, PostgresAgentEvent } from '../../storage/postgres/agent-events.js';
+import { PostgresAgentEventsRepository } from '../../storage/postgres/agent-events.js';
+import {
+  PostgresObservationGenerationJobEventsRepository,
+  PostgresObservationGenerationJobRepository,
+  type PostgresObservationGenerationJob,
+} from '../../storage/postgres/generation-jobs.js';
+import type { PostgresPool } from '../../storage/postgres/pool.js';
+import { withPostgresTransaction } from '../../storage/postgres/pool.js';
+import { logger } from '../../utils/logger.js';
+import { buildServerJobId } from '../jobs/job-id.js';
+import type { GenerateObservationsForEventJob } from '../jobs/types.js';
+import {
+  buildEnqueueEventDecision,
+  scheduleDebouncedEventJob,
+  type ServerSessionGenerationPolicy,
+} from '../runtime/SessionGenerationPolicy.js';
+import { newId } from '../../storage/postgres/utils.js';
+
+function buildEventBullmqPayload(input: {
+  outboxId: string;
+  event: PostgresAgentEvent;
+  apiKeyId: string | null;
+  actorId: string | null;
+  sourceAdapter: string | null;
+  requestId: string | null;
+}): GenerateObservationsForEventJob {
+  return {
+    kind: 'event',
+    team_id: input.event.teamId,
+    project_id: input.event.projectId,
+    source_type: 'agent_event',
+    source_id: input.event.id,
+    generation_job_id: input.outboxId,
+    agent_event_id: input.event.id,
+    api_key_id: input.apiKeyId,
+    actor_id: input.actorId,
+    source_adapter: input.sourceAdapter ?? input.event.sourceAdapter ?? 'api',
+    request_id: input.requestId,
+  };
+}
+
+const EVENT_JOB_TYPE = 'observation_generate_for_event';
+
+export type EnqueueOutcome = 'enqueued' | 'queued_only' | 'skipped';
+
+export interface IngestEventsServiceOptions {
+  pool: PostgresPool;
+  // Lazy queue resolver so the service does not depend on the queue manager
+  // type and tests can swap in a fake. When this returns null, the outbox
+  // row stays `queued` and Phase 3 startup reconciliation will publish it.
+  resolveEventQueue: () => EventQueueLike | null;
+  sessionPolicy?: ServerSessionGenerationPolicy;
+  sessionDebounceWindowMs?: number;
+}
+
+export interface EventQueueLike {
+  add(jobId: string, payload: unknown, options?: unknown): Promise<unknown>;
+}
+
+export interface IngestEventResult {
+  event: PostgresAgentEvent;
+  outbox: PostgresObservationGenerationJob | null;
+  enqueueState: EnqueueOutcome;
+}
+
+export interface IngestEventOptions {
+  generate?: boolean;
+  source?: string;
+  // Phase 11 — identity context that flows from the HTTP auth boundary into
+  // the BullMQ payload and audit log. None of these are auth gates: the
+  // worker reloads and re-validates from Postgres before any side effect.
+  apiKeyId?: string | null;
+  actorId?: string | null;
+  sourceAdapter?: string | null;
+  // Phase 12 — opaque correlation id minted at the HTTP middleware so
+  // generator logs and audit rows can pivot back to the originating request.
+  requestId?: string | null;
+}
+
+export class IngestEventsService {
+  constructor(private readonly options: IngestEventsServiceOptions) {}
+
+  async ingestOne(
+    input: CreatePostgresAgentEventInput,
+    opts: IngestEventOptions = {},
+  ): Promise<IngestEventResult> {
+    const generate = opts.generate ?? true;
+    const source = opts.source ?? 'http_post_v1_events';
+
+    const txResult = await withPostgresTransaction(this.options.pool, async (client) => {
+      const eventsRepo = new PostgresAgentEventsRepository(client);
+      const inserted = await eventsRepo.create(input);
+
+      if (!generate) {
+        return { event: inserted, outbox: null as PostgresObservationGenerationJob | null };
+      }
+
+      const jobsRepo = new PostgresObservationGenerationJobRepository(client);
+      const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
+      // Pre-generate the outbox id so we can build the BullMQ payload (which
+      // references generation_job_id) and persist it on the row. Reconciliation
+      // and operator retry rely on this persisted payload to re-enqueue a
+      // payload that passes assertServerGenerationJobPayload at the worker.
+      const outboxId = newId();
+      const bullmqPayload = buildEventBullmqPayload({
+        outboxId,
+        event: inserted,
+        apiKeyId: opts.apiKeyId ?? null,
+        actorId: opts.actorId ?? null,
+        sourceAdapter: opts.sourceAdapter ?? null,
+        requestId: opts.requestId ?? null,
+      });
+      const outbox = await jobsRepo.create({
+        id: outboxId,
+        projectId: inserted.projectId,
+        teamId: inserted.teamId,
+        sourceType: 'agent_event',
+        sourceId: inserted.id,
+        agentEventId: inserted.id,
+        serverSessionId: inserted.serverSessionId,
+        jobType: EVENT_JOB_TYPE,
+        bullmqJobId: buildServerJobId({
+          kind: 'event',
+          team_id: inserted.teamId,
+          project_id: inserted.projectId,
+          source_type: 'agent_event',
+          source_id: inserted.id,
+        }),
+        payload: bullmqPayload as unknown as Record<string, unknown>,
+      });
+      await eventsLogRepo.append({
+        generationJobId: outbox.id,
+        projectId: outbox.projectId,
+        teamId: outbox.teamId,
+        eventType: 'queued',
+        statusAfter: outbox.status,
+        attempt: outbox.attempts,
+        details: { source },
+      });
+      return { event: inserted, outbox };
+    });
+
+    let enqueueState: EnqueueOutcome = 'skipped';
+    if (txResult.outbox) {
+      enqueueState = await this.publishEventJob(txResult.event, txResult.outbox, opts);
+    }
+    return { event: txResult.event, outbox: txResult.outbox, enqueueState };
+  }
+
+  async ingestBatch(
+    inputs: CreatePostgresAgentEventInput[],
+    opts: IngestEventOptions = {},
+  ): Promise<IngestEventResult[]> {
+    const generate = opts.generate ?? true;
+    const source = opts.source ?? 'http_post_v1_events_batch';
+
+    const txResults = await withPostgresTransaction(this.options.pool, async (client) => {
+      const eventsRepo = new PostgresAgentEventsRepository(client);
+      const jobsRepo = new PostgresObservationGenerationJobRepository(client);
+      const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
+      const acc: { event: PostgresAgentEvent; outbox: PostgresObservationGenerationJob | null }[] = [];
+      for (const input of inputs) {
+        const event = await eventsRepo.create(input);
+        if (!generate) {
+          acc.push({ event, outbox: null });
+          continue;
+        }
+        const outboxId = newId();
+        const bullmqPayload = buildEventBullmqPayload({
+          outboxId,
+          event,
+          apiKeyId: opts.apiKeyId ?? null,
+          actorId: opts.actorId ?? null,
+          sourceAdapter: opts.sourceAdapter ?? null,
+          requestId: opts.requestId ?? null,
+        });
+        const outbox = await jobsRepo.create({
+          id: outboxId,
+          projectId: event.projectId,
+          teamId: event.teamId,
+          sourceType: 'agent_event',
+          sourceId: event.id,
+          agentEventId: event.id,
+          serverSessionId: event.serverSessionId,
+          jobType: EVENT_JOB_TYPE,
+          bullmqJobId: buildServerJobId({
+            kind: 'event',
+            team_id: event.teamId,
+            project_id: event.projectId,
+            source_type: 'agent_event',
+            source_id: event.id,
+          }),
+          payload: bullmqPayload as unknown as Record<string, unknown>,
+        });
+        await eventsLogRepo.append({
+          generationJobId: outbox.id,
+          projectId: outbox.projectId,
+          teamId: outbox.teamId,
+          eventType: 'queued',
+          statusAfter: outbox.status,
+          attempt: outbox.attempts,
+          details: { source },
+        });
+        acc.push({ event, outbox });
+      }
+      return acc;
+    });
+
+    return Promise.all(txResults.map(async ({ event, outbox }) => {
+      const enqueueState: EnqueueOutcome = outbox
+        ? await this.publishEventJob(event, outbox, opts)
+        : 'skipped';
+      return { event, outbox, enqueueState };
+    }));
+  }
+
+  private async publishEventJob(
+    event: PostgresAgentEvent,
+    outbox: PostgresObservationGenerationJob,
+    opts: IngestEventOptions = {},
+  ): Promise<'enqueued' | 'queued_only'> {
+    const queue = this.options.resolveEventQueue();
+    if (!queue) {
+      return 'queued_only';
+    }
+    const policyOptions: { policy?: ServerSessionGenerationPolicy; debounceWindowMs?: number } = {};
+    if (this.options.sessionPolicy !== undefined) {
+      policyOptions.policy = this.options.sessionPolicy;
+    }
+    if (this.options.sessionDebounceWindowMs !== undefined) {
+      policyOptions.debounceWindowMs = this.options.sessionDebounceWindowMs;
+    }
+    const decision = buildEnqueueEventDecision(
+      {
+        event,
+        outbox,
+        apiKeyId: opts.apiKeyId ?? null,
+        actorId: opts.actorId ?? null,
+        sourceAdapter: opts.sourceAdapter ?? event.sourceAdapter ?? null,
+        // Phase 12 — flow request_id into the BullMQ payload so the worker
+        // can emit it in [generation] logs and the audit row.
+        requestId: opts.requestId ?? null,
+      },
+      policyOptions,
+    );
+    if (!decision.shouldEnqueue) {
+      return 'queued_only';
+    }
+    try {
+      await scheduleDebouncedEventJob(queue as never, decision);
+      return 'enqueued';
+    } catch (error) {
+      logger.warn('SYSTEM', 'failed to publish event generation job to BullMQ', {
+        outboxId: outbox.id,
+        error: error instanceof Error ? error.message : String(error),
+      });
+      return 'queued_only';
+    }
+  }
+}