server-beta: Phases 4–13 — event pipeline, generation, MCP, compat, Docker, team audit, observability (#2383)

* feat(server-beta): Phase 4 — Postgres event-to-generation-job pipeline

Adds POST /v1/events, /v1/events/batch, GET /v1/jobs/:id, GET /v1/events/:id,
and POST /v1/memories on the server-beta runtime, backed by Postgres.

- Event row + outbox generation-job row insert in one withPostgresTransaction.
- BullMQ enqueue happens after commit; enqueue failure leaves the row queued
  for Phase 3 startup reconciliation.
- ?generate=false skips the outbox; ?wait=true returns queue status only,
  never observation IDs (provider generation is Phase 5).
- Batch pre-validates all event projectIds against api-key scope before any
  write; mixed-project batches reject 403 with zero side effects.
- /v1/memories is a direct insert alias — no generator, no outbox.
- Cross-tenant /v1/jobs/:id returns 404 to avoid leaking row existence.
- New PostgresAuthMiddleware reads api_keys by SHA-256 hash; populates
  req.authContext.teamId/projectId; legacy ServerV1Routes (SQLite, used by
  worker runtime) is left untouched.
- Tests: unit suite hardened with stubbed pool.query so route registration
  is safe; integration tests skip cleanly without CLAUDE_MEM_TEST_POSTGRES_URL.

Verification: 87 pass / 1 skip / 0 fail. No new typecheck errors. Required
greps for WorkerService and MemoryItemsRepository in src/server/routes/v1
and src/server/runtime return no hits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 5 — provider observation generator

Adds independent provider generation under src/server/generation/ with no
worker coupling. Server beta can now generate observations end-to-end:
event -> outbox -> BullMQ -> provider -> parser -> persisted observation.

- ProviderObservationGenerator orchestrates: lock outbox (queued -> processing),
  reload agent_event from Postgres (BullMQ payload is advisory only), call
  provider, hand raw text to processGeneratedResponse, route errors via
  markGenerationFailed with retryable flag from ServerClassifiedProviderError.
- processGeneratedResponse parses with parseAgentXml, persists via
  PostgresObservationRepository with deterministic
  generation_key = generation:v1:{job_id}:{index}:{fingerprint},
  links via PostgresObservationSourcesRepository, advances outbox status,
  appends observation_generation_job_events, audits — all in one
  withPostgresTransaction. Idempotent on retry via UNIQUE constraints.
- Three provider adapters under src/server/generation/providers/:
  Claude, Gemini, OpenRouter. Self-contained — no imports from
  src/services/worker/*. Worker providers unchanged.
- Shared error classification + prompt builder under providers/shared/.
  Prompt builder strips <private> at the edge; fully-private batches
  emit <skip_summary /> without billing the provider.
- ActiveServerBetaGenerationWorkerManager wires BullMQ Worker via
  ServerJobQueue.start(...) with concurrency 1 + autorun:false +
  worker.on('error') per BullMQ docs.
- New GET /v1/events/:id/observations on ServerV1PostgresRoutes returns
  observations linked via observation_sources, team/project scoped.

Verification: 104 pass / 4 skip / 0 fail. No typecheck regressions.
Anti-pattern greps clean for services/worker imports under src/server,
WorkerRef/ActiveSession/SessionStore in src/server/generation.

Deferred: ModeManager loading uses a stable fallback observation type
list; summary and reindex queue lanes are not yet wired.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 6 — independent server session semantics

server_sessions is now the canonical Server beta session model. Sessions
are independent of legacy worker ActiveSession state.

- PostgresServerSessionRepository extended: findByExternalIdForScope,
  endSession (idempotent via COALESCE(ended_at, now())),
  markGenerationStarted/Completed/Failed, listUnprocessedEvents (filters
  agent_events with completed agent_event jobs).
- ServerSessionRuntimeRepository wraps the repo; every method requires
  explicit team_id + project_id and validates scope via assertProjectOwnership.
- SessionGenerationPolicy supports per-event (default), debounce
  (BullMQ delayed-job replace via getJob+remove+add), and end-of-session.
  Configured via CLAUDE_MEM_SERVER_SESSION_POLICY and
  CLAUDE_MEM_SERVER_SESSION_DEBOUNCE_MS env vars; per-team override hooks
  are exposed on ServerV1PostgresRoutesOptions for future settings layer.
- POST /v1/sessions/start (find-or-create on (project_id, external_session_id),
  GET /v1/sessions/:id (scoped 404), POST /v1/sessions/:id/end
  (transactional: end + create summary outbox via UNIQUE collapse +
  enqueue post-commit). Re-ending is fully idempotent.
- processSessionSummaryResponse persists summary as kind='summary'
  observation with the same idempotency model
  (generation_key + observation_sources UNIQUE).
- ProviderObservationGenerator dispatches on source_type:
  agent_event -> processGeneratedResponse, session_summary ->
  processSessionSummaryResponse; loadEvents handles session-summary
  by loading unprocessed events.
- ActiveServerBetaGenerationWorkerManager wires summary BullMQ lane
  alongside event lane (concurrency=1, autorun=false, error listener
  attached per BullMQ docs).

Verification: 110 pass / 6 skip / 0 fail. Net typecheck error count
unchanged at 24 (pre-existing, none in Phase 6 files). Anti-pattern
greps clean for ActiveSession/SessionStore in src/server/runtime,
no worker imports anywhere in src/server.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 7 — hook routing without worker dependency

Hooks can now talk directly to server-beta when CLAUDE_MEM_RUNTIME=server-beta
is selected, with a clean worker fallback when server-beta is unhealthy.

- src/services/hooks/server-beta-client.ts — typed HTTP client for
  /v1/sessions/start, /v1/events, /v1/sessions/:id/end. Throws
  ServerBetaClientError with kind classification (missing_api_key,
  transport, timeout, http_error, invalid_response) and isFallbackEligible
  helper. Zero imports from services/worker/.
- src/services/hooks/runtime-selector.ts — reads CLAUDE_MEM_RUNTIME from
  settings, returns worker or server-beta context, logs
  [server-beta-fallback] reason=<code> on every config-time fallback.
- src/services/hooks/server-beta-bootstrap.ts — Postgres-backed API key
  bootstrap. Find-or-creates local-hook-team + local-hook-project,
  generates cmem_<random> key (SHA-256 hashed), inserts into api_keys
  with scopes events:write/sessions:write/observations:read/jobs:read.
  Settings file written with chmod 0600. rotateServerBetaApiKey() wired
  to a new `claude-mem server keys rotate` command.
- src/cli/handlers/{observation,session-init,summarize}.ts — every hook
  handler tries server-beta first when configured, falls through to the
  existing worker path on transport/5xx/429/missing-key. One WARN line
  per fallback. Hook JSON output shape unchanged.
- src/shared/SettingsDefaultsManager.ts — three new keys with defaults:
  CLAUDE_MEM_SERVER_BETA_URL, CLAUDE_MEM_SERVER_BETA_API_KEY,
  CLAUDE_MEM_SERVER_BETA_PROJECT_ID.
- src/npx-cli/commands/install.ts — when installer selects server-beta
  runtime and CLAUDE_MEM_SERVER_DATABASE_URL is set, bootstraps a local
  API key automatically. Warns and continues if the DB URL is missing.

plugin/scripts/*.cjs bundles rebuilt via npm run build to pick up the
new hook handler code path. No plaintext keys in the bundle (verified).

Verification: 16 hook unit tests pass; 275 server/storage/services tests
pass with 7 pre-existing failures (verified independent of this change
via git stash --include-untracked). Build clean. No new typecheck
errors in Phase 7 files.

Anti-pattern guards verified:
- /api/sessions/observations only reached via explicit fallback path
- server-beta runtime never starts the worker process
- API keys live only in ~/.claude-mem/settings.json (chmod 0600), never
  in the bundle (grep confirmed)
- Worker fallback preserved, observable via single WARN line per call

Deferred: semantic context injection (UserPromptSubmit hook) stays
worker-only; server-beta does not yet expose /v1/context/semantic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 8 — MCP backed by server-beta core

MCP tools now route through server-beta in server-beta mode while keeping
worker-mode search/timeline/get_observations tools fully working.

- src/servers/mcp-server.ts — five new observation_* tools registered:
  observation_add, observation_record_event, observation_search,
  observation_context, observation_generation_status. Three memory_*
  compatibility aliases delegate to the canonical handlers. Worker
  auto-start is gated when selectRuntime() === 'server-beta' so MCP
  in server-beta mode never spawns the worker.
- src/services/hooks/server-beta-client.ts — addObservation,
  searchObservations, contextObservations, getJobStatus added so MCP
  shares one transport with hooks (Phase 7).
- src/server/routes/v1/ServerV1PostgresRoutes.ts — POST /v1/search and
  POST /v1/context REST cores backed by PostgresObservationRepository
  full-text search (GIN tsvector from Phase 1).
- Existing memory_search/timeline/get_observations tools call
  callWorkerAPI unchanged in worker mode; worker tests unaffected.

Verification: 39 pass / 4 skip / 0 fail on targeted suite. Pre-existing
7 baseline failures verified independent (git stash). No new typecheck
errors. WorkerService grep clean across src/servers/mcp-server.ts and
src/server/.

Anti-pattern guards verified:
- No duplicate generation logic in MCP — observation_record_event hits
  /v1/events which owns event+outbox+enqueue inside one tx
- WorkerService not imported anywhere under MCP server-beta path
- No hardcoded worker URLs — all transport via Phase 7 ServerBetaClient
- memory_* aliases retained, single handler per pair

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 9 — compatibility adapters without coupling

Legacy /api/sessions/observations and /api/sessions/summarize endpoints
keep working on server-beta runtime by translating to AgentEvent and
session-end calls — no worker code, no route duplication.

- src/server/services/IngestEventsService.ts — shared event-ingest path
  used by both /v1/events and the compat adapter. Owns transactional
  event row + outbox row + lifecycle log + post-commit BullMQ enqueue,
  honors Phase 6 SessionGenerationPolicy.
- src/server/services/EndSessionService.ts — shared session-end path
  used by both /v1/sessions/:id/end and the compat adapter. Idempotent
  ended_at + summary outbox + deterministic summary job id.
- src/server/compat/SessionsObservationsAdapter.ts — translates legacy
  POST /api/sessions/observations payload (Claude Code transcript shape)
  -> AgentEvent (source_adapter='claude-code-compat',
  event_type='tool_use') -> IngestEventsService.ingestOne. Resolves
  contentSessionId to server_sessions via find-or-create.
- src/server/compat/SessionsSummarizeAdapter.ts — translates legacy
  POST /api/sessions/summarize -> EndSessionService.end. Preserves the
  legacy agentId -> {status:'skipped', reason:'subagent_context'}
  behavior so existing clients see the same response shape.
- src/server/routes/v1/ServerV1PostgresRoutes.ts — refactored to
  delegate to the new shared services (-203 LoC net) so /v1 and
  /api compat both call the SAME canonical code path.
- src/server/runtime/ServerBetaService.ts — registers both compat
  adapters alongside ServerV1PostgresRoutes, sharing service instances.
- docs/server-beta-parity-map.md — full enumeration of legacy /api/*
  routes labeled native, adapter, or unsupported (with reasons).
  Viewer read-path adapters explicitly listed as unsupported pending
  a future viewer-rewrite phase.

Verification: 7 compat tests pass, 6 v1-routes tests still pass
(refactor preserved behavior), 4 session-routes tests pass. Pre-
existing 16 baseline failures verified independent via git stash.
Zero new typecheck errors.

Anti-pattern guards verified:
- No services/worker/http/routes or WorkerService imports under
  src/server/compat or src/server/runtime
- Compat adapters are thin translators with names ending in *Adapter
  and a top-of-file comment noting they are legacy compatibility
- /v1/* remains the canonical Server beta API; compat adapters
  call shared services rather than acting as a parallel API

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 10 — Docker stack and deployable runtime

Server beta now ships as a Docker stack with no worker process anywhere
and a separate horizontal generation worker for scaling.

- src/server/runtime/create-server-beta-service.ts — validateServerBetaEnv()
  fails fast on missing CLAUDE_MEM_SERVER_DATABASE_URL, requires
  CLAUDE_MEM_QUEUE_ENGINE=bullmq in Docker, rejects
  CLAUDE_MEM_AUTH_MODE=local-dev and CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS
  inside containers (detected via /.dockerenv or CLAUDE_MEM_DOCKER=1).
  Adds CLAUDE_MEM_GENERATION_DISABLED so the HTTP service can run
  generator-free.
- src/server/runtime/ServerBetaService.ts — runServerBetaGenerationWorker
  for the dedicated consumer process; runServerBetaApiKeyCli is a new
  Postgres-backed `server api-key` command (the legacy worker CLI wrote
  to SQLite and was invisible to the Postgres runtime); getQueueHealth
  shim feeds /api/health a consistent ObservationQueueHealth shape.
- src/npx-cli/commands/{runtime,server}.ts — `claude-mem server worker
  start` subcommand that boots only the BullMQ consumer.
- docker/claude-mem/{Dockerfile,entrypoint.sh} — entrypoint forces
  CLAUDE_MEM_DOCKER=1 + CLAUDE_MEM_RUNTIME=server-beta and exposes
  three modes: server (HTTP only, generation disabled), worker (BullMQ
  consumer), shell. Worker bundle is no longer the default CMD.
- docker-compose.yml — full stack: postgres + valkey + claude-mem-server
  (HTTP-only) + claude-mem-worker (generation consumer). Wires
  service-to-service env vars.
- scripts/e2e-server-beta-docker.sh + docker/e2e/server-beta-e2e.mjs —
  E2E now hits /v1/sessions/start, /v1/events?wait=true, /v1/jobs/:id;
  asserts no worker-service.cjs process anywhere in the stack;
  one-shot docker compose run --rm verifies local-dev auth is
  rejected with the expected stderr; restart-and-verify confirms
  Postgres durability and BullMQ retry idempotency.
- docs/server.md — full Phase 10 doc: stack diagram, env table,
  worker mode, auth-in-Docker policy.
- docs/api.md — event generation semantics (wait=true, generationJob).

Verification: full Docker E2E PASSED on live daemon
(phase1 + phase2 + restart-and-verify + revoked-key + no-worker-
process + local-dev-rejected). Unit tests 292 pass / 9 skip / 7 fail
(7 fails pre-existing baseline). Zero new typecheck errors.

Anti-pattern guards verified:
- entrypoint never execs worker-service.cjs; E2E greps prove no
  worker process anywhere in the stack
- validateServerBetaEnv refuses local-dev auth in Docker with explicit
  remediation message; ALLOW_LOCAL_DEV_BYPASS rejected the same way
- Docker requires CLAUDE_MEM_QUEUE_ENGINE=bullmq; in-process queue
  rejected at startup
- claude-mem worker / worker-service / WorkerService greps clean
  in docker/

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 11 — team-aware generation with audit chain

Generation jobs now carry team_id/project_id/api_key_id/actor_id/
source_adapter from enqueue through execution; the outbox is reloaded
from Postgres before any side effect so BullMQ payload can never act
as auth authority.

- src/server/jobs/types.ts — ServerGenerationJobPayloadSchema (Zod
  discriminated union) requires team_id, project_id, generation_job_id,
  source_adapter, api_key_id, actor_id (nullable), source_type, source_id,
  plus event_id / server_session_id per kind. assertServerGenerationJobPayload
  is called at enqueue (outbox.ts) and again at execution boundary.
- src/server/services/{IngestEventsService,EndSessionService}.ts +
  SessionGenerationPolicy.ts — thread identity context (apiKeyId, actorId,
  sourceAdapter) into both event and summary BullMQ payloads.
- src/server/generation/ProviderObservationGenerator.ts —
  loadCanonicalOutbox loads the outbox row WITHOUT scope filter, then
  compares candidate.team_id/project_id to payload.team_id/project_id;
  mismatch -> ServerGenerationScopeViolationError (non-retryable),
  failed status, generation_job.scope_violation audit. isApiKeyRevoked
  checks api_keys (revoked_at, expires_at, row missing) before any
  provider call; revoked -> generation_job.revoked_key audit + non-
  retryable failure. generation_job.processing audit emitted on lock.
- src/server/generation/processGeneratedResponse.ts — generated
  observations carry team_id/project_id/server_session_id from the
  reloaded source row (not job payload). observation_sources.metadata
  records source_adapter, actor_id, api_key_id for traceability.
  observation.created audit per observation; generation_job.completed
  audit per terminal transition. All audit rows reference the same
  generation_job_id in details.
- src/server/routes/v1/ServerV1PostgresRoutes.ts — GET /v1/teams/:id/jobs
  and GET /v1/projects/:id/jobs with SQL-layer scoping (WHERE team_id=$1
  [AND project_id=$2] [AND status=$3]); cross-tenant returns 404 to
  avoid leaking row existence. Pagination via status/limit/offset.
  audit_log rows for event.received, event.batch_received, observation.read.
- src/server/compat/{SessionsObservationsAdapter,SessionsSummarizeAdapter}.ts —
  propagate apiKeyId and sourceAdapter='claude-code-compat'.

Verification: 162 pass / 10 skip / 0 fail. Pre-existing failures in
tests/services/queue and tests/services/worker confirmed independent
via git stash. Zero new typecheck errors in server-beta files.
Required greps:
  rg "team_id.*req\.body|project_id.*req\.body" src/server -> 0 matches
Audit chain integration test passes — generation_job.processing,
observation.created, and generation_job.completed audit rows all
share the same generation_job_id reference.

Anti-pattern guards verified:
- BullMQ payload never acts as auth authority — Postgres outbox
  reload with mismatch check happens before every side effect
- team_id / project_id never derived from request body for scope
  decisions; always req.authContext.teamId / projectId
- Application-layer team/project filtering forbidden — listJobsForScope
  pushes scope into the SQL WHERE clause
- Project-scoped key on cross-project /v1/teams/:id/jobs returns 404
- Revoked api keys cause non-retryable failure with audit before
  any provider call

Deferred: a redundant generation_job.queued audit_log row (already
covered by observation_generation_job_events lifecycle log per Phase 1
schema split). Compat adapters set actor_id=null but propagate
api_key_id which is the canonical reference downstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server-beta): Phase 12 — observability and operations

Operators can now inspect, retry, and cancel generation jobs from the
CLI; queue lane metrics flow into /api/health and /v1/info; every
request gets a stable request_id that flows through HTTP -> audit ->
outbox -> generator -> completion log.

- src/server/middleware/request-id.ts — honors safe inbound X-Request-Id,
  mints uuid v4 otherwise. Set on req.requestId and echoed via response
  header so external traces can correlate.
- src/server/jobs/ServerJobQueue.ts — QueueEvents wired with completed,
  failed, progress, stalled, error listeners; lifecycle counters
  exposed via observe() API. Logs emitted as
  [generation] job=<id> source_type=<...> duration=<ms> attempts=<N>
  reason=<message>. Stalled and error counters survive worker restart.
- src/server/jobs/types.ts — ServerGenerationJob payload schema
  extended with optional request_id; flows through from HTTP into
  every BullMQ job.
- src/server/queue/ObservationQueueEngine.ts — health snapshot now
  carries per-lane (event, summary) counts via
  ObservationQueueHealthLaneSnapshot.
- src/server/runtime/{ActiveServerBetaQueueManager,
  ActiveServerBetaGenerationWorkerManager,ServerBetaService}.ts —
  per-lane getJobCounts feed /api/health and /v1/info; stalled events
  audit through audit_log with action generation_job.stalled.
- src/server/routes/v1/ServerV1PostgresRoutes.ts —
  GET /v1/jobs (status/source_type/since/limit/offset, scope from
  api-key, payload stripped unless ?include=payload AND admin scope),
  POST /v1/jobs/:id/retry (idempotent; queued -> no-op; audit
  generation_job.retried_by_operator), POST /v1/jobs/:id/cancel
  (terminal -> no-op; audit generation_job.cancelled_by_operator;
  generator reload-before-side-effects already prevents double work).
- src/server/services/IngestEventsService.ts +
  SessionGenerationPolicy.ts + ProviderObservationGenerator.ts —
  request_id propagated end to end. Generator extracts request_id
  from BullMQ payload and includes it in lock/processing/completion
  logs and audit details.
- src/npx-cli/commands/server-jobs.ts +
  src/npx-cli/commands/server.ts — `claude-mem server jobs
  status|failed|retry|cancel`. status compares Postgres outbox counts
  to BullMQ queue counts and surfaces divergence. failed prints
  attempts + last_error message. --team and --project filters.

Verification: 350 pass / 12 skip / 7 fail (pre-existing baseline,
verified independent via git stash). 18 new tests added (request-id
middleware, server-jobs CLI seams, jobs list/retry/cancel routes
Postgres-gated). Zero new typecheck errors.

Anti-pattern guards verified:
- agent_events.payload only emitted in /v1/jobs response inside the
  admin-gated branch (?include=payload + admin scope) — returns 403
  otherwise
- jobs retry on a queued row is a no-op (no double BullMQ enqueue,
  no double UPDATE)
- Every operator action writes to audit_log with the
  *_by_operator action and request_id correlation in details
- Stalled events audit through generation_job.stalled

Sample correlated trace (one request_id end to end):
  HTTP middleware: req.requestId = 'req-abc'
  audit event.received: details.requestId = 'req-abc'
  BullMQ payload: { request_id: 'req-abc', generation_job_id: 'gj_x' }
  generator lock log: [generation] job locked { jobId, requestId }
  audit generation_job.processing: details.requestId = 'req-abc'
  completion log: [generation] job=evt_... duration=1230ms

Deferred: live /api/health round-trip integration test (needs
Redis); stalled event live integration test (needs Redis); storing
request_id on the observations row itself (spec did not require).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(server-beta): add Phase 13 release readiness report

Captures the final verification gate: tests (1749 pass, 45 fail all
pre-existing baseline, zero regressions), required greps clean,
Docker E2E green end-to-end, all 7 exit criteria met, build clean,
typecheck unchanged from main. Documents deferred items.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(server-beta): rebuild server-beta-service bundle

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): address Greptile review on PR #2383

- ProviderObservationGenerator.lockOutbox: skip duplicate worker run when
  another lock is active instead of returning the row, which previously let
  two BullMQ workers issue the (paid, rate-limited) external provider call
  before the persistence-layer terminal-status guard collapsed the duplicate.
  Reconciliation still recovers from a stale lock on startup or next retry.
- docker-compose.yml: require POSTGRES_USER/PASSWORD/DB env vars (no
  defaults). Stack refuses to start without explicit secrets. Added a header
  warning that the file must not be deployed unmodified.
- e2e-server-beta-docker.sh: export ephemeral test creds for the new
  required env vars so the Docker E2E driver still runs unattended.
- ServerBetaService api-key list: bound query with LIMIT/OFFSET (default 100,
  max 500) and add optional --team filter to prevent unintentional
  cross-tenant key metadata disclosure on shared admin hosts.
- SessionGenerationPolicy: fix dead `??` fallback for NaN parseInt result;
  use `||` so DEFAULT_DEBOUNCE_MS actually applies.
- ServerV1PostgresRoutes: `?wait=true` now actually waits — polls the outbox
  row until terminal status (timeout 30s, 100ms interval) on both
  /v1/events and /v1/events/batch. Returns `waitTimedOut: true` if the cap
  is hit so callers can re-poll the status endpoints.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): address CodeRabbit + Greptile second review on PR #2383

P1 fixes
- Operator retry endpoint was re-publishing the Postgres outbox metadata
  column as the BullMQ payload; the worker's
  assertServerGenerationJobPayload always rejected it, leaving the row
  stuck in queued until startup reconciliation. Persist the BullMQ payload
  on the outbox row at create-time inside IngestEventsService and
  EndSessionService, then re-enqueue that canonical payload on retry.

Major fixes
- prompt-builder: escape server_session_id when interpolating into the
  XML prompt; previously a session id containing `<`, `&`, or quotes
  could inject XML into the provider input.
- ServerJobQueue: route both worker.on('stalled') and the QueueEvents
  'stalled' subscriber through a single notifyStalled helper that
  dedupes by jobId for 30s, so counters.stalled increments once per
  stall. QueueEvents 'error' now routes through notifyQueueError so
  it increments counters.errored and runs onError listeners — keeping
  observability symmetric across both sources.
- ServerV1PostgresRoutes: convert PostgresObservationRepository from
  three dynamic imports to a single static import for consistency.
- mcp-server / ServerBetaClient: actually forward the
  observation_record_event tool's `generate` flag through to the
  /v1/events endpoint as `?generate=false` instead of voiding it.
- server-sessions.markGenerationFailed: guard jsonb_set against a null
  error payload so the failure path can't null out metadata before the
  generation_status='failed' write commits.

Minor fixes
- server-sessions.endSession: keep updated_at stable on repeated calls
  so the documented idempotency contract holds.
- SettingsDefaultsManager + ServerBetaService.getServerBetaPort: derive
  the server-beta default port from UID (37877 + uid%100), matching the
  worker port pattern, so two users on the same host don't collide.
  Docker stacks always pass CLAUDE_MEM_SERVER_PORT explicitly so the
  containerized deployment is unaffected.
- server-session-runtime test: close the pg.Pool in afterAll.
- server-beta-release-readiness.md: escape pipes inside table inline
  code, add `text` language tag to the fenced log block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): address Greptile + CodeRabbit third review on PR #2383

P1 fixes
- SessionsObservationsAdapter.resolveServerSession: catch unique-violation
  (23505) on concurrent compat inserts and re-fetch instead of returning
  500. Two compat callers carrying the same contentSessionId can both
  observe `existing===null` and race on the (project_id,
  external_session_id) unique constraint; the second now resolves to the
  raced row instead of dropping the event.
- /v1/events/batch: pass `sourceAdapter: null` to ingestBatch so each
  event's BullMQ payload (and persisted outbox payload column) reflects
  its own event.sourceAdapter via buildEventBullmqPayload's fallback,
  rather than stamping the whole batch with the first event's adapter.

Minor
- server-session-runtime test afterEach: wrap DROP SCHEMA in try/finally
  so client.release() always runs even if the drop throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): drop `pool as never` cast — pg.Pool already matches PostgresPool

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): retry of completed job now 409s instead of duplicating

retryGenerationJob previously fell through to the reset+re-enqueue path
when called on a job in `completed` status. The observations index
dedupes on (generation_job_id, parsed_observation_index, content) but
LLM output is non-deterministic, so a second provider run almost always
produced a different content string and bypassed the index, persisting a
parallel set of observation rows attributed to the same generation job.

Match cancelGenerationJob's 409 guard for completed jobs. failed and
cancelled remain valid retry targets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(server-beta): rebuild bundles after rebase onto main

Regenerates the three plugin bundles so they reflect the rebased source
state. Mechanical rebuild output only — no source changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): wrap resolveServerSession in try/catch for structured error response

Greptile P1 on PR #2383: resolveServerSession was called before the try/catch
in both compat adapters, so Postgres errors during session lookup (timeout,
pool exhaustion, etc.) escaped to Express's default error handler and returned
HTML/text 500s. Legacy clients calling response.json() would get a parse
failure instead of the documented { stored: false, reason: 'internal_error' }
(or { status: 'error', reason: 'internal_error' } for the summarize adapter)
shape.

Move the resolveServerSession call inside the existing try block in both
adapters so any failure flows through the structured catch handler.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(server-beta): catch 23505 unique violation in POST /v1/sessions/start

Greptile P1 on PR #2383: concurrent requests with the same externalSessionId
can both pass the findByExternalIdForScope check, both call repo.create,
and the loser hits the (project_id, external_session_id) unique constraint.
The handler treated that as an unknown error and returned a 500.

Apply the same pattern resolveServerSession already uses: catch error.code
'23505' when externalSessionId is set, refetch the row inserted by the
winning request, and return 200 with that session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alex Newman
2026-05-11 00:26:11 -07:00
committed by GitHub
parent a10d1b342f
commit e7bbb2a9aa
72 changed files with 13901 additions and 982 deletions
@@ -0,0 +1,212 @@
// SPDX-License-Identifier: Apache-2.0
// Legacy compatibility — new clients should use POST /v1/events directly.
//
// Legacy worker payloads to `/api/sessions/observations` are translated into
// the Server beta event/job model and delegated to IngestEventsService. The
// adapter never touches worker code, never queues observations directly, and
// never uses `src/services/worker/*` types.
//
// Translation rules:
// - `contentSessionId` (Claude Code session UUID) becomes the
// `external_session_id` of a Server beta `server_sessions` row, scoped to
// the API key's team and project. The session is create-or-found.
// - The tool-use shape (tool_name, tool_input, tool_response, tool_use_id)
// is mapped to an `agent_event` with sourceAdapter='claude-code-compat',
// eventType='tool_use', payload preserves the legacy fields verbatim.
// - The API key MUST be project-scoped. Cross-project compat calls return
// 400; we never let compat traffic bypass project scope.
import type { Application, Request, Response } from 'express';
import { z } from 'zod';
import type { RouteHandler } from '../../services/server/Server.js';
import type { PostgresPool } from '../../storage/postgres/pool.js';
import { PostgresServerSessionsRepository } from '../../storage/postgres/server-sessions.js';
import { logger } from '../../utils/logger.js';
import { requirePostgresServerAuth } from '../middleware/postgres-auth.js';
import { IngestEventsService } from '../services/IngestEventsService.js';
import type { CreatePostgresAgentEventInput } from '../../storage/postgres/agent-events.js';
const COMPAT_SOURCE_ADAPTER = 'claude-code-compat';
const COMPAT_EVENT_TYPE = 'tool_use';
const observationsSchema = z.object({
contentSessionId: z.string().min(1),
tool_name: z.string().min(1),
tool_input: z.unknown().optional(),
tool_response: z.unknown().optional(),
cwd: z.string().optional(),
agentId: z.string().optional(),
agentType: z.string().optional(),
platformSource: z.string().optional(),
tool_use_id: z.string().optional(),
toolUseId: z.string().optional(),
}).passthrough();
export interface SessionsObservationsAdapterOptions {
pool: PostgresPool;
ingestEvents: IngestEventsService;
authMode?: string;
allowLocalDevBypass?: boolean;
}
export class SessionsObservationsAdapter implements RouteHandler {
constructor(private readonly options: SessionsObservationsAdapterOptions) {}
setupRoutes(app: Application): void {
const writeAuth = requirePostgresServerAuth(this.options.pool, {
authMode: this.options.authMode,
allowLocalDevBypass: this.options.allowLocalDevBypass,
requiredScopes: ['memories:write'],
});
app.post('/api/sessions/observations', writeAuth, this.asyncHandler(async (req, res) => {
const parsed = observationsSchema.safeParse(req.body);
if (!parsed.success) {
res.status(400).json({ error: 'ValidationError', issues: parsed.error.issues });
return;
}
const teamId = req.authContext?.teamId ?? null;
const projectId = req.authContext?.projectId ?? null;
if (!teamId) {
res.status(403).json({ error: 'Forbidden', message: 'API key is not bound to a team' });
return;
}
if (!projectId) {
// Compat mode requires a project-scoped key — the legacy payload does
// not carry a Server beta projectId, so without scope we cannot place
// the row in a tenant-scoped table.
res.status(400).json({
error: 'BadRequest',
message: 'Legacy /api/sessions/observations requires a project-scoped API key',
});
return;
}
try {
const session = await resolveServerSession({
pool: this.options.pool,
teamId,
projectId,
contentSessionId: parsed.data.contentSessionId,
platformSource: typeof parsed.data.platformSource === 'string' ? parsed.data.platformSource : null,
agentId: typeof parsed.data.agentId === 'string' ? parsed.data.agentId : null,
agentType: typeof parsed.data.agentType === 'string' ? parsed.data.agentType : null,
});
const toolUseId = typeof parsed.data.tool_use_id === 'string'
? parsed.data.tool_use_id
: (typeof parsed.data.toolUseId === 'string' ? parsed.data.toolUseId : null);
const input: CreatePostgresAgentEventInput = {
projectId,
teamId,
serverSessionId: session.id,
sourceAdapter: COMPAT_SOURCE_ADAPTER,
sourceEventId: toolUseId,
eventType: COMPAT_EVENT_TYPE,
payload: {
contentSessionId: parsed.data.contentSessionId,
tool_name: parsed.data.tool_name,
tool_input: parsed.data.tool_input ?? null,
tool_response: parsed.data.tool_response ?? null,
cwd: parsed.data.cwd ?? null,
platformSource: parsed.data.platformSource ?? null,
agentId: parsed.data.agentId ?? null,
agentType: parsed.data.agentType ?? null,
toolUseId,
},
metadata: { compat: 'sessions/observations' },
occurredAt: new Date(),
};
const result = await this.options.ingestEvents.ingestOne(input, {
source: 'http_post_api_sessions_observations',
apiKeyId: req.authContext?.apiKeyId ?? null,
actorId: null,
sourceAdapter: COMPAT_SOURCE_ADAPTER,
});
// Legacy response shape — older clients only check `status`.
res.json({
status: 'queued',
observationCount: 1,
sessionId: session.id,
serverSessionId: session.id,
eventId: result.event.id,
generationJobId: result.outbox?.id ?? null,
transport: result.enqueueState,
});
} catch (error) {
logger.error('SYSTEM', 'compat observations adapter failed', {
error: error instanceof Error ? error.message : String(error),
contentSessionId: parsed.data.contentSessionId,
});
res.status(500).json({ stored: false, reason: 'internal_error' });
}
}));
}
private asyncHandler(fn: (req: Request, res: Response) => Promise<void> | void) {
return (req: Request, res: Response, next: (err?: unknown) => void): void => {
Promise.resolve(fn(req, res)).catch(next);
};
}
}
/**
* Look up an existing server_session by (project, team, externalSessionId)
* or create one if missing. Idempotent: re-issuing for the same content
* session returns the existing row.
*
* Concurrent compat callers can race here — both observe `existing===null`
* and both call `repo.create`, where the second will hit one of two unique
* constraints (`(project_id, idempotency_key)` covered by ON CONFLICT, or
* `(project_id, external_session_id)` which is NOT covered). Catch the
* unique-violation and re-fetch so the caller never sees a 500.
*/
export async function resolveServerSession(input: {
pool: PostgresPool;
teamId: string;
projectId: string;
contentSessionId: string;
platformSource: string | null;
agentId: string | null;
agentType: string | null;
}): Promise<{ id: string; projectId: string; teamId: string }> {
const repo = new PostgresServerSessionsRepository(input.pool);
const existing = await repo.findByExternalIdForScope({
externalSessionId: input.contentSessionId,
projectId: input.projectId,
teamId: input.teamId,
});
if (existing) {
return { id: existing.id, projectId: existing.projectId, teamId: existing.teamId };
}
try {
const created = await repo.create({
projectId: input.projectId,
teamId: input.teamId,
externalSessionId: input.contentSessionId,
contentSessionId: input.contentSessionId,
agentId: input.agentId,
agentType: input.agentType,
platformSource: input.platformSource,
});
return { id: created.id, projectId: created.projectId, teamId: created.teamId };
} catch (error) {
// Postgres unique_violation. A concurrent compat call inserted the row
// for this (project, external_session_id) before we could; re-fetch
// and return that row instead of bubbling a 500 to the legacy client.
if ((error as { code?: string } | null)?.code === '23505') {
const racedRow = await repo.findByExternalIdForScope({
externalSessionId: input.contentSessionId,
projectId: input.projectId,
teamId: input.teamId,
});
if (racedRow) {
return { id: racedRow.id, projectId: racedRow.projectId, teamId: racedRow.teamId };
}
}
throw error;
}
}
@@ -0,0 +1,127 @@
// SPDX-License-Identifier: Apache-2.0
// Legacy compatibility — new clients should use POST /v1/sessions/:id/end directly.
//
// Translates the legacy `/api/sessions/summarize` request into a call to
// EndSessionService. The legacy shape carries `contentSessionId` and an
// optional `last_assistant_message`; we resolve the server_session by
// (team, project, external_session_id=contentSessionId), then end it.
//
// Re-summarizing the same session collapses to the same outbox row because
// the (team_id, project_id, source_type='session_summary', source_id)
// UNIQUE constraint stays in force — exactly the same idempotency guarantee
// as `/v1/sessions/:id/end`.
import type { Application, Request, Response } from 'express';
import { z } from 'zod';
import type { RouteHandler } from '../../services/server/Server.js';
import type { PostgresPool } from '../../storage/postgres/pool.js';
import { PostgresServerSessionsRepository } from '../../storage/postgres/server-sessions.js';
import { logger } from '../../utils/logger.js';
import { requirePostgresServerAuth } from '../middleware/postgres-auth.js';
import { EndSessionService } from '../services/EndSessionService.js';
import { resolveServerSession } from './SessionsObservationsAdapter.js';
const summarizeSchema = z.object({
contentSessionId: z.string().min(1),
last_assistant_message: z.string().optional(),
agentId: z.string().optional(),
platformSource: z.string().optional(),
}).passthrough();
export interface SessionsSummarizeAdapterOptions {
pool: PostgresPool;
endSession: EndSessionService;
authMode?: string;
allowLocalDevBypass?: boolean;
}
export class SessionsSummarizeAdapter implements RouteHandler {
constructor(private readonly options: SessionsSummarizeAdapterOptions) {}
setupRoutes(app: Application): void {
const writeAuth = requirePostgresServerAuth(this.options.pool, {
authMode: this.options.authMode,
allowLocalDevBypass: this.options.allowLocalDevBypass,
requiredScopes: ['memories:write'],
});
app.post('/api/sessions/summarize', writeAuth, this.asyncHandler(async (req, res) => {
const parsed = summarizeSchema.safeParse(req.body);
if (!parsed.success) {
res.status(400).json({ error: 'ValidationError', issues: parsed.error.issues });
return;
}
const teamId = req.authContext?.teamId ?? null;
const projectId = req.authContext?.projectId ?? null;
if (!teamId) {
res.status(403).json({ error: 'Forbidden', message: 'API key is not bound to a team' });
return;
}
if (!projectId) {
res.status(400).json({
error: 'BadRequest',
message: 'Legacy /api/sessions/summarize requires a project-scoped API key',
});
return;
}
// Subagent contexts in legacy code emit summarize calls but the worker
// skipped them. We preserve the legacy semantics so existing clients
// see the same response shape.
if (parsed.data.agentId) {
res.json({ status: 'skipped', reason: 'subagent_context' });
return;
}
try {
const session = await resolveServerSession({
pool: this.options.pool,
teamId,
projectId,
contentSessionId: parsed.data.contentSessionId,
platformSource: typeof parsed.data.platformSource === 'string' ? parsed.data.platformSource : null,
agentId: null,
agentType: null,
});
const result = await this.options.endSession.end({
sessionId: session.id,
projectId,
teamId,
source: 'http_post_api_sessions_summarize',
apiKeyId: req.authContext?.apiKeyId ?? null,
actorId: null,
sourceAdapter: 'claude-code-compat',
});
if (!result.session) {
res.status(404).json({ status: 'not_found', reason: 'session_not_found' });
return;
}
res.json({
status: 'queued',
sessionId: session.id,
serverSessionId: session.id,
generationJobId: result.outbox?.id ?? null,
transport: result.enqueueState,
});
} catch (error) {
logger.error('SYSTEM', 'compat summarize adapter failed', {
error: error instanceof Error ? error.message : String(error),
contentSessionId: parsed.data.contentSessionId,
});
res.status(500).json({ status: 'error', reason: 'internal_error' });
}
}));
}
private asyncHandler(fn: (req: Request, res: Response) => Promise<void> | void) {
return (req: Request, res: Response, next: (err?: unknown) => void): void => {
Promise.resolve(fn(req, res)).catch(next);
};
}
}
// Side-effect import so PostgresServerSessionsRepository symbol is reachable
// even when tree-shaking is aggressive in the main bundle.
void PostgresServerSessionsRepository;
@@ -0,0 +1,538 @@
// SPDX-License-Identifier: Apache-2.0
import type { Job } from 'bullmq';
import { logger } from '../../utils/logger.js';
import { PostgresAgentEventsRepository } from '../../storage/postgres/agent-events.js';
import { PostgresObservationGenerationJobRepository } from '../../storage/postgres/generation-jobs.js';
import { PostgresProjectsRepository } from '../../storage/postgres/projects.js';
import { PostgresAuthRepository } from '../../storage/postgres/auth.js';
import type { PostgresPool } from '../../storage/postgres/pool.js';
import type { PostgresObservationGenerationJob } from '../../storage/postgres/generation-jobs.js';
import {
assertServerGenerationJobPayload,
ServerGenerationJobPayloadValidationError,
type ServerGenerationJobPayload,
} from '../jobs/types.js';
import { ServerClassifiedProviderError } from './providers/shared/error-classification.js';
import type { ServerGenerationProvider } from './providers/shared/types.js';
import {
markGenerationFailed,
processGeneratedResponse,
processSessionSummaryResponse,
type ProcessGeneratedResponseOutcome,
} from './processGeneratedResponse.js';
import { PostgresServerSessionsRepository } from '../../storage/postgres/server-sessions.js';
// Phase 11 — sentinel exception class so the worker can distinguish
// scope-violation/revoked-key failures from generic processor errors and
// audit them under the right action. Marked non-retryable: an attacker who
// tampered with a payload should never be retried into the queue.
export class ServerGenerationScopeViolationError extends Error {
readonly reason: 'scope_mismatch' | 'revoked_key';
constructor(reason: 'scope_mismatch' | 'revoked_key', message: string) {
super(message);
this.reason = reason;
}
}
// ProviderObservationGenerator is the BullMQ Worker processor for server-beta
// observation generation. It does the following on every job invocation:
//
// 1. Reload the Postgres outbox row and the source agent_events row.
// 2. Lock the outbox by transitioning queued -> processing.
// 3. Call the provider with a fully-reloaded ServerGenerationContext.
// BullMQ payload data is advisory only.
// 4. Hand the raw response to processGeneratedResponse, which persists +
// links + advances outbox in one Postgres transaction.
// 5. On provider/parse error, route through markGenerationFailed which
// decides retry vs final failure based on attempt count + error class.
//
// Anti-pattern guards verified at the boundary:
// - no imports from src/services/worker/*
// - no use of WorkerRef / ActiveSession / SessionStore
// - no assumption of Claude Code transcript shape
export interface ProviderObservationGeneratorOptions {
pool: PostgresPool;
provider: ServerGenerationProvider;
workerId?: string;
}
export class ProviderObservationGenerator {
constructor(private readonly options: ProviderObservationGeneratorOptions) {}
/**
* Worker entrypoint. Returns a small JSON summary on success so BullMQ's
* completed-state telemetry has something to inspect, but Postgres remains
* canonical authority.
*/
async process(
job: Job<ServerGenerationJobPayload>,
): Promise<{ jobId: string; status: 'completed'; observationCount: number }> {
const correlationId = `bullmq:${job.id ?? '?'}`;
// Phase 12 — pivot id captured up front so every log line in this
// dispatch carries the same identifier whether or not we manage to
// load the canonical row. requestId comes from payload (HTTP middleware).
const payloadRequestId = (job.data as { request_id?: string | null } | undefined)?.request_id ?? null;
// Phase 11 — validate the BullMQ payload against the discriminated-union
// schema BEFORE doing anything else. A malformed payload (missing
// team_id, project_id, generation_job_id, etc.) means the enqueue path
// bypassed the boundary contract; we refuse to run it. Throwing surfaces
// it on BullMQ's failed list with a clear message.
let payload: ServerGenerationJobPayload;
try {
payload = assertServerGenerationJobPayload(job.data);
} catch (error) {
if (error instanceof ServerGenerationJobPayloadValidationError) {
logger.error('SYSTEM', 'rejecting malformed job payload at execution', {
correlationId,
issues: error.issues,
});
}
throw error;
}
if (payload.kind !== 'event' && payload.kind !== 'event-batch' && payload.kind !== 'summary') {
logger.warn('SYSTEM', 'unsupported job kind for ProviderObservationGenerator', {
correlationId,
kind: payload.kind,
});
throw new Error(`unsupported job kind: ${payload.kind}`);
}
// Phase 11 — anti-bypass guard. We MUST NOT trust BullMQ payload data
// for tenant scope. Reload the canonical outbox row keyed by id only
// (no scope filter), then compare its team_id/project_id to the
// payload's. A mismatch indicates payload tampering or a programmer
// bug; either way we audit and refuse.
const candidate = await this.loadCanonicalOutbox(payload.generation_job_id);
if (!candidate) {
logger.info('SYSTEM', 'job row not found by id; nothing to do', {
correlationId,
generationJobId: payload.generation_job_id,
});
return { jobId: payload.generation_job_id, status: 'completed', observationCount: 0 };
}
if (candidate.teamId !== payload.team_id || candidate.projectId !== payload.project_id) {
const violation = new ServerGenerationScopeViolationError(
'scope_mismatch',
`BullMQ payload team/project does not match outbox row (jobId=${payload.generation_job_id})`,
);
await this.auditScopeViolation(payload, candidate, violation, correlationId);
// Tag the row as failed so subsequent retries do not pick it up.
await markGenerationFailed({
pool: this.options.pool,
job: candidate,
reason: violation.message,
classification: 'scope_mismatch',
retryable: false,
...(this.options.workerId !== undefined ? { workerId: this.options.workerId } : {}),
});
throw violation;
}
// Phase 11 — revocation check. If the api_key that initiated this job
// was revoked between enqueue and execution, do not generate. Audit
// and fail without retry.
if (payload.api_key_id) {
const revoked = await this.isApiKeyRevoked(payload.api_key_id);
if (revoked) {
const violation = new ServerGenerationScopeViolationError(
'revoked_key',
`api key ${payload.api_key_id} is revoked; refusing to generate for outbox ${candidate.id}`,
);
await this.auditRevokedKey(payload, candidate, violation, correlationId);
await markGenerationFailed({
pool: this.options.pool,
job: candidate,
reason: violation.message,
classification: 'revoked_key',
retryable: false,
...(this.options.workerId !== undefined ? { workerId: this.options.workerId } : {}),
});
throw violation;
}
}
const fresh = await this.lockOutbox(payload.generation_job_id, payload.team_id, payload.project_id);
if (!fresh) {
logger.info('SYSTEM', 'job no longer exists or is in terminal status; nothing to do', {
correlationId,
generationJobId: payload.generation_job_id,
});
return { jobId: payload.generation_job_id, status: 'completed', observationCount: 0 };
}
// Phase 11 — emit "processing started" audit so we have a row even if
// the provider crashes before completion.
// Phase 12 — log+audit carry the same job_id / request_id so support
// can pivot from BullMQ id -> outbox id -> originating HTTP request.
logger.info('SYSTEM', `[generation] job locked for processing`, {
correlationId,
jobId: fresh.id,
bullmqJobId: job.id ?? null,
requestId: payloadRequestId,
sourceType: fresh.sourceType,
attempt: fresh.attempts,
});
await this.auditEvent({
teamId: fresh.teamId,
projectId: fresh.projectId,
apiKeyId: payload.api_key_id,
actorId: payload.actor_id,
action: 'generation_job.processing',
resourceId: fresh.id,
details: {
sourceType: fresh.sourceType,
sourceId: fresh.sourceId,
sourceAdapter: payload.source_adapter,
attempt: fresh.attempts,
correlationId,
requestId: payloadRequestId,
},
});
try {
const events = await this.loadEvents(fresh, payload);
const project = await this.loadProject(fresh);
const result = await this.options.provider.generate({
job: fresh,
events,
project: {
projectId: fresh.projectId,
teamId: fresh.teamId,
serverSessionId: fresh.serverSessionId,
projectName: project?.name ?? null,
},
});
const persistInput = {
pool: this.options.pool,
job: fresh,
rawText: result.rawText,
modelId: result.modelId,
providerLabel: result.providerLabel,
// Phase 11 — flow identity context from BullMQ payload into the
// persistence layer so observations and audit rows carry the same
// generation_job_id reference back through to the original API key.
apiKeyId: payload.api_key_id,
actorId: payload.actor_id,
sourceAdapter: payload.source_adapter,
...(this.options.workerId !== undefined ? { workerId: this.options.workerId } : {}),
};
const outcome: ProcessGeneratedResponseOutcome = fresh.sourceType === 'session_summary'
? await processSessionSummaryResponse(persistInput)
: await processGeneratedResponse(persistInput);
if (outcome.kind === 'parse_error') {
await markGenerationFailed({
pool: this.options.pool,
job: fresh,
reason: outcome.reason,
classification: 'parse_error',
retryable: false,
...(this.options.workerId !== undefined ? { workerId: this.options.workerId } : {}),
});
throw new Error(`generation parse error: ${outcome.reason}`);
}
logger.info('SYSTEM', 'generation completed', {
correlationId,
jobId: outcome.jobId,
bullmqJobId: job.id ?? null,
requestId: payloadRequestId,
observationCount: outcome.observations.length,
privateContentDetected: outcome.privateContentDetected,
});
return {
jobId: outcome.jobId,
status: 'completed',
observationCount: outcome.observations.length,
};
} catch (error) {
const classified = error instanceof ServerClassifiedProviderError ? error : null;
const retryable = classified
? classified.kind === 'transient' || classified.kind === 'rate_limit'
: false;
await markGenerationFailed({
pool: this.options.pool,
job: fresh,
reason: error instanceof Error ? error.message : String(error),
classification: classified?.kind ?? 'unknown',
retryable,
...(this.options.workerId !== undefined ? { workerId: this.options.workerId } : {}),
});
throw error;
}
}
// Phase 11 — load the outbox row by id WITHOUT a scope filter so we can
// compare its team_id/project_id to the BullMQ payload as a tampering
// detector. Authoritative scope decisions still come from this row, NEVER
// from the BullMQ payload.
private async loadCanonicalOutbox(jobId: string): Promise<PostgresObservationGenerationJob | null> {
const result = await this.options.pool.query<{
id: string;
project_id: string;
team_id: string;
agent_event_id: string | null;
source_type: 'agent_event' | 'session_summary' | 'observation_reindex';
source_id: string;
server_session_id: string | null;
job_type: string;
status: 'queued' | 'processing' | 'completed' | 'failed' | 'cancelled';
idempotency_key: string;
bullmq_job_id: string | null;
attempts: number;
max_attempts: number;
next_attempt_at: Date | null;
locked_at: Date | null;
locked_by: string | null;
completed_at: Date | null;
failed_at: Date | null;
cancelled_at: Date | null;
last_error: unknown;
payload: unknown;
created_at: Date;
updated_at: Date;
}>(
'SELECT * FROM observation_generation_jobs WHERE id = $1',
[jobId],
);
const row = result.rows[0];
if (!row) return null;
return {
id: row.id,
projectId: row.project_id,
teamId: row.team_id,
agentEventId: row.agent_event_id,
sourceType: row.source_type,
sourceId: row.source_id,
serverSessionId: row.server_session_id,
jobType: row.job_type,
status: row.status,
idempotencyKey: row.idempotency_key,
bullmqJobId: row.bullmq_job_id,
attempts: row.attempts,
maxAttempts: row.max_attempts,
nextAttemptAtEpoch: row.next_attempt_at?.getTime() ?? null,
lockedAtEpoch: row.locked_at?.getTime() ?? null,
lockedBy: row.locked_by,
completedAtEpoch: row.completed_at?.getTime() ?? null,
failedAtEpoch: row.failed_at?.getTime() ?? null,
cancelledAtEpoch: row.cancelled_at?.getTime() ?? null,
lastError: row.last_error && typeof row.last_error === 'object'
? (row.last_error as Record<string, unknown>)
: null,
payload: row.payload && typeof row.payload === 'object' && !Array.isArray(row.payload)
? (row.payload as Record<string, unknown>)
: {},
createdAtEpoch: row.created_at.getTime(),
updatedAtEpoch: row.updated_at.getTime(),
};
}
private async isApiKeyRevoked(apiKeyId: string): Promise<boolean> {
const result = await this.options.pool.query<{ revoked_at: Date | null; expires_at: Date | null }>(
'SELECT revoked_at, expires_at FROM api_keys WHERE id = $1',
[apiKeyId],
);
const row = result.rows[0];
if (!row) {
// The key was deleted entirely. Treat as revoked.
return true;
}
if (row.revoked_at) return true;
if (row.expires_at && row.expires_at.getTime() <= Date.now()) return true;
return false;
}
private async auditScopeViolation(
payload: ServerGenerationJobPayload,
canonical: PostgresObservationGenerationJob,
error: ServerGenerationScopeViolationError,
correlationId: string,
): Promise<void> {
logger.error('SYSTEM', 'BullMQ payload scope mismatch — refusing to generate', {
correlationId,
generationJobId: payload.generation_job_id,
payloadTeamId: payload.team_id,
payloadProjectId: payload.project_id,
canonicalTeamId: canonical.teamId,
canonicalProjectId: canonical.projectId,
});
await this.auditEvent({
teamId: canonical.teamId,
projectId: canonical.projectId,
apiKeyId: payload.api_key_id,
actorId: payload.actor_id,
action: 'generation_job.scope_violation',
resourceId: canonical.id,
details: {
reason: 'scope_mismatch',
message: error.message,
payloadTeamId: payload.team_id,
payloadProjectId: payload.project_id,
canonicalTeamId: canonical.teamId,
canonicalProjectId: canonical.projectId,
sourceAdapter: payload.source_adapter,
correlationId,
},
});
}
private async auditRevokedKey(
payload: ServerGenerationJobPayload,
canonical: PostgresObservationGenerationJob,
error: ServerGenerationScopeViolationError,
correlationId: string,
): Promise<void> {
logger.warn('SYSTEM', 'api key revoked between enqueue and execute — refusing to generate', {
correlationId,
generationJobId: payload.generation_job_id,
apiKeyId: payload.api_key_id,
});
await this.auditEvent({
teamId: canonical.teamId,
projectId: canonical.projectId,
apiKeyId: payload.api_key_id,
actorId: payload.actor_id,
action: 'generation_job.revoked_key',
resourceId: canonical.id,
details: {
reason: 'revoked_key',
message: error.message,
sourceAdapter: payload.source_adapter,
correlationId,
},
});
}
private async auditEvent(input: {
teamId: string | null;
projectId: string | null;
apiKeyId: string | null;
actorId: string | null;
action: string;
resourceId: string | null;
details?: Record<string, unknown>;
}): Promise<void> {
try {
const repo = new PostgresAuthRepository(this.options.pool);
await repo.createAuditLog({
teamId: input.teamId,
projectId: input.projectId,
actorId: input.actorId,
apiKeyId: input.apiKeyId,
action: input.action,
resourceType: 'observation_generation_job',
resourceId: input.resourceId,
details: input.details ?? {},
});
} catch (auditError) {
logger.warn('SYSTEM', 'audit_log insert failed in ProviderObservationGenerator', {
action: input.action,
error: auditError instanceof Error ? auditError.message : String(auditError),
});
}
}
private async lockOutbox(
jobId: string,
teamId: string,
projectId: string,
): Promise<PostgresObservationGenerationJob | null> {
const repo = new PostgresObservationGenerationJobRepository(this.options.pool);
const current = await repo.getByIdForScope({ id: jobId, projectId, teamId });
if (!current) {
return null;
}
if (current.status === 'completed' || current.status === 'cancelled' || current.status === 'failed') {
return null;
}
if (current.status === 'processing') {
// Another worker holds the lock — most commonly this fires when BullMQ
// redelivers a stalled job to a second worker while the first is still
// mid-`provider.generate()`. Returning the row here would cause both
// workers to issue the (paid, rate-limited) external provider call,
// and the persistence-level terminal-status guard only collapses the
// duplicate after the call has already happened. Skip instead. If the
// first worker truly died, `reconcileOnStartup` (and the next BullMQ
// retry) will resurrect the row.
logger.info('SYSTEM', 'generation job already in processing; skipping duplicate worker run', {
jobId: current.id,
lockedBy: current.lockedBy,
lockedAtEpoch: current.lockedAtEpoch,
attempts: current.attempts,
});
return null;
}
const transitioned = await repo.transitionStatus({
id: current.id,
projectId: current.projectId,
teamId: current.teamId,
status: 'processing',
lockedBy: this.options.workerId ?? 'server-beta-worker',
});
return transitioned;
}
private async loadEvents(
job: PostgresObservationGenerationJob,
payload: ServerGenerationJobPayload,
): Promise<NonNullable<Awaited<ReturnType<PostgresAgentEventsRepository['getByIdForScope']>>>[]> {
const repo = new PostgresAgentEventsRepository(this.options.pool);
type Event = NonNullable<Awaited<ReturnType<PostgresAgentEventsRepository['getByIdForScope']>>>;
if (job.sourceType === 'session_summary') {
// Summary jobs feed the provider every event tied to the server_session
// that hasn't already been collapsed into a completed event-generation
// job. The session repo enforces tenant scope inside its WHERE clause.
if (!job.serverSessionId) return [];
const sessions = new PostgresServerSessionsRepository(this.options.pool);
const events = await sessions.listUnprocessedEvents({
serverSessionId: job.serverSessionId,
projectId: job.projectId,
teamId: job.teamId,
});
return events;
}
if (job.sourceType !== 'agent_event') {
return [];
}
if (payload.kind === 'event') {
const event = await repo.getByIdForScope({
id: payload.agent_event_id,
projectId: job.projectId,
teamId: job.teamId,
});
return event ? [event] : [];
}
if (payload.kind === 'event-batch') {
const out: Event[] = [];
for (const id of payload.agent_event_ids) {
const event = await repo.getByIdForScope({
id,
projectId: job.projectId,
teamId: job.teamId,
});
if (event) out.push(event);
}
return out;
}
return [];
}
private async loadProject(job: PostgresObservationGenerationJob) {
const repo = new PostgresProjectsRepository(this.options.pool);
return await repo.getByIdForTeam(job.projectId, job.teamId);
}
}
@@ -0,0 +1,539 @@
// SPDX-License-Identifier: Apache-2.0
import { parseAgentXml, type ParsedObservation, type ParsedSummary } from '../../sdk/parser.js';
import { logger } from '../../utils/logger.js';
import {
PostgresObservationRepository,
PostgresObservationSourcesRepository,
buildObservationGenerationKey,
type PostgresObservation,
} from '../../storage/postgres/observations.js';
import {
PostgresObservationGenerationJobEventsRepository,
PostgresObservationGenerationJobRepository,
type PostgresObservationGenerationJob,
} from '../../storage/postgres/generation-jobs.js';
import { PostgresAuthRepository } from '../../storage/postgres/auth.js';
import {
withPostgresTransaction,
type PostgresPool,
} from '../../storage/postgres/pool.js';
import { stripTags } from '../../utils/tag-stripping.js';
// processGeneratedResponse owns the full "we got XML from a provider →
// persist + link + advance outbox" pipeline. Every side-effect runs inside
// a single Postgres transaction so retries are idempotent:
//
// - observations.generation_key (UNIQUE per team/project) collapses retry
// duplicates to a single row.
// - observation_sources (UNIQUE on observation_id, source_type, source_id)
// collapses duplicate source links.
// - observation_generation_jobs.transitionStatus is the lifecycle gate.
//
// The function NEVER touches worker SessionStore tables, NEVER assumes a
// Claude Code transcript shape, and ALWAYS reloads the job before mutating.
// BullMQ payload data is advisory; the outbox row is canonical.
export type ProcessGeneratedResponseOutcome =
| {
kind: 'completed';
jobId: string;
observations: PostgresObservation[];
privateContentDetected: boolean;
}
| { kind: 'parse_error'; jobId: string; reason: string };
export interface ProcessGeneratedResponseInput {
pool: PostgresPool;
job: PostgresObservationGenerationJob;
rawText: string;
modelId?: string;
providerLabel: string;
workerId?: string;
// Phase 11 — identity context propagated from the BullMQ payload (and
// ultimately the API-key that ingested the source row). Persisted on
// observation_sources.metadata for traceability and re-emitted in the
// observation.created audit row.
apiKeyId?: string | null;
actorId?: string | null;
sourceAdapter?: string | null;
}
export async function processGeneratedResponse(
input: ProcessGeneratedResponseInput,
): Promise<ProcessGeneratedResponseOutcome> {
const { job, rawText } = input;
const parsed = parseAgentXml(rawText, job.id);
if (!parsed.valid) {
return { kind: 'parse_error', jobId: job.id, reason: 'parser rejected response' };
}
// Skip-summary or zero-observation responses are still a success — the
// provider explicitly decided there's nothing worth recording (e.g.
// privacy-stripped batch). Mark the job completed with no observations.
const observationsToWrite = parsed.observations ?? [];
const skipped = parsed.summary?.skipped === true;
const privateContentDetected = skipped || observationsToWrite.length === 0;
return await withPostgresTransaction(input.pool, async (client) => {
const obsRepo = new PostgresObservationRepository(client);
const sourcesRepo = new PostgresObservationSourcesRepository(client);
const jobsRepo = new PostgresObservationGenerationJobRepository(client);
const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
const auditRepo = new PostgresAuthRepository(client);
// Reload the job inside the transaction. If it was already completed
// by another worker, return its existing observations idempotently.
const fresh = await jobsRepo.getByIdForScope({
id: job.id,
projectId: job.projectId,
teamId: job.teamId,
});
if (!fresh) {
throw new Error(`generation job ${job.id} not found in scope`);
}
if (fresh.status === 'completed' || fresh.status === 'cancelled' || fresh.status === 'failed') {
logger.info('SYSTEM', 'generation job already in terminal status; skipping persistence', {
jobId: fresh.id,
status: fresh.status,
});
return {
kind: 'completed' as const,
jobId: fresh.id,
observations: [],
privateContentDetected,
};
}
const persisted: PostgresObservation[] = [];
for (let index = 0; index < observationsToWrite.length; index++) {
const parsedObservation = observationsToWrite[index]!;
const content = renderObservationContent(parsedObservation);
if (!content || content.trim().length === 0) {
continue;
}
// Defense-in-depth: even if the parser slipped a private-tagged
// string through, scrub before persisting.
const scrubbed = stripTags(content);
if (!scrubbed.stripped || scrubbed.stripped.trim().length === 0) {
continue;
}
const generationKey = buildObservationGenerationKey({
generationJobId: fresh.id,
parsedObservationIndex: index,
content: scrubbed.stripped,
});
const observation = await obsRepo.create({
projectId: fresh.projectId,
teamId: fresh.teamId,
serverSessionId: fresh.serverSessionId,
kind: parsedObservation.type ?? 'observation',
content: scrubbed.stripped,
generationKey,
metadata: {
title: parsedObservation.title,
subtitle: parsedObservation.subtitle,
facts: parsedObservation.facts,
narrative: parsedObservation.narrative,
concepts: parsedObservation.concepts,
files_read: parsedObservation.files_read,
files_modified: parsedObservation.files_modified,
provider: input.providerLabel,
model: input.modelId ?? null,
},
createdByJobId: fresh.id,
});
persisted.push(observation);
await sourcesRepo.addSource({
observationId: observation.id,
projectId: fresh.projectId,
teamId: fresh.teamId,
sourceType: fresh.sourceType,
sourceId: fresh.sourceId,
agentEventId: fresh.agentEventId ?? null,
generationJobId: fresh.id,
metadata: {
provider: input.providerLabel,
parsedObservationIndex: index,
// Phase 11 — denormalize identity context for traceability so an
// operator can answer "which api key produced this observation?"
// without joining back through generation_job → outbox → key.
source_adapter: input.sourceAdapter ?? null,
actor_id: input.actorId ?? null,
api_key_id: input.apiKeyId ?? null,
},
});
// Phase 11 — audit each generated observation. Using the SAME
// generation_job_id reference so the audit chain (event_received →
// generation_job.queued → generation_job.processing → observation.
// created → observation.read) can be reconstructed.
try {
await auditRepo.createAuditLog({
teamId: fresh.teamId,
projectId: fresh.projectId,
actorId: input.actorId ?? null,
apiKeyId: input.apiKeyId ?? null,
action: 'observation.created',
resourceType: 'observation',
resourceId: observation.id,
details: {
generationJobId: fresh.id,
sourceType: fresh.sourceType,
sourceId: fresh.sourceId,
provider: input.providerLabel,
model: input.modelId ?? null,
sourceAdapter: input.sourceAdapter ?? null,
parsedObservationIndex: index,
},
});
} catch (auditError) {
logger.warn('SYSTEM', 'audit_log observation.created insert failed', {
observationId: observation.id,
error: auditError instanceof Error ? auditError.message : String(auditError),
});
}
}
// Advance outbox status. Phase 1 transitionStatus enforces legal
// transitions and tenant scope inside its WHERE clause.
await jobsRepo.transitionStatus({
id: fresh.id,
projectId: fresh.projectId,
teamId: fresh.teamId,
status: 'completed',
});
await eventsLogRepo.append({
generationJobId: fresh.id,
projectId: fresh.projectId,
teamId: fresh.teamId,
eventType: 'completed',
statusAfter: 'completed',
attempt: fresh.attempts,
details: {
provider: input.providerLabel,
model: input.modelId ?? null,
observationCount: persisted.length,
privateContentDetected,
workerId: input.workerId ?? null,
},
});
// Audit log — best-effort; failure here would already be inside the
// transaction so any insert error rolls everything back. We accept
// that to keep the pipeline observable end-to-end.
try {
await auditRepo.createAuditLog({
teamId: fresh.teamId,
projectId: fresh.projectId,
actorId: input.actorId ?? null,
apiKeyId: input.apiKeyId ?? null,
action: 'generation_job.completed',
resourceType: 'observation_generation_job',
resourceId: fresh.id,
details: {
generationJobId: fresh.id,
provider: input.providerLabel,
model: input.modelId ?? null,
observationCount: persisted.length,
observationIds: persisted.map(o => o.id),
sourceAdapter: input.sourceAdapter ?? null,
},
});
} catch (auditError) {
// The audit log table may not have a metadata column on older
// schemas; swallow rather than failing generation.
logger.warn('SYSTEM', 'audit log insert failed during generation', {
jobId: fresh.id,
error: auditError instanceof Error ? auditError.message : String(auditError),
});
}
return {
kind: 'completed' as const,
jobId: fresh.id,
observations: persisted,
privateContentDetected,
};
});
}
export interface MarkGenerationFailedInput {
pool: PostgresPool;
job: PostgresObservationGenerationJob;
reason: string;
classification?: string;
retryable: boolean;
workerId?: string;
}
/**
* Move a generation job to a non-success terminal state. Used when the
* provider returned an error or invalid XML. Retryable failures move the
* job back to `queued` so reconciliation can re-enqueue; non-retryable
* failures move to `failed`.
*/
export async function markGenerationFailed(input: MarkGenerationFailedInput): Promise<void> {
await withPostgresTransaction(input.pool, async (client) => {
const jobsRepo = new PostgresObservationGenerationJobRepository(client);
const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
const fresh = await jobsRepo.getByIdForScope({
id: input.job.id,
projectId: input.job.projectId,
teamId: input.job.teamId,
});
if (!fresh || fresh.status === 'completed' || fresh.status === 'cancelled') {
return;
}
const canRetry = input.retryable && fresh.attempts < fresh.maxAttempts;
const target = canRetry ? 'queued' : 'failed';
await jobsRepo.transitionStatus({
id: fresh.id,
projectId: fresh.projectId,
teamId: fresh.teamId,
status: target,
lastError: { reason: input.reason, classification: input.classification ?? null },
...(canRetry ? { nextAttemptAt: new Date(Date.now() + retryDelayMs(fresh.attempts)) } : {}),
});
await eventsLogRepo.append({
generationJobId: fresh.id,
projectId: fresh.projectId,
teamId: fresh.teamId,
eventType: canRetry ? 'retry_scheduled' : 'failed',
statusAfter: target,
attempt: fresh.attempts,
details: {
reason: input.reason,
classification: input.classification ?? null,
workerId: input.workerId ?? null,
},
});
});
}
/**
* Persist a parsed session summary as an observations row with kind='summary'.
*
* Wraps the same outbox transition / source-link / audit pipeline as
* processGeneratedResponse but emits a single 'summary'-kind observation
* derived from the summary fields. Idempotency is enforced through the same
* `observations.generation_key` UNIQUE index — re-running the summary job
* after a restart will collapse to one row.
*/
export async function processSessionSummaryResponse(
input: ProcessGeneratedResponseInput,
): Promise<ProcessGeneratedResponseOutcome> {
const { job, rawText } = input;
if (job.sourceType !== 'session_summary') {
return { kind: 'parse_error', jobId: job.id, reason: 'session summary processor invoked on non-summary job' };
}
const parsed = parseAgentXml(rawText, job.id);
if (!parsed.valid) {
return { kind: 'parse_error', jobId: job.id, reason: 'parser rejected summary response' };
}
const summary = parsed.summary ?? null;
const skipped = summary?.skipped === true;
const summaryContent = summary ? renderSummaryContent(summary) : '';
const privateContentDetected = skipped || summaryContent.trim().length === 0;
return await withPostgresTransaction(input.pool, async (client) => {
const obsRepo = new PostgresObservationRepository(client);
const sourcesRepo = new PostgresObservationSourcesRepository(client);
const jobsRepo = new PostgresObservationGenerationJobRepository(client);
const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
const auditRepo = new PostgresAuthRepository(client);
const fresh = await jobsRepo.getByIdForScope({
id: job.id,
projectId: job.projectId,
teamId: job.teamId,
});
if (!fresh) {
throw new Error(`session summary generation job ${job.id} not found in scope`);
}
if (fresh.status === 'completed' || fresh.status === 'cancelled' || fresh.status === 'failed') {
logger.info('SYSTEM', 'session summary job already in terminal status; skipping persistence', {
jobId: fresh.id,
status: fresh.status,
});
return {
kind: 'completed' as const,
jobId: fresh.id,
observations: [],
privateContentDetected,
};
}
const persisted: PostgresObservation[] = [];
if (!privateContentDetected) {
const scrubbed = stripTags(summaryContent);
const scrubbedContent = scrubbed.stripped ?? '';
if (scrubbedContent.trim().length > 0) {
const generationKey = buildObservationGenerationKey({
generationJobId: fresh.id,
parsedObservationIndex: 0,
content: scrubbedContent,
});
const observation = await obsRepo.create({
projectId: fresh.projectId,
teamId: fresh.teamId,
serverSessionId: fresh.serverSessionId,
kind: 'summary',
content: scrubbedContent,
generationKey,
metadata: {
request: summary?.request ?? null,
investigated: summary?.investigated ?? null,
learned: summary?.learned ?? null,
completed: summary?.completed ?? null,
next_steps: summary?.next_steps ?? null,
notes: summary?.notes ?? null,
provider: input.providerLabel,
model: input.modelId ?? null,
},
createdByJobId: fresh.id,
});
persisted.push(observation);
await sourcesRepo.addSource({
observationId: observation.id,
projectId: fresh.projectId,
teamId: fresh.teamId,
sourceType: 'session_summary',
sourceId: fresh.sourceId,
generationJobId: fresh.id,
metadata: {
provider: input.providerLabel,
parsedObservationIndex: 0,
source_adapter: input.sourceAdapter ?? null,
actor_id: input.actorId ?? null,
api_key_id: input.apiKeyId ?? null,
},
});
// Phase 11 — observation.created audit for the summary observation.
try {
await auditRepo.createAuditLog({
teamId: fresh.teamId,
projectId: fresh.projectId,
actorId: input.actorId ?? null,
apiKeyId: input.apiKeyId ?? null,
action: 'observation.created',
resourceType: 'observation',
resourceId: observation.id,
details: {
generationJobId: fresh.id,
sourceType: 'session_summary',
sourceId: fresh.sourceId,
provider: input.providerLabel,
model: input.modelId ?? null,
sourceAdapter: input.sourceAdapter ?? null,
kind: 'summary',
},
});
} catch (auditError) {
logger.warn('SYSTEM', 'audit_log observation.created (summary) insert failed', {
observationId: observation.id,
error: auditError instanceof Error ? auditError.message : String(auditError),
});
}
}
}
await jobsRepo.transitionStatus({
id: fresh.id,
projectId: fresh.projectId,
teamId: fresh.teamId,
status: 'completed',
});
await eventsLogRepo.append({
generationJobId: fresh.id,
projectId: fresh.projectId,
teamId: fresh.teamId,
eventType: 'completed',
statusAfter: 'completed',
attempt: fresh.attempts,
details: {
provider: input.providerLabel,
model: input.modelId ?? null,
observationCount: persisted.length,
privateContentDetected,
workerId: input.workerId ?? null,
sourceType: 'session_summary',
},
});
try {
await auditRepo.createAuditLog({
teamId: fresh.teamId,
projectId: fresh.projectId,
actorId: input.actorId ?? null,
apiKeyId: input.apiKeyId ?? null,
action: 'generation_job.completed',
resourceType: 'observation_generation_job',
resourceId: fresh.id,
details: {
generationJobId: fresh.id,
provider: input.providerLabel,
model: input.modelId ?? null,
observationCount: persisted.length,
observationIds: persisted.map(o => o.id),
sourceAdapter: input.sourceAdapter ?? null,
sourceType: 'session_summary',
},
});
} catch (auditError) {
logger.warn('SYSTEM', 'audit log insert failed during summary generation', {
jobId: fresh.id,
error: auditError instanceof Error ? auditError.message : String(auditError),
});
}
return {
kind: 'completed' as const,
jobId: fresh.id,
observations: persisted,
privateContentDetected,
};
});
}
function renderSummaryContent(summary: ParsedSummary): string {
const parts: string[] = [];
if (summary.request) parts.push(`Request: ${summary.request}`);
if (summary.investigated) parts.push(`Investigated: ${summary.investigated}`);
if (summary.learned) parts.push(`Learned: ${summary.learned}`);
if (summary.completed) parts.push(`Completed: ${summary.completed}`);
if (summary.next_steps) parts.push(`Next steps: ${summary.next_steps}`);
if (summary.notes) parts.push(`Notes: ${summary.notes}`);
return parts.join('\n\n').trim();
}
function renderObservationContent(observation: ParsedObservation): string {
const parts: string[] = [];
if (observation.title) parts.push(observation.title);
if (observation.subtitle) parts.push(observation.subtitle);
if (observation.narrative) parts.push(observation.narrative);
if (observation.facts && observation.facts.length > 0) {
parts.push(observation.facts.map(f => `- ${f}`).join('\n'));
}
return parts.join('\n\n').trim();
}
function retryDelayMs(attempts: number): number {
// Exponential backoff: 5s, 25s, 125s, capped at 10 minutes.
const base = 5000 * Math.pow(5, Math.max(0, attempts));
return Math.min(base, 10 * 60 * 1000);
}
@@ -0,0 +1,247 @@
// SPDX-License-Identifier: Apache-2.0
import { logger } from '../../../utils/logger.js';
import {
ServerClassifiedProviderError,
parseRetryAfterMs,
} from './shared/error-classification.js';
import { buildServerGenerationPrompt } from './shared/prompt-builder.js';
import type {
ServerGenerationContext,
ServerGenerationProvider,
ServerGenerationResult,
} from './shared/types.js';
const ANTHROPIC_API_URL = 'https://api.anthropic.com/v1/messages';
const ANTHROPIC_VERSION = '2023-06-01';
const DEFAULT_MODEL = 'claude-3-5-sonnet-latest';
export interface ClaudeObservationProviderOptions {
apiKey: string;
model?: string;
maxOutputTokens?: number;
fetchImpl?: typeof fetch;
}
interface AnthropicMessagesResponse {
content?: Array<{ type?: string; text?: string }>;
usage?: { input_tokens?: number; output_tokens?: number };
error?: { type?: string; message?: string };
}
export class ClaudeObservationProvider implements ServerGenerationProvider {
readonly providerLabel = 'claude' as const;
private readonly apiKey: string;
private readonly model: string;
private readonly maxOutputTokens: number;
private readonly fetchImpl: typeof fetch;
constructor(options: ClaudeObservationProviderOptions) {
if (!options.apiKey) {
throw new ServerClassifiedProviderError('Anthropic API key not configured', {
kind: 'auth_invalid',
cause: new Error('apiKey is required'),
});
}
this.apiKey = options.apiKey;
this.model = options.model ?? DEFAULT_MODEL;
this.maxOutputTokens = options.maxOutputTokens ?? 4096;
this.fetchImpl = options.fetchImpl ?? fetch;
}
async generate(
context: ServerGenerationContext,
signal?: AbortSignal,
): Promise<ServerGenerationResult> {
const { prompt, skippedAll } = buildServerGenerationPrompt(context);
if (skippedAll) {
// All events were scrubbed by privacy stripping. Don't bill the
// provider — return a synthetic skip response that parser accepts.
return {
rawText: '<skip_summary reason="all_events_private" />',
providerLabel: this.providerLabel,
modelId: this.model,
};
}
let response: Response;
try {
response = await this.fetchImpl(ANTHROPIC_API_URL, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-api-key': this.apiKey,
'anthropic-version': ANTHROPIC_VERSION,
},
body: JSON.stringify({
model: this.model,
max_tokens: this.maxOutputTokens,
temperature: 0.3,
messages: [{ role: 'user', content: prompt }],
}),
signal,
});
} catch (networkError) {
throw classifyClaudeServerError({
cause: networkError,
});
}
if (!response.ok) {
const bodyText = await safeReadBody(response);
throw classifyClaudeServerError({
status: response.status,
bodyText,
headers: response.headers,
cause: new Error(`Anthropic API error: ${response.status} - ${bodyText}`),
});
}
let data: AnthropicMessagesResponse;
try {
data = (await response.json()) as AnthropicMessagesResponse;
} catch (parseError) {
throw new ServerClassifiedProviderError('Anthropic returned invalid JSON', {
kind: 'parse_error',
cause: parseError,
});
}
if (data.error) {
throw classifyClaudeServerError({
status: response.status,
bodyText: `${data.error.type ?? ''} ${data.error.message ?? ''}`,
headers: response.headers,
cause: new Error(`Anthropic API error: ${data.error.type} - ${data.error.message}`),
});
}
const blocks = Array.isArray(data.content) ? data.content : [];
const rawText = blocks
.filter(block => block?.type === 'text' && typeof block.text === 'string')
.map(block => block.text!)
.join('\n')
.trim();
if (!rawText) {
logger.warn('SDK', 'Anthropic returned empty content array', {
provider: 'claude',
model: this.model,
});
}
const usage = data.usage ?? {};
const tokensUsed =
typeof usage.input_tokens === 'number' || typeof usage.output_tokens === 'number'
? (usage.input_tokens ?? 0) + (usage.output_tokens ?? 0)
: undefined;
return {
rawText,
...(tokensUsed !== undefined ? { tokensUsed } : {}),
providerLabel: this.providerLabel,
modelId: this.model,
};
}
}
interface ClassifyInput {
status?: number;
bodyText?: string;
headers?: Headers | { get(name: string): string | null };
cause: unknown;
}
/**
* Anthropic-specific HTTP error classification. Mirrors worker
* `classifyClaudeError`, but extracted for server-beta and rebound to
* Anthropic Messages REST semantics rather than SDK error classes.
*/
export function classifyClaudeServerError(input: ClassifyInput): ServerClassifiedProviderError {
const status = input.status;
const body = input.bodyText ?? '';
const lower = body.toLowerCase();
const retryAfterMs = input.headers ? parseRetryAfterMs(input.headers.get('retry-after')) : undefined;
if (lower.includes('overloaded')) {
return new ServerClassifiedProviderError(
`Anthropic overloaded${status !== undefined ? ` (status ${status})` : ''}`,
{ kind: 'transient', cause: input.cause },
);
}
if (status === 401 || status === 403 || lower.includes('invalid api key')) {
return new ServerClassifiedProviderError(
`Anthropic auth invalid${status !== undefined ? ` (status ${status})` : ''}`,
{ kind: 'auth_invalid', cause: input.cause },
);
}
if (status === 429) {
return new ServerClassifiedProviderError('Anthropic rate limit (429)', {
kind: 'rate_limit',
cause: input.cause,
...(retryAfterMs !== undefined ? { retryAfterMs } : {}),
});
}
if (lower.includes('quota exceeded')) {
return new ServerClassifiedProviderError('Anthropic quota exhausted', {
kind: 'quota_exhausted',
cause: input.cause,
});
}
if (
lower.includes('prompt is too long') ||
lower.includes('context window') ||
lower.includes('max_tokens')
) {
return new ServerClassifiedProviderError('Anthropic context overflow', {
kind: 'unrecoverable',
cause: input.cause,
});
}
if (status === 529) {
return new ServerClassifiedProviderError('Anthropic overloaded (529)', {
kind: 'transient',
cause: input.cause,
});
}
if (status !== undefined && status >= 500 && status < 600) {
return new ServerClassifiedProviderError(`Anthropic upstream error (status ${status})`, {
kind: 'transient',
cause: input.cause,
});
}
if (status === 400) {
return new ServerClassifiedProviderError('Anthropic bad request (400)', {
kind: 'unrecoverable',
cause: input.cause,
});
}
if (status === undefined) {
const message = input.cause instanceof Error ? input.cause.message : String(input.cause);
return new ServerClassifiedProviderError(`Anthropic network error: ${message}`, {
kind: 'transient',
cause: input.cause,
});
}
return new ServerClassifiedProviderError(
`Anthropic API error: ${status}${body ? ` - ${body.substring(0, 200)}` : ''}`,
{ kind: 'unrecoverable', cause: input.cause },
);
}
async function safeReadBody(response: Response): Promise<string> {
try {
return await response.text();
} catch {
return '';
}
}
@@ -0,0 +1,148 @@
// SPDX-License-Identifier: Apache-2.0
import { logger } from '../../../utils/logger.js';
import {
ServerClassifiedProviderError,
classifyHttpProviderError,
parseRetryAfterMs,
} from './shared/error-classification.js';
import { buildServerGenerationPrompt } from './shared/prompt-builder.js';
import type {
ServerGenerationContext,
ServerGenerationProvider,
ServerGenerationResult,
} from './shared/types.js';
const GEMINI_API_URL = 'https://generativelanguage.googleapis.com/v1/models';
const DEFAULT_MODEL = 'gemini-2.5-flash';
export interface GeminiObservationProviderOptions {
apiKey: string;
model?: string;
maxOutputTokens?: number;
fetchImpl?: typeof fetch;
}
interface GeminiResponse {
candidates?: Array<{
content?: { parts?: Array<{ text?: string }> };
}>;
usageMetadata?: { totalTokenCount?: number };
error?: { code?: number; status?: string; message?: string };
}
export class GeminiObservationProvider implements ServerGenerationProvider {
readonly providerLabel = 'gemini' as const;
private readonly apiKey: string;
private readonly model: string;
private readonly maxOutputTokens: number;
private readonly fetchImpl: typeof fetch;
constructor(options: GeminiObservationProviderOptions) {
if (!options.apiKey) {
throw new ServerClassifiedProviderError('Gemini API key not configured', {
kind: 'auth_invalid',
cause: new Error('apiKey is required'),
});
}
this.apiKey = options.apiKey;
this.model = options.model ?? DEFAULT_MODEL;
this.maxOutputTokens = options.maxOutputTokens ?? 4096;
this.fetchImpl = options.fetchImpl ?? fetch;
}
async generate(
context: ServerGenerationContext,
signal?: AbortSignal,
): Promise<ServerGenerationResult> {
const { prompt, skippedAll } = buildServerGenerationPrompt(context);
if (skippedAll) {
return {
rawText: '<skip_summary reason="all_events_private" />',
providerLabel: this.providerLabel,
modelId: this.model,
};
}
const url = `${GEMINI_API_URL}/${encodeURIComponent(this.model)}:generateContent?key=${encodeURIComponent(this.apiKey)}`;
let response: Response;
try {
response = await this.fetchImpl(url, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
contents: [{ role: 'user', parts: [{ text: prompt }] }],
generationConfig: {
temperature: 0.3,
maxOutputTokens: this.maxOutputTokens,
},
}),
signal,
});
} catch (networkError) {
throw classifyHttpProviderError({
cause: networkError,
providerLabel: 'Gemini',
});
}
if (!response.ok) {
const bodyText = await safeReadBody(response);
throw classifyHttpProviderError({
status: response.status,
bodyText,
headers: response.headers,
cause: new Error(`Gemini API error: ${response.status} - ${bodyText}`),
providerLabel: 'Gemini',
});
}
let data: GeminiResponse;
try {
data = (await response.json()) as GeminiResponse;
} catch (parseError) {
throw new ServerClassifiedProviderError('Gemini returned invalid JSON', {
kind: 'parse_error',
cause: parseError,
});
}
if (data.error) {
throw classifyHttpProviderError({
status: response.status,
bodyText: `${data.error.status ?? ''} ${data.error.message ?? ''}`,
headers: response.headers,
cause: new Error(`Gemini API error: ${data.error.status} - ${data.error.message}`),
providerLabel: 'Gemini',
});
}
const rawText = data.candidates?.[0]?.content?.parts?.[0]?.text?.trim() ?? '';
if (!rawText) {
logger.warn('SDK', 'Gemini returned empty content', { provider: 'gemini', model: this.model });
}
const tokensUsed = typeof data.usageMetadata?.totalTokenCount === 'number'
? data.usageMetadata.totalTokenCount
: undefined;
return {
rawText,
...(tokensUsed !== undefined ? { tokensUsed } : {}),
providerLabel: this.providerLabel,
modelId: this.model,
};
}
}
// Re-export for tests/auditing parity with worker classifier surface.
export { parseRetryAfterMs };
async function safeReadBody(response: Response): Promise<string> {
try {
return await response.text();
} catch {
return '';
}
}
@@ -0,0 +1,151 @@
// SPDX-License-Identifier: Apache-2.0
import { logger } from '../../../utils/logger.js';
import {
ServerClassifiedProviderError,
classifyHttpProviderError,
} from './shared/error-classification.js';
import { buildServerGenerationPrompt } from './shared/prompt-builder.js';
import type {
ServerGenerationContext,
ServerGenerationProvider,
ServerGenerationResult,
} from './shared/types.js';
const OPENROUTER_API_URL = 'https://openrouter.ai/api/v1/chat/completions';
const DEFAULT_MODEL = 'anthropic/claude-3.5-sonnet';
export interface OpenRouterObservationProviderOptions {
apiKey: string;
model?: string;
maxOutputTokens?: number;
siteUrl?: string;
appName?: string;
fetchImpl?: typeof fetch;
}
interface OpenRouterResponse {
choices?: Array<{ message?: { content?: string } }>;
usage?: { total_tokens?: number };
error?: { code?: string | number; message?: string };
}
export class OpenRouterObservationProvider implements ServerGenerationProvider {
readonly providerLabel = 'openrouter' as const;
private readonly apiKey: string;
private readonly model: string;
private readonly maxOutputTokens: number;
private readonly siteUrl: string;
private readonly appName: string;
private readonly fetchImpl: typeof fetch;
constructor(options: OpenRouterObservationProviderOptions) {
if (!options.apiKey) {
throw new ServerClassifiedProviderError('OpenRouter API key not configured', {
kind: 'auth_invalid',
cause: new Error('apiKey is required'),
});
}
this.apiKey = options.apiKey;
this.model = options.model ?? DEFAULT_MODEL;
this.maxOutputTokens = options.maxOutputTokens ?? 4096;
this.siteUrl = options.siteUrl ?? 'https://github.com/thedotmack/claude-mem';
this.appName = options.appName ?? 'claude-mem';
this.fetchImpl = options.fetchImpl ?? fetch;
}
async generate(
context: ServerGenerationContext,
signal?: AbortSignal,
): Promise<ServerGenerationResult> {
const { prompt, skippedAll } = buildServerGenerationPrompt(context);
if (skippedAll) {
return {
rawText: '<skip_summary reason="all_events_private" />',
providerLabel: this.providerLabel,
modelId: this.model,
};
}
let response: Response;
try {
response = await this.fetchImpl(OPENROUTER_API_URL, {
method: 'POST',
headers: {
Authorization: `Bearer ${this.apiKey}`,
'HTTP-Referer': this.siteUrl,
'X-Title': this.appName,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: this.model,
messages: [{ role: 'user', content: prompt }],
temperature: 0.3,
max_tokens: this.maxOutputTokens,
}),
signal,
});
} catch (networkError) {
throw classifyHttpProviderError({
cause: networkError,
providerLabel: 'OpenRouter',
});
}
if (!response.ok) {
const bodyText = await safeReadBody(response);
throw classifyHttpProviderError({
status: response.status,
bodyText,
headers: response.headers,
cause: new Error(`OpenRouter API error: ${response.status} - ${bodyText}`),
providerLabel: 'OpenRouter',
});
}
let data: OpenRouterResponse;
try {
data = (await response.json()) as OpenRouterResponse;
} catch (parseError) {
throw new ServerClassifiedProviderError('OpenRouter returned invalid JSON', {
kind: 'parse_error',
cause: parseError,
});
}
if (data.error) {
throw classifyHttpProviderError({
status: response.status,
bodyText: `${data.error.code ?? ''} ${data.error.message ?? ''}`,
headers: response.headers,
cause: new Error(`OpenRouter API error: ${data.error.code} - ${data.error.message}`),
providerLabel: 'OpenRouter',
});
}
const rawText = data.choices?.[0]?.message?.content?.trim() ?? '';
if (!rawText) {
logger.warn('SDK', 'OpenRouter returned empty content', {
provider: 'openrouter',
model: this.model,
});
}
const tokensUsed = typeof data.usage?.total_tokens === 'number' ? data.usage.total_tokens : undefined;
return {
rawText,
...(tokensUsed !== undefined ? { tokensUsed } : {}),
providerLabel: this.providerLabel,
modelId: this.model,
};
}
}
async function safeReadBody(response: Response): Promise<string> {
try {
return await response.text();
} catch {
return '';
}
}
@@ -0,0 +1,136 @@
// SPDX-License-Identifier: Apache-2.0
// Server-beta-local copy of the worker provider error classification model.
// Phase 5 anti-pattern guard: src/server/* must not import from
// src/services/worker/*, so we duplicate the small, stable error model here.
// Worker code keeps src/services/worker/provider-errors.ts unchanged.
export type ServerProviderErrorClass =
| 'transient'
| 'unrecoverable'
| 'rate_limit'
| 'quota_exhausted'
| 'auth_invalid'
| 'parse_error'
| (string & {});
export class ServerClassifiedProviderError extends Error {
readonly kind: ServerProviderErrorClass;
readonly retryAfterMs?: number;
readonly cause: unknown;
constructor(
message: string,
opts: {
kind: ServerProviderErrorClass;
cause: unknown;
retryAfterMs?: number;
},
) {
super(message);
this.name = 'ServerClassifiedProviderError';
this.kind = opts.kind;
this.cause = opts.cause;
if (opts.retryAfterMs !== undefined) {
this.retryAfterMs = opts.retryAfterMs;
}
}
}
export function isServerClassified(err: unknown): err is ServerClassifiedProviderError {
return err instanceof ServerClassifiedProviderError;
}
/**
* Parse Retry-After header (seconds or HTTP-date). Returns ms or undefined.
* Behavior intentionally mirrors the worker providers' helper so server
* retries match worker retry policy.
*/
export function parseRetryAfterMs(value: string | null): number | undefined {
if (!value) return undefined;
const seconds = Number(value);
if (!Number.isNaN(seconds) && seconds >= 0) {
return Math.floor(seconds * 1000);
}
const dateMs = Date.parse(value);
if (!Number.isNaN(dateMs)) {
const delta = dateMs - Date.now();
return delta > 0 ? delta : 0;
}
return undefined;
}
interface ClassifyHttpInput {
status?: number;
bodyText?: string;
headers?: Headers | { get(name: string): string | null };
cause: unknown;
providerLabel: string;
}
/**
* Generic HTTP-error → ServerClassifiedProviderError mapping shared by
* Gemini and OpenRouter server adapters. Provider-specific overrides (e.g.
* Anthropic OverloadedError, Gemini quota body markers) are layered on top
* by the per-provider classifier wrappers in this module.
*/
export function classifyHttpProviderError(input: ClassifyHttpInput): ServerClassifiedProviderError {
const { status, providerLabel } = input;
const body = input.bodyText ?? '';
const lower = body.toLowerCase();
const retryAfterMs = input.headers ? parseRetryAfterMs(input.headers.get('retry-after')) : undefined;
if (
lower.includes('quota exceeded') ||
lower.includes('insufficient credits') ||
lower.includes('insufficient_quota') ||
lower.includes('resource_exhausted')
) {
return new ServerClassifiedProviderError(
`${providerLabel} quota exhausted${status !== undefined ? ` (status ${status})` : ''}`,
{ kind: 'quota_exhausted', cause: input.cause },
);
}
if (status === 429) {
return new ServerClassifiedProviderError(`${providerLabel} rate limit (429)`, {
kind: 'rate_limit',
cause: input.cause,
...(retryAfterMs !== undefined ? { retryAfterMs } : {}),
});
}
if (status === 401 || status === 403) {
return new ServerClassifiedProviderError(`${providerLabel} auth error (status ${status})`, {
kind: 'auth_invalid',
cause: input.cause,
});
}
if (status === 400 || status === 404) {
return new ServerClassifiedProviderError(`${providerLabel} bad request (status ${status})`, {
kind: 'unrecoverable',
cause: input.cause,
});
}
if (status !== undefined && status >= 500 && status < 600) {
return new ServerClassifiedProviderError(`${providerLabel} upstream error (status ${status})`, {
kind: 'transient',
cause: input.cause,
});
}
if (status === undefined) {
const message = input.cause instanceof Error ? input.cause.message : String(input.cause);
return new ServerClassifiedProviderError(`${providerLabel} network error: ${message}`, {
kind: 'transient',
cause: input.cause,
});
}
return new ServerClassifiedProviderError(
`${providerLabel} API error: ${status}${body ? ` - ${body.substring(0, 200)}` : ''}`,
{ kind: 'unrecoverable', cause: input.cause },
);
}
@@ -0,0 +1,164 @@
// SPDX-License-Identifier: Apache-2.0
import { ModeManager } from '../../../../services/domain/ModeManager.js';
import type { ModeConfig, ObservationType } from '../../../../services/domain/types.js';
import { stripTags } from '../../../../utils/tag-stripping.js';
import type { PostgresAgentEvent } from '../../../../storage/postgres/agent-events.js';
import type { ServerGenerationContext } from './types.js';
// Fallback list mirrors the default observation types used by claude-mem
// modes. The server-beta prompt does not strictly need a loaded mode file —
// the parser accepts any of these as the <type> value — so when no mode is
// loaded (tests, fresh installs) we synthesize a minimal type list rather
// than throwing.
const FALLBACK_OBSERVATION_TYPES: ReadonlyArray<Pick<ObservationType, 'id'>> = [
{ id: 'discovery' },
{ id: 'progress' },
{ id: 'blocker' },
{ id: 'decision' },
];
// Build a single-shot generation prompt from a list of AgentEvent records
// plus project/session metadata. Output: a user prompt asking the provider
// to return one or more <observation> XML blocks (or an empty response if
// the batch should be skipped). This is intentionally a single-turn request
// — server-beta does NOT use the worker's multi-turn SDK conversation
// model. parseAgentXml(...) accepts the response unchanged.
//
// Privacy: every event payload field passes through `stripTags` (which
// removes <private>, <claude-mem-context>, <system-reminder>, etc.) before
// being included in the prompt. Privacy enforcement here is belt-and-suspenders
// — `processGeneratedResponse` also discards observations that are entirely
// derived from privately-tagged inputs.
export interface BuildServerPromptResult {
readonly prompt: string;
readonly hadPrivateContent: boolean;
readonly skippedAll: boolean;
}
const MAX_PAYLOAD_CHARS = 16 * 1024;
export function buildServerGenerationPrompt(
context: ServerGenerationContext,
options: { mode?: ModeConfig } = {},
): BuildServerPromptResult {
const mode = options.mode ?? loadActiveModeOrFallback();
let hadPrivateContent = false;
let allEventsScrubbedToEmpty = true;
const eventBlocks: string[] = [];
for (const event of context.events) {
const block = buildEventBlock(event);
if (block.hadPrivate) {
hadPrivateContent = true;
}
if (block.body.length > 0) {
allEventsScrubbedToEmpty = false;
eventBlocks.push(block.body);
}
}
const skippedAll = context.events.length > 0 && allEventsScrubbedToEmpty;
const sessionTag = context.project.serverSessionId
? `\n <server_session_id>${escapeXml(context.project.serverSessionId)}</server_session_id>`
: '';
const projectTag = context.project.projectName
? `\n <project_name>${escapeXml(context.project.projectName)}</project_name>`
: '';
const observationOutputSchema = buildObservationOutputSchema(mode);
const prompt = [
'<server_beta_observation_request>',
` <project_id>${escapeXml(context.project.projectId)}</project_id>`,
` <team_id>${escapeXml(context.project.teamId)}</team_id>` + sessionTag + projectTag,
` <generation_job_id>${escapeXml(context.job.id)}</generation_job_id>`,
' <agent_events>',
eventBlocks.length > 0 ? eventBlocks.join('\n') : ' <!-- empty after privacy stripping -->',
' </agent_events>',
'</server_beta_observation_request>',
'',
'You are observing an agent at work. Return one or more',
'<observation>...</observation> XML blocks summarizing durable, useful',
'discoveries from the events above. If the events contain nothing worth',
'recording (e.g., everything was scrubbed by privacy filters or the',
'activity was trivial), return a single self-closing <skip_summary />',
'tag and nothing else. Do not include any prose outside the XML.',
'',
'Schema for each <observation> block:',
observationOutputSchema,
].join('\n');
return { prompt, hadPrivateContent, skippedAll };
}
interface EventBlockResult {
body: string;
hadPrivate: boolean;
}
function buildEventBlock(event: PostgresAgentEvent): EventBlockResult {
const rawPayload =
typeof event.payload === 'string' ? event.payload : JSON.stringify(event.payload ?? {}, null, 2);
const stripResult = stripTags(rawPayload);
const hadPrivate = (stripResult.counts.private ?? 0) > 0;
const truncatedPayload = stripResult.stripped.length > MAX_PAYLOAD_CHARS
? stripResult.stripped.slice(0, MAX_PAYLOAD_CHARS) + '\n[...truncated]'
: stripResult.stripped;
if (truncatedPayload.trim().length === 0) {
return { body: '', hadPrivate };
}
return {
body: [
' <agent_event>',
` <id>${escapeXml(event.id)}</id>`,
` <event_type>${escapeXml(event.eventType)}</event_type>`,
` <source_adapter>${escapeXml(event.sourceAdapter)}</source_adapter>`,
` <occurred_at>${new Date(event.occurredAtEpoch).toISOString()}</occurred_at>`,
' <payload>',
escapeXml(truncatedPayload),
' </payload>',
' </agent_event>',
].join('\n'),
hadPrivate,
};
}
function loadActiveModeOrFallback(): ModeConfig | { observation_types: ReadonlyArray<Pick<ObservationType, 'id'>> } {
try {
return ModeManager.getInstance().getActiveMode();
} catch {
return { observation_types: FALLBACK_OBSERVATION_TYPES } as unknown as ModeConfig;
}
}
function buildObservationOutputSchema(mode: ModeConfig | { observation_types: ReadonlyArray<Pick<ObservationType, 'id'>> }): string {
const types = mode.observation_types.map(t => t.id).join(' | ');
return [
'<observation>',
` <type>[ ${types} ]</type>`,
' <title>...</title>',
' <subtitle>...</subtitle>',
' <facts><fact>...</fact></facts>',
' <narrative>...</narrative>',
' <concepts><concept>...</concept></concepts>',
' <files_read><file>...</file></files_read>',
' <files_modified><file>...</file></files_modified>',
'</observation>',
].join('\n');
}
function escapeXml(text: string): string {
return text
.replace(/&/g, '&amp;')
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;')
.replace(/"/g, '&quot;')
.replace(/'/g, '&apos;');
}
@@ -0,0 +1,33 @@
// SPDX-License-Identifier: Apache-2.0
import type { PostgresAgentEvent } from '../../../../storage/postgres/agent-events.js';
import type { PostgresObservationGenerationJob } from '../../../../storage/postgres/generation-jobs.js';
// ServerGenerationContext is the input handed to a server provider adapter.
// It is reloaded from Postgres on every retry; BullMQ payload is advisory.
// Anti-pattern guard: this MUST NOT carry worker session state.
export interface ServerGenerationContext {
readonly job: PostgresObservationGenerationJob;
readonly events: readonly PostgresAgentEvent[];
readonly project: {
readonly projectId: string;
readonly teamId: string;
readonly serverSessionId: string | null;
readonly projectName?: string | null;
};
}
// ServerGenerationResult is the raw provider response (XML accepted by
// parseAgentXml). Empty string means provider returned nothing — handled
// upstream as a "skip with no observation" outcome by processGeneratedResponse.
export interface ServerGenerationResult {
readonly rawText: string;
readonly tokensUsed?: number;
readonly providerLabel: string;
readonly modelId?: string;
}
export interface ServerGenerationProvider {
readonly providerLabel: 'claude' | 'gemini' | 'openrouter';
generate(context: ServerGenerationContext, signal?: AbortSignal): Promise<ServerGenerationResult>;
}
+177 -4
View File
@@ -2,10 +2,12 @@
import {
Queue,
QueueEvents,
Worker,
type Job,
type JobsOptions,
type Processor,
type QueueEventsOptions,
type QueueOptions,
type WorkerOptions
} from 'bullmq';
@@ -33,6 +35,22 @@ export interface ServerJobCounts {
completed: number;
}
// Phase 12 — runtime stalled counter. BullMQ doesn't expose a stalled counter
// from getJobCounts (the underlying list is rotated on consumption). We keep
// a per-process counter that tracks how many distinct stalled events we've
// observed since startup. /api/health and /v1/info surface this.
export interface ServerJobLifecycleCounters {
stalled: number;
errored: number;
}
export interface ServerJobObservedListener {
onCompleted?: (jobId: string, durationMs: number, returnvalue: unknown) => void;
onFailed?: (jobId: string | undefined, attemptsMade: number, reason: string) => void;
onStalled?: (jobId: string) => void;
onError?: (error: unknown) => void;
}
export interface ServerJobQueueOptions<TPayload> {
name: string;
config: RedisQueueConfig;
@@ -63,7 +81,18 @@ export class ServerJobQueue<TPayload extends object = object> {
private readonly workerFactory?: ServerJobQueueOptions<TPayload>['workerFactory'];
private queue: ReturnType<NonNullable<ServerJobQueueOptions<TPayload>['queueFactory']>> | Queue<TPayload> | null = null;
private worker: ReturnType<NonNullable<ServerJobQueueOptions<TPayload>['workerFactory']>> | Worker<TPayload> | null = null;
private queueEvents: QueueEvents | null = null;
private started = false;
private readonly counters: ServerJobLifecycleCounters = { stalled: 0, errored: 0 };
private readonly listeners: ServerJobObservedListener[] = [];
private readonly jobStartTimes = new Map<string, number>();
// worker.on('stalled') and the QueueEvents 'stalled' subscriber both fire
// for the same job — BullMQ's docs explicitly recommend listening on both
// for production reliability. To avoid double-counting and double-callback
// we record each stalled jobId here for a short TTL and treat the second
// signal as an idempotent no-op.
private readonly recentlyStalled = new Map<string, NodeJS.Timeout>();
private static readonly STALLED_DEDUPE_WINDOW_MS = 30_000;
constructor(options: ServerJobQueueOptions<TPayload>) {
this.name = options.name;
@@ -154,6 +183,53 @@ export class ServerJobQueue<TPayload extends object = object> {
// BullMQ docs require `worker.on('error', ...)` to avoid unhandled rejections
// when a job throws. We construct the Worker with autorun: false so the
// caller controls startup explicitly via run().
//
// Phase 12 — wire `completed`, `failed`, `progress`, `error`, and the
// QueueEvents `stalled` listener. Stalled events go through QueueEvents
// because BullMQ's docs note rare stalls don't always reach the local
// worker.on('stalled') listener; QueueEvents publishes from Redis.
// Deduped stalled handler. Counts the stall once even though BullMQ may
// surface it via both worker.on('stalled') and QueueEvents 'stalled'.
private notifyStalled(jobId: string, source: 'worker' | 'queue-events'): void {
if (this.recentlyStalled.has(jobId)) {
logger.debug?.('QUEUE', `[generation] job=${jobId} stalled (suppressed duplicate from ${source})`, {
queue: this.name,
jobId,
source,
});
return;
}
const timer = setTimeout(() => {
this.recentlyStalled.delete(jobId);
}, ServerJobQueue.STALLED_DEDUPE_WINDOW_MS);
if (typeof (timer as { unref?: () => void }).unref === 'function') {
(timer as { unref: () => void }).unref();
}
this.recentlyStalled.set(jobId, timer);
this.counters.stalled += 1;
logger.warn('QUEUE', `[generation] job=${jobId} stalled${source === 'queue-events' ? ' (queue-events)' : ''}`, {
queue: this.name,
jobId,
source,
});
for (const l of this.listeners) {
try { l.onStalled?.(jobId); } catch { /* listener errors must not propagate */ }
}
}
// Single source of truth for queue-side error accounting. worker errors and
// QueueEvents errors both increment counters.errored and notify listeners,
// so per-process metrics aren't asymmetric across the two sources.
private notifyQueueError(error: unknown, source: 'worker' | 'queue-events'): void {
this.counters.errored += 1;
logger.warn('QUEUE', `${this.name} ${source} error`, {
error: error instanceof Error ? error.message : String(error),
});
for (const l of this.listeners) {
try { l.onError?.(error); } catch { /* listener errors must not propagate */ }
}
}
start(processor: Processor<TPayload>): void {
if (this.started) {
throw new Error(`ServerJobQueue ${this.name} is already started`);
@@ -168,22 +244,115 @@ export class ServerJobQueue<TPayload extends object = object> {
const worker = this.workerFactory
? this.workerFactory(this.name, processor, workerOptions)
: new Worker<TPayload>(this.name, processor, workerOptions);
worker.on('error', (error: unknown) => {
logger.warn('QUEUE', `${this.name} worker error`, {
error: error instanceof Error ? error.message : String(error)
worker.on('error', (error: unknown) => this.notifyQueueError(error, 'worker'));
// BullMQ Worker exposes `active`, `completed`, `failed`, `progress`, and
// `stalled` events. We attach to all five because the runtime relies on
// them for observability (Phase 12).
if (typeof (worker as { on?: unknown }).on === 'function') {
const w = worker as Worker<TPayload>;
w.on('active', (job: Job<TPayload>) => {
if (job.id) this.jobStartTimes.set(job.id, Date.now());
});
});
w.on('completed', (job: Job<TPayload>, returnvalue: unknown) => {
const startedAt = job.id ? this.jobStartTimes.get(job.id) : undefined;
const durationMs = startedAt ? Date.now() - startedAt : 0;
if (job.id) this.jobStartTimes.delete(job.id);
const sourceType = (job.data as { source_type?: string } | undefined)?.source_type ?? '?';
logger.info('QUEUE', `[generation] job=${job.id ?? '?'} source_type=${sourceType} duration=${durationMs}ms`, {
queue: this.name,
jobId: job.id ?? null,
sourceType,
durationMs,
});
for (const l of this.listeners) {
try { l.onCompleted?.(job.id ?? '?', durationMs, returnvalue); } catch { /* swallow listener errors only */ }
}
});
w.on('failed', (job: Job<TPayload> | undefined, error: Error) => {
if (job?.id) this.jobStartTimes.delete(job.id);
const sourceType = (job?.data as { source_type?: string } | undefined)?.source_type ?? '?';
const attemptsMade = job?.attemptsMade ?? 0;
logger.warn('QUEUE', `[generation] job=${job?.id ?? '?'} source_type=${sourceType} attempts=${attemptsMade} reason=${error.message}`, {
queue: this.name,
jobId: job?.id ?? null,
sourceType,
attemptsMade,
reason: error.message,
});
for (const l of this.listeners) {
try { l.onFailed?.(job?.id, attemptsMade, error.message); } catch { /* swallow */ }
}
});
w.on('progress', (job: Job<TPayload>, progress: unknown) => {
logger.debug?.('QUEUE', `[generation] job=${job.id ?? '?'} progress`, {
queue: this.name,
jobId: job.id ?? null,
progress,
});
});
w.on('stalled', (jobId: string) => this.notifyStalled(jobId, 'worker'));
}
worker.run();
this.worker = worker;
// QueueEvents subscribes to Redis pub/sub for cross-process events
// (BullMQ "Stalled Jobs" docs recommend this for production reliability).
// Skip in test/factory mode since the test factory does not provide a
// real Redis connection.
if (!this.workerFactory) {
try {
const events = new QueueEvents(this.name, {
connection: this.config.connection,
prefix: this.config.prefix,
} as QueueEventsOptions);
events.on('stalled', ({ jobId }: { jobId: string }) => this.notifyStalled(jobId, 'queue-events'));
// QueueEvents emits its own 'error' too — surface through the same
// counter+listener path as worker errors so observability stays symmetric.
events.on('error', (error: Error) => this.notifyQueueError(error, 'queue-events'));
this.queueEvents = events;
} catch (error) {
logger.warn('QUEUE', `${this.name} failed to start QueueEvents listener`, {
error: error instanceof Error ? error.message : String(error),
});
}
}
this.started = true;
}
/**
* Phase 12 — register an observer for completed/failed/stalled/error
* events. Used by the runtime to surface lifecycle hooks (audit, metrics)
* without subclassing. Listeners that throw are isolated.
*/
observe(listener: ServerJobObservedListener): void {
this.listeners.push(listener);
}
/**
* Phase 12 — runtime counters for stalled/errored events. waiting/active/
* completed/failed/delayed live in `getCounts()` (BullMQ getJobCounts).
* Stalled is a per-process counter because BullMQ rotates the underlying
* list and there's no reliable count from getJobCounts.
*/
getLifecycleCounters(): ServerJobLifecycleCounters {
return { ...this.counters };
}
isStarted(): boolean {
return this.started;
}
async close(): Promise<void> {
const errors: Error[] = [];
if (this.queueEvents) {
try {
await this.queueEvents.close();
} catch (error) {
errors.push(error instanceof Error ? error : new Error(String(error)));
}
this.queueEvents = null;
}
if (this.worker) {
try {
await this.worker.close();
@@ -201,6 +370,10 @@ export class ServerJobQueue<TPayload extends object = object> {
}
this.queue = null;
}
for (const timer of this.recentlyStalled.values()) {
clearTimeout(timer);
}
this.recentlyStalled.clear();
if (errors.length > 0) {
throw errors[0];
}
+10 -5
View File
@@ -9,11 +9,12 @@ import type { JsonObject } from '../../storage/postgres/utils.js';
import { logger } from '../../utils/logger.js';
import { buildServerJobId } from './job-id.js';
import type { ServerJobQueue } from './ServerJobQueue.js';
import type {
GenerateObservationsForEventJob,
GenerateSessionSummaryJob,
ReindexObservationJob,
ServerGenerationJobKind
import {
assertServerGenerationJobPayload,
type GenerateObservationsForEventJob,
type GenerateSessionSummaryJob,
type ReindexObservationJob,
type ServerGenerationJobKind,
} from './types.js';
// Postgres outbox is canonical history; BullMQ is the execution transport.
@@ -86,6 +87,10 @@ export async function enqueueOutbox(
});
try {
// Phase 11 — defense in depth. Validate the payload shape at the queue
// boundary so a malformed enqueue is rejected synchronously and never
// produces a job whose audit trail is missing fields.
assertServerGenerationJobPayload(payload);
await queue.add(bullmqJobId, payload);
await eventsRepo.append({
generationJobId: row.id,
+96
View File
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: Apache-2.0
import { z } from 'zod';
import type {
ObservationGenerationJobSourceType,
ObservationGenerationJobStatus
@@ -9,6 +10,12 @@ export type ServerGenerationJobKind = 'event' | 'event-batch' | 'summary' | 'rei
export type ServerGenerationJobStatus = ObservationGenerationJobStatus;
// Phase 11 — every BullMQ job carries the full team-aware tracing surface so
// the worker can audit and scope-check on every retry. team_id and project_id
// are advisory: the worker MUST reload the canonical outbox row from Postgres
// and compare these fields before any side effect. Treating these as auth
// authority would be a bypass — the comparison is a tampering detector, not
// the auth gate.
export interface ServerGenerationJob {
kind: ServerGenerationJobKind;
team_id: string;
@@ -16,6 +23,18 @@ export interface ServerGenerationJob {
source_type: ObservationGenerationJobSourceType;
source_id: string;
generation_job_id: string;
// Identity of the API key that initiated this job at the HTTP boundary.
// Reused at execution time to detect revocation between enqueue and run.
api_key_id: string | null;
// The actor associated with the api key at enqueue time. Audit-only;
// never trust this for authz decisions.
actor_id: string | null;
// Legacy adapter or surface that produced the source row, for routing
// and audit (e.g. 'api', 'hooks', 'mcp', 'compat:sessions-observations').
source_adapter: string;
// Phase 12 — request correlation id, optional but always serialized as a
// nullable field so downstream consumers can rely on shape stability.
request_id?: string | null;
}
export interface GenerateObservationsForEventJob extends ServerGenerationJob {
@@ -57,3 +76,80 @@ export const SERVER_JOB_KIND_PREFIX: Record<ServerGenerationJobKind, string> = {
summary: 'sum',
reindex: 'rdx'
};
// Phase 11 — Zod schema validates payloads at the queue boundary so a
// malformed enqueue is rejected synchronously rather than silently producing
// a job the worker can't audit. Required fields here mirror the
// ServerGenerationJob interface; a missing team_id, project_id, or
// generation_job_id should always be a programmer error caught at enqueue.
const baseFieldsSchema = z.object({
team_id: z.string().min(1, 'team_id is required'),
project_id: z.string().min(1, 'project_id is required'),
source_type: z.enum(['agent_event', 'session_summary', 'observation_reindex']),
source_id: z.string().min(1, 'source_id is required'),
generation_job_id: z.string().min(1, 'generation_job_id is required'),
// api_key_id and actor_id are nullable to accommodate local-dev/system
// enqueues, but the *field* must be present in the payload so audit
// records always render the same shape.
api_key_id: z.string().min(1).nullable(),
actor_id: z.string().min(1).nullable(),
source_adapter: z.string().min(1, 'source_adapter is required'),
// Phase 12 — request_id is optional in the schema (older jobs predating
// this phase have nullable/missing values) but always passes through to
// logs and audit when present.
request_id: z.string().min(1).nullable().optional(),
});
export const GenerateObservationsForEventJobSchema = baseFieldsSchema.extend({
kind: z.literal('event'),
agent_event_id: z.string().min(1),
});
export const GenerateObservationsForEventBatchJobSchema = baseFieldsSchema.extend({
kind: z.literal('event-batch'),
agent_event_ids: z.array(z.string().min(1)).min(1),
});
export const GenerateSessionSummaryJobSchema = baseFieldsSchema.extend({
kind: z.literal('summary'),
server_session_id: z.string().min(1),
});
export const ReindexObservationJobSchema = baseFieldsSchema.extend({
kind: z.literal('reindex'),
observation_id: z.string().min(1),
});
export const ServerGenerationJobPayloadSchema = z.discriminatedUnion('kind', [
GenerateObservationsForEventJobSchema,
GenerateObservationsForEventBatchJobSchema,
GenerateSessionSummaryJobSchema,
ReindexObservationJobSchema,
]);
export class ServerGenerationJobPayloadValidationError extends Error {
readonly issues: z.ZodIssue[];
constructor(issues: z.ZodIssue[]) {
super(`invalid server generation job payload: ${issues.map(i => i.message).join('; ')}`);
this.issues = issues;
}
}
/**
* Validate a candidate BullMQ payload against the discriminated union and
* return a typed payload, or throw `ServerGenerationJobPayloadValidationError`.
* Use this at every enqueue site so a malformed payload never enters the
* transport — the worker MUST also re-validate from Postgres but defense in
* depth is cheap.
*/
export function assertServerGenerationJobPayload(
candidate: unknown,
): ServerGenerationJobPayload {
const result = ServerGenerationJobPayloadSchema.safeParse(candidate);
if (!result.success) {
throw new ServerGenerationJobPayloadValidationError(result.error.issues);
}
return result.data as ServerGenerationJobPayload;
}
+199
View File
@@ -0,0 +1,199 @@
// SPDX-License-Identifier: Apache-2.0
import { createHash } from 'crypto';
import type { NextFunction, Request, RequestHandler, Response } from 'express';
import type { PostgresPool } from '../../storage/postgres/pool.js';
import type { PostgresApiKey } from '../../storage/postgres/auth.js';
import type { AuthContext } from './auth.js';
// Postgres-backed auth middleware for the server-beta runtime.
//
// Mirrors src/server/middleware/auth.ts but reads API keys from the Postgres
// `api_keys` table instead of bun:sqlite. Phase 4 routes use this so the
// runtime depends only on the Postgres pool and Postgres-backed repositories.
//
// teamId / projectId on req.authContext come straight from the Postgres
// api_keys row. Routes use those to scope every read and write.
export interface PostgresRequireAuthOptions {
requiredScopes?: string[];
authMode?: string;
allowLocalDevBypass?: boolean;
// Local-dev fallback team for unauthenticated loopback requests. This is
// only used when authMode === 'local-dev' AND allowLocalDevBypass is true
// AND the request is on loopback. It must NEVER be used to scope a real
// production request.
localDevTeamId?: string | null;
}
export function requirePostgresServerAuth(
pool: PostgresPool,
options: PostgresRequireAuthOptions = {},
): RequestHandler {
return async (req: Request, res: Response, next: NextFunction) => {
try {
const authMode = options.authMode ?? process.env.CLAUDE_MEM_AUTH_MODE ?? 'api-key';
const authorization = req.header('authorization') ?? '';
const rawKey = parseBearerToken(authorization);
const allowLocalDevBypass = options.allowLocalDevBypass
?? process.env.CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS === '1';
if (
!rawKey
&& authMode === 'local-dev'
&& allowLocalDevBypass
&& isLocalhost(req)
&& hasLoopbackHostHeader(req)
&& !hasForwardedClientHeaders(req)
) {
const ctx: AuthContext = {
userId: null,
organizationId: null,
teamId: options.localDevTeamId ?? null,
projectId: null,
scopes: ['local-dev'],
apiKeyId: null,
mode: 'local-dev',
};
req.authContext = ctx;
next();
return;
}
if (!rawKey) {
res.status(401).json({ error: 'Unauthorized', message: 'Missing bearer API key' });
return;
}
const verified = await verifyPostgresApiKey(pool, rawKey, options.requiredScopes ?? []);
if (!verified) {
res.status(403).json({ error: 'Forbidden', message: 'Invalid API key or insufficient scope' });
return;
}
const ctx: AuthContext = {
userId: null,
organizationId: null,
teamId: verified.teamId,
projectId: verified.projectId,
scopes: verified.scopes,
apiKeyId: verified.apiKeyId,
mode: 'api-key',
};
req.authContext = ctx;
next();
} catch (error) {
next(error);
}
};
}
interface VerifiedPostgresApiKey {
apiKeyId: string;
teamId: string | null;
projectId: string | null;
scopes: string[];
}
export async function verifyPostgresApiKey(
pool: PostgresPool,
rawKey: string,
requiredScopes: string[],
): Promise<VerifiedPostgresApiKey | null> {
const keyHash = createHash('sha256').update(rawKey).digest('hex');
const result = await pool.query(
`
SELECT id, team_id, project_id, scopes, revoked_at, expires_at
FROM api_keys
WHERE key_hash = $1
`,
[keyHash],
);
const row = result.rows[0] as Pick<
PostgresApiKey,
'id' | 'teamId' | 'projectId'
> & {
id: string;
team_id: string | null;
project_id: string | null;
scopes: unknown;
revoked_at: Date | null;
expires_at: Date | null;
} | undefined;
if (!row) {
return null;
}
if (row.revoked_at) {
return null;
}
if (row.expires_at && row.expires_at.getTime() <= Date.now()) {
return null;
}
const scopes = normalizeScopes(row.scopes);
if (!hasRequiredScopes(scopes, requiredScopes)) {
return null;
}
return {
apiKeyId: row.id,
teamId: row.team_id,
projectId: row.project_id,
scopes,
};
}
function normalizeScopes(value: unknown): string[] {
if (!Array.isArray(value)) {
return [];
}
return value.filter((item): item is string => typeof item === 'string');
}
function hasRequiredScopes(grantedScopes: string[], requiredScopes: string[]): boolean {
if (requiredScopes.length === 0 || grantedScopes.includes('*')) {
return true;
}
return requiredScopes.every(scope => grantedScopes.includes(scope));
}
function parseBearerToken(header: string): string | null {
const match = /^Bearer\s+(.+)$/i.exec(header.trim());
return match?.[1]?.trim() || null;
}
function isLocalhost(req: Request): boolean {
const clientIp = req.ip || req.socket.remoteAddress || '';
return clientIp === '127.0.0.1'
|| clientIp === '::1'
|| clientIp === '::ffff:127.0.0.1'
|| clientIp === 'localhost';
}
function hasLoopbackHostHeader(req: Request): boolean {
const host = parseHostWithoutPort(req.header('host') ?? '');
return host === '127.0.0.1'
|| host === 'localhost'
|| host === '::1';
}
function parseHostWithoutPort(rawHost: string): string {
const host = rawHost.trim().toLowerCase();
if (host.startsWith('[')) {
const closeBracketIndex = host.indexOf(']');
return closeBracketIndex === -1 ? host : host.slice(1, closeBracketIndex);
}
const lastColonIndex = host.lastIndexOf(':');
if (lastColonIndex > -1 && /^\d+$/.test(host.slice(lastColonIndex + 1))) {
return host.slice(0, lastColonIndex);
}
return host;
}
function hasForwardedClientHeaders(req: Request): boolean {
return Boolean(
req.header('forwarded')
|| req.header('x-forwarded-for')
|| req.header('x-forwarded-host')
|| req.header('x-real-ip'),
);
}
+40
View File
@@ -0,0 +1,40 @@
// SPDX-License-Identifier: Apache-2.0
import { randomUUID } from 'crypto';
import type { NextFunction, Request, RequestHandler, Response } from 'express';
// Phase 12 — request_id middleware. Mints a UUID per inbound request and
// attaches it to req.requestId so route handlers, ingest services, and
// generation jobs can correlate logs back to the original HTTP call. Honors
// an inbound `X-Request-Id` header so an upstream load balancer / gateway
// can supply the id, but rejects non-conformant values to keep audit rows
// clean (UUID v4 OR a small whitelist of [a-zA-Z0-9-_] up to 64 chars).
//
// Anti-pattern guard: never trust the inbound id for auth — this is purely
// an audit/log correlator. Auth still flows through requirePostgresServerAuth.
const REQUEST_ID_HEADER = 'x-request-id';
const REQUEST_ID_MAX_LENGTH = 64;
const REQUEST_ID_SAFE_PATTERN = /^[A-Za-z0-9][A-Za-z0-9\-_]{0,63}$/;
declare module 'express-serve-static-core' {
interface Request {
requestId?: string;
}
}
export function requestIdMiddleware(): RequestHandler {
return (req: Request, res: Response, next: NextFunction) => {
const inbound = req.header(REQUEST_ID_HEADER);
const accepted = inbound && isAcceptableRequestId(inbound) ? inbound : randomUUID();
req.requestId = accepted;
res.setHeader('X-Request-Id', accepted);
next();
};
}
export function isAcceptableRequestId(value: string): boolean {
if (typeof value !== 'string') return false;
if (value.length === 0 || value.length > REQUEST_ID_MAX_LENGTH) return false;
return REQUEST_ID_SAFE_PATTERN.test(value);
}
@@ -17,6 +17,23 @@ export interface ObservationQueueEngine {
close(): Promise<void>;
}
// Phase 12 — `lanes` exposes per-queue counts (waiting/active/completed/
// failed/delayed/stalled) so deploy probes can monitor saturation per lane.
// `unavailable: true` means the sample failed; the health endpoint MUST NOT
// 503 just because counts are stale.
export interface ObservationQueueHealthLaneSnapshot {
kind: string;
name: string;
waiting: number;
active: number;
completed: number;
failed: number;
delayed: number;
stalled: number;
unavailable: boolean;
unavailableReason?: string;
}
export interface ObservationQueueHealth {
engine: 'bullmq';
redis: {
@@ -27,6 +44,7 @@ export interface ObservationQueueHealth {
prefix: string;
error?: string;
};
lanes?: ObservationQueueHealthLaneSnapshot[];
}
export interface ObservationQueueInspection {
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,164 @@
// SPDX-License-Identifier: Apache-2.0
import type { Job } from 'bullmq';
import { logger } from '../../utils/logger.js';
import { PostgresAuthRepository } from '../../storage/postgres/auth.js';
import type { PostgresPool } from '../../storage/postgres/pool.js';
import { ProviderObservationGenerator } from '../generation/ProviderObservationGenerator.js';
import type { ServerGenerationProvider } from '../generation/providers/shared/types.js';
import type { ServerGenerationJobPayload } from '../jobs/types.js';
import type { ActiveServerBetaQueueManager } from './ActiveServerBetaQueueManager.js';
import type {
ServerBetaBoundaryHealth,
ServerBetaGenerationWorkerManager,
} from './types.js';
// ActiveServerBetaGenerationWorkerManager wires a BullMQ Worker (per the
// 'event' queue) to a ProviderObservationGenerator. Concurrency defaults to
// 1 per the plan (line 8086) so retries observe a single inflight provider
// call per server. autorun:false / explicit run() is enforced by
// ServerJobQueue.start.
//
// This class is wired in only when both a queue manager AND a configured
// provider are present. create-server-beta-service keeps the disabled
// adapter otherwise so server beta can boot without provider credentials.
export interface ActiveServerBetaGenerationWorkerManagerOptions {
pool: PostgresPool;
queueManager: ActiveServerBetaQueueManager;
provider: ServerGenerationProvider;
workerId?: string;
// Test seam: replace the generator with a stub.
generatorFactory?: (
pool: PostgresPool,
provider: ServerGenerationProvider,
workerId: string,
) => ProviderObservationGenerator;
}
export class ActiveServerBetaGenerationWorkerManager implements ServerBetaGenerationWorkerManager {
readonly kind = 'generation-worker-manager' as const;
private started = false;
private closed = false;
private readonly generator: ProviderObservationGenerator;
private readonly workerId: string;
constructor(private readonly options: ActiveServerBetaGenerationWorkerManagerOptions) {
this.workerId = options.workerId ?? `server-beta-${process.pid}`;
this.generator = options.generatorFactory
? options.generatorFactory(options.pool, options.provider, this.workerId)
: new ProviderObservationGenerator({
pool: options.pool,
provider: options.provider,
workerId: this.workerId,
});
}
/**
* Attach BullMQ Worker to the 'event' queue. Per BullMQ docs we use
* new Worker(queueName, processor, { concurrency, autorun })
* via ServerJobQueue.start(...). Errors are surfaced through the queue
* wrapper's worker.on('error', ...) listener.
*/
start(): void {
if (this.started) return;
const dispatcher = async (job: Job<ServerGenerationJobPayload>) => {
try {
return await this.generator.process(job);
} catch (error) {
logger.warn('SYSTEM', 'observation generator failed', {
jobId: job.id,
kind: job.data.kind,
error: error instanceof Error ? error.message : String(error),
});
throw error;
}
};
this.options.queueManager.start('event', dispatcher);
// Phase 6: wire the summary lane alongside the event lane. Concurrency
// defaults to 1 per ServerJobQueue config (per the plan), and the same
// ProviderObservationGenerator dispatches on job.data.source_type via the
// outbox row reload inside lockOutbox+process.
this.options.queueManager.start('summary', dispatcher);
// Phase 12 — audit stalled events directly. Phase 11's audit chain now
// covers the operator and provider lifecycle; stalled jobs come from
// BullMQ runtime not the HTTP boundary, so we wire them in here. Best-
// effort: a missing/unscoped audit MUST NOT crash the worker.
for (const lane of ['event', 'summary'] as const) {
try {
const queue = this.options.queueManager.getQueue(lane);
queue.observe({
onStalled: (jobId) => {
void this.auditStalledJob(jobId, lane);
},
});
} catch (error) {
logger.warn('SYSTEM', `failed to wire stalled observer for ${lane} lane`, {
error: error instanceof Error ? error.message : String(error),
});
}
}
this.started = true;
}
// Phase 12 — write a `generation_job.stalled` audit row. We look up the
// outbox row by BullMQ jobId (== bullmq_job_id column) so team/project
// scope is correct on the audit row even when the original API key
// metadata is unavailable (BullMQ retries can outlive a session).
private async auditStalledJob(bullmqJobId: string, lane: 'event' | 'summary'): Promise<void> {
try {
const result = await this.options.pool.query<{
id: string;
team_id: string | null;
project_id: string | null;
}>(
'SELECT id, team_id, project_id FROM observation_generation_jobs WHERE bullmq_job_id = $1 LIMIT 1',
[bullmqJobId],
);
const row = result.rows[0];
if (!row) return;
const repo = new PostgresAuthRepository(this.options.pool);
await repo.createAuditLog({
teamId: row.team_id,
projectId: row.project_id,
actorId: null,
apiKeyId: null,
action: 'generation_job.stalled',
resourceType: 'observation_generation_job',
resourceId: row.id,
details: { lane, bullmqJobId },
});
} catch (error) {
logger.warn('SYSTEM', 'failed to audit stalled generation_job', {
bullmqJobId,
error: error instanceof Error ? error.message : String(error),
});
}
}
getHealth(): ServerBetaBoundaryHealth {
if (this.closed) {
return { status: 'errored', reason: 'generation-worker-manager closed' };
}
return {
status: this.started ? 'active' : 'disabled',
reason: this.started
? 'BullMQ Worker attached to event queue with ProviderObservationGenerator'
: 'wired but not started',
details: {
provider: this.options.provider.providerLabel,
workerId: this.workerId,
},
};
}
async close(): Promise<void> {
if (this.closed) return;
this.closed = true;
// The underlying Worker is owned by ServerJobQueue.close() (driven by
// the queue manager). We do not double-close here; the queue manager's
// close cascade handles it.
}
}
@@ -11,6 +11,7 @@ import type { RedisQueueConfig } from '../queue/redis-config.js';
import { logger } from '../../utils/logger.js';
import type {
ServerBetaBoundaryHealth,
ServerBetaQueueLaneMetric,
ServerBetaQueueManager,
} from './types.js';
@@ -75,6 +76,49 @@ export class ActiveServerBetaQueueManager implements ServerBetaQueueManager {
};
}
/**
* Phase 12 — per-lane counts. Returns BullMQ getJobCounts plus the
* per-process stalled counter. If Redis is unreachable, the lane is
* reported with an `unavailable` flag rather than throwing so /api/health
* remains responsive even in partial-failure modes.
*/
async getLaneMetrics(): Promise<ServerBetaQueueLaneMetric[]> {
const out: ServerBetaQueueLaneMetric[] = [];
for (const kind of QUEUE_KINDS) {
const queue = this.queues.get(kind);
if (!queue) continue;
const lifecycle = queue.getLifecycleCounters();
try {
const counts = await queue.getCounts();
out.push({
kind,
name: SERVER_JOB_QUEUE_NAMES[kind],
waiting: counts.waiting,
active: counts.active,
completed: counts.completed,
failed: counts.failed,
delayed: counts.delayed,
stalled: lifecycle.stalled,
unavailable: false,
});
} catch (error) {
out.push({
kind,
name: SERVER_JOB_QUEUE_NAMES[kind],
waiting: 0,
active: 0,
completed: 0,
failed: 0,
delayed: 0,
stalled: lifecycle.stalled,
unavailable: true,
unavailableReason: error instanceof Error ? error.message : String(error),
});
}
}
return out;
}
async close(): Promise<void> {
if (this.closed) {
return;
+319 -3
View File
@@ -14,7 +14,11 @@ import {
verifyPidFileOwnership,
type PidInfo,
} from '../../supervisor/process-registry.js';
import type { ServerBetaServiceGraph } from './types.js';
import { ServerV1PostgresRoutes } from '../routes/v1/ServerV1PostgresRoutes.js';
import { SessionsObservationsAdapter } from '../compat/SessionsObservationsAdapter.js';
import { SessionsSummarizeAdapter } from '../compat/SessionsSummarizeAdapter.js';
import { ActiveServerBetaQueueManager } from './ActiveServerBetaQueueManager.js';
import type { ServerBetaServiceGraph, ServerBetaQueueLaneMetric } from './types.js';
const SERVER_BETA_RUNTIME = 'server-beta';
const DEFAULT_SERVER_BETA_HOST = '127.0.0.1';
@@ -50,7 +54,12 @@ class ServerBetaRuntimeInfoRoutes implements RouteHandler {
res.json({ status: 'ok', runtime: SERVER_BETA_RUNTIME });
});
app.get('/v1/info', (_req, res) => {
// Phase 12 — `/v1/info` includes per-lane queue metrics so deploy probes
// can read waiting/active/completed/failed/delayed/stalled without
// hitting `/api/health`. Sampling is best-effort: a Redis blip surfaces
// the lane with `unavailable: true` rather than crashing the route.
app.get('/v1/info', async (_req, res) => {
const queueLanes = await collectQueueLaneMetrics(this.graph);
res.json({
name: 'claude-mem-server',
runtime: SERVER_BETA_RUNTIME,
@@ -65,11 +74,28 @@ class ServerBetaRuntimeInfoRoutes implements RouteHandler {
providerRegistry: this.graph.providerRegistry.getHealth(),
eventBroadcaster: this.graph.eventBroadcaster.getHealth(),
},
queueLanes,
});
});
}
}
async function collectQueueLaneMetrics(
graph: ServerBetaServiceGraph,
): Promise<ServerBetaQueueLaneMetric[]> {
const manager = graph.queueManager;
if (!(manager instanceof ActiveServerBetaQueueManager)) {
return [];
}
try {
return await manager.getLaneMetrics();
} catch {
// /api/health and /v1/info MUST never throw on a queue blip — surface
// empty lanes so the rest of the payload still renders.
return [];
}
}
export class ServerBetaService {
private readonly graph: ServerBetaServiceGraph;
private readonly host: string;
@@ -106,8 +132,73 @@ export class ServerBetaService {
authMethod: this.graph.authMode,
lastInteraction: null,
}),
// Phase 10 — surface BullMQ/Valkey health on /api/health so deploy
// probes (and the Docker E2E) can confirm the queue engine without
// peeking at /v1/info. The queue manager's getHealth() returns its
// boundary descriptor; we shape it into the worker-compatible
// ObservationQueueHealth schema the Server class expects.
// Phase 12 — also include per-lane counts (waiting/active/completed/
// failed/delayed/stalled) so deploy probes can monitor saturation.
getQueueHealth: async () => {
const health = this.graph.queueManager.getHealth();
const details = (health.details ?? {}) as Record<string, unknown>;
if (health.status !== 'active' || details.engine !== 'bullmq') {
return null;
}
const lanes = await collectQueueLaneMetrics(this.graph);
return {
engine: 'bullmq' as const,
redis: {
status: 'ok' as const,
mode: String(details.mode ?? 'unknown'),
host: String(details.host ?? '127.0.0.1'),
port: typeof details.port === 'number' ? details.port : 6379,
prefix: String(details.prefix ?? 'claude_mem'),
},
lanes: lanes.map(lane => ({
kind: lane.kind,
name: lane.name,
waiting: lane.waiting,
active: lane.active,
completed: lane.completed,
failed: lane.failed,
delayed: lane.delayed,
stalled: lane.stalled,
unavailable: lane.unavailable,
...(lane.unavailableReason ? { unavailableReason: lane.unavailableReason } : {}),
})),
};
},
});
server.registerRoutes(new ServerBetaRuntimeInfoRoutes(this.graph));
const v1Routes = new ServerV1PostgresRoutes({
pool: this.graph.postgres.pool,
queueManager: this.graph.queueManager,
authMode: this.graph.authMode === 'disabled' ? 'api-key' : this.graph.authMode,
runtime: SERVER_BETA_RUNTIME,
// Session policy is read inside the routes (default 'per-event' from
// resolveSessionGenerationPolicy(), env-overridable via
// CLAUDE_MEM_SERVER_SESSION_POLICY). We do not duplicate it here.
});
server.registerRoutes(v1Routes);
// Phase 9 — legacy compatibility adapters. These translate the old
// `/api/sessions/observations` and `/api/sessions/summarize` worker
// routes to the canonical Server beta event/job model. They share the
// SAME shared services with /v1/* routes — never duplicate ingest or
// session-end logic. New clients should hit /v1/* directly.
const compatAuthMode = this.graph.authMode === 'disabled' ? 'api-key' : this.graph.authMode;
server.registerRoutes(new SessionsObservationsAdapter({
pool: this.graph.postgres.pool,
ingestEvents: v1Routes.getIngestEventsService(),
authMode: compatAuthMode,
}));
server.registerRoutes(new SessionsSummarizeAdapter({
pool: this.graph.postgres.pool,
endSession: v1Routes.getEndSessionService(),
authMode: compatAuthMode,
}));
server.finalizeRoutes();
await server.listen(this.requestedPort, this.host);
@@ -184,6 +275,28 @@ export async function runServerBetaCli(argv: string[] = process.argv.slice(2)):
const port = getServerBetaPort();
const host = process.env.CLAUDE_MEM_SERVER_HOST ?? DEFAULT_SERVER_BETA_HOST;
// Phase 10: `claude-mem server worker [start|--daemon]` runs the BullMQ
// generation worker as a foregrounded process — no HTTP server, no route
// registration. In Compose this becomes a separately scaled service.
if (command === 'worker') {
const sub = (argv[1] ?? '--daemon').toLowerCase();
if (sub === 'start' || sub === '--daemon' || sub === 'run') {
await runServerBetaGenerationWorker();
return;
}
console.error('Usage: server-beta-service worker start');
process.exit(1);
}
// `server api-key create|list|revoke` mirrors the worker-service tooling
// but writes to the Postgres `api_keys` table the server-beta runtime
// actually reads from. The legacy worker-service CLI talks to SQLite and
// would be invisible to this stack.
if (command === 'server' && argv[1]?.toLowerCase() === 'api-key') {
await runServerBetaApiKeyCli(argv.slice(2));
return;
}
switch (command) {
case 'start': {
const existing = readServerBetaPidFile();
@@ -258,9 +371,212 @@ export async function runServerBetaCli(argv: string[] = process.argv.slice(2)):
}
}
// Phase 10 — Postgres-backed `server api-key create|list|revoke` CLI. The
// legacy `worker-service.cjs server api-key` command talks to SQLite and
// is invisible to the server-beta runtime, which reads keys from
// Postgres. Use this entrypoint inside Docker / Compose.
export async function runServerBetaApiKeyCli(argv: string[]): Promise<void> {
const sub = argv[0]?.toLowerCase();
const options = parseFlagArgs(argv.slice(1));
if (!process.env.CLAUDE_MEM_SERVER_DATABASE_URL) {
console.error('CLAUDE_MEM_SERVER_DATABASE_URL is required for `server api-key` commands.');
process.exit(1);
}
const { getSharedPostgresPool } = await import('../../storage/postgres/index.js');
const { PostgresAuthRepository } = await import('../../storage/postgres/auth.js');
const { createHash, randomBytes } = await import('crypto');
const pool = getSharedPostgresPool({ requireDatabaseUrl: true });
const repo = new PostgresAuthRepository(pool);
try {
if (sub === 'create') {
const scopes = (options.scope ?? options.scopes ?? 'memories:read')
.split(',')
.map((scope: string) => scope.trim())
.filter(Boolean);
// Resolve team/project. If the caller passed --team/--project, honor
// them. Otherwise, run the server-beta bootstrap to get-or-create the
// local team+project, then create a NEW key against those IDs with
// the caller's requested scopes (the bootstrap key uses hook scopes,
// which is the wrong default for an arbitrary CLI-issued key).
let teamId = options.team ?? null;
let projectId = options.project ?? null;
if (!teamId || !projectId) {
const { bootstrapServerBetaApiKey } = await import('../../services/hooks/server-beta-bootstrap.js');
const result = await bootstrapServerBetaApiKey({ pool, closePool: false });
teamId = result.teamId;
projectId = result.projectId;
}
const rawKey = `cmem_${randomBytes(24).toString('hex')}`;
const keyHash = createHash('sha256').update(rawKey).digest('hex');
const created = await repo.createApiKey({
keyHash,
teamId,
projectId,
scopes,
actorId: 'system:server-beta-cli',
});
console.log(JSON.stringify({
id: created.id,
key: rawKey,
name: options.name ?? 'server-api-key',
teamId,
projectId,
scopes,
}, null, 2));
return;
}
if (sub === 'list') {
// Bound the result set to prevent unintentional cross-tenant key
// metadata disclosure when an admin runs `api-key list` on a shared
// host. Default page is 100; --team filters to a single tenant.
const teamFilter = options.team ?? null;
const limitArg = Number.parseInt(options.limit ?? '100', 10);
const offsetArg = Number.parseInt(options.offset ?? '0', 10);
const limit = Number.isFinite(limitArg) && limitArg > 0 && limitArg <= 500
? limitArg
: 100;
const offset = Number.isFinite(offsetArg) && offsetArg >= 0 ? offsetArg : 0;
const where = teamFilter ? 'WHERE team_id = $1' : '';
const params: unknown[] = teamFilter ? [teamFilter, limit, offset] : [limit, offset];
const limitIdx = teamFilter ? 2 : 1;
const offsetIdx = teamFilter ? 3 : 2;
const result = await pool.query<{
id: string;
team_id: string | null;
project_id: string | null;
scopes: unknown;
revoked_at: Date | null;
expires_at: Date | null;
last_used_at: Date | null;
created_at: Date;
}>(
`SELECT id, team_id, project_id, scopes, revoked_at, expires_at, last_used_at, created_at
FROM api_keys
${where}
ORDER BY created_at DESC
LIMIT $${limitIdx} OFFSET $${offsetIdx}`,
params,
);
console.log(JSON.stringify({
teamId: teamFilter,
limit,
offset,
count: result.rows.length,
keys: result.rows.map(row => ({
id: row.id,
teamId: row.team_id,
projectId: row.project_id,
scopes: row.scopes,
status: row.revoked_at ? 'revoked' : 'active',
lastUsedAt: row.last_used_at?.toISOString() ?? null,
expiresAt: row.expires_at?.toISOString() ?? null,
createdAt: row.created_at.toISOString(),
})),
}, null, 2));
return;
}
if (sub === 'revoke') {
const id = argv[1];
if (!id) {
console.error('Usage: server-beta-service server api-key revoke <id>');
process.exit(1);
}
const result = await pool.query(
`UPDATE api_keys SET revoked_at = now()
WHERE id = $1 AND revoked_at IS NULL
RETURNING id`,
[id],
);
if (result.rowCount === 0) {
console.error(`API key not found or already revoked: ${id}`);
process.exit(1);
}
console.log(JSON.stringify({ id, status: 'revoked' }, null, 2));
return;
}
console.error(`Unknown server api-key subcommand: ${sub ?? '(none)'}`);
console.error('Usage: server-beta-service server api-key create|list|revoke');
process.exit(1);
} finally {
// Pool is shared; do not close here. The process will exit and the
// pool tears down via the shared module's process exit hook.
}
}
function parseFlagArgs(argv: string[]): Record<string, string> {
const out: Record<string, string> = {};
for (let i = 0; i < argv.length; i++) {
const arg = argv[i];
if (!arg) continue;
if (arg.startsWith('--')) {
const equalsIdx = arg.indexOf('=');
if (equalsIdx > -1) {
out[arg.slice(2, equalsIdx)] = arg.slice(equalsIdx + 1);
} else {
out[arg.slice(2)] = argv[i + 1] ?? '';
i += 1;
}
}
}
return out;
}
// Phase 10 — generation-worker-only entrypoint. Starts BullMQ workers against
// the same Postgres + Valkey/Redis the HTTP server-beta service uses, but
// never opens an HTTP listener. In Compose this is a separate, horizontally
// scalable service. The HTTP server-beta service should run with
// CLAUDE_MEM_GENERATION_DISABLED=true so generation only happens in this
// process.
export async function runServerBetaGenerationWorker(): Promise<void> {
const { validateServerBetaEnv, createServerBetaService } = await import('./create-server-beta-service.js');
validateServerBetaEnv();
// Build the service WITHOUT starting HTTP. We reuse createServerBetaService
// for pool + bootstrap + queue + generation worker wiring, but never call
// service.start(). Generation is enabled here even if env says
// CLAUDE_MEM_GENERATION_DISABLED, because this IS the generation worker.
delete process.env.CLAUDE_MEM_GENERATION_DISABLED;
const service = await createServerBetaService();
const state = service.getRuntimeState();
logger.info('SYSTEM', 'Server beta generation worker started (no HTTP)', {
pid: process.pid,
queue: state.boundaries.queueManager,
generation: state.boundaries.generationWorkerManager,
});
console.log(JSON.stringify({ status: 'worker-running', runtime: SERVER_BETA_RUNTIME, pid: process.pid }));
let stopping = false;
const shutdown = async () => {
if (stopping) return;
stopping = true;
try {
await service.stop();
} finally {
process.exit(0);
}
};
process.once('SIGTERM', shutdown);
process.once('SIGINT', shutdown);
// Block forever — Workers run in background via BullMQ. Without this the
// process would exit and BullMQ jobs would never be consumed.
await new Promise<void>(() => {});
}
function getServerBetaPort(): number {
const parsed = Number.parseInt(process.env.CLAUDE_MEM_SERVER_PORT ?? '', 10);
return Number.isInteger(parsed) && parsed > 0 ? parsed : DEFAULT_SERVER_BETA_PORT;
if (Number.isInteger(parsed) && parsed > 0) {
return parsed;
}
// UID-derived default for multi-account isolation: two users on the same
// host get distinct ports without explicit configuration. Containerized
// deployments always pass CLAUDE_MEM_SERVER_PORT so this branch is local-only.
return DEFAULT_SERVER_BETA_PORT + ((process.getuid?.() ?? 77) % 100);
}
function spawnServerBetaDaemon(port: number): number | undefined {
@@ -0,0 +1,163 @@
// SPDX-License-Identifier: Apache-2.0
import {
PostgresServerSessionsRepository,
type PostgresServerSession,
} from '../../storage/postgres/server-sessions.js';
import type { PostgresAgentEvent } from '../../storage/postgres/agent-events.js';
import type { JsonObject } from '../../storage/postgres/utils.js';
import type { PostgresPool } from '../../storage/postgres/pool.js';
import type { PostgresQueryable } from '../../storage/postgres/utils.js';
// ServerSessionRuntimeRepository is the runtime helper layer used by Server
// beta routes and generation policies. It is intentionally thin: every method
// requires explicit `team_id` + `project_id` and validates scope through the
// underlying PostgresServerSessionsRepository (which calls
// assertProjectOwnership before any write). It does NOT cache state — every
// call hits Postgres so the runtime never trusts in-memory ActiveSession-style
// objects, per the Phase 6 anti-pattern guard.
export interface ServerSessionScope {
teamId: string;
projectId: string;
}
export interface GetActiveSessionInput extends ServerSessionScope {
externalSessionId: string;
contentSessionId?: string | null;
agentId?: string | null;
agentType?: string | null;
platformSource?: string | null;
metadata?: JsonObject;
}
export interface ServerSessionRuntimeRepositoryOptions {
client: PostgresQueryable;
}
export class ServerSessionRuntimeRepository {
private readonly repo: PostgresServerSessionsRepository;
constructor(private readonly options: ServerSessionRuntimeRepositoryOptions) {
this.repo = new PostgresServerSessionsRepository(options.client);
}
/**
* Find or create the canonical Server beta session row for an external
* session id. Idempotent on (project_id, external_session_id).
*
* Anti-pattern guard: this MUST NOT consult worker `ActiveSession` or any
* legacy SessionStore. server_sessions is the canonical model.
*/
async getActiveSession(input: GetActiveSessionInput): Promise<PostgresServerSession> {
const existing = await this.repo.findByExternalIdForScope({
externalSessionId: input.externalSessionId,
projectId: input.projectId,
teamId: input.teamId,
});
if (existing) {
return existing;
}
return this.repo.create({
projectId: input.projectId,
teamId: input.teamId,
externalSessionId: input.externalSessionId,
contentSessionId: input.contentSessionId ?? null,
agentId: input.agentId ?? null,
agentType: input.agentType ?? null,
platformSource: input.platformSource ?? null,
metadata: input.metadata ?? {},
});
}
async getById(input: { id: string } & ServerSessionScope): Promise<PostgresServerSession | null> {
return this.repo.getByIdForScope({
id: input.id,
projectId: input.projectId,
teamId: input.teamId,
});
}
async findByExternalId(input: {
externalSessionId: string;
} & ServerSessionScope): Promise<PostgresServerSession | null> {
return this.repo.findByExternalIdForScope({
externalSessionId: input.externalSessionId,
projectId: input.projectId,
teamId: input.teamId,
});
}
async listUnprocessedEvents(
input: { serverSessionId: string; limit?: number } & ServerSessionScope,
): Promise<PostgresAgentEvent[]> {
const params: {
serverSessionId: string;
projectId: string;
teamId: string;
limit?: number;
} = {
serverSessionId: input.serverSessionId,
projectId: input.projectId,
teamId: input.teamId,
};
if (input.limit !== undefined) {
params.limit = input.limit;
}
return this.repo.listUnprocessedEvents(params);
}
/**
* End the session if not already ended. Idempotent — re-ending a session
* returns the unchanged row and never creates a duplicate summary job
* because the (team_id, project_id, source_type='session_summary',
* source_id) UNIQUE constraint on observation_generation_jobs collapses
* duplicate enqueue attempts.
*/
async endSession(
input: { id: string } & ServerSessionScope,
): Promise<PostgresServerSession | null> {
return this.repo.endSession({
id: input.id,
projectId: input.projectId,
teamId: input.teamId,
});
}
async markGenerationStarted(
input: { id: string } & ServerSessionScope,
): Promise<PostgresServerSession | null> {
return this.repo.markGenerationStarted({
id: input.id,
projectId: input.projectId,
teamId: input.teamId,
});
}
async markGenerationCompleted(
input: { id: string } & ServerSessionScope,
): Promise<PostgresServerSession | null> {
return this.repo.markGenerationCompleted({
id: input.id,
projectId: input.projectId,
teamId: input.teamId,
});
}
async markGenerationFailed(
input: { id: string; error?: string | null } & ServerSessionScope,
): Promise<PostgresServerSession | null> {
return this.repo.markGenerationFailed({
id: input.id,
projectId: input.projectId,
teamId: input.teamId,
error: input.error ?? null,
});
}
}
export function createServerSessionRuntimeRepository(
pool: PostgresPool,
): ServerSessionRuntimeRepository {
return new ServerSessionRuntimeRepository({ client: pool });
}
@@ -0,0 +1,206 @@
// SPDX-License-Identifier: Apache-2.0
import type { JobsOptions } from 'bullmq';
import type {
GenerateObservationsForEventJob,
GenerateSessionSummaryJob,
} from '../jobs/types.js';
import { buildServerJobId } from '../jobs/job-id.js';
import type { PostgresAgentEvent } from '../../storage/postgres/agent-events.js';
import type { PostgresObservationGenerationJob } from '../../storage/postgres/generation-jobs.js';
// SessionGenerationPolicy decides WHEN to enqueue work for the BullMQ event
// and summary lanes. It is configurable via:
// - CLAUDE_MEM_SERVER_SESSION_POLICY env var (per-process default)
// - per-call override (per-team settings can plug in here later)
//
// Three policies are supported:
// - 'per-event' (default): enqueue immediately on every event POST.
// Matches Phase 4/5 behavior.
// - 'debounce': enqueue with `delay`; when a new event arrives within
// the window, replace the delayed job (deterministic
// BullMQ jobId means re-add(jobId, ...) overwrites the
// waiting entry, and removeOnComplete/Fail keep things
// tidy). Outbox row is canonical so durability is safe.
// - 'end-of-session': only enqueue summary jobs at /v1/sessions/:id/end.
// Per-event posts skip BullMQ entirely; the outbox row
// remains in `queued` state and startup reconciliation
// will publish it later (or it can be cancelled).
//
// Anti-pattern guard: the policy MUST NOT use ActiveSession-style cached
// state. Inputs are always reloaded by the caller from Postgres before this
// fires.
export type ServerSessionGenerationPolicy = 'per-event' | 'debounce' | 'end-of-session';
const DEFAULT_DEBOUNCE_MS = 5000;
export interface SessionGenerationPolicyOptions {
policy?: ServerSessionGenerationPolicy;
debounceWindowMs?: number;
}
export function resolveSessionGenerationPolicy(
options: SessionGenerationPolicyOptions = {},
): { policy: ServerSessionGenerationPolicy; debounceWindowMs: number } {
const envPolicy = (process.env.CLAUDE_MEM_SERVER_SESSION_POLICY ?? '').trim().toLowerCase();
const policy: ServerSessionGenerationPolicy = options.policy
?? (envPolicy === 'debounce' || envPolicy === 'end-of-session' || envPolicy === 'per-event'
? envPolicy
: 'per-event');
const debounceWindowMs = options.debounceWindowMs
?? (Number.parseInt(process.env.CLAUDE_MEM_SERVER_SESSION_DEBOUNCE_MS ?? '', 10)
|| DEFAULT_DEBOUNCE_MS);
return {
policy,
debounceWindowMs: Number.isFinite(debounceWindowMs) && debounceWindowMs > 0
? debounceWindowMs
: DEFAULT_DEBOUNCE_MS,
};
}
export interface EnqueueEventDecisionInput {
event: PostgresAgentEvent;
outbox: PostgresObservationGenerationJob;
// Phase 11 — identity context captured at HTTP ingest time so the BullMQ
// payload carries every audit field. apiKeyId may be null for local-dev
// enqueues and `actorId` follows the api key's `actor_id` column.
apiKeyId?: string | null;
actorId?: string | null;
sourceAdapter?: string | null;
// Phase 12 — request correlation id minted at the HTTP boundary.
requestId?: string | null;
}
export interface EnqueueEventDecision {
shouldEnqueue: boolean;
jobId: string;
payload: GenerateObservationsForEventJob;
jobsOptions?: JobsOptions;
reason: 'per-event' | 'debounce' | 'end-of-session-skip';
}
export function buildEnqueueEventDecision(
input: EnqueueEventDecisionInput,
options: SessionGenerationPolicyOptions = {},
): EnqueueEventDecision {
const resolved = resolveSessionGenerationPolicy(options);
const jobId = input.outbox.bullmqJobId ?? buildServerJobId({
kind: 'event',
team_id: input.event.teamId,
project_id: input.event.projectId,
source_type: 'agent_event',
source_id: input.event.id,
});
const payload: GenerateObservationsForEventJob = {
kind: 'event',
team_id: input.outbox.teamId,
project_id: input.outbox.projectId,
source_type: 'agent_event',
source_id: input.event.id,
generation_job_id: input.outbox.id,
agent_event_id: input.event.id,
api_key_id: input.apiKeyId ?? null,
actor_id: input.actorId ?? null,
source_adapter: input.sourceAdapter ?? input.event.sourceAdapter ?? 'api',
request_id: input.requestId ?? null,
};
if (resolved.policy === 'end-of-session') {
return { shouldEnqueue: false, jobId, payload, reason: 'end-of-session-skip' };
}
if (resolved.policy === 'debounce') {
return {
shouldEnqueue: true,
jobId,
payload,
jobsOptions: { delay: resolved.debounceWindowMs },
reason: 'debounce',
};
}
return { shouldEnqueue: true, jobId, payload, reason: 'per-event' };
}
// Minimal queue surface used by scheduleDebouncedEventJob. Declared as an
// interface (instead of `Pick<ServerJobQueue<...>, ...>`) so the parameter
// accepts ServerJobQueue<ServerGenerationJobPayload> at the call site without
// triggering invariant TPayload type errors. The ServerJobQueue.add signature
// is structurally compatible — it requires `payload: TPayload`, and we only
// hand in narrowed payloads.
export interface DebounceableEventQueue {
add(jobId: string, payload: GenerateObservationsForEventJob, options?: JobsOptions): Promise<void>;
remove(jobId: string): Promise<void>;
getJob(jobId: string): Promise<unknown>;
}
/**
* Apply a debounce decision to a BullMQ queue. If a delayed job already exists
* for this deterministic id, BullMQ's `add(jobId, ...)` will be a no-op, so we
* proactively remove it first so the new event's delay window starts fresh.
*
* This implements the "if a new event arrives within window, replace the
* delayed job" requirement.
*/
export async function scheduleDebouncedEventJob(
queue: DebounceableEventQueue,
decision: EnqueueEventDecision,
): Promise<void> {
if (!decision.shouldEnqueue) return;
if (decision.reason === 'debounce') {
try {
const existing = await queue.getJob(decision.jobId);
if (existing) {
await queue.remove(decision.jobId);
}
} catch {
// best-effort; if remove fails because the job already moved to active
// we just let `add` no-op or fail through to the caller's error handler
}
}
await queue.add(decision.jobId, decision.payload, decision.jobsOptions);
}
export interface BuildSummaryJobInput {
serverSessionId: string;
teamId: string;
projectId: string;
generationJobId: string;
// Phase 11 — same identity context the event-payload builder receives.
apiKeyId?: string | null;
actorId?: string | null;
sourceAdapter?: string | null;
// Phase 12 — request correlation id flows into the summary lane too.
requestId?: string | null;
}
export function buildSummaryJobId(input: {
serverSessionId: string;
teamId: string;
projectId: string;
}): string {
return buildServerJobId({
kind: 'summary',
team_id: input.teamId,
project_id: input.projectId,
source_type: 'session_summary',
source_id: input.serverSessionId,
});
}
export function buildSummaryJobPayload(input: BuildSummaryJobInput): GenerateSessionSummaryJob {
return {
kind: 'summary',
team_id: input.teamId,
project_id: input.projectId,
source_type: 'session_summary',
source_id: input.serverSessionId,
generation_job_id: input.generationJobId,
server_session_id: input.serverSessionId,
api_key_id: input.apiKeyId ?? null,
actor_id: input.actorId ?? null,
source_adapter: input.sourceAdapter ?? 'api',
request_id: input.requestId ?? null,
};
}
@@ -1,10 +1,17 @@
// SPDX-License-Identifier: Apache-2.0
import { existsSync } from 'fs';
import { logger } from '../../utils/logger.js';
import { createPostgresStorageRepositories, getSharedPostgresPool, SERVER_BETA_POSTGRES_SCHEMA_VERSION } from '../../storage/postgres/index.js';
import { bootstrapServerBetaPostgresSchema } from '../../storage/postgres/schema.js';
import type { PostgresPool } from '../../storage/postgres/pool.js';
import { getRedisQueueConfig } from '../queue/redis-config.js';
import { ActiveServerBetaQueueManager } from './ActiveServerBetaQueueManager.js';
import { ActiveServerBetaGenerationWorkerManager } from './ActiveServerBetaGenerationWorkerManager.js';
import { ClaudeObservationProvider } from '../generation/providers/ClaudeObservationProvider.js';
import { GeminiObservationProvider } from '../generation/providers/GeminiObservationProvider.js';
import { OpenRouterObservationProvider } from '../generation/providers/OpenRouterObservationProvider.js';
import type { ServerGenerationProvider } from '../generation/providers/shared/types.js';
import { ServerBetaService } from './ServerBetaService.js';
import {
DisabledServerBetaEventBroadcaster,
@@ -13,6 +20,7 @@ import {
DisabledServerBetaQueueManager,
type ServerBetaAuthMode,
type ServerBetaBootstrapStatus,
type ServerBetaGenerationWorkerManager,
type ServerBetaQueueManager,
type ServerBetaServiceGraph,
} from './types.js';
@@ -22,13 +30,147 @@ export interface CreateServerBetaServiceOptions {
authMode?: ServerBetaAuthMode;
bootstrapSchema?: boolean;
queueManager?: ServerBetaQueueManager;
// Phase 5 seam: tests can inject a fake provider without env config.
generationProvider?: ServerGenerationProvider;
generationWorkerManager?: ServerBetaGenerationWorkerManager;
// Phase 10: when true, skip building the generation worker. Used when the
// service is just an HTTP front-end and a separate `server worker` process
// consumes the BullMQ queues.
generationDisabled?: boolean;
// Phase 10: skip env validation (tests). Production code paths always run
// validation so misconfiguration fails fast at startup.
skipEnvValidation?: boolean;
}
// Phase 10 — env validation. Server beta in Docker requires explicit, complete
// configuration. Missing pieces fail fast at startup rather than silently
// degrading. Required env when running in Docker:
// - CLAUDE_MEM_SERVER_DATABASE_URL (Postgres)
// - CLAUDE_MEM_QUEUE_ENGINE=bullmq (no in-memory queue in Docker)
// - CLAUDE_MEM_REDIS_URL (BullMQ requires Redis/Valkey)
// - CLAUDE_MEM_AUTH_MODE != local-dev (auth must be real in Docker)
// `local-dev` bypass is only valid on a developer's loopback; in Docker the
// container is reachable via service-to-service networking and exposed ports,
// so the loopback assumption is invalid.
export interface ServerBetaEnvValidationOptions {
env?: NodeJS.ProcessEnv;
isDocker?: boolean;
}
export interface ServerBetaEnvValidationResult {
isDocker: boolean;
runtime: string;
authMode: string;
queueEngine: string;
hasDatabaseUrl: boolean;
hasRedisUrl: boolean;
}
export function detectDockerEnvironment(env: NodeJS.ProcessEnv = process.env): boolean {
if (env.CLAUDE_MEM_DOCKER === '1' || env.CLAUDE_MEM_DOCKER === 'true') return true;
// /.dockerenv is the canonical Docker marker; existsSync is cheap.
try {
if (existsSync('/.dockerenv')) return true;
} catch {
// ignore
}
return false;
}
export function validateServerBetaEnv(
options: ServerBetaEnvValidationOptions = {},
): ServerBetaEnvValidationResult {
const env = options.env ?? process.env;
const isDocker = options.isDocker ?? detectDockerEnvironment(env);
const errors: string[] = [];
const runtime = (env.CLAUDE_MEM_RUNTIME ?? '').trim();
if (!runtime) {
// Warn but allow — defaulted to 'worker' upstream; we log a warning so
// operators know server-beta is the active runtime here.
if (isDocker) {
logger.warn('SYSTEM', 'CLAUDE_MEM_RUNTIME unset; server-beta container assumes runtime=server-beta');
}
} else if (runtime !== 'server-beta' && isDocker) {
errors.push(
`CLAUDE_MEM_RUNTIME=${runtime} is invalid in Docker; the server-beta image only runs CLAUDE_MEM_RUNTIME=server-beta.`,
);
}
const authMode = (env.CLAUDE_MEM_AUTH_MODE ?? 'api-key').trim();
if (isDocker) {
if (authMode === 'local-dev') {
errors.push(
'CLAUDE_MEM_AUTH_MODE=local-dev is not allowed in Docker. Set CLAUDE_MEM_AUTH_MODE=api-key and create a key with `claude-mem server api-key create`.',
);
}
if (
env.CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS === '1'
|| env.CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS === 'true'
) {
errors.push(
'CLAUDE_MEM_ALLOW_LOCAL_DEV_BYPASS is not allowed in Docker. Loopback bypass cannot be enforced inside a container; remove the variable.',
);
}
}
const queueEngine = (env.CLAUDE_MEM_QUEUE_ENGINE ?? '').trim().toLowerCase();
if (isDocker) {
if (!queueEngine) {
errors.push('CLAUDE_MEM_QUEUE_ENGINE is required in Docker; set it to "bullmq".');
} else if (queueEngine !== 'bullmq') {
errors.push(
`CLAUDE_MEM_QUEUE_ENGINE=${queueEngine} is not allowed in Docker. Only "bullmq" is supported (no in-process queues across container boundaries).`,
);
}
}
const hasDatabaseUrl = Boolean((env.CLAUDE_MEM_SERVER_DATABASE_URL ?? '').trim());
if (!hasDatabaseUrl) {
errors.push('CLAUDE_MEM_SERVER_DATABASE_URL is required to start server-beta (Postgres connection string).');
}
const hasRedisUrl = Boolean((env.CLAUDE_MEM_REDIS_URL ?? '').trim());
if (queueEngine === 'bullmq' && !hasRedisUrl) {
errors.push('CLAUDE_MEM_REDIS_URL is required when CLAUDE_MEM_QUEUE_ENGINE=bullmq.');
}
if (errors.length > 0) {
const message = [
'server-beta startup configuration is invalid:',
...errors.map(line => ` - ${line}`),
].join('\n');
throw new Error(message);
}
return {
isDocker,
runtime: runtime || 'server-beta',
authMode,
queueEngine: queueEngine || 'disabled',
hasDatabaseUrl,
hasRedisUrl,
};
}
export async function createServerBetaService(
options: CreateServerBetaServiceOptions = {},
): Promise<ServerBetaService> {
if (!options.skipEnvValidation) {
validateServerBetaEnv();
}
const pool = options.pool ?? getSharedPostgresPool({ requireDatabaseUrl: true });
const bootstrap = await initializePostgres(pool, options.bootstrapSchema ?? true);
const queueManager = options.queueManager ?? buildQueueManager();
const generationDisabled = options.generationDisabled
?? (process.env.CLAUDE_MEM_GENERATION_DISABLED === '1'
|| process.env.CLAUDE_MEM_GENERATION_DISABLED === 'true');
const generationWorkerManager = options.generationWorkerManager
?? (generationDisabled
? new DisabledServerBetaGenerationWorkerManager(
'CLAUDE_MEM_GENERATION_DISABLED is set; this server runs HTTP only. A separate `claude-mem server worker start` process consumes the BullMQ queues.',
)
: buildGenerationWorkerManager(pool, queueManager, options.generationProvider));
const graph: ServerBetaServiceGraph = {
runtime: 'server-beta',
postgres: {
@@ -36,16 +178,74 @@ export async function createServerBetaService(
bootstrap,
},
authMode: options.authMode ?? parseAuthMode(process.env.CLAUDE_MEM_AUTH_MODE),
queueManager: options.queueManager ?? buildQueueManager(),
generationWorkerManager: new DisabledServerBetaGenerationWorkerManager('Phase 2 boundary only; generation workers are not wired.'),
providerRegistry: new DisabledServerBetaProviderRegistry('Phase 2 boundary only; provider-backed generation is not wired.'),
queueManager,
generationWorkerManager,
providerRegistry: new DisabledServerBetaProviderRegistry('Phase 5 keeps the provider registry boundary as inert; per-call providers are owned by the generation worker manager.'),
eventBroadcaster: new DisabledServerBetaEventBroadcaster('Phase 2 boundary only; SSE/event broadcasting is not wired.'),
storage: createPostgresStorageRepositories(pool),
};
if (generationWorkerManager instanceof ActiveServerBetaGenerationWorkerManager) {
generationWorkerManager.start();
}
return new ServerBetaService({ graph });
}
function buildGenerationWorkerManager(
pool: PostgresPool,
queueManager: ServerBetaQueueManager,
injectedProvider?: ServerGenerationProvider,
): ServerBetaGenerationWorkerManager {
if (!(queueManager instanceof ActiveServerBetaQueueManager)) {
return new DisabledServerBetaGenerationWorkerManager(
'queue manager is disabled; set CLAUDE_MEM_QUEUE_ENGINE=bullmq to enable provider generation.',
);
}
const provider = injectedProvider ?? buildServerGenerationProviderFromEnv();
if (!provider) {
return new DisabledServerBetaGenerationWorkerManager(
'no server generation provider configured; set CLAUDE_MEM_SERVER_PROVIDER and the matching API key to enable.',
);
}
return new ActiveServerBetaGenerationWorkerManager({
pool,
queueManager,
provider,
});
}
function buildServerGenerationProviderFromEnv(): ServerGenerationProvider | null {
const provider = (process.env.CLAUDE_MEM_SERVER_PROVIDER ?? '').trim().toLowerCase();
if (!provider) return null;
try {
if (provider === 'claude' || provider === 'anthropic') {
const apiKey = process.env.ANTHROPIC_API_KEY ?? process.env.CLAUDE_MEM_ANTHROPIC_API_KEY ?? '';
if (!apiKey) return null;
const opts: { apiKey: string; model?: string } = { apiKey };
if (process.env.CLAUDE_MEM_SERVER_MODEL) opts.model = process.env.CLAUDE_MEM_SERVER_MODEL;
return new ClaudeObservationProvider(opts);
}
if (provider === 'gemini') {
const apiKey = process.env.GEMINI_API_KEY ?? process.env.CLAUDE_MEM_GEMINI_API_KEY ?? '';
if (!apiKey) return null;
const opts: { apiKey: string; model?: string } = { apiKey };
if (process.env.CLAUDE_MEM_SERVER_MODEL) opts.model = process.env.CLAUDE_MEM_SERVER_MODEL;
return new GeminiObservationProvider(opts);
}
if (provider === 'openrouter') {
const apiKey = process.env.OPENROUTER_API_KEY ?? process.env.CLAUDE_MEM_OPENROUTER_API_KEY ?? '';
if (!apiKey) return null;
const opts: { apiKey: string; model?: string } = { apiKey };
if (process.env.CLAUDE_MEM_SERVER_MODEL) opts.model = process.env.CLAUDE_MEM_SERVER_MODEL;
return new OpenRouterObservationProvider(opts);
}
} catch {
return null;
}
return null;
}
// Queue manager selection is fail-fast on misconfiguration. If the user
// explicitly opts into BullMQ via CLAUDE_MEM_QUEUE_ENGINE=bullmq we build
// the active manager; any error there throws so the runtime does not
+18
View File
@@ -20,6 +20,24 @@ export interface ServerBetaBoundaryHealth {
details?: Record<string, unknown>;
}
// Phase 12 — per-lane queue metric snapshot. Returned by
// ActiveServerBetaQueueManager.getLaneMetrics so /api/health and /v1/info
// can publish current waiting/active/completed/failed/delayed/stalled counts
// for each generation lane. `unavailable` is set when Redis was unreachable
// at sample time so /api/health still responds rather than 500'ing.
export interface ServerBetaQueueLaneMetric {
kind: string;
name: string;
waiting: number;
active: number;
completed: number;
failed: number;
delayed: number;
stalled: number;
unavailable: boolean;
unavailableReason?: string;
}
export interface ServerBetaQueueManager {
readonly kind: 'queue-manager';
getHealth(): ServerBetaBoundaryHealth;
+155
View File
@@ -0,0 +1,155 @@
// SPDX-License-Identifier: Apache-2.0
// Shared session-end + summary-job path used by both `/v1/sessions/:id/end`
// (canonical) and `src/server/compat/SessionsSummarizeAdapter.ts` (legacy
// translator). Both call sites must produce identical Postgres state and
// queue effects: ended_at idempotency, exactly one outbox row per session
// summary, deterministic BullMQ job id.
//
// This module MUST NOT import from src/services/worker/* — Phase 9 keeps
// the compat shim coupled to Server beta core only.
import {
PostgresObservationGenerationJobEventsRepository,
PostgresObservationGenerationJobRepository,
type PostgresObservationGenerationJob,
} from '../../storage/postgres/generation-jobs.js';
import type { PostgresPool } from '../../storage/postgres/pool.js';
import { withPostgresTransaction } from '../../storage/postgres/pool.js';
import {
PostgresServerSessionsRepository,
type PostgresServerSession,
} from '../../storage/postgres/server-sessions.js';
import { logger } from '../../utils/logger.js';
import { buildSummaryJobId, buildSummaryJobPayload } from '../runtime/SessionGenerationPolicy.js';
import type { GenerateSessionSummaryJob } from '../jobs/types.js';
import type { EnqueueOutcome, EventQueueLike } from './IngestEventsService.js';
import { newId } from '../../storage/postgres/utils.js';
const SUMMARY_JOB_TYPE = 'observation_generate_session_summary';
export interface EndSessionServiceOptions {
pool: PostgresPool;
resolveSummaryQueue: () => EventQueueLike | null;
}
export interface EndSessionResult {
session: PostgresServerSession | null;
outbox: PostgresObservationGenerationJob | null;
enqueueState: EnqueueOutcome;
}
export interface EndSessionInput {
sessionId: string;
projectId: string;
teamId: string;
source?: string;
// Phase 11 — identity context propagated into the BullMQ summary payload.
apiKeyId?: string | null;
actorId?: string | null;
sourceAdapter?: string | null;
}
export class EndSessionService {
constructor(private readonly options: EndSessionServiceOptions) {}
async end(input: EndSessionInput): Promise<EndSessionResult> {
const source = input.source ?? 'http_post_v1_sessions_end';
const txResult = await withPostgresTransaction(this.options.pool, async (client) => {
const sessionsRepo = new PostgresServerSessionsRepository(client);
const ended = await sessionsRepo.endSession({
id: input.sessionId,
projectId: input.projectId,
teamId: input.teamId,
});
if (!ended) {
return {
session: null as PostgresServerSession | null,
outbox: null as PostgresObservationGenerationJob | null,
};
}
const jobsRepo = new PostgresObservationGenerationJobRepository(client);
const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
// Persist the BullMQ payload at create-time so reconciliation and
// operator retry can re-enqueue a payload that passes the worker's
// assertServerGenerationJobPayload validation.
const outboxId = newId();
const summaryPayload = buildSummaryJobPayload({
serverSessionId: ended.id,
teamId: ended.teamId,
projectId: ended.projectId,
generationJobId: outboxId,
apiKeyId: input.apiKeyId ?? null,
actorId: input.actorId ?? null,
sourceAdapter: input.sourceAdapter ?? null,
});
const outbox = await jobsRepo.create({
id: outboxId,
projectId: ended.projectId,
teamId: ended.teamId,
sourceType: 'session_summary',
sourceId: ended.id,
serverSessionId: ended.id,
jobType: SUMMARY_JOB_TYPE,
bullmqJobId: buildSummaryJobId({
serverSessionId: ended.id,
teamId: ended.teamId,
projectId: ended.projectId,
}),
payload: summaryPayload as unknown as Record<string, unknown>,
});
await eventsLogRepo.append({
generationJobId: outbox.id,
projectId: outbox.projectId,
teamId: outbox.teamId,
eventType: 'queued',
statusAfter: outbox.status,
attempt: outbox.attempts,
details: { source },
});
return { session: ended, outbox };
});
if (!txResult.session || !txResult.outbox) {
return { session: txResult.session, outbox: null, enqueueState: 'skipped' };
}
const enqueueState = await this.publishSummaryJob(txResult.session.id, txResult.outbox, input);
return { session: txResult.session, outbox: txResult.outbox, enqueueState };
}
private async publishSummaryJob(
serverSessionId: string,
outbox: PostgresObservationGenerationJob,
input: EndSessionInput,
): Promise<'enqueued' | 'queued_only'> {
const queue = this.options.resolveSummaryQueue();
if (!queue) {
return 'queued_only';
}
const jobId = outbox.bullmqJobId ?? buildSummaryJobId({
serverSessionId,
teamId: outbox.teamId,
projectId: outbox.projectId,
});
const payload: GenerateSessionSummaryJob = buildSummaryJobPayload({
serverSessionId,
teamId: outbox.teamId,
projectId: outbox.projectId,
generationJobId: outbox.id,
apiKeyId: input.apiKeyId ?? null,
actorId: input.actorId ?? null,
sourceAdapter: input.sourceAdapter ?? null,
});
try {
await queue.add(jobId, payload);
return 'enqueued';
} catch (error) {
logger.warn('SYSTEM', 'failed to publish summary generation job to BullMQ', {
outboxId: outbox.id,
error: error instanceof Error ? error.message : String(error),
});
return 'queued_only';
}
}
}
+273
View File
@@ -0,0 +1,273 @@
// SPDX-License-Identifier: Apache-2.0
// Shared event-ingest path used by both `/v1/events` (canonical) and
// `src/server/compat/SessionsObservationsAdapter.ts` (legacy translator).
// Centralizes the transactional write (event row + outbox row + lifecycle
// log) and the post-commit BullMQ enqueue so both call sites apply the
// exact same SessionGenerationPolicy and outbox-then-publish guarantees.
//
// This module MUST NOT import from src/services/worker/* — the whole point
// of Phase 9 is to give the compat adapters a translation surface that
// reaches Server beta core directly, with no worker-layer detours.
import type { CreatePostgresAgentEventInput, PostgresAgentEvent } from '../../storage/postgres/agent-events.js';
import { PostgresAgentEventsRepository } from '../../storage/postgres/agent-events.js';
import {
PostgresObservationGenerationJobEventsRepository,
PostgresObservationGenerationJobRepository,
type PostgresObservationGenerationJob,
} from '../../storage/postgres/generation-jobs.js';
import type { PostgresPool } from '../../storage/postgres/pool.js';
import { withPostgresTransaction } from '../../storage/postgres/pool.js';
import { logger } from '../../utils/logger.js';
import { buildServerJobId } from '../jobs/job-id.js';
import type { GenerateObservationsForEventJob } from '../jobs/types.js';
import {
buildEnqueueEventDecision,
scheduleDebouncedEventJob,
type ServerSessionGenerationPolicy,
} from '../runtime/SessionGenerationPolicy.js';
import { newId } from '../../storage/postgres/utils.js';
function buildEventBullmqPayload(input: {
outboxId: string;
event: PostgresAgentEvent;
apiKeyId: string | null;
actorId: string | null;
sourceAdapter: string | null;
requestId: string | null;
}): GenerateObservationsForEventJob {
return {
kind: 'event',
team_id: input.event.teamId,
project_id: input.event.projectId,
source_type: 'agent_event',
source_id: input.event.id,
generation_job_id: input.outboxId,
agent_event_id: input.event.id,
api_key_id: input.apiKeyId,
actor_id: input.actorId,
source_adapter: input.sourceAdapter ?? input.event.sourceAdapter ?? 'api',
request_id: input.requestId,
};
}
const EVENT_JOB_TYPE = 'observation_generate_for_event';
export type EnqueueOutcome = 'enqueued' | 'queued_only' | 'skipped';
export interface IngestEventsServiceOptions {
pool: PostgresPool;
// Lazy queue resolver so the service does not depend on the queue manager
// type and tests can swap in a fake. When this returns null, the outbox
// row stays `queued` and Phase 3 startup reconciliation will publish it.
resolveEventQueue: () => EventQueueLike | null;
sessionPolicy?: ServerSessionGenerationPolicy;
sessionDebounceWindowMs?: number;
}
export interface EventQueueLike {
add(jobId: string, payload: unknown, options?: unknown): Promise<unknown>;
}
export interface IngestEventResult {
event: PostgresAgentEvent;
outbox: PostgresObservationGenerationJob | null;
enqueueState: EnqueueOutcome;
}
export interface IngestEventOptions {
generate?: boolean;
source?: string;
// Phase 11 — identity context that flows from the HTTP auth boundary into
// the BullMQ payload and audit log. None of these are auth gates: the
// worker reloads and re-validates from Postgres before any side effect.
apiKeyId?: string | null;
actorId?: string | null;
sourceAdapter?: string | null;
// Phase 12 — opaque correlation id minted at the HTTP middleware so
// generator logs and audit rows can pivot back to the originating request.
requestId?: string | null;
}
export class IngestEventsService {
constructor(private readonly options: IngestEventsServiceOptions) {}
async ingestOne(
input: CreatePostgresAgentEventInput,
opts: IngestEventOptions = {},
): Promise<IngestEventResult> {
const generate = opts.generate ?? true;
const source = opts.source ?? 'http_post_v1_events';
const txResult = await withPostgresTransaction(this.options.pool, async (client) => {
const eventsRepo = new PostgresAgentEventsRepository(client);
const inserted = await eventsRepo.create(input);
if (!generate) {
return { event: inserted, outbox: null as PostgresObservationGenerationJob | null };
}
const jobsRepo = new PostgresObservationGenerationJobRepository(client);
const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
// Pre-generate the outbox id so we can build the BullMQ payload (which
// references generation_job_id) and persist it on the row. Reconciliation
// and operator retry rely on this persisted payload to re-enqueue a
// payload that passes assertServerGenerationJobPayload at the worker.
const outboxId = newId();
const bullmqPayload = buildEventBullmqPayload({
outboxId,
event: inserted,
apiKeyId: opts.apiKeyId ?? null,
actorId: opts.actorId ?? null,
sourceAdapter: opts.sourceAdapter ?? null,
requestId: opts.requestId ?? null,
});
const outbox = await jobsRepo.create({
id: outboxId,
projectId: inserted.projectId,
teamId: inserted.teamId,
sourceType: 'agent_event',
sourceId: inserted.id,
agentEventId: inserted.id,
serverSessionId: inserted.serverSessionId,
jobType: EVENT_JOB_TYPE,
bullmqJobId: buildServerJobId({
kind: 'event',
team_id: inserted.teamId,
project_id: inserted.projectId,
source_type: 'agent_event',
source_id: inserted.id,
}),
payload: bullmqPayload as unknown as Record<string, unknown>,
});
await eventsLogRepo.append({
generationJobId: outbox.id,
projectId: outbox.projectId,
teamId: outbox.teamId,
eventType: 'queued',
statusAfter: outbox.status,
attempt: outbox.attempts,
details: { source },
});
return { event: inserted, outbox };
});
let enqueueState: EnqueueOutcome = 'skipped';
if (txResult.outbox) {
enqueueState = await this.publishEventJob(txResult.event, txResult.outbox, opts);
}
return { event: txResult.event, outbox: txResult.outbox, enqueueState };
}
async ingestBatch(
inputs: CreatePostgresAgentEventInput[],
opts: IngestEventOptions = {},
): Promise<IngestEventResult[]> {
const generate = opts.generate ?? true;
const source = opts.source ?? 'http_post_v1_events_batch';
const txResults = await withPostgresTransaction(this.options.pool, async (client) => {
const eventsRepo = new PostgresAgentEventsRepository(client);
const jobsRepo = new PostgresObservationGenerationJobRepository(client);
const eventsLogRepo = new PostgresObservationGenerationJobEventsRepository(client);
const acc: { event: PostgresAgentEvent; outbox: PostgresObservationGenerationJob | null }[] = [];
for (const input of inputs) {
const event = await eventsRepo.create(input);
if (!generate) {
acc.push({ event, outbox: null });
continue;
}
const outboxId = newId();
const bullmqPayload = buildEventBullmqPayload({
outboxId,
event,
apiKeyId: opts.apiKeyId ?? null,
actorId: opts.actorId ?? null,
sourceAdapter: opts.sourceAdapter ?? null,
requestId: opts.requestId ?? null,
});
const outbox = await jobsRepo.create({
id: outboxId,
projectId: event.projectId,
teamId: event.teamId,
sourceType: 'agent_event',
sourceId: event.id,
agentEventId: event.id,
serverSessionId: event.serverSessionId,
jobType: EVENT_JOB_TYPE,
bullmqJobId: buildServerJobId({
kind: 'event',
team_id: event.teamId,
project_id: event.projectId,
source_type: 'agent_event',
source_id: event.id,
}),
payload: bullmqPayload as unknown as Record<string, unknown>,
});
await eventsLogRepo.append({
generationJobId: outbox.id,
projectId: outbox.projectId,
teamId: outbox.teamId,
eventType: 'queued',
statusAfter: outbox.status,
attempt: outbox.attempts,
details: { source },
});
acc.push({ event, outbox });
}
return acc;
});
return Promise.all(txResults.map(async ({ event, outbox }) => {
const enqueueState: EnqueueOutcome = outbox
? await this.publishEventJob(event, outbox, opts)
: 'skipped';
return { event, outbox, enqueueState };
}));
}
private async publishEventJob(
event: PostgresAgentEvent,
outbox: PostgresObservationGenerationJob,
opts: IngestEventOptions = {},
): Promise<'enqueued' | 'queued_only'> {
const queue = this.options.resolveEventQueue();
if (!queue) {
return 'queued_only';
}
const policyOptions: { policy?: ServerSessionGenerationPolicy; debounceWindowMs?: number } = {};
if (this.options.sessionPolicy !== undefined) {
policyOptions.policy = this.options.sessionPolicy;
}
if (this.options.sessionDebounceWindowMs !== undefined) {
policyOptions.debounceWindowMs = this.options.sessionDebounceWindowMs;
}
const decision = buildEnqueueEventDecision(
{
event,
outbox,
apiKeyId: opts.apiKeyId ?? null,
actorId: opts.actorId ?? null,
sourceAdapter: opts.sourceAdapter ?? event.sourceAdapter ?? null,
// Phase 12 — flow request_id into the BullMQ payload so the worker
// can emit it in [generation] logs and the audit row.
requestId: opts.requestId ?? null,
},
policyOptions,
);
if (!decision.shouldEnqueue) {
return 'queued_only';
}
try {
await scheduleDebouncedEventJob(queue as never, decision);
return 'enqueued';
} catch (error) {
logger.warn('SYSTEM', 'failed to publish event generation job to BullMQ', {
outboxId: outbox.id,
error: error instanceof Error ? error.message : String(error),
});
return 'queued_only';
}
}
}