Server-beta: Postgres storage + independent runtime + BullMQ queue (Phases 1–3) (#2351)
* Add server beta runtime foundation * Address server beta review findings * Resolve server beta review comments * Tighten server beta review follow-ups * Harden server beta auth and search * Avoid unnecessary FTS rebuilds * Block scoped keys from creating projects * Release BullMQ claims best effort on close * Address server beta review blockers * Reset BullMQ claims best effort * Add Postgres observation storage foundation * feat(server-beta): add independent runtime service Introduce src/server/runtime/ as a self-contained server-beta runtime that owns its lifecycle, Postgres bootstrap, and HTTP boundary without depending on WorkerService. ServerBetaService wraps the existing Server class, exposes /healthz and /v1/info with runtime="server-beta", and persists state to dedicated paths (.server-beta.pid|.port|.runtime.json). The four boundary managers (queue, generation worker, provider registry, event broadcaster) are intentionally disabled in this phase and report their status through /v1/info; later phases activate them. Adds plans/2026-05-07-finish-bullmq-branch-ship-plan.md to track the remaining work for this branch. Phase 2 of plans/2026-05-07-server-beta-independent-bullmq-observation-runtime.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): route CLI lifecycle and bundle separate runtime scripts/build-hooks.js now produces plugin/scripts/server-beta-service.cjs as a separate Node CJS bundle, alongside the existing worker-service bundle. The server-beta runtime is now installable independently. src/npx-cli/commands/server.ts routes start|stop|restart|status to the server-beta lifecycle instead of the legacy worker. The worker keeps its own start|stop|restart|status under the worker namespace; the two runtimes can be operated independently. src/services/worker-service.ts adds a server-* command parser branch that delegates to the sibling server-beta-service.cjs bundle so direct worker-service invocations still route to the right runtime. tests/npx-cli-server-namespace.test.ts updated to expect server-beta lifecycle routing. Includes rebuilt plugin/scripts/*.cjs bundles produced by build-and-sync. Phase 2 of plans/2026-05-07-server-beta-independent-bullmq-observation-runtime.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): add BullMQ job queue primitives Introduce src/server/jobs/ as the queue-side primitives that Phase 3 of the server-beta runtime needs to operate. types.ts defines a discriminated union over the four job kinds (event, event-batch, summary, reindex) and maps each to a per-kind BullMQ queue name and deterministic-ID prefix. job-id.ts builds deterministic, colon-free BullMQ jobIds from (kind, team, project, source). The colon ban exists because BullMQ uses ':' as a Redis key separator internally; embedding ':' in jobIds breaks scan and state lookups. ServerJobQueue.ts is a thin wrapper over BullMQ Queue + Worker that enforces autorun:false, default concurrency 1, and an attached error listener — all per BullMQ docs requirements. Test seams accept queue and worker factories so unit tests do not need Redis. outbox.ts publishes through the Postgres ObservationGenerationJob repository as canonical history. enqueueOutbox writes the row first, then publishes to BullMQ; if BullMQ throws, the row is transitioned to failed and a failed event is appended. reconcileOnStartup re-enqueues queued + processing rows after a restart, replacing terminal BullMQ jobs that may still be holding the deterministic ID slot. markCompleted and markFailed wrap transitionStatus and append the matching event row. Includes 20 unit tests covering deterministic ID stability, colon-free output, queue lifecycle, error-listener attachment, double-start refusal, idempotent enqueue, BullMQ failure rollback, startup reconciliation, max-attempts skipping, and completion / failure / retry transitions. Phase 3 commit 1 of plans/2026-05-07-server-beta-independent-bullmq-observation-runtime.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server-beta): activate queue boundary in runtime service Wire ActiveServerBetaQueueManager into the server-beta runtime graph. The active manager owns one ServerJobQueue per generation kind (event, event-batch, summary, reindex) and surfaces lane metadata through boundary health. Selection is opt-in and fail-fast: if CLAUDE_MEM_QUEUE_ENGINE is set to bullmq the active manager is constructed (and any Redis/config error throws — no silent fallback to SQLite, per Phase 3 anti-pattern guard). For any other engine the disabled boundary remains so worker-era and test setups stay compatible. Widens ServerBetaBoundaryHealth.status to a discriminated union ('disabled' | 'active' | 'errored') with optional details. The disabled adapter still emits status='disabled', which keeps the existing server-beta-service test green. ServerBetaService receives the manager through a new optional queueManager field on CreateServerBetaServiceOptions so test graphs and Phase 4 wiring can inject custom managers. Adds tests/server/runtime/active-queue-manager.test.ts covering bullmq guard, active health shape, per-kind queue access, close behavior, and post-close errored health. Phase 3 commit 2 of plans/2026-05-07-server-beta-independent-bullmq-observation-runtime.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(server-beta): cap /v1/events/batch at 500 events Prevents unbounded array DoS surface flagged in PR review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -12,6 +12,11 @@ const WORKER_SERVICE = {
|
||||
source: 'src/services/worker-service.ts'
|
||||
};
|
||||
|
||||
const SERVER_BETA_SERVICE = {
|
||||
name: 'server-beta-service',
|
||||
source: 'src/server/runtime/ServerBetaService.ts'
|
||||
};
|
||||
|
||||
const MCP_SERVER = {
|
||||
name: 'mcp-server',
|
||||
source: 'src/servers/mcp-server.ts'
|
||||
@@ -139,6 +144,7 @@ async function buildHooks() {
|
||||
logLevel: 'error', // Suppress warnings (import.meta warning is benign)
|
||||
external: [
|
||||
'bun:sqlite',
|
||||
'zod',
|
||||
'cohere-ai',
|
||||
'ollama',
|
||||
'@chroma-core/default-embed',
|
||||
@@ -162,6 +168,38 @@ async function buildHooks() {
|
||||
const workerStats = fs.statSync(`${hooksDir}/${WORKER_SERVICE.name}.cjs`);
|
||||
console.log(`✓ worker-service built (${(workerStats.size / 1024).toFixed(2)} KB)`);
|
||||
|
||||
console.log(`\n🔧 Building server beta service...`);
|
||||
await build({
|
||||
entryPoints: [SERVER_BETA_SERVICE.source],
|
||||
bundle: true,
|
||||
platform: 'node',
|
||||
target: 'node18',
|
||||
format: 'cjs',
|
||||
outfile: `${hooksDir}/${SERVER_BETA_SERVICE.name}.cjs`,
|
||||
minify: true,
|
||||
logLevel: 'error',
|
||||
external: [
|
||||
'bun:sqlite',
|
||||
'zod',
|
||||
],
|
||||
define: {
|
||||
'__DEFAULT_PACKAGE_VERSION__': `"${version}"`
|
||||
},
|
||||
banner: {
|
||||
js: [
|
||||
'#!/usr/bin/env bun',
|
||||
'var __filename = __filename || require("node:path").resolve(process.argv[1] || "");',
|
||||
'var __dirname = __dirname || require("node:path").dirname(__filename);'
|
||||
].join('\n')
|
||||
}
|
||||
});
|
||||
|
||||
stripHardcodedDirname(`${hooksDir}/${SERVER_BETA_SERVICE.name}.cjs`);
|
||||
|
||||
fs.chmodSync(`${hooksDir}/${SERVER_BETA_SERVICE.name}.cjs`, 0o755);
|
||||
const serverBetaStats = fs.statSync(`${hooksDir}/${SERVER_BETA_SERVICE.name}.cjs`);
|
||||
console.log(`✓ server-beta-service built (${(serverBetaStats.size / 1024).toFixed(2)} KB)`);
|
||||
|
||||
console.log(`\n🔧 Building MCP server...`);
|
||||
await build({
|
||||
entryPoints: [MCP_SERVER.source],
|
||||
@@ -406,6 +444,7 @@ async function buildHooks() {
|
||||
console.log('\n✅ All build targets compiled successfully!');
|
||||
console.log(` Output: ${hooksDir}/`);
|
||||
console.log(` - Worker: worker-service.cjs`);
|
||||
console.log(` - Server beta: server-beta-service.cjs`);
|
||||
console.log(` - MCP Server: mcp-server.cjs`);
|
||||
console.log(` - Context Generator: context-generator.cjs`);
|
||||
console.log(` Output: ${npxCliOutDir}/`);
|
||||
|
||||
Executable → Regular
+14
-14
@@ -39,19 +39,19 @@ Claude-Mem Queue Clearer
|
||||
Clear orphaned messages from the pending_messages SQLite table.
|
||||
|
||||
Usage:
|
||||
bun scripts/clear-failed-queue.ts [options]
|
||||
bun scripts/clear-pending-queue.ts [options]
|
||||
|
||||
Options:
|
||||
--help, -h Show this help message
|
||||
--all Clear ALL messages (pending, processing, processed, failed)
|
||||
--all Clear ALL messages (pending and processing)
|
||||
--force Clear without prompting for confirmation
|
||||
|
||||
Examples:
|
||||
# Clear failed messages interactively
|
||||
bun scripts/clear-failed-queue.ts
|
||||
# Clear processing messages interactively
|
||||
bun scripts/clear-pending-queue.ts
|
||||
|
||||
# Clear ALL messages without confirmation
|
||||
bun scripts/clear-failed-queue.ts --all --force
|
||||
bun scripts/clear-pending-queue.ts --all --force
|
||||
|
||||
Notes:
|
||||
Operates directly on ~/.claude-mem/claude-mem.db (or \$CLAUDE_MEM_DATA_DIR).
|
||||
@@ -65,7 +65,7 @@ Notes:
|
||||
|
||||
console.log(clearAll
|
||||
? '\n=== Claude-Mem Queue Clearer (ALL) ===\n'
|
||||
: '\n=== Claude-Mem Queue Clearer (Failed) ===\n');
|
||||
: '\n=== Claude-Mem Queue Clearer (Processing) ===\n');
|
||||
|
||||
const dbPath = resolveDbPath();
|
||||
if (!existsSync(dbPath)) {
|
||||
@@ -81,20 +81,20 @@ Notes:
|
||||
).all() as StatusRow[];
|
||||
|
||||
const total = counts.reduce((sum, row) => sum + row.count, 0);
|
||||
const failed = counts.find(r => r.status === 'failed')?.count ?? 0;
|
||||
const processing = counts.find(r => r.status === 'processing')?.count ?? 0;
|
||||
|
||||
console.log('Queue Summary:');
|
||||
for (const status of ['pending', 'processing', 'processed', 'failed'] as const) {
|
||||
for (const status of ['pending', 'processing'] as const) {
|
||||
const row = counts.find(r => r.status === status);
|
||||
console.log(` ${status.padEnd(11)} ${row?.count ?? 0}`);
|
||||
}
|
||||
console.log('');
|
||||
|
||||
const willClear = clearAll ? total : failed;
|
||||
const willClear = clearAll ? total : processing;
|
||||
if (willClear === 0) {
|
||||
console.log(clearAll
|
||||
? 'No messages in queue. Nothing to clear.\n'
|
||||
: 'No failed messages in queue. Nothing to clear.\n');
|
||||
: 'No processing messages in queue. Nothing to clear.\n');
|
||||
db.close();
|
||||
process.exit(0);
|
||||
}
|
||||
@@ -102,8 +102,8 @@ Notes:
|
||||
if (!force) {
|
||||
const answer = await prompt(
|
||||
clearAll
|
||||
? `Clear ${willClear} messages (all statuses)? [y/N]: `
|
||||
: `Clear ${willClear} failed messages? [y/N]: `
|
||||
? `Clear ${willClear} messages (pending and processing)? [y/N]: `
|
||||
: `Clear ${willClear} processing messages? [y/N]: `
|
||||
);
|
||||
if (answer.toLowerCase() !== 'y') {
|
||||
console.log('\nCancelled. Run with --force to skip confirmation.\n');
|
||||
@@ -114,8 +114,8 @@ Notes:
|
||||
}
|
||||
|
||||
const stmt = clearAll
|
||||
? db.prepare('DELETE FROM pending_messages')
|
||||
: db.prepare("DELETE FROM pending_messages WHERE status = 'failed'");
|
||||
? db.prepare("DELETE FROM pending_messages WHERE status IN ('pending', 'processing')")
|
||||
: db.prepare("DELETE FROM pending_messages WHERE status = 'processing'");
|
||||
const cleared = stmt.run().changes;
|
||||
|
||||
const remaining = (db.prepare(
|
||||
Executable
+94
@@ -0,0 +1,94 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
|
||||
PROJECT_NAME="${COMPOSE_PROJECT_NAME:-claude-mem-server-beta-e2e-$(date +%s)}"
|
||||
RUN_ID="${E2E_RUN_ID:-$(date +%s)-$RANDOM}"
|
||||
COMPOSE_FILES=(-f docker-compose.yml -f docker-compose.e2e.yml)
|
||||
COMPOSE=(docker compose -p "$PROJECT_NAME" "${COMPOSE_FILES[@]}")
|
||||
SERVER_SCRIPT="/opt/claude-mem/scripts/worker-service.cjs"
|
||||
|
||||
cd "$ROOT_DIR"
|
||||
|
||||
cleanup() {
|
||||
local exit_code=$?
|
||||
if [[ $exit_code -ne 0 ]]; then
|
||||
echo "[e2e] failure; recent server logs:" >&2
|
||||
"${COMPOSE[@]}" logs --no-color --tail=200 claude-mem-server valkey >&2 || true
|
||||
fi
|
||||
"${COMPOSE[@]}" down -v --remove-orphans >/dev/null 2>&1 || true
|
||||
}
|
||||
trap cleanup EXIT
|
||||
|
||||
wait_for_container_readiness() {
|
||||
local deadline=$((SECONDS + 120))
|
||||
until "${COMPOSE[@]}" exec -T claude-mem-server curl -fsS http://127.0.0.1:37777/api/readiness >/dev/null 2>&1; do
|
||||
if (( SECONDS > deadline )); then
|
||||
echo "[e2e] server did not become ready" >&2
|
||||
return 1
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
}
|
||||
|
||||
json_field() {
|
||||
local field="$1"
|
||||
node -e '
|
||||
const field = process.argv[1];
|
||||
let raw = "";
|
||||
process.stdin.on("data", chunk => raw += chunk);
|
||||
process.stdin.on("end", () => {
|
||||
const value = JSON.parse(raw)[field];
|
||||
if (value === undefined || value === null) process.exit(1);
|
||||
process.stdout.write(String(value));
|
||||
});
|
||||
' "$field"
|
||||
}
|
||||
|
||||
create_key() {
|
||||
local name="$1"
|
||||
local scopes="$2"
|
||||
"${COMPOSE[@]}" exec -T claude-mem-server \
|
||||
bun "$SERVER_SCRIPT" server api-key create --name "$name" --scope "$scopes"
|
||||
}
|
||||
|
||||
echo "[e2e] building plugin bundles"
|
||||
npm run build
|
||||
|
||||
echo "[e2e] starting Docker stack project=$PROJECT_NAME run=$RUN_ID"
|
||||
"${COMPOSE[@]}" up --build -d valkey claude-mem-server
|
||||
wait_for_container_readiness
|
||||
|
||||
echo "[e2e] creating API keys inside server container"
|
||||
FULL_KEY_JSON="$(create_key "docker-e2e-full-$RUN_ID" "memories:read,memories:write")"
|
||||
READ_ONLY_KEY_JSON="$(create_key "docker-e2e-read-$RUN_ID" "memories:read")"
|
||||
FULL_KEY="$(printf '%s' "$FULL_KEY_JSON" | json_field key)"
|
||||
READ_ONLY_KEY="$(printf '%s' "$READ_ONLY_KEY_JSON" | json_field key)"
|
||||
READ_ONLY_KEY_ID="$(printf '%s' "$READ_ONLY_KEY_JSON" | json_field id)"
|
||||
|
||||
echo "[e2e] running phase1 functional paths in test container"
|
||||
"${COMPOSE[@]}" run --rm \
|
||||
-e E2E_PHASE=phase1 \
|
||||
-e E2E_RUN_ID="$RUN_ID" \
|
||||
-e E2E_API_KEY="$FULL_KEY" \
|
||||
-e E2E_READ_ONLY_API_KEY="$READ_ONLY_KEY" \
|
||||
server-beta-e2e
|
||||
|
||||
echo "[e2e] revoking read-only key inside server container"
|
||||
"${COMPOSE[@]}" exec -T claude-mem-server \
|
||||
bun "$SERVER_SCRIPT" server api-key revoke "$READ_ONLY_KEY_ID" >/dev/null
|
||||
|
||||
echo "[e2e] restarting server container to verify persisted state"
|
||||
"${COMPOSE[@]}" restart claude-mem-server
|
||||
wait_for_container_readiness
|
||||
|
||||
echo "[e2e] running phase2 persistence and revoked-key checks in test container"
|
||||
"${COMPOSE[@]}" run --rm \
|
||||
-e E2E_PHASE=phase2 \
|
||||
-e E2E_RUN_ID="$RUN_ID" \
|
||||
-e E2E_API_KEY="$FULL_KEY" \
|
||||
-e E2E_REVOKED_API_KEY="$READ_ONLY_KEY" \
|
||||
server-beta-e2e
|
||||
|
||||
echo "[e2e] Docker server beta E2E passed for run=$RUN_ID"
|
||||
@@ -15,14 +15,6 @@ interface AffectedObservation {
|
||||
title: string;
|
||||
}
|
||||
|
||||
interface ProcessedMessage {
|
||||
id: number;
|
||||
session_db_id: number;
|
||||
tool_name: string;
|
||||
created_at_epoch: number;
|
||||
completed_at_epoch: number;
|
||||
}
|
||||
|
||||
interface SessionMapping {
|
||||
session_db_id: number;
|
||||
memory_session_id: string;
|
||||
@@ -78,19 +70,7 @@ function main() {
|
||||
return;
|
||||
}
|
||||
|
||||
console.log('Step 2: Finding pending messages processed during bad window...');
|
||||
const processedMessages = db.query<ProcessedMessage, []>(`
|
||||
SELECT id, session_db_id, tool_name, created_at_epoch, completed_at_epoch
|
||||
FROM pending_messages
|
||||
WHERE status = 'processed'
|
||||
AND completed_at_epoch >= ${BAD_WINDOW_START}
|
||||
AND completed_at_epoch <= ${BAD_WINDOW_END}
|
||||
ORDER BY completed_at_epoch
|
||||
`).all();
|
||||
|
||||
console.log(`Found ${processedMessages.length} processed messages\n`);
|
||||
|
||||
console.log('Step 3: Matching observations to session start times...');
|
||||
console.log('Step 2: Matching observations to session start times...');
|
||||
const fixes: TimestampFix[] = [];
|
||||
|
||||
interface ObsWithSession {
|
||||
|
||||
@@ -89,17 +89,6 @@ function main() {
|
||||
console.log();
|
||||
}
|
||||
|
||||
console.log('Check 4: Verifying processed pending_messages...');
|
||||
const processedCount = db.query<{ count: number }, []>(`
|
||||
SELECT COUNT(*) as count
|
||||
FROM pending_messages
|
||||
WHERE status = 'processed'
|
||||
AND completed_at_epoch >= ${BAD_WINDOW_START}
|
||||
AND completed_at_epoch <= ${BAD_WINDOW_END}
|
||||
`).get();
|
||||
|
||||
console.log(`${processedCount?.count || 0} pending messages were processed during bad window\n`);
|
||||
|
||||
console.log('═══════════════════════════════════════════════════════════════════════');
|
||||
console.log('VERIFICATION SUMMARY:');
|
||||
console.log('═══════════════════════════════════════════════════════════════════════\n');
|
||||
@@ -108,7 +97,6 @@ function main() {
|
||||
console.log('✅ SUCCESS: Timestamp fix appears to be working correctly!');
|
||||
console.log(` - No observations remain in bad window (Dec 24 19:45-20:31)`);
|
||||
console.log(` - ${originalWindowObs?.count} observations restored to Dec 17-20`);
|
||||
console.log(` - Processed ${processedCount?.count} pending messages`);
|
||||
console.log('\n💡 Safe to re-enable orphan processing in worker-service.ts\n');
|
||||
} else if (badWindowObs.length > 0) {
|
||||
console.log('⚠️ WARNING: Some observations still have incorrect timestamps!');
|
||||
|
||||
Reference in New Issue
Block a user