Files
claude-mem/tests/session_id_usage_validation.test.ts
T
Alex Newman f21ea97c39 refactor: decompose monolith into modular architecture with comprehensive test suite (#538)
* fix: prevent memory_session_id from equaling content_session_id

The bug: memory_session_id was initialized to contentSessionId as a
"placeholder for FK purposes". This caused the SDK resume logic to
inject memory agent messages into the USER's Claude Code transcript,
corrupting their conversation history.

Root cause:
- SessionStore.createSDKSession initialized memory_session_id = contentSessionId
- SDKAgent checked memorySessionId !== contentSessionId but this check
  only worked if the session was fetched fresh from DB

The fix:
- SessionStore: Initialize memory_session_id as NULL, not contentSessionId
- SDKAgent: Simple truthy check !!session.memorySessionId (NULL = fresh start)
- Database migration: Ran UPDATE to set memory_session_id = NULL for 1807
  existing sessions that had the bug

Also adds [ALIGNMENT] logging across the session lifecycle to help debug
session continuity issues:
- Hook entry: contentSessionId + promptNumber
- DB lookup: contentSessionId → memorySessionId mapping proof
- Resume decision: shows which memorySessionId will be used for resume
- Capture: logs when memorySessionId is captured from first SDK response

UI: Added "Alignment" quick filter button in LogsModal to show only
alignment logs for debugging session continuity.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor: improve error handling in worker-service.ts

- Fix GENERIC_CATCH anti-patterns by logging full error objects instead of just messages
- Add [ANTI-PATTERN IGNORED] markers for legitimate cases (cleanup, hot paths)
- Simplify error handling comments to be more concise
- Improve httpShutdown() error discrimination for ECONNREFUSED
- Reduce LARGE_TRY_BLOCK issues in initialization code

Part of anti-pattern cleanup plan (132 total issues)

* refactor: improve error logging in SearchManager.ts

- Pass full error objects to logger instead of just error.message
- Fixes PARTIAL_ERROR_LOGGING anti-patterns (10 instances)
- Better debugging visibility when Chroma queries fail

Part of anti-pattern cleanup (133 remaining)

* refactor: improve error logging across SessionStore and mcp-server

- SessionStore.ts: Fix error logging in column rename utility
- mcp-server.ts: Log full error objects instead of just error.message
- Improve error handling in Worker API calls and tool execution

Part of anti-pattern cleanup (133 remaining)

* Refactor hooks to streamline error handling and loading states

- Simplified error handling in useContextPreview by removing try-catch and directly checking response status.
- Refactored usePagination to eliminate try-catch, improving readability and maintaining error handling through response checks.
- Cleaned up useSSE by removing unnecessary try-catch around JSON parsing, ensuring clarity in message handling.
- Enhanced useSettings by streamlining the saving process, removing try-catch, and directly checking the result for success.

* refactor: add error handling back to SearchManager Chroma calls

- Wrap queryChroma calls in try-catch to prevent generator crashes
- Log Chroma errors as warnings and fall back gracefully
- Fixes generator failures when Chroma has issues
- Part of anti-pattern cleanup recovery

* feat: Add generator failure investigation report and observation duplication regression report

- Created a comprehensive investigation report detailing the root cause of generator failures during anti-pattern cleanup, including the impact, investigation process, and implemented fixes.
- Documented the critical regression causing observation duplication due to race conditions in the SDK agent, outlining symptoms, root cause analysis, and proposed fixes.

* fix: address PR #528 review comments - atomic cleanup and detector improvements

This commit addresses critical review feedback from PR #528:

## 1. Atomic Message Cleanup (Fix Race Condition)

**Problem**: SessionRoutes.ts generator error handler had race condition
- Queried messages then marked failed in loop
- If crash during loop → partial marking → inconsistent state

**Solution**:
- Added `markSessionMessagesFailed()` to PendingMessageStore.ts
- Single atomic UPDATE statement replaces loop
- Follows existing pattern from `resetProcessingToPending()`

**Files**:
- src/services/sqlite/PendingMessageStore.ts (new method)
- src/services/worker/http/routes/SessionRoutes.ts (use new method)

## 2. Anti-Pattern Detector Improvements

**Problem**: Detector didn't recognize logger.failure() method
- Lines 212 & 335 already included "failure"
- Lines 112-113 (PARTIAL_ERROR_LOGGING detection) did not

**Solution**: Updated regex patterns to include "failure" for consistency

**Files**:
- scripts/anti-pattern-test/detect-error-handling-antipatterns.ts

## 3. Documentation

**PR Comment**: Added clarification on memory_session_id fix location
- Points to SessionStore.ts:1155
- Explains why NULL initialization prevents message injection bug

## Review Response

Addresses "Must Address Before Merge" items from review:
 Clarified memory_session_id bug fix location (via PR comment)
 Made generator error handler message cleanup atomic
 Deferred comprehensive test suite to follow-up PR (keeps PR focused)

## Testing

- Build passes with no errors
- Anti-pattern detector runs successfully
- Atomic cleanup follows proven pattern from existing methods

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: FOREIGN KEY constraint and missing failed_at_epoch column

Two critical bugs fixed:

1. Missing failed_at_epoch column in pending_messages table
   - Added migration 20 to create the column
   - Fixes error when trying to mark messages as failed

2. FOREIGN KEY constraint failed when storing observations
   - All three agents (SDK, Gemini, OpenRouter) were passing
     session.contentSessionId instead of session.memorySessionId
   - storeObservationsAndMarkComplete expects memorySessionId
   - Added null check and clear error message

However, observations still not saving - see investigation report.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Refactor hook input parsing to improve error handling

- Added a nested try-catch block in new-hook.ts, save-hook.ts, and summary-hook.ts to handle JSON parsing errors more gracefully.
- Replaced direct error throwing with logging of the error details using logger.error.
- Ensured that the process exits cleanly after handling input in all three hooks.

* docs: add monolith refactor report with system breakdown

Comprehensive analysis of codebase identifying:
- 14 files over 500 lines requiring refactoring
- 3 critical monoliths (SessionStore, SearchManager, worker-service)
- 80% code duplication across agent files
- 5-phase refactoring roadmap with domain-based architecture

* docs: update monolith report post session-logging merge

- SessionStore grew to 2,011 lines (49 methods) - highest priority
- SearchManager reduced to 1,778 lines (improved)
- Agent files reduced by ~45 lines combined
- Added trend indicators and post-merge observations
- Core refactoring proposal remains valid

* refactor(sqlite): decompose SessionStore into modular architecture

Extract the 2011-line SessionStore.ts monolith into focused, single-responsibility
modules following grep-optimized progressive disclosure pattern:

New module structure:
- sessions/ - Session creation and retrieval (create.ts, get.ts, types.ts)
- observations/ - Observation storage and queries (store.ts, get.ts, recent.ts, files.ts, types.ts)
- summaries/ - Summary storage and queries (store.ts, get.ts, recent.ts, types.ts)
- prompts/ - User prompt management (store.ts, get.ts, types.ts)
- timeline/ - Cross-entity timeline queries (queries.ts)
- import/ - Bulk import operations (bulk.ts)
- migrations/ - Database migrations (runner.ts)

New coordinator files:
- Database.ts - ClaudeMemDatabase class with re-exports
- transactions.ts - Atomic cross-entity transactions
- Named re-export facades (Sessions.ts, Observations.ts, etc.)

Key design decisions:
- All functions take `db: Database` as first parameter (functional style)
- Named re-exports instead of index.ts for grep-friendliness
- SessionStore retained as backward-compatible wrapper
- Target file size: 50-150 lines (60% compliance)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(agents): extract shared logic into modular architecture

Consolidate duplicate code across SDKAgent, GeminiAgent, and OpenRouterAgent
into focused utility modules. Total reduction: 500 lines (29%).

New modules in src/services/worker/agents/:
- ResponseProcessor.ts: Atomic DB transactions, Chroma sync, SSE broadcast
- ObservationBroadcaster.ts: SSE event formatting and dispatch
- SessionCleanupHelper.ts: Session state cleanup and stuck message reset
- FallbackErrorHandler.ts: Provider error detection for fallback logic
- types.ts: Shared interfaces (WorkerRef, SSE payloads, StorageResult)

Bug fix: SDKAgent was incorrectly using obs.files instead of obs.files_read
and hardcoding files_modified to empty array.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(search): extract search strategies into modular architecture

Decompose SearchManager into focused strategy pattern with:
- SearchOrchestrator: Coordinates strategy selection and fallback
- ChromaSearchStrategy: Vector semantic search via ChromaDB
- SQLiteSearchStrategy: Filter-only queries for date/project/type
- HybridSearchStrategy: Metadata filtering + semantic ranking
- ResultFormatter: Markdown table formatting for results
- TimelineBuilder: Chronological timeline construction
- Filter modules: DateFilter, ProjectFilter, TypeFilter

SearchManager now delegates to new infrastructure while maintaining
full backward compatibility with existing public API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(context): decompose context-generator into modular architecture

Extract 660-line monolith into focused components:
- ContextBuilder: Main orchestrator (~160 lines)
- ContextConfigLoader: Configuration loading
- TokenCalculator: Token budget calculations
- ObservationCompiler: Data retrieval and query building
- MarkdownFormatter/ColorFormatter: Output formatting
- Section renderers: Header, Timeline, Summary, Footer

Maintains full backward compatibility - context-generator.ts now
delegates to new ContextBuilder while preserving public API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(worker): decompose worker-service into modular infrastructure

Split 2000+ line monolith into focused modules:

Infrastructure:
- ProcessManager: PID files, signal handlers, child process cleanup
- HealthMonitor: Port checks, health polling, version matching
- GracefulShutdown: Coordinated cleanup on exit

Server:
- Server: Express app setup, core routes, route registration
- Middleware: Re-exports from existing middleware
- ErrorHandler: Centralized error handling with AppError class

Integrations:
- CursorHooksInstaller: Full Cursor IDE integration (registry, hooks, MCP)

WorkerService now acts as thin coordinator wiring all components together.
Maintains full backward compatibility with existing public API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Refactor session queue processing and database interactions

- Implement claim-and-delete pattern in SessionQueueProcessor to simplify message handling and eliminate duplicate processing.
- Update PendingMessageStore to support atomic claim-and-delete operations, removing the need for intermediate processing states.
- Introduce storeObservations method in SessionStore for simplified observation and summary storage without message tracking.
- Remove deprecated methods and clean up session state management in worker agents.
- Adjust response processing to accommodate new storage patterns, ensuring atomic transactions for observations and summaries.
- Remove unnecessary reset logic for stuck messages due to the new queue handling approach.

* Add duplicate observation cleanup script

Script to clean up duplicate observations created by the batching bug
where observations were stored once per message ID instead of once per
observation. Includes safety checks to always keep at least one copy.

Usage:
  bun scripts/cleanup-duplicates.ts           # Dry run
  bun scripts/cleanup-duplicates.ts --execute # Delete duplicates
  bun scripts/cleanup-duplicates.ts --aggressive # Ignore time window

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test(sqlite): add comprehensive test suite for SQLite repositories

Add 44 tests across 5 test files covering:
- Sessions: CRUD operations and schema validation
- Observations: creation, retrieval, filtering, and ordering
- Prompts: persistence and association with observations
- Summaries: generation tracking and session linkage
- Transactions: context management and rollback behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test(worker): add comprehensive test suites for worker agent modules

Add test coverage for response-processor, observation-broadcaster,
session-cleanup-helper, and fallback-error-handler agents. Fix type
import issues across search module (use `import type` for type-only
imports) and update worker-service main module detection for ESM/CJS
compatibility.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test(search): add comprehensive test suites for search module

Add test coverage for the refactored search architecture:
- SearchOrchestrator: query coordination and caching
- ResultFormatter: pagination, sorting, and field mapping
- SQLiteSearchStrategy: database search operations
- ChromaSearchStrategy: vector similarity search
- HybridSearchStrategy: combined search with score fusion

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test(context): add comprehensive test suites for context-generator modules

Add test coverage for the modular context-generator architecture:
- context-builder.test.ts: Tests for context building and result assembly
- observation-compiler.test.ts: Tests for observation compilation with privacy tags
- token-calculator.test.ts: Tests for token budget calculations
- formatters/markdown-formatter.test.ts: Tests for markdown output formatting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test(infrastructure): add comprehensive test suites for worker infrastructure modules

Add test coverage for graceful-shutdown, health-monitor, and process-manager
modules extracted during the worker-service refactoring. All 32 tests pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test(server): add comprehensive test suites for server modules

Add test coverage for Express server infrastructure:
- error-handler.test.ts: Tests error handling middleware including
  validation errors, database errors, and async error handling
- server.test.ts: Tests server initialization, middleware configuration,
  and route mounting for all API endpoints

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore(package): add test scripts for modular test suites

Add npm run scripts to simplify running tests:
- test: run all tests
- test:sqlite, test:agents, test:search, test:context, test:infra, test:server

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* build assets

* feat(tests): add detailed failure analysis reports for session ID refactor, validation, and store tests

- Created reports for session ID refactor test failures, highlighting 8 failures due to design mismatches.
- Added session ID usage validation report detailing 10 failures caused by outdated assumptions in tests.
- Documented session store test failures, focusing on foreign key constraint violations in 2 tests.
- Compiled a comprehensive test suite report summarizing overall test results, including 28 failing tests across various categories.

* fix(tests): align session ID tests with NULL-based initialization

Update test expectations to match implementation where memory_session_id
starts as NULL (not equal to contentSessionId) per architecture decision
that memory_session_id must NEVER equal contentSessionId.

Changes:
- session_id_refactor.test.ts: expect NULL initial state, add updateMemorySessionId() calls
- session_id_usage_validation.test.ts: update placeholder detection to check !== null
- session_store.test.ts: add updateMemorySessionId() before storeObservation/storeSummary

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(tests): update GeminiAgent tests with correct field names and mocks

- Rename deprecated fields: claudeSessionId → contentSessionId,
  sdkSessionId → memorySessionId, pendingProcessingIds → pendingMessages
- Add missing required ActiveSession fields
- Add storeObservations mock (plural) for ResponseProcessor compatibility
- Fix settings mock to use correct CLAUDE_MEM_GEMINI_RATE_LIMITING_ENABLED key
- Add await to rejects.toThrow assertion

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(tests): add logger imports and fix coverage test exclusions

Phase 3 of test suite fixes:
- Add logger imports to 34 high-priority source files (SQLite, worker, context)
- Exclude CLI-facing files from console.log check (worker-service.ts,
  integrations/*Installer.ts) as they use console.log intentionally for
  interactive user output

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update SESSION_ID_ARCHITECTURE for NULL-based initialization

Update documentation to reflect that memory_session_id starts as NULL,
not as a placeholder equal to contentSessionId. This matches the
implementation decision that memory_session_id must NEVER equal
contentSessionId to prevent injecting memory messages into user transcripts.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore(deps): update esbuild and MCP SDK

- esbuild: 0.25.12 → 0.27.2 (fixes minifyIdentifiers issue)
- @modelcontextprotocol/sdk: 1.20.1 → 1.25.1

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* build assets and updates

* chore: remove bun.lock and add to gitignore

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 23:58:41 -05:00

511 lines
21 KiB
TypeScript

import { describe, it, expect, beforeEach, afterEach } from 'bun:test';
import { SessionStore } from '../src/services/sqlite/SessionStore.js';
/**
* Session ID Usage Validation Tests
*
* PURPOSE: Prevent confusion and bugs from mixing contentSessionId and memorySessionId
*
* CRITICAL ARCHITECTURE:
* - contentSessionId: User's Claude Code conversation session (immutable)
* - memorySessionId: SDK agent's session ID for resume (captured from SDK response)
*
* INVARIANTS TO ENFORCE:
* 1. memorySessionId starts as NULL (NEVER equals contentSessionId - that would inject memory into user transcript!)
* 2. Resume MUST NOT be used when memorySessionId is NULL
* 3. Resume MUST ONLY be used when hasRealMemorySessionId === true (memorySessionId is non-null)
* 4. Observations are stored with memorySessionId (after updateMemorySessionId has been called)
* 5. updateMemorySessionId() is required before storeObservation() or storeSummary() can work
*/
describe('Session ID Usage Validation', () => {
let store: SessionStore;
beforeEach(() => {
store = new SessionStore(':memory:');
});
afterEach(() => {
store.close();
});
describe('Placeholder Detection - hasRealMemorySessionId Logic', () => {
it('should identify placeholder when memorySessionId is NULL', () => {
const contentSessionId = 'user-session-123';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test prompt');
const session = store.getSessionById(sessionDbId);
// Initially, memory_session_id is NULL (placeholder state)
// CRITICAL: memory_session_id must NEVER equal contentSessionId - that would inject memory into user transcript!
expect(session?.memory_session_id).toBeNull();
// hasRealMemorySessionId would be FALSE (NULL is falsy)
const hasRealMemorySessionId = session?.memory_session_id !== null;
expect(hasRealMemorySessionId).toBe(false);
});
it('should identify real memory session ID after capture', () => {
const contentSessionId = 'user-session-456';
const capturedMemoryId = 'sdk-generated-abc123';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test prompt');
store.updateMemorySessionId(sessionDbId, capturedMemoryId);
const session = store.getSessionById(sessionDbId);
// After capture, memory_session_id is set (non-NULL)
expect(session?.memory_session_id).toBe(capturedMemoryId);
// hasRealMemorySessionId would be TRUE
const hasRealMemorySessionId = session?.memory_session_id !== null;
expect(hasRealMemorySessionId).toBe(true);
});
it('should never use contentSessionId as resume parameter when in placeholder state', () => {
const contentSessionId = 'dangerous-session-789';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test');
const session = store.getSessionById(sessionDbId);
const hasRealMemorySessionId = session?.memory_session_id !== null;
// CRITICAL: This check prevents resuming when memory_session_id is not captured
if (hasRealMemorySessionId) {
// Safe to use for resume
const resumeParam = session?.memory_session_id;
expect(resumeParam).not.toBe(contentSessionId);
} else {
// Must NOT pass resume parameter
// Resume should be undefined/null in SDK call
expect(hasRealMemorySessionId).toBe(false);
}
});
});
describe('Observation Storage - MemorySessionId Usage', () => {
it('should store observations with memorySessionId in memory_session_id column', () => {
const contentSessionId = 'obs-content-session-123';
const memorySessionId = 'obs-memory-session-123';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test');
store.updateMemorySessionId(sessionDbId, memorySessionId);
const obs = {
type: 'discovery',
title: 'Test Observation',
subtitle: null,
facts: ['Fact 1'],
narrative: 'Testing',
concepts: ['testing'],
files_read: [],
files_modified: []
};
// storeObservation takes memorySessionId (after updateMemorySessionId has been called)
const result = store.storeObservation(memorySessionId, 'test-project', obs, 1);
// Verify it's stored in the memory_session_id column with memorySessionId value
const stored = store.db.prepare(
'SELECT memory_session_id FROM observations WHERE id = ?'
).get(result.id) as { memory_session_id: string };
// memory_session_id column contains the captured SDK session ID
expect(stored.memory_session_id).toBe(memorySessionId);
});
it('should be retrievable using memorySessionId', () => {
const contentSessionId = 'retrieval-test-session';
const memorySessionId = 'retrieval-memory-session';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test');
store.updateMemorySessionId(sessionDbId, memorySessionId);
// Store observation with memorySessionId
const obs = {
type: 'feature',
title: 'Observation',
subtitle: null,
facts: [],
narrative: null,
concepts: [],
files_read: [],
files_modified: []
};
store.storeObservation(memorySessionId, 'test-project', obs, 1);
// Observations are retrievable by memorySessionId
const observations = store.getObservationsForSession(memorySessionId);
expect(observations.length).toBe(1);
expect(observations[0].title).toBe('Observation');
});
});
describe('Resume Safety - Prevent contentSessionId Resume Bug', () => {
it('should prevent resume with NULL memorySessionId', () => {
const contentSessionId = 'safety-test-session';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test');
const session = store.getSessionById(sessionDbId);
// Simulate hasRealMemorySessionId check - memory_session_id must be non-null
const hasRealMemorySessionId = session?.memory_session_id !== null;
// MUST be false in placeholder state (memory_session_id is NULL)
expect(hasRealMemorySessionId).toBe(false);
// Resume parameter should NOT be set
// In SDK call: ...(hasRealMemorySessionId && { resume: session.memorySessionId })
// This evaluates to an empty object, not a resume parameter
const resumeOptions = hasRealMemorySessionId ? { resume: session?.memory_session_id } : {};
expect(resumeOptions).toEqual({});
});
it('should allow resume only after memory session ID is captured', () => {
const contentSessionId = 'resume-ready-session';
const capturedMemoryId = 'real-sdk-session-123';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test');
// Before capture - no resume (memory_session_id is NULL)
let session = store.getSessionById(sessionDbId);
let hasRealMemorySessionId = session?.memory_session_id !== null;
expect(hasRealMemorySessionId).toBe(false);
// Capture memory session ID
store.updateMemorySessionId(sessionDbId, capturedMemoryId);
// After capture - resume allowed
session = store.getSessionById(sessionDbId);
hasRealMemorySessionId = session?.memory_session_id !== null;
expect(hasRealMemorySessionId).toBe(true);
// Resume parameter should be the captured ID
const resumeOptions = hasRealMemorySessionId ? { resume: session?.memory_session_id } : {};
expect(resumeOptions).toEqual({ resume: capturedMemoryId });
expect(resumeOptions.resume).not.toBe(contentSessionId);
});
});
describe('Cross-Contamination Prevention', () => {
it('should never mix observations from different content sessions', () => {
const content1 = 'user-session-A';
const content2 = 'user-session-B';
const memory1 = 'memory-session-A';
const memory2 = 'memory-session-B';
const id1 = store.createSDKSession(content1, 'project-a', 'Prompt A');
const id2 = store.createSDKSession(content2, 'project-b', 'Prompt B');
store.updateMemorySessionId(id1, memory1);
store.updateMemorySessionId(id2, memory2);
// Store observations in each session using memorySessionId
store.storeObservation(memory1, 'project-a', {
type: 'discovery',
title: 'Observation A',
subtitle: null,
facts: [],
narrative: null,
concepts: [],
files_read: [],
files_modified: []
}, 1);
store.storeObservation(memory2, 'project-b', {
type: 'discovery',
title: 'Observation B',
subtitle: null,
facts: [],
narrative: null,
concepts: [],
files_read: [],
files_modified: []
}, 1);
// Verify isolation
const obsA = store.getObservationsForSession(memory1);
const obsB = store.getObservationsForSession(memory2);
expect(obsA.length).toBe(1);
expect(obsB.length).toBe(1);
expect(obsA[0].title).toBe('Observation A');
expect(obsB[0].title).toBe('Observation B');
});
it('should never leak memory session IDs between content sessions', () => {
const content1 = 'content-session-1';
const content2 = 'content-session-2';
const memory1 = 'memory-session-1';
const memory2 = 'memory-session-2';
const id1 = store.createSDKSession(content1, 'project', 'Prompt');
const id2 = store.createSDKSession(content2, 'project', 'Prompt');
store.updateMemorySessionId(id1, memory1);
store.updateMemorySessionId(id2, memory2);
const session1 = store.getSessionById(id1);
const session2 = store.getSessionById(id2);
// Each session must have its own unique memory session ID
expect(session1?.memory_session_id).toBe(memory1);
expect(session2?.memory_session_id).toBe(memory2);
expect(session1?.memory_session_id).not.toBe(session2?.memory_session_id);
});
});
describe('Foreign Key Integrity', () => {
it('should cascade delete observations when session is deleted', () => {
const contentSessionId = 'cascade-test-session';
const memorySessionId = 'cascade-memory-session';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test');
store.updateMemorySessionId(sessionDbId, memorySessionId);
// Store observation
const obs = {
type: 'discovery',
title: 'Will be deleted',
subtitle: null,
facts: [],
narrative: null,
concepts: [],
files_read: [],
files_modified: []
};
store.storeObservation(memorySessionId, 'test-project', obs, 1);
// Verify observation exists
let observations = store.getObservationsForSession(memorySessionId);
expect(observations.length).toBe(1);
// Delete session (should cascade to observations)
store.db.prepare('DELETE FROM sdk_sessions WHERE id = ?').run(sessionDbId);
// Verify observations were deleted
observations = store.getObservationsForSession(memorySessionId);
expect(observations.length).toBe(0);
});
it('should maintain FK relationship between observations and sessions', () => {
const contentSessionId = 'fk-test-session';
const memorySessionId = 'fk-memory-session';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test');
store.updateMemorySessionId(sessionDbId, memorySessionId);
// This should succeed (FK exists)
expect(() => {
store.storeObservation(memorySessionId, 'test-project', {
type: 'discovery',
title: 'Valid FK',
subtitle: null,
facts: [],
narrative: null,
concepts: [],
files_read: [],
files_modified: []
}, 1);
}).not.toThrow();
// This should fail (FK doesn't exist)
expect(() => {
store.storeObservation('nonexistent-session-id', 'test-project', {
type: 'discovery',
title: 'Invalid FK',
subtitle: null,
facts: [],
narrative: null,
concepts: [],
files_read: [],
files_modified: []
}, 1);
}).toThrow();
});
});
describe('Session Lifecycle - Memory ID Capture Flow', () => {
it('should follow correct lifecycle: create → capture → resume', () => {
const contentSessionId = 'lifecycle-session';
// STEP 1: Hook creates session (memory_session_id = NULL)
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'First prompt');
let session = store.getSessionById(sessionDbId);
expect(session?.memory_session_id).toBeNull(); // NULL - not captured yet
// STEP 2: First SDK message arrives with real session ID
const realMemoryId = 'sdk-generated-session-xyz';
store.updateMemorySessionId(sessionDbId, realMemoryId);
session = store.getSessionById(sessionDbId);
expect(session?.memory_session_id).toBe(realMemoryId); // Real ID
// STEP 3: Subsequent prompts can now resume
const hasRealMemorySessionId = session?.memory_session_id !== null;
expect(hasRealMemorySessionId).toBe(true);
// Resume parameter is safe to use
const resumeParam = session?.memory_session_id;
expect(resumeParam).toBe(realMemoryId);
expect(resumeParam).not.toBe(contentSessionId);
});
it('should handle worker restart by preserving captured memory session ID', () => {
const contentSessionId = 'restart-test-session';
const capturedMemoryId = 'persisted-memory-id';
// Simulate first worker instance
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Prompt');
store.updateMemorySessionId(sessionDbId, capturedMemoryId);
// Simulate worker restart - session re-fetched from database
const session = store.getSessionById(sessionDbId);
// Memory session ID should be preserved
expect(session?.memory_session_id).toBe(capturedMemoryId);
// Resume can work immediately
const hasRealMemorySessionId = session?.memory_session_id !== null;
expect(hasRealMemorySessionId).toBe(true);
});
});
describe('CRITICAL: 1:1 Transcript Mapping Guarantees', () => {
it('should enforce UNIQUE constraint on memory_session_id (prevents duplicate memory transcripts)', () => {
const content1 = 'content-session-1';
const content2 = 'content-session-2';
const sharedMemoryId = 'shared-memory-id';
const id1 = store.createSDKSession(content1, 'project', 'Prompt 1');
const id2 = store.createSDKSession(content2, 'project', 'Prompt 2');
// First session captures memory ID - should succeed
store.updateMemorySessionId(id1, sharedMemoryId);
// Second session tries to use SAME memory ID - should FAIL
expect(() => {
store.updateMemorySessionId(id2, sharedMemoryId);
}).toThrow(); // UNIQUE constraint violation
// Verify first session still has the ID
const session1 = store.getSessionById(id1);
expect(session1?.memory_session_id).toBe(sharedMemoryId);
});
it('should prevent memorySessionId from being changed after real capture (single transition guarantee)', () => {
const contentSessionId = 'single-capture-test';
const firstMemoryId = 'first-sdk-session-id';
const secondMemoryId = 'different-sdk-session-id';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test');
// First capture - should succeed
store.updateMemorySessionId(sessionDbId, firstMemoryId);
let session = store.getSessionById(sessionDbId);
expect(session?.memory_session_id).toBe(firstMemoryId);
// Second capture with DIFFERENT ID - should FAIL (or be no-op in proper implementation)
// This test documents current behavior - ideally updateMemorySessionId should
// check if memorySessionId already differs from contentSessionId and refuse to update
store.updateMemorySessionId(sessionDbId, secondMemoryId);
session = store.getSessionById(sessionDbId);
// CRITICAL: If this allows the update, we could get multiple memory transcripts!
// This test currently shows the vulnerability - in production, SDKAgent.ts
// has the check `if (!session.memorySessionId)` which should prevent this,
// but the database layer doesn't enforce it.
//
// For now, we document that the second update DOES go through (current behavior)
expect(session?.memory_session_id).toBe(secondMemoryId);
// TODO: Add database-level protection via CHECK constraint or trigger
// to prevent changing memory_session_id once it differs from content_session_id
});
it('should use same memorySessionId for all prompts in a conversation (resume consistency)', () => {
const contentSessionId = 'multi-prompt-session';
const realMemoryId = 'consistent-memory-id';
// Prompt 1: Create session
let sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Prompt 1');
let session = store.getSessionById(sessionDbId);
// Initially NULL
expect(session?.memory_session_id).toBeNull();
// Prompt 1: Capture real memory ID
store.updateMemorySessionId(sessionDbId, realMemoryId);
// Prompt 2: Look up session by contentSessionId (simulates hook creating session again)
sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Prompt 2');
session = store.getSessionById(sessionDbId);
// Should get SAME memory ID (resume with this)
expect(session?.memory_session_id).toBe(realMemoryId);
// Prompt 3: Again, same contentSessionId
sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Prompt 3');
session = store.getSessionById(sessionDbId);
// Should STILL get same memory ID
expect(session?.memory_session_id).toBe(realMemoryId);
// All three prompts use the SAME memorySessionId → ONE memory transcript file
const hasRealMemorySessionId = session?.memory_session_id !== null;
expect(hasRealMemorySessionId).toBe(true);
});
it('should lookup session by contentSessionId and retrieve memorySessionId for resume', () => {
const contentSessionId = 'lookup-test-session';
const capturedMemoryId = 'memory-for-resume';
// First prompt: Create and capture
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'First');
store.updateMemorySessionId(sessionDbId, capturedMemoryId);
// Second prompt: Hook provides contentSessionId, needs to lookup memorySessionId
// The createSDKSession method IS the lookup (INSERT OR IGNORE + SELECT)
const lookedUpSessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Second');
// Should be same DB row
expect(lookedUpSessionDbId).toBe(sessionDbId);
// Get session to extract memorySessionId for resume
const session = store.getSessionById(lookedUpSessionDbId);
const resumeParam = session?.memory_session_id;
// This is what would be passed to SDK query({ resume: resumeParam })
expect(resumeParam).toBe(capturedMemoryId);
expect(resumeParam).not.toBe(contentSessionId);
});
});
describe('Edge Cases - Session ID Equality', () => {
it('should handle case where SDK returns session ID equal to contentSessionId', () => {
// Edge case: SDK happens to generate same ID as content session
// This shouldn't happen in practice, but we test it anyway
const contentSessionId = 'same-id-123';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test');
// SDK returns the same ID (unlikely but possible)
store.updateMemorySessionId(sessionDbId, contentSessionId);
const session = store.getSessionById(sessionDbId);
// Now checking for non-null instead of comparing to content_session_id
const hasRealMemorySessionId = session?.memory_session_id !== null;
// Would be TRUE since we set a value (even if same as content)
// In practice, the SDK should never return the same ID as contentSessionId
expect(hasRealMemorySessionId).toBe(true);
});
it('should handle NULL memory_session_id gracefully', () => {
const contentSessionId = 'null-test-session';
const sessionDbId = store.createSDKSession(contentSessionId, 'test-project', 'Test');
// memory_session_id is already NULL from createSDKSession
const session = store.getSessionById(sessionDbId);
const hasRealMemorySessionId = session?.memory_session_id !== null;
// Should be false (NULL means not captured yet)
expect(hasRealMemorySessionId).toBe(false);
});
});
});