Files
claude-mem/TIMESTAMP-FIX-VALIDATION.md
T
Alex Newman 266c746d50 feat: Fix observation timestamps, refactor session management, and enhance worker reliability (#437)
* Refactor worker version checks and increase timeout settings

- Updated the default hook timeout from 5000ms to 120000ms for improved stability.
- Modified the worker version check to log a warning instead of restarting the worker on version mismatch.
- Removed legacy PM2 cleanup and worker start logic, simplifying the ensureWorkerRunning function.
- Enhanced polling mechanism for worker readiness with increased retries and reduced interval.

* feat: implement worker queue polling to ensure processing completion before proceeding

* refactor: change worker command from start to restart in hooks configuration

* refactor: remove session management complexity

- Simplify createSDKSession to pure INSERT OR IGNORE
- Remove auto-create logic from storeObservation/storeSummary
- Delete 11 unused session management methods
- Derive prompt_number from user_prompts count
- Keep sdk_sessions table schema unchanged for compatibility

* refactor: simplify session management by removing unused methods and auto-creation logic

* Refactor session prompt number retrieval in SessionRoutes

- Updated the method of obtaining the prompt number from the session.
- Replaced `store.getPromptCounter(sessionDbId)` with `store.getPromptNumberFromUserPrompts(claudeSessionId)` for better clarity and accuracy.
- Adjusted the logic for incrementing the prompt number to derive it from the user prompts count instead of directly incrementing a counter.

* refactor: replace getPromptCounter with getPromptNumberFromUserPrompts in SessionManager

Phase 7 of session management simplification. Updates SessionManager to derive
prompt numbers from user_prompts table count instead of using the deprecated
prompt_counter column.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: simplify SessionCompletionHandler to use direct SQL query

Phase 8: Remove call to findActiveSDKSession() and replace with direct
database query in SessionCompletionHandler.completeByClaudeId().

This removes dependency on the deleted findActiveSDKSession() method
and simplifies the code by using a straightforward SELECT query.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: remove markSessionCompleted call from SDKAgent

- Delete call to markSessionCompleted() in SDKAgent.ts
- Session status is no longer tracked or updated
- Part of phase 9: simplifying session management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: remove markSessionComplete method (Phase 10)

- Deleted markSessionComplete() method from DatabaseManager
- Removed markSessionComplete call from SessionCompletionHandler
- Session completion status no longer tracked in database
- Part of session management simplification effort

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: replace deleted updateSDKSessionId calls in import script (Phase 11)

- Replace updateSDKSessionId() calls with direct SQL UPDATE statements
- Method was deleted in Phase 3 as part of session management simplification
- Import script now uses direct database access consistently

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* test: add validation for SQL updates in sdk_sessions table

* refactor: enhance worker-cli to support manual and automated runs

* Remove cleanup hook and associated session completion logic

- Deleted the cleanup-hook implementation from the hooks directory.
- Removed the session completion endpoint that was used by the cleanup hook.
- Updated the SessionCompletionHandler to eliminate the completeByClaudeId method and its dependencies.
- Adjusted the SessionRoutes to reflect the removal of the session completion route.

* fix: update worker-cli command to use bun for consistency

* feat: Implement timestamp fix for observations and enhance processing logic

- Added `earliestPendingTimestamp` to `ActiveSession` to track the original timestamp of the earliest pending message.
- Updated `SDKAgent` to capture and utilize the earliest pending timestamp during response processing.
- Modified `SessionManager` to track the earliest timestamp when yielding messages.
- Created scripts for fixing corrupted timestamps, validating fixes, and investigating timestamp issues.
- Verified that all corrupted observations have been repaired and logic for future processing is sound.
- Ensured orphan processing can be safely re-enabled after validation.

* feat: Enhance SessionStore to support custom database paths and add timestamp fields for observations and summaries

* Refactor pending queue processing and add management endpoints

- Disabled automatic recovery of orphaned queues on startup; users must now use the new /api/pending-queue/process endpoint.
- Updated processOrphanedQueues method to processPendingQueues with improved session handling and return detailed results.
- Added new API endpoints for managing pending queues: GET /api/pending-queue and POST /api/pending-queue/process.
- Introduced a new script (check-pending-queue.ts) for checking and processing pending observation queues interactively or automatically.
- Enhanced logging and error handling for better monitoring of session processing.

* updated agent sdk

* feat: Add manual recovery guide and queue management endpoints to documentation

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-25 15:36:46 -05:00

5.5 KiB

Timestamp Fix Validation Report

Date: Dec 24, 2025 Status: VALIDATED - Logic is correct and working

Summary

The backlog timestamp fix has been validated. All 171 corrupted observations have been repaired, and the code logic for preventing future corruption is correct.

What Was Fixed

  • Total corrupted observations: 171
  • Date range: Oct 26 - Dec 24, 2025
  • Root cause: Observations from old sessions being processed late got current timestamps instead of original timestamps
  • Fix applied: Restored all observations to their correct original timestamps

Validation Results

1. Database Integrity

Corrupted observations remaining: 0
Pending messages with issues: 0

2. Code Logic

The timestamp override logic flows correctly:

  1. SessionManager.yieldNextMessage() (src/services/worker/SessionManager.ts:451-454)

    • Tracks earliestPendingTimestamp when yielding messages
    • Uses Math.min() to keep the earliest timestamp across batches
  2. SDKAgent.createMessageGenerator() (src/services/worker/SDKAgent.ts:209)

    • Calls getMessageIterator() which yields messages
    • earliestPendingTimestamp is set BEFORE messages are sent to Claude
  3. SDKAgent response handling (src/services/worker/SDKAgent.ts:119)

    • Captures originalTimestamp = session.earliestPendingTimestamp
    • This happens when Claude RESPONDS, after messages were already yielded
  4. SDKAgent.processSDKResponse() (src/services/worker/SDKAgent.ts:272, 350)

    • Passes originalTimestamp ?? undefined to storage methods
    • Both observations and summaries use this timestamp
  5. SessionStore.storeObservation/storeSummary() (src/services/sqlite/SessionStore.ts:1157, 1210)

    • Uses overrideTimestampEpoch ?? Date.now()
    • If override provided (from backlog), uses that
    • Otherwise uses current time (for new messages)
  6. SDKAgent.markMessagesProcessed() (src/services/worker/SDKAgent.ts:430)

    • Resets earliestPendingTimestamp = null after batch completes
    • Ready for next batch with fresh timestamp tracking

3. Sequence Validation

Correct sequence for backlog messages:

Time    Event                                           earliestPendingTimestamp
------  ----------------------------------------------  ------------------------
T1      yieldNextMessage() called                       → Set to msg.created_at_epoch
T2      Messages sent to Claude SDK                     → Still set
T3      Claude responds                                 → Still set
T4      Capture originalTimestamp                       → Captured (equals T1 timestamp)
T5      Create observations with originalTimestamp      → Uses T1 timestamp ✅
T6      Mark messages processed                         → Reset to null

Correct sequence for new messages:

Time    Event                                           earliestPendingTimestamp
------  ----------------------------------------------  ------------------------
T1      yieldNextMessage() called (recent message)      → Set to msg.created_at_epoch (recent)
T2      Messages sent to Claude SDK                     → Still set
T3      Claude responds                                 → Still set
T4      Capture originalTimestamp                       → Captured (equals T1 timestamp)
T5      Create observations with originalTimestamp      → Uses T1 timestamp ✅
T6      Mark messages processed                         → Reset to null

In both cases, observations get the timestamp from when the message was originally created, not when the observation was saved.

Current State

Pending Messages

  • 6 pending messages (all from Dec 24, 2025)
  • 0 stuck messages (status='processing')
  • All pending messages would be processed with correct timestamps if orphan processing enabled

Orphan Processing

  • Currently DISABLED in src/services/worker-service.ts:479
  • Safe to re-enable - timestamp fix is working correctly
  • No risk of future timestamp corruption

Scripts Created

  1. scripts/fix-all-timestamps.ts - Comprehensive fix for ALL corrupted timestamps

    bun scripts/fix-all-timestamps.ts --dry-run  # Preview
    bun scripts/fix-all-timestamps.ts --yes      # Apply
    
  2. scripts/validate-timestamp-logic.ts - Validate the backlog timestamp logic

    bun scripts/validate-timestamp-logic.ts
    
  3. scripts/verify-timestamp-fix.ts - Verify specific time window

    bun scripts/verify-timestamp-fix.ts
    

Recommendation

Safe to re-enable orphan processing

The timestamp fix is working correctly. To re-enable:

// src/services/worker-service.ts:479
// Change from:
// this.processOrphanedQueues(pendingStore).catch((err: Error) => {

// To:
this.processOrphanedQueues(pendingStore).catch((err: Error) => {
  logger.warn('SYSTEM', 'Orphan queue processing failed', {}, err);
});

Files Changed in Fix

  • src/services/sqlite/SessionStore.ts - Added overrideTimestampEpoch parameter
  • src/services/worker-types.ts - Added earliestPendingTimestamp to ActiveSession
  • src/services/worker/SessionManager.ts - Tracks earliest timestamp when yielding
  • src/services/worker/SDKAgent.ts - Passes timestamp through to storage
  • src/services/worker-service.ts - Orphan processing (currently disabled)

Conclusion

All corrupted timestamps fixed (171 observations) Code logic validated and working correctly No remaining timestamp issues in database Safe to re-enable orphan processing