diff --git a/PHASED-EXECUTION-PLAN-SIMPLE.md b/PHASED-EXECUTION-PLAN-SIMPLE.md deleted file mode 100644 index 4460c666..00000000 --- a/PHASED-EXECUTION-PLAN-SIMPLE.md +++ /dev/null @@ -1,405 +0,0 @@ -# Phased Execution Plan: Remove Session Management Logic - -**Goal**: Delete problematic session management code while keeping schema unchanged - -**Approach**: Stop using certain columns/methods, but don't change the database structure - ---- - -## PHASE 1: Simplify createSDKSession() - -**File**: `src/services/sqlite/SessionStore.ts` - -**Task**: Replace `createSDKSession()` with a pure INSERT OR IGNORE version - -**Current code** (~line 1142-1178): -- Does INSERT OR IGNORE -- Then tries to UPDATE project/user_prompt if they changed -- Complex logic - -**New code**: -```typescript -createSDKSession(claudeSessionId: string, project: string, userPrompt: string): number { - const now = new Date(); - const nowEpoch = now.getTime(); - - // Pure INSERT OR IGNORE - no updates, no complexity - this.db.prepare(` - INSERT OR IGNORE INTO sdk_sessions - (claude_session_id, sdk_session_id, project, user_prompt, started_at, started_at_epoch, status) - VALUES (?, ?, ?, ?, ?, ?, 'active') - `).run(claudeSessionId, claudeSessionId, project, userPrompt, now.toISOString(), nowEpoch); - - // Return existing or new ID - const row = this.db.prepare('SELECT id FROM sdk_sessions WHERE claude_session_id = ?') - .get(claudeSessionId) as { id: number }; - return row.id; -} -``` - -**Verification**: -```bash -grep -A20 "createSDKSession(" src/services/sqlite/SessionStore.ts | head -25 -# Should see simple INSERT OR IGNORE, no UPDATE logic -``` - -**Output**: Simplified createSDKSession method - ---- - -## PHASE 2: Add getPromptNumberFromUserPrompts() Helper - -**File**: `src/services/sqlite/SessionStore.ts` - -**Task**: Add a new helper method to derive prompt number from user_prompts table - -**Add this method**: -```typescript -/** - * Get current prompt number by counting user_prompts for this session - * Replaces the prompt_counter column which is no longer maintained - */ -getPromptNumberFromUserPrompts(claudeSessionId: string): number { - const result = this.db.prepare(` - SELECT COUNT(*) as count FROM user_prompts WHERE claude_session_id = ? - `).get(claudeSessionId) as { count: number }; - return result.count; -} -``` - -**Verification**: -```bash -grep -n "getPromptNumberFromUserPrompts" src/services/sqlite/SessionStore.ts -# Should find the new method -``` - -**Output**: New helper method added - ---- - -## PHASE 3: Delete Unused Session Management Methods - -**File**: `src/services/sqlite/SessionStore.ts` - -**Task**: Delete these 11 methods entirely: - -1. `updateSDKSessionId()` (~line 1185-1205) -2. `findActiveSDKSession()` (~line 1043-1057) -3. `findAnySDKSession()` (~line 1062-1071) -4. `reactivateSession()` (~line 1076-1084) -5. `incrementPromptCounter()` (~line 1089-1103) -6. `getPromptCounter()` (~line 1108-1114) -7. `setWorkerPort()` (~line 1210-1218) -8. `getWorkerPort()` (~line 1223-1233) -9. `markSessionCompleted()` (~line 1419-1430) -10. `markSessionFailed()` (~line 1435-1446) -11. `getSdkSessionsBySessionIds()` (~line 1017-1038) - if unused - -**Verification**: -```bash -# These should return no results after deletion -grep -n "incrementPromptCounter\|getPromptCounter\|setWorkerPort\|getWorkerPort" src/services/sqlite/SessionStore.ts -grep -n "markSessionCompleted\|markSessionFailed\|reactivateSession" src/services/sqlite/SessionStore.ts -grep -n "findActiveSDKSession\|findAnySDKSession\|updateSDKSessionId" src/services/sqlite/SessionStore.ts -``` - -**Output**: 11 methods deleted from SessionStore.ts - ---- - -## PHASE 4: Remove Auto-Create from storeObservation() - -**File**: `src/services/sqlite/SessionStore.ts` - -**Task**: Remove session auto-creation logic from storeObservation() - -**Find** (~line 1291-1312): The block that checks if session exists and creates it if not - -**Delete** this pattern: -```typescript -// DELETE THIS BLOCK: -let sessionId = this.db.prepare(`SELECT id FROM sdk_sessions WHERE sdk_session_id = ?`).get(sdkSessionId); -if (!sessionId) { - // auto-create session... -} -``` - -**Keep**: Just the INSERT INTO observations statement - -The method should assume the session already exists. If it doesn't, the INSERT will fail with a foreign key error - which is correct behavior (means the hook is broken). - -**Verification**: -```bash -grep -B5 -A15 "storeObservation(" src/services/sqlite/SessionStore.ts | grep -i "insert.*sdk_sessions" -# Should return nothing - no session creation in storeObservation -``` - -**Output**: storeObservation() no longer auto-creates sessions - ---- - -## PHASE 5: Remove Auto-Create from storeSummary() - -**File**: `src/services/sqlite/SessionStore.ts` - -**Task**: Remove session auto-creation logic from storeSummary() - -**Find** (~line 1367-1388): Similar pattern to storeObservation - -**Delete** the session existence check and auto-create block - -**Keep**: Just the INSERT INTO session_summaries statement - -**Verification**: -```bash -grep -B5 -A15 "storeSummary(" src/services/sqlite/SessionStore.ts | grep -i "insert.*sdk_sessions" -# Should return nothing - no session creation in storeSummary -``` - -**Output**: storeSummary() no longer auto-creates sessions - ---- - -## PHASE 6: Update SessionRoutes - Replace getPromptCounter - -**File**: `src/services/worker/http/routes/SessionRoutes.ts` - -**Task**: Replace calls to `getPromptCounter()` with `getPromptNumberFromUserPrompts()` - -**Find**: All calls to `store.getPromptCounter(sessionDbId)` - -**Replace with**: `store.getPromptNumberFromUserPrompts(claudeSessionId)` - -Note: The new method takes `claudeSessionId` (string), not `sessionDbId` (number) - -**Verification**: -```bash -grep -n "getPromptCounter" src/services/worker/http/routes/SessionRoutes.ts -# Should return nothing - -grep -n "getPromptNumberFromUserPrompts" src/services/worker/http/routes/SessionRoutes.ts -# Should find the new calls -``` - -**Output**: SessionRoutes uses new prompt counting method - ---- - -## PHASE 7: Update SessionManager - Replace getPromptCounter - -**File**: `src/services/worker/SessionManager.ts` - -**Task**: Replace call to `getPromptCounter()` in `initializeSession()` - -**Find** (~line 116): `this.dbManager.getSessionStore().getPromptCounter(sessionDbId)` - -**Replace with**: -```typescript -this.dbManager.getSessionStore().getPromptNumberFromUserPrompts(dbSession.claude_session_id) -``` - -**Verification**: -```bash -grep -n "getPromptCounter" src/services/worker/SessionManager.ts -# Should return nothing -``` - -**Output**: SessionManager uses new prompt counting method - ---- - -## PHASE 8: Update SessionCompletionHandler - -**File**: `src/services/worker/session/SessionCompletionHandler.ts` - -**Task**: Remove call to `findActiveSDKSession()` and simplify - -**Find**: Call to `store.findActiveSDKSession(claudeSessionId)` - -**Replace**: Use a simpler query or remove the lookup if not needed - -If the method just needs to find the session to delete it, we can query sdk_sessions directly: -```typescript -const session = store.db.prepare( - 'SELECT id FROM sdk_sessions WHERE claude_session_id = ?' -).get(claudeSessionId); -``` - -**Verification**: -```bash -grep -n "findActiveSDKSession" src/services/worker/session/SessionCompletionHandler.ts -# Should return nothing -``` - -**Output**: SessionCompletionHandler no longer uses deleted method - ---- - -## PHASE 9: Update SDKAgent - Remove markSessionCompleted - -**File**: `src/services/worker/SDKAgent.ts` - -**Task**: Remove call to `markSessionCompleted()` - -**Find** (~line 148): `this.dbManager.getSessionStore().markSessionCompleted(session.sessionDbId)` - -**Action**: Delete this line entirely. We no longer track session status. - -**Verification**: -```bash -grep -n "markSessionCompleted\|markSessionFailed" src/services/worker/SDKAgent.ts -# Should return nothing -``` - -**Output**: SDKAgent no longer marks sessions as completed - ---- - -## PHASE 10: Update DatabaseManager - Remove markSessionComplete - -**File**: `src/services/worker/DatabaseManager.ts` - -**Task**: Remove the `markSessionComplete()` method - -**Find** (~line 116-118): The `markSessionComplete` method - -**Action**: Delete the entire method - -**Verification**: -```bash -grep -n "markSessionComplete" src/services/worker/DatabaseManager.ts -# Should return nothing -``` - -**Output**: DatabaseManager no longer has session completion method - ---- - -## PHASE 11: Search for Any Remaining References - -**Task**: Find and fix any remaining references to deleted methods - -**Search**: -```bash -# Search for all deleted method names -grep -rn "incrementPromptCounter\|getPromptCounter\|setWorkerPort\|getWorkerPort" src/ --include="*.ts" -grep -rn "markSessionCompleted\|markSessionFailed\|reactivateSession" src/ --include="*.ts" -grep -rn "findActiveSDKSession\|findAnySDKSession\|updateSDKSessionId" src/ --include="*.ts" -``` - -**Action**: Fix any remaining references found - -**Output**: No references to deleted methods remain - ---- - -## PHASE 12: Build and Test - -**Task**: Verify everything compiles and works - -**Commands**: -```bash -# Build -npm run build - -# If build fails, fix TypeScript errors and rebuild -``` - -**Verification**: -- Build completes without errors -- No TypeScript errors about missing methods - -**Output**: Clean build - ---- - -## PHASE 13: Integration Test - -**Task**: Test the complete flow - -**Test steps**: -1. Start a new Claude Code session -2. Submit a prompt (triggers new-hook → createSDKSession) -3. Use some tools (triggers save-hook → storeObservation) -4. End session (triggers summary-hook → storeSummary) - -**Verify in database**: -```bash -sqlite3 ~/.claude-mem/claude-mem.db "SELECT id, claude_session_id, project, status FROM sdk_sessions ORDER BY id DESC LIMIT 3;" -sqlite3 ~/.claude-mem/claude-mem.db "SELECT id, sdk_session_id, type, title FROM observations ORDER BY id DESC LIMIT 5;" -``` - -**Expected**: -- Sessions created with status='active' (never updated, that's fine) -- Observations saved correctly -- No errors in worker logs - -**Output**: System works end-to-end - ---- - -## PHASE 14: Cleanup Plan Files - -**Task**: Delete the planning files - -```bash -rm PLAN-REMOVE-SESSION-MANAGEMENT.md -rm PHASED-EXECUTION-PLAN.md -rm PHASED-EXECUTION-PLAN-SIMPLE.md -``` - -**Commit**: -```bash -git add -A -git commit -m "refactor: remove session management complexity - -- Simplify createSDKSession to pure INSERT OR IGNORE -- Remove auto-create logic from storeObservation/storeSummary -- Delete 11 unused session management methods -- Derive prompt_number from user_prompts count -- Keep sdk_sessions table schema unchanged for compatibility" -``` - -**Output**: Clean commit with simplified session handling - ---- - -## SUCCESS CRITERIA - -After all phases complete, verify: - -- [ ] `createSDKSession()` is pure INSERT OR IGNORE (no updates) -- [ ] `storeObservation()` has no session auto-create logic -- [ ] `storeSummary()` has no session auto-create logic -- [ ] 11 session management methods deleted -- [ ] `getPromptNumberFromUserPrompts()` exists and is used -- [ ] Build completes without errors -- [ ] Normal flow works (prompt → tools → observations → summary) -- [ ] No references to deleted methods in codebase -- [ ] Database schema unchanged (no migration needed) - ---- - -## WHAT WE STOPPED USING (BUT DIDN'T DELETE FROM SCHEMA) - -These columns in `sdk_sessions` are now dead: - -| Column | Previously | Now | -|--------|-----------|-----| -| `status` | Updated to 'completed'/'failed' | Always 'active', never updated | -| `completed_at` | Set on session end | Never set | -| `completed_at_epoch` | Set on session end | Never set | -| `worker_port` | Tracked which worker | Never set | -| `prompt_counter` | Incremented per prompt | Ignored, derived from user_prompts | - -These columns still work fine but we just don't write to them anymore. Existing data is preserved. - ---- - -## PHILOSOPHY - -**Before**: Complex session lifecycle management with status tracking, port assignment, prompt counting, auto-creation fallbacks - -**After**: Simple lookup table. INSERT OR IGNORE on first prompt. That's it. - -If data is missing, it's a bug in the hook. Fail loudly, don't paper over it. diff --git a/TIMESTAMP-FIX-VALIDATION.md b/TIMESTAMP-FIX-VALIDATION.md deleted file mode 100644 index 0ed313ea..00000000 --- a/TIMESTAMP-FIX-VALIDATION.md +++ /dev/null @@ -1,145 +0,0 @@ -# Timestamp Fix Validation Report - -**Date**: Dec 24, 2025 -**Status**: ✅ VALIDATED - Logic is correct and working - -## Summary - -The backlog timestamp fix has been validated. All 171 corrupted observations have been repaired, and the code logic for preventing future corruption is correct. - -## What Was Fixed - -- **Total corrupted observations**: 171 -- **Date range**: Oct 26 - Dec 24, 2025 -- **Root cause**: Observations from old sessions being processed late got current timestamps instead of original timestamps -- **Fix applied**: Restored all observations to their correct original timestamps - -## Validation Results - -### 1. Database Integrity ✅ - -``` -Corrupted observations remaining: 0 -Pending messages with issues: 0 -``` - -### 2. Code Logic ✅ - -The timestamp override logic flows correctly: - -1. **SessionManager.yieldNextMessage()** (src/services/worker/SessionManager.ts:451-454) - - Tracks `earliestPendingTimestamp` when yielding messages - - Uses `Math.min()` to keep the earliest timestamp across batches - -2. **SDKAgent.createMessageGenerator()** (src/services/worker/SDKAgent.ts:209) - - Calls `getMessageIterator()` which yields messages - - earliestPendingTimestamp is set BEFORE messages are sent to Claude - -3. **SDKAgent response handling** (src/services/worker/SDKAgent.ts:119) - - Captures `originalTimestamp = session.earliestPendingTimestamp` - - This happens when Claude RESPONDS, after messages were already yielded - -4. **SDKAgent.processSDKResponse()** (src/services/worker/SDKAgent.ts:272, 350) - - Passes `originalTimestamp ?? undefined` to storage methods - - Both observations and summaries use this timestamp - -5. **SessionStore.storeObservation/storeSummary()** (src/services/sqlite/SessionStore.ts:1157, 1210) - - Uses `overrideTimestampEpoch ?? Date.now()` - - If override provided (from backlog), uses that - - Otherwise uses current time (for new messages) - -6. **SDKAgent.markMessagesProcessed()** (src/services/worker/SDKAgent.ts:430) - - Resets `earliestPendingTimestamp = null` after batch completes - - Ready for next batch with fresh timestamp tracking - -### 3. Sequence Validation ✅ - -**Correct sequence for backlog messages:** - -``` -Time Event earliestPendingTimestamp ------- ---------------------------------------------- ------------------------ -T1 yieldNextMessage() called → Set to msg.created_at_epoch -T2 Messages sent to Claude SDK → Still set -T3 Claude responds → Still set -T4 Capture originalTimestamp → Captured (equals T1 timestamp) -T5 Create observations with originalTimestamp → Uses T1 timestamp ✅ -T6 Mark messages processed → Reset to null -``` - -**Correct sequence for new messages:** - -``` -Time Event earliestPendingTimestamp ------- ---------------------------------------------- ------------------------ -T1 yieldNextMessage() called (recent message) → Set to msg.created_at_epoch (recent) -T2 Messages sent to Claude SDK → Still set -T3 Claude responds → Still set -T4 Capture originalTimestamp → Captured (equals T1 timestamp) -T5 Create observations with originalTimestamp → Uses T1 timestamp ✅ -T6 Mark messages processed → Reset to null -``` - -In both cases, observations get the timestamp from when the message was originally created, not when the observation was saved. - -## Current State - -### Pending Messages -- **6 pending messages** (all from Dec 24, 2025) -- **0 stuck messages** (status='processing') -- All pending messages would be processed with correct timestamps if orphan processing enabled - -### Orphan Processing -- Currently **DISABLED** in `src/services/worker-service.ts:479` -- **Safe to re-enable** - timestamp fix is working correctly -- No risk of future timestamp corruption - -## Scripts Created - -1. **`scripts/fix-all-timestamps.ts`** - Comprehensive fix for ALL corrupted timestamps - ```bash - bun scripts/fix-all-timestamps.ts --dry-run # Preview - bun scripts/fix-all-timestamps.ts --yes # Apply - ``` - -2. **`scripts/validate-timestamp-logic.ts`** - Validate the backlog timestamp logic - ```bash - bun scripts/validate-timestamp-logic.ts - ``` - -3. **`scripts/verify-timestamp-fix.ts`** - Verify specific time window - ```bash - bun scripts/verify-timestamp-fix.ts - ``` - -## Recommendation - -✅ **Safe to re-enable orphan processing** - -The timestamp fix is working correctly. To re-enable: - -```typescript -// src/services/worker-service.ts:479 -// Change from: -// this.processOrphanedQueues(pendingStore).catch((err: Error) => { - -// To: -this.processOrphanedQueues(pendingStore).catch((err: Error) => { - logger.warn('SYSTEM', 'Orphan queue processing failed', {}, err); -}); -``` - -## Files Changed in Fix - -- `src/services/sqlite/SessionStore.ts` - Added `overrideTimestampEpoch` parameter -- `src/services/worker-types.ts` - Added `earliestPendingTimestamp` to ActiveSession -- `src/services/worker/SessionManager.ts` - Tracks earliest timestamp when yielding -- `src/services/worker/SDKAgent.ts` - Passes timestamp through to storage -- `src/services/worker-service.ts` - Orphan processing (currently disabled) - -## Conclusion - -✅ All corrupted timestamps fixed (171 observations) -✅ Code logic validated and working correctly -✅ No remaining timestamp issues in database -✅ Safe to re-enable orphan processing diff --git a/URGENT-FIX-TIMESTAMPS.md b/URGENT-FIX-TIMESTAMPS.md deleted file mode 100644 index 60ff648a..00000000 --- a/URGENT-FIX-TIMESTAMPS.md +++ /dev/null @@ -1,117 +0,0 @@ -# ✅ FIXED: Corrupted Observation Timestamps - -## What Happened - -On Dec 24, 2025, orphan queue processing was enabled without a working timestamp fix. This caused 25 observations from old sessions (Dec 16-22) to be saved with Dec 24 timestamps instead of their original timestamps. - -## The Fix - COMPLETED - -**Status**: ✅ All corrupted timestamps have been repaired - -**Date Fixed**: Dec 24, 2025 at 20:55 PM PST - -**Results**: -- Fixed 25 observations with corrupted timestamps -- Observations restored to correct dates (Dec 16-22) -- Remaining 31 observations in the window are legitimate (from Dec 24 sessions) - -## The Original Damage - -Observations created between approximately 19:45-20:31 on Dec 24 had wrong timestamps. These observations came from processing old pending_messages that were stuck in "processing" status. - -## How to Identify Affected Observations - -The affected observations were created from pending_messages that had `created_at_epoch` from Dec 17-20, but the observations got Dec 24 timestamps. - -```sql --- Find observations created during the bad window (Dec 24 ~19:45-20:31) -SELECT id, sdk_session_id, - datetime(created_at_epoch/1000, 'unixepoch', 'localtime') as obs_time, - title -FROM observations -WHERE created_at_epoch >= 1735098300000 -- Dec 24 19:45 - AND created_at_epoch <= 1735101060000 -- Dec 24 20:31 -ORDER BY id; -``` - -## How to Fix - -### Option 1: Match via processed pending_messages - -The pending_messages table has `completed_at_epoch` which shows when they were processed, and `created_at_epoch` which has the CORRECT original timestamp. - -```sql --- Find pending_messages processed during the bad window -SELECT id, session_db_id, tool_name, - datetime(created_at_epoch/1000, 'unixepoch', 'localtime') as original_time, - datetime(completed_at_epoch/1000, 'unixepoch', 'localtime') as processed_time -FROM pending_messages -WHERE status = 'processed' - AND completed_at_epoch >= 1735098300000 - AND completed_at_epoch <= 1735101060000 -ORDER BY completed_at_epoch; -``` - -### Option 2: Match by session and sequence - -For each affected session, observations should be ordered by their original pending_message timestamps: - -```sql --- Get the correct timestamp for each observation by matching session --- This requires correlating sdk_session_id with session_db_id -``` - -### Fix Script Approach - -1. Get all pending_messages processed during the bad window -2. For each, get its `session_db_id` and `created_at_epoch` (original timestamp) -3. Find observations for that session created during the bad window -4. Update observations with the correct `created_at_epoch` from their source pending_message - -```sql --- Example fix for a single observation (after identifying the correct timestamp): -UPDATE observations -SET created_at_epoch = [correct_epoch], - created_at = datetime([correct_epoch]/1000, 'unixepoch') -WHERE id = [observation_id]; -``` - -## The Timestamp Fix (Already Deployed) - -The code now correctly passes `overrideTimestampEpoch` through: -- `SessionStore.storeObservation()` - accepts optional timestamp override -- `SessionStore.storeSummary()` - accepts optional timestamp override -- `ActiveSession.earliestPendingTimestamp` - tracks original message timestamp -- `SDKAgent.processSDKResponse()` - passes timestamp to storage - -## Scripts Created - -Three scripts were created to handle this issue: - -1. **`scripts/fix-corrupted-timestamps.ts`** - Identifies and repairs corrupted timestamps - ```bash - bun scripts/fix-corrupted-timestamps.ts --dry-run # Preview fixes - bun scripts/fix-corrupted-timestamps.ts --yes # Apply fixes - ``` - -2. **`scripts/verify-timestamp-fix.ts`** - Verifies the fix was successful - ```bash - bun scripts/verify-timestamp-fix.ts - ``` - -3. **`scripts/investigate-timestamps.ts`** - Investigates timestamp data - ```bash - bun scripts/investigate-timestamps.ts - ``` - -## Orphan Processing Status - -The timestamp fix has been verified and is working correctly. Orphan processing can now be safely re-enabled in `src/services/worker-service.ts` if needed. - -## Files Changed - -- `src/services/sqlite/SessionStore.ts` - added overrideTimestampEpoch param -- `src/services/worker-types.ts` - added earliestPendingTimestamp to ActiveSession -- `src/services/worker/SessionManager.ts` - tracks earliest timestamp when yielding -- `src/services/worker/SDKAgent.ts` - passes timestamp through processSDKResponse -- `src/services/worker-service.ts` - orphan processing disabled