feat: Fix observation timestamps, refactor session management, and enhance worker reliability (#437)

* Refactor worker version checks and increase timeout settings

- Updated the default hook timeout from 5000ms to 120000ms for improved stability.
- Modified the worker version check to log a warning instead of restarting the worker on version mismatch.
- Removed legacy PM2 cleanup and worker start logic, simplifying the ensureWorkerRunning function.
- Enhanced polling mechanism for worker readiness with increased retries and reduced interval.

* feat: implement worker queue polling to ensure processing completion before proceeding

* refactor: change worker command from start to restart in hooks configuration

* refactor: remove session management complexity

- Simplify createSDKSession to pure INSERT OR IGNORE
- Remove auto-create logic from storeObservation/storeSummary
- Delete 11 unused session management methods
- Derive prompt_number from user_prompts count
- Keep sdk_sessions table schema unchanged for compatibility

* refactor: simplify session management by removing unused methods and auto-creation logic

* Refactor session prompt number retrieval in SessionRoutes

- Updated the method of obtaining the prompt number from the session.
- Replaced `store.getPromptCounter(sessionDbId)` with `store.getPromptNumberFromUserPrompts(claudeSessionId)` for better clarity and accuracy.
- Adjusted the logic for incrementing the prompt number to derive it from the user prompts count instead of directly incrementing a counter.

* refactor: replace getPromptCounter with getPromptNumberFromUserPrompts in SessionManager

Phase 7 of session management simplification. Updates SessionManager to derive
prompt numbers from user_prompts table count instead of using the deprecated
prompt_counter column.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: simplify SessionCompletionHandler to use direct SQL query

Phase 8: Remove call to findActiveSDKSession() and replace with direct
database query in SessionCompletionHandler.completeByClaudeId().

This removes dependency on the deleted findActiveSDKSession() method
and simplifies the code by using a straightforward SELECT query.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: remove markSessionCompleted call from SDKAgent

- Delete call to markSessionCompleted() in SDKAgent.ts
- Session status is no longer tracked or updated
- Part of phase 9: simplifying session management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: remove markSessionComplete method (Phase 10)

- Deleted markSessionComplete() method from DatabaseManager
- Removed markSessionComplete call from SessionCompletionHandler
- Session completion status no longer tracked in database
- Part of session management simplification effort

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: replace deleted updateSDKSessionId calls in import script (Phase 11)

- Replace updateSDKSessionId() calls with direct SQL UPDATE statements
- Method was deleted in Phase 3 as part of session management simplification
- Import script now uses direct database access consistently

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* test: add validation for SQL updates in sdk_sessions table

* refactor: enhance worker-cli to support manual and automated runs

* Remove cleanup hook and associated session completion logic

- Deleted the cleanup-hook implementation from the hooks directory.
- Removed the session completion endpoint that was used by the cleanup hook.
- Updated the SessionCompletionHandler to eliminate the completeByClaudeId method and its dependencies.
- Adjusted the SessionRoutes to reflect the removal of the session completion route.

* fix: update worker-cli command to use bun for consistency

* feat: Implement timestamp fix for observations and enhance processing logic

- Added `earliestPendingTimestamp` to `ActiveSession` to track the original timestamp of the earliest pending message.
- Updated `SDKAgent` to capture and utilize the earliest pending timestamp during response processing.
- Modified `SessionManager` to track the earliest timestamp when yielding messages.
- Created scripts for fixing corrupted timestamps, validating fixes, and investigating timestamp issues.
- Verified that all corrupted observations have been repaired and logic for future processing is sound.
- Ensured orphan processing can be safely re-enabled after validation.

* feat: Enhance SessionStore to support custom database paths and add timestamp fields for observations and summaries

* Refactor pending queue processing and add management endpoints

- Disabled automatic recovery of orphaned queues on startup; users must now use the new /api/pending-queue/process endpoint.
- Updated processOrphanedQueues method to processPendingQueues with improved session handling and return detailed results.
- Added new API endpoints for managing pending queues: GET /api/pending-queue and POST /api/pending-queue/process.
- Introduced a new script (check-pending-queue.ts) for checking and processing pending observation queues interactively or automatically.
- Enhanced logging and error handling for better monitoring of session processing.

* updated agent sdk

* feat: Add manual recovery guide and queue management endpoints to documentation

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Alex Newman
2025-12-25 15:36:46 -05:00
committed by GitHub
parent 4cf2c1bdb1
commit 266c746d50
47 changed files with 3417 additions and 1026 deletions
+131 -27
View File
@@ -371,45 +371,149 @@ npm test
## Testing
### Running Tests
### Testing Philosophy
Claude-mem relies on **real-world usage and manual testing** rather than traditional unit tests. The project philosophy prioritizes:
1. **Manual verification** - Testing features in actual Claude Code sessions
2. **Integration testing** - Running the full system end-to-end
3. **Database inspection** - Verifying data correctness via SQLite queries
4. **CLI tools** - Interactive tools for checking system state
5. **Observability** - Comprehensive logging and worker health checks
This approach was chosen because:
- Hook behavior depends heavily on Claude Code's runtime environment
- SDK interactions require real API calls and responses
- SQLite and Bun runtime provide stability guarantees
- Manual testing catches integration issues that unit tests miss
### Manual Testing Workflow
When developing new features:
1. **Build and sync**:
```bash
npm run build
npm run sync-marketplace
claude-mem restart
```
2. **Test in real session**:
- Start Claude Code
- Trigger the feature you're testing
- Verify expected behavior
3. **Check database state**:
```bash
sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM your_table;"
```
4. **Monitor worker logs**:
```bash
npm run worker:logs
```
5. **Verify queue health** (for recovery features):
```bash
bun scripts/check-pending-queue.ts
```
### Testing Tools
**Health Checks**:
```bash
# All tests
npm test
# Worker status
npm run worker:status
# Specific test file
node --test tests/your-test.test.ts
# Queue inspection
curl http://localhost:37777/api/pending-queue
# With coverage (if configured)
npm test -- --coverage
# Database integrity
sqlite3 ~/.claude-mem/claude-mem.db "PRAGMA integrity_check;"
```
### Writing Tests
**Hook Testing**:
```bash
# Test context hook manually
echo '{"session_id":"test-123","cwd":"'$(pwd)'","source":"startup"}' | node plugin/scripts/context-hook.js
Create test files in `tests/`:
```typescript
import { describe, it } from 'node:test';
import assert from 'node:assert';
describe('YourFeature', () => {
it('should do something', () => {
// Test implementation
assert.strictEqual(result, expected);
});
});
# Test new hook
echo '{"session_id":"test-123","cwd":"'$(pwd)'","prompt":"test"}' | node plugin/scripts/new-hook.js
```
### Test Database
**Data Verification**:
```bash
# Check recent observations
sqlite3 ~/.claude-mem/claude-mem.db "
SELECT id, tool_name, created_at
FROM observations
ORDER BY created_at_epoch DESC
LIMIT 10;
"
Use a separate test database:
```typescript
import { SessionStore } from '../src/services/sqlite/SessionStore';
const store = new SessionStore(':memory:'); // In-memory database
# Check summaries
sqlite3 ~/.claude-mem/claude-mem.db "
SELECT id, request, completed
FROM session_summaries
ORDER BY created_at_epoch DESC
LIMIT 5;
"
```
### Recovery Feature Testing
For manual recovery features specifically:
1. **Simulate stuck messages**:
```bash
# Manually create stuck message (for testing only)
sqlite3 ~/.claude-mem/claude-mem.db "
UPDATE pending_messages
SET status = 'processing',
started_processing_at_epoch = strftime('%s', 'now', '-10 minutes') * 1000
WHERE id = 123;
"
```
2. **Test recovery**:
```bash
bun scripts/check-pending-queue.ts
```
3. **Verify results**:
```bash
curl http://localhost:37777/api/pending-queue | jq '.queue'
```
### Regression Testing
Before releasing:
1. **Test all hook triggers**:
- SessionStart: Start new Claude Code session
- UserPromptSubmit: Submit a prompt
- PostToolUse: Use a tool like Read
- Summary: Let session complete
- SessionEnd: Close Claude Code
2. **Test core features**:
- Context injection (recent sessions appear)
- Observation processing (summaries generated)
- MCP search tools (search returns results)
- Viewer UI (loads at http://localhost:37777)
- Manual recovery (stuck messages recovered)
3. **Test edge cases**:
- Worker crash recovery
- Database locks
- Port conflicts
- Large databases
4. **Cross-platform** (if applicable):
- macOS
- Linux
- Windows
## Code Style
### TypeScript Guidelines