97d565e3cd
* feat: add mem-search skill with progressive disclosure architecture Add comprehensive mem-search skill for accessing claude-mem's persistent cross-session memory database. Implements progressive disclosure workflow and token-efficient search patterns. Features: - 12 search operations (observations, sessions, prompts, by-type, by-concept, by-file, timelines, etc.) - Progressive disclosure principles to minimize token usage - Anti-patterns documentation to guide LLM behavior - HTTP API integration for all search functionality - Common workflows with composition examples Structure: - SKILL.md: Entry point with temporal trigger patterns - principles/: Progressive disclosure + anti-patterns - operations/: 12 search operation files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: add CHANGELOG entry for mem-search skill Document mem-search skill addition in Unreleased section with: - 100% effectiveness compliance metrics - Comparison to previous search skill implementation - Progressive disclosure architecture details - Reference to audit report documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: add mem-search skill audit report Add comprehensive audit report validating mem-search skill against Anthropic's official skill-creator documentation. Report includes: - Effectiveness metrics comparison (search vs mem-search) - Critical issues analysis for production readiness - Compliance validation across 6 key dimensions - Reference implementation guidance Result: mem-search achieves 100% compliance vs search's 67% 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: Add comprehensive search architecture analysis document - Document current state of dual search architectures (HTTP API and MCP) - Analyze HTTP endpoints and MCP search server architectures - Identify DRY violations across search implementations - Evaluate the use of curl as the optimal approach for search - Provide architectural recommendations for immediate and long-term improvements - Outline action plan for cleanup, feature parity, DRY refactoring * refactor: Remove deprecated search skill documentation and operations * refactor: Reorganize documentation into public and context directories Changes: - Created docs/public/ for Mintlify documentation (.mdx files) - Created docs/context/ for internal planning and implementation docs - Moved all .mdx files and assets to docs/public/ - Moved all internal .md files to docs/context/ - Added CLAUDE.md to both directories explaining their purpose - Updated docs.json paths to work with new structure Benefits: - Clear separation between user-facing and internal documentation - Easier to maintain Mintlify docs in dedicated directory - Internal context files organized separately 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Enhance session management and continuity in hooks - Updated new-hook.ts to clarify session_id threading and idempotent session creation. - Modified prompts.ts to require claudeSessionId for continuation prompts, ensuring session context is maintained. - Improved SessionStore.ts documentation on createSDKSession to emphasize idempotent behavior and session connection. - Refined SDKAgent.ts to detail continuation prompt logic and its reliance on session.claudeSessionId for unified session handling. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Alex Newman <thedotmack@gmail.com>
617 lines
22 KiB
Markdown
617 lines
22 KiB
Markdown
# JIT Context Filtering: Post-Mortem
|
|
|
|
**Date:** November 9, 2025
|
|
**Duration:** 3.5 hours (7:45 PM - 11:11 PM)
|
|
**Branches:** `feature/jit-context`, `failed/jit-context`
|
|
**Status:** Failed, reverted to main
|
|
**Commits:**
|
|
- `3ac0790` - feat: Implement JIT context hook for user prompt submission
|
|
- `adf7bf4` - Refactor JIT context handling in SDKAgent and WorkerService
|
|
|
|
## Executive Summary
|
|
|
|
Attempted to implement JIT (Just-In-Time) context filtering—a feature that would dynamically generate relevant context timelines on every user prompt, potentially replacing the static session-start context entirely. After multiple architectural iterations spanning 3.5 hours and adding ~2,850 lines of code, the implementation was abandoned and reverted. The revert was not due to lack of vision (the feature aligns with long-term architectural goals), but due to implementation complexity and the need for a simpler initial approach. Significant architectural knowledge was gained about hook limitations, worker patterns, and proper separation of concerns.
|
|
|
|
## What We Tried to Build
|
|
|
|
### Goal
|
|
When a user submits a prompt, dynamically generate a relevant context timeline instead of the static session-start context. Use the fast search infrastructure (SQLite FTS5 + ChromaDB) to fetch precisely relevant context on-demand.
|
|
|
|
### The Vision
|
|
**Current approach:** SessionStart hook loads 50 recent observations blindly, displays them all.
|
|
|
|
**Proposed approach:** UserPromptSubmit hook analyzes the prompt, queries the timeline search API, and loads only the relevant context window dynamically.
|
|
|
|
**Why this makes sense:**
|
|
- We already have fast search: SQLite FTS5 + Chroma semantic search
|
|
- Dynamic context timeline search is implemented and tested
|
|
- Search results come back in <200ms
|
|
- Could **replace** session-start context entirely with smarter, prompt-specific context
|
|
|
|
### User Experience
|
|
```
|
|
User types: "How did we fix the authentication bug?"
|
|
|
|
Behind the scenes:
|
|
1. Analyze prompt: "authentication bug fix"
|
|
2. Query timeline search for relevant period
|
|
3. Load 5-10 observations from that specific timeline
|
|
4. Inject as context
|
|
5. Claude answers with precisely relevant historical context
|
|
|
|
vs. Current:
|
|
Load 50 most recent observations regardless of relevance
|
|
```
|
|
|
|
### Why Checkbox Settings Became Less Important
|
|
Originally asked for checkboxes to customize session-start context display. But if JIT context could replace session-start context with intelligent, prompt-specific timelines, the display customization became a non-issue.
|
|
|
|
## Architectural Attempts
|
|
|
|
### Attempt 1: Hook-Based Filtering (7:45 PM - 9:30 PM)
|
|
|
|
**Approach:** Call Agent SDK `query()` directly in `new-hook.ts` during UserPromptSubmit event.
|
|
|
|
**Implementation:**
|
|
- Created `jit-context-hook.ts` (~432 lines)
|
|
- Added `generateJitContext()` function in hook
|
|
- Called SDK `query()` with observation list and user prompt
|
|
- Expected hook to block for ~1-2s while Haiku filters
|
|
|
|
**Failure:**
|
|
```
|
|
Error: Claude Code executable not found at
|
|
/Users/alexnewman/.claude/plugins/marketplaces/thedotmack/plugin/scripts/cli.js
|
|
```
|
|
|
|
**Root Cause:** Hooks run in sandboxed environment without access to `claudePath` (path to Claude Code executable). The Agent SDK requires this path, which is only available in the worker service.
|
|
|
|
**Architectural Violation:** This broke the established pattern where hooks handle orchestration and workers handle AI processing. The `save-hook` sets the precedent: hooks capture data, send to worker, worker runs SDK queries asynchronously.
|
|
|
|
### Attempt 2: Worker-Based with Simple Queries (9:30 PM - 10:30 PM)
|
|
|
|
**Approach:** Move JIT filtering to worker service, keep it simple with per-request SDK queries.
|
|
|
|
**Implementation:**
|
|
- Documented architecture fix plan in `docs/jit-context-architecture-fix.md`
|
|
- Moved `generateJitContext()` to worker (considered creating `src/services/worker/JitContext.ts`)
|
|
- Modified `/sessions/:id/init` endpoint to accept `jitEnabled` flag
|
|
- Worker would run one-shot SDK query per prompt
|
|
|
|
**Architecture:**
|
|
```
|
|
UserPromptSubmit → new-hook → POST /sessions/:id/init { jitEnabled: true }
|
|
↓
|
|
Worker spawns Claude Haiku
|
|
↓
|
|
Filters 50 obs → 3-5 IDs
|
|
↓
|
|
Returns { context: [...] }
|
|
↓
|
|
Hook injects context → Claude
|
|
```
|
|
|
|
**Issues Identified:**
|
|
- Each filter request spawns a new Claude subprocess (~200-500ms overhead)
|
|
- Observation list re-sent on every prompt (~5-10KB per request)
|
|
- No token caching between requests
|
|
- Performance worse than just loading all observations directly
|
|
|
|
**Decision:** Pivoted to persistent sessions to solve performance issues.
|
|
|
|
### Attempt 3: Persistent JIT Sessions (10:30 PM - 11:11 PM)
|
|
|
|
**Approach:** Create a long-lived Agent SDK session that persists throughout user session, similar to main memory session pattern.
|
|
|
|
**Implementation (291 new lines in SDKAgent.ts):**
|
|
|
|
1. **Session Lifecycle:**
|
|
- Added `jitSessionId`, `jitAbortController`, `jitGeneratorPromise` to `ActiveSession` interface
|
|
- `startJitSession()`: Creates persistent SDK session at session init
|
|
- `cleanupJitSession()`: Terminates JIT session at session end
|
|
|
|
2. **Request Queue Architecture:**
|
|
- `jitFilterQueues` Map: Per-session request queues
|
|
- `JITFilterRequest` interface: `{ userPrompt, resolve, reject }`
|
|
- EventEmitter coordination: Wake generator when new requests arrive
|
|
|
|
3. **Message Generator Pattern:**
|
|
- `createJitMessageGenerator()`: Async generator that yields filter requests
|
|
- Initial prompt: Load 50 observations, wait for "READY" response
|
|
- Loop: Wait for EventEmitter signal → yield user prompt → parse response → resolve promise
|
|
- Pattern: Persistent session stays alive between requests
|
|
|
|
4. **Filter Query Flow:**
|
|
```typescript
|
|
runFilterQuery(sessionDbId, userPrompt) {
|
|
// Queue request
|
|
queue.requests.push({ userPrompt, resolve, reject });
|
|
queue.emitter.emit('request');
|
|
|
|
// Wait for response (30s timeout)
|
|
return Promise.race([
|
|
new Promise((resolve, reject) => { /* queued */ }),
|
|
timeout(30000)
|
|
]);
|
|
}
|
|
```
|
|
|
|
5. **Response Processing:**
|
|
- `processJitFilterResponse()`: Accumulate streaming text
|
|
- Parse IDs: "1,5,23,41" or "NONE"
|
|
- Resolve queued promise with ID array
|
|
|
|
**Added Files:**
|
|
- `src/services/worker/SDKAgent.ts`: +291 lines
|
|
- `src/services/worker-types.ts`: +3 fields (jit state tracking)
|
|
- `src/services/worker/SessionManager.ts`: +26 lines (JIT cleanup)
|
|
- `src/services/worker-service.ts`: +102 lines (JIT initialization)
|
|
- `src/shared/settings.ts`: +65 lines (JIT config)
|
|
- `src/hooks/jit-context-hook.ts`: +208 lines (orchestration)
|
|
- `docs/jit-context-architecture-fix.md`: +265 lines
|
|
- `context/session-pattern-parity.md`: +298 lines
|
|
|
|
**Total Changes:** 18 files, +2,852 lines, -133 lines
|
|
|
|
**Final Status at Revert:** Implementation was complete and likely functional, but...
|
|
|
|
## Why It Failed
|
|
|
|
### 1. Architectural Complexity Explosion
|
|
|
|
**Problem:** The persistent session pattern added enormous complexity for marginal benefit.
|
|
|
|
**Evidence:**
|
|
- Parallel session management: Regular + JIT sessions running concurrently
|
|
- Complex coordination: EventEmitter + promise queues + generator pattern
|
|
- Lifecycle coupling: Session init, request handling, cleanup all intertwined
|
|
- State explosion: 3 new fields per session (`jitSessionId`, `jitAbortController`, `jitGeneratorPromise`)
|
|
|
|
**Code Smell:** When the "optimization" requires 300 lines of coordination code, it's probably not an optimization.
|
|
|
|
### 2. Premature Optimization
|
|
|
|
**YAGNI Violation:** Built elaborate token caching and persistent session architecture before proving the feature provided value.
|
|
|
|
**Reality Check:**
|
|
- **Current approach:** Load 50 observations = ~25KB context, works fine
|
|
- **JIT overhead:** Haiku query = 1-2s latency + coordination complexity
|
|
- **User benefit:** Unclear—users haven't complained about context relevance
|
|
- **Token savings:** Marginal—Claude caches long contexts efficiently anyway
|
|
|
|
**Quote from CLAUDE.md:**
|
|
> "Write the dumb, obvious thing first. Add complexity only when you actually hit the problem."
|
|
|
|
We didn't hit a problem. We invented one.
|
|
|
|
### 3. Implementation Complexity, Not Vision
|
|
|
|
**The Vision is Sound:**
|
|
- Dynamic context is better than static context
|
|
- Timeline search API exists and is fast
|
|
- Infrastructure (SQLite + Chroma) can support this
|
|
- Replacing session-start context with prompt-specific context makes sense
|
|
|
|
**The Problem:**
|
|
We jumped to the complex persistent-session approach without trying the simple per-request approach first.
|
|
|
|
**What We Should Have Done:**
|
|
```typescript
|
|
// Simple version (not tried):
|
|
app.post('/sessions/:id/init', async (req, res) => {
|
|
const { userPrompt } = req.body;
|
|
|
|
// Query timeline search API (already exists, fast)
|
|
const timeline = await timelineSearch(project, userPrompt, depth=10);
|
|
|
|
// Return observations
|
|
return res.json({ context: timeline });
|
|
});
|
|
```
|
|
|
|
**This would have:**
|
|
- Validated the feature's value quickly
|
|
- Used existing infrastructure
|
|
- Avoided all the persistence complexity
|
|
- Taken 30 minutes instead of 3.5 hours
|
|
|
|
### 4. Pattern Divergence
|
|
|
|
**Inconsistency:** JIT sessions work fundamentally differently from memory sessions.
|
|
|
|
**Memory Session Pattern:**
|
|
```typescript
|
|
// One-shot: Init → Process observations → Complete
|
|
startSession() → yield prompts → parse responses → complete
|
|
```
|
|
|
|
**JIT Session Pattern:**
|
|
```typescript
|
|
// Persistent: Init → Wait indefinitely → Process on-demand → Complete
|
|
startJitSession() → yield initial load → LOOP:
|
|
- Wait for EventEmitter signal
|
|
- Yield filter request
|
|
- Parse response
|
|
- Resolve promise
|
|
- GOTO LOOP
|
|
```
|
|
|
|
**Maintenance Burden:** Two completely different session patterns means:
|
|
- Doubled testing complexity
|
|
- Increased cognitive load for contributors
|
|
- Higher risk of subtle bugs in lifecycle management
|
|
|
|
**Session Pattern Parity Document:** The 298-line `session-pattern-parity.md` was created to document the differences—a sign that maybe they shouldn't be different.
|
|
|
|
### 5. Blocking I/O in Critical Path
|
|
|
|
**Performance Impact:** Every user prompt now blocks for 1-2s waiting for Haiku filtering.
|
|
|
|
**Current Flow:**
|
|
```
|
|
User types prompt → 10ms → Claude responds
|
|
```
|
|
|
|
**JIT Flow:**
|
|
```
|
|
User types prompt → 10ms init → 1-2s Haiku filter → Claude responds
|
|
```
|
|
|
|
**User Experience:** We added 1-2 seconds of latency to every interaction for questionable benefit.
|
|
|
|
**Alternative:** If context filtering is valuable, do it asynchronously and apply to next prompt.
|
|
|
|
### 6. Missing the Forest for the Trees
|
|
|
|
**Real Issue:** We focused on technical implementation without asking strategic questions:
|
|
|
|
- **Is context relevance actually a problem?** No evidence.
|
|
- **Do users want this?** No feedback requested.
|
|
- **Is 50 observations too many?** Not proven.
|
|
- **Does filtering improve responses?** Not tested.
|
|
|
|
**Anti-Pattern:** Solution in search of a problem.
|
|
|
|
## What We Should Have Done
|
|
|
|
### Option 1: Don't Build It
|
|
|
|
**Justification:** No validated user need. Current system works fine.
|
|
|
|
**Next Step:** Wait for user feedback indicating context relevance is an issue.
|
|
|
|
### Option 2: Simple MVP
|
|
|
|
If we really wanted to explore this:
|
|
|
|
1. **Week 1:** Add basic filtering in worker with one-shot queries
|
|
- Accept slight performance hit (~500ms overhead)
|
|
- Measure filter accuracy and user impact
|
|
- Gather feedback
|
|
|
|
2. **Week 2:** If proven valuable, optimize
|
|
- Add token caching only if needed
|
|
- Consider persistent sessions only if performance is bottleneck
|
|
|
|
3. **Week 3:** If still valuable, scale
|
|
- Polish error handling
|
|
- Add configuration options
|
|
- Document patterns
|
|
|
|
**Philosophy:** Incremental validation, not big-bang architecture.
|
|
|
|
### Option 3: Different Approach Entirely
|
|
|
|
**Alternative:** Pre-computed relevance scores
|
|
|
|
Instead of on-demand filtering:
|
|
- Score observations at creation time (save-hook)
|
|
- Store relevance embeddings in Chroma
|
|
- At session start, query Chroma with user's first prompt
|
|
- Load top 10-20 most relevant observations
|
|
- No runtime latency, better accuracy, simpler architecture
|
|
|
|
**Benefit:** Leverages existing Chroma infrastructure, avoids runtime overhead.
|
|
|
|
## Technical Lessons Learned
|
|
|
|
### 1. EventEmitter Coordination Anti-Pattern
|
|
|
|
**Code:**
|
|
```typescript
|
|
queue.emitter.on('request', () => {
|
|
// Wake up generator to process request
|
|
});
|
|
```
|
|
|
|
**Issue:** Complex async coordination using event-driven wakeup signals is hard to reason about.
|
|
|
|
**Better:** Use async queues or channels (e.g., `async-queue` package) that handle coordination internally.
|
|
|
|
### 2. Generator Pattern Complexity
|
|
|
|
**Pattern:**
|
|
```typescript
|
|
async *createJitMessageGenerator() {
|
|
yield initialPrompt;
|
|
while (!aborted) {
|
|
await waitForEvent(); // Blocks here
|
|
yield nextRequest;
|
|
}
|
|
}
|
|
```
|
|
|
|
**Tradeoff:** Generators are great for iteration, but terrible for event-driven request/response patterns.
|
|
|
|
**Better:** Use explicit session object with `sendMessage()/waitForResponse()` methods.
|
|
|
|
### 3. Dual Session Management
|
|
|
|
**Complexity:** Managing two concurrent SDK sessions per user session is inherently complex.
|
|
|
|
**Alternatives Considered:**
|
|
- Single session handling both observations and filtering (rejected: tight coupling)
|
|
- Separate service for filtering (rejected: too much infrastructure)
|
|
- Pre-computed filtering (not considered: should have been)
|
|
|
|
**Lesson:** When parallel state management feels hard, question whether you need parallel state.
|
|
|
|
### 4. Promise Queue Pattern
|
|
|
|
**Implementation:**
|
|
```typescript
|
|
interface QueuedRequest {
|
|
resolve: (result: T) => void;
|
|
reject: (error: Error) => void;
|
|
}
|
|
queue.push({ resolve, reject });
|
|
// Later...
|
|
queue[0].resolve(result);
|
|
```
|
|
|
|
**Good:** Clean async API for callers
|
|
**Bad:** Easy to leak promises if error handling isn't perfect
|
|
**Improvement:** Use libraries like `p-queue` that handle edge cases
|
|
|
|
## Process Lessons Learned
|
|
|
|
### 1. No Incremental Validation
|
|
|
|
**Mistake:** Went from "idea" to "complete architecture" without validation points.
|
|
|
|
**Better Process:**
|
|
1. Write one-pager explaining user value
|
|
2. Build simplest possible version (2 hours max)
|
|
3. Test with real usage
|
|
4. Measure impact
|
|
5. Decide: kill, iterate, or scale
|
|
|
|
**Checkpoint Questions:**
|
|
- After 1 hour: "Does this solve a real problem?"
|
|
- After 2 hours: "Is this getting too complex?"
|
|
- After 3 hours: "Should I just ship the simple version?"
|
|
|
|
### 2. Architecture Astronomy
|
|
|
|
**Definition:** Designing elaborate systems without building/testing them.
|
|
|
|
**Evidence:**
|
|
- 265-line architecture doc written before any code
|
|
- 298-line session pattern parity analysis
|
|
- Multiple complete rewrites of the same feature
|
|
|
|
**Better:** Code first, document later. Spike solutions, learn from implementation.
|
|
|
|
### 3. Sunk Cost Fallacy
|
|
|
|
**Timeline:**
|
|
- **Hour 1:** "This seems complex but achievable"
|
|
- **Hour 2:** "We're halfway done, can't stop now"
|
|
- **Hour 3:** "Just need to fix this one coordination issue"
|
|
- **Hour 4:** "It's working, but... this feels wrong"
|
|
|
|
**Correct Decision:** Revert. Took courage to throw away 4 hours of work.
|
|
|
|
**Learning:** Time invested is not a reason to continue. Quality of outcome matters more.
|
|
|
|
### 4. Missing User Feedback Loop
|
|
|
|
**No User Input:**
|
|
- Didn't ask: "Is context relevance a problem for you?"
|
|
- Didn't test: "Does filtered context improve your responses?"
|
|
- Didn't measure: "Are you hitting context limits?"
|
|
|
|
**Engineering Theater:** Building impressive-sounding features without user validation.
|
|
|
|
## What We Actually Learned (The Real Value)
|
|
|
|
Despite reverting, this was productive R&D:
|
|
|
|
### 1. Deep Understanding of Hook Architecture
|
|
|
|
**Critical Discovery:** Hooks run in sandboxed environment without `claudePath`.
|
|
- Hooks cannot call Agent SDK `query()` directly
|
|
- All AI processing must happen in worker service
|
|
- This architectural constraint is now documented
|
|
|
|
**Learned Pattern:**
|
|
```
|
|
Hook (orchestration) → Worker (AI processing)
|
|
✓ save-hook: Captures data → Worker processes with SDK
|
|
✓ new-hook: Creates session → Worker returns confirmation
|
|
✗ jit-hook: Tried SDK in hook → Failed, no claudePath
|
|
```
|
|
|
|
**Value:** Future features will avoid this mistake. We now know the boundary.
|
|
|
|
### 2. Worker Architecture Patterns
|
|
|
|
**Blocking vs. Non-Blocking:**
|
|
- SessionStart: Can be non-blocking (context loads async)
|
|
- UserPromptSubmit: Must be blocking (session must exist before processing)
|
|
- JIT Context: Must be blocking (context needed before prompt processed)
|
|
|
|
**Established Pattern:**
|
|
```typescript
|
|
// Worker endpoint for features requiring AI
|
|
app.post('/sessions/:id/operation', async (req, res) => {
|
|
const { operationData } = req.body;
|
|
const result = await sdkAgent.performOperation(operationData);
|
|
return res.json({ result });
|
|
});
|
|
```
|
|
|
|
### 3. Persistent Session Management
|
|
|
|
**Architecture Knowledge Gained:**
|
|
- How to maintain long-lived SDK sessions
|
|
- EventEmitter coordination patterns for request/response
|
|
- Promise queue management for async operations
|
|
- Proper cleanup with AbortControllers
|
|
|
|
**Pattern Documented:**
|
|
- Dual session management (regular + JIT)
|
|
- Generator-based message loops
|
|
- Request queuing with timeouts
|
|
|
|
**Value:** When we build the simpler version, we'll know these patterns.
|
|
|
|
### 4. Configuration Infrastructure
|
|
|
|
`src/shared/settings.ts` (65 lines) provides reusable configuration patterns:
|
|
```typescript
|
|
export function getConfigValue(key: string, defaultValue: string): string {
|
|
// Priority: settings.json → env var → default
|
|
}
|
|
```
|
|
|
|
**Kept After Revert:** This module is useful for other features.
|
|
|
|
### 5. Key Architectural Decisions Made
|
|
|
|
**Decisions that will guide future implementation:**
|
|
1. JIT context filtering must happen in worker (proven via failed hook attempt)
|
|
2. Context must be blocking on UserPromptSubmit (session needs context before processing)
|
|
3. Dynamic timeline search is the right approach (fast, precise, leverages existing infrastructure)
|
|
4. Simple per-request queries should be tried before persistent sessions
|
|
|
|
### 6. Documentation Quality
|
|
|
|
- `jit-context-architecture-fix.md`: Documents why hooks can't run SDK queries
|
|
- `session-pattern-parity.md`: Reference for implementing dual sessions
|
|
- Hooks reference: Comprehensive hook documentation added
|
|
|
|
**Value:** These docs help future contributors understand the system constraints.
|
|
|
|
### 7. Infrastructure Validation
|
|
|
|
**Confirmed that our search stack is ready:**
|
|
- SQLite FTS5: Fast full-text search (<50ms)
|
|
- ChromaDB: Semantic search (<200ms with 8,000+ vectors)
|
|
- Timeline search API: Already implemented and tested
|
|
- Worker service: Can handle synchronous AI operations
|
|
|
|
**The infrastructure exists. We just need a simpler integration.**
|
|
|
|
## Recommendations
|
|
|
|
### Immediate Actions
|
|
|
|
1. **Archive the work:**
|
|
- Keep `failed/jit-context` branch for reference
|
|
- Extract reusable components (settings.ts)
|
|
- Save architecture docs for future features
|
|
|
|
2. **Document the anti-patterns:**
|
|
- Add this post-mortem to CLAUDE.md references
|
|
- Update coding standards with lessons learned
|
|
|
|
3. **Reset focus:**
|
|
- Return to validated user needs
|
|
- Prioritize features with clear value propositions
|
|
|
|
### Future Feature Development
|
|
|
|
**Gating Questions (Answer before coding):**
|
|
|
|
1. **User Value:** What specific user problem does this solve?
|
|
2. **Evidence:** Have users requested this or reported the underlying issue?
|
|
3. **Measurement:** How will we know if it's successful?
|
|
4. **Simplicity:** What's the dumbest version that could work?
|
|
5. **Time Limit:** If we can't prove value in 2 hours, should we build it?
|
|
|
|
**Process:**
|
|
|
|
```
|
|
VALIDATE → BUILD SIMPLE → TEST → MEASURE → DECIDE
|
|
↑ ↓
|
|
└──────────── ITERATE OR KILL ────────────┘
|
|
```
|
|
|
|
### If Context Filtering Returns
|
|
|
|
Should we revisit this idea in the future:
|
|
|
|
**Prerequisites:**
|
|
- User feedback requesting better context relevance
|
|
- Metrics showing current context is too broad
|
|
- Evidence that filtering improves response quality
|
|
|
|
**Simple Approach:**
|
|
```typescript
|
|
// In worker-service.ts /sessions/:id/init
|
|
if (jitEnabled) {
|
|
const observations = await db.getRecentObservations(project, 50);
|
|
const filtered = await simpleFilter(observations, userPrompt); // One-shot query
|
|
return { context: filtered };
|
|
}
|
|
```
|
|
|
|
**Acceptance Criteria:**
|
|
- <100 lines of code
|
|
- <500ms latency impact
|
|
- No new session types
|
|
- Degrades gracefully on errors
|
|
|
|
**If that works:** Then consider optimization.
|
|
|
|
## Conclusion
|
|
|
|
JIT context filtering failed not because the vision was wrong, but because we jumped to the complex implementation without validating the simple one first. The feature aligns with long-term goals (dynamic, prompt-specific context using our fast search infrastructure), but the persistent-session architecture was premature optimization.
|
|
|
|
**The right call:** Revert the complex implementation. Build the simple version when ready.
|
|
|
|
**Key Takeaway:** The vision is sound. The execution was overcomplicated. We now have:
|
|
- Deep knowledge of hook/worker architecture constraints
|
|
- Documented patterns for persistent SDK sessions
|
|
- Validated fast search infrastructure
|
|
- Clear understanding of what to build next time (simple timeline search API integration)
|
|
|
|
**This was R&D, not failure.** We learned what doesn't work (SDK in hooks), what does work (worker-based AI processing), and how to approach it next time (simple API calls before persistent sessions).
|
|
|
|
**Next Implementation:**
|
|
When we revisit this (and we should), start with:
|
|
1. Worker endpoint that accepts prompt
|
|
2. Queries existing timeline search API
|
|
3. Returns context
|
|
4. Hook injects context
|
|
5. Validate it improves responses
|
|
6. Then optimize if needed
|
|
|
|
**Final Thought:** Sometimes you have to build the wrong thing to understand the right thing. That's R&D.
|
|
|
|
---
|
|
|
|
**Branch Status:**
|
|
- `feature/jit-context`: Abandoned
|
|
- `failed/jit-context`: Archived for reference
|
|
- `main`: Stable at v5.4.0
|
|
|
|
**Files to Keep:**
|
|
- `src/shared/settings.ts`: Reusable config utilities
|
|
|
|
**Files Discarded:**
|
|
- Everything else (+2,850 lines)
|
|
|
|
**Emotional State:** Relieved. Dodged a maintenance nightmare.
|