refactor: Clean up search architecture, remove experimental contextualize endpoint (#133)
* Refactor code structure for improved readability and maintainability * Add test results for search API and related functionalities - Created test result files for various search-related functionalities, including: - test-11-search-server-changes.json - test-12-context-hook-changes.json - test-13-worker-service-changes.json - test-14-patterns.json - test-15-gotchas.json - test-16-discoveries.json - test-17-all-bugfixes.json - test-18-all-features.json - test-19-all-decisions.json - test-20-session-search.json - test-21-prompt-search.json - test-22-decisions-endpoint.json - test-23-changes-endpoint.json - test-24-how-it-works-endpoint.json - test-25-contextualize-endpoint.json - test-26-timeline-around-observation.json - test-27-multi-param-combo.json - test-28-file-type-combo.json - Each test result file captures specific search failures or outcomes, including issues with undefined properties and successful execution of search queries. - Enhanced documentation of search architecture and testing strategies, ensuring compliance with established guidelines and improving overall search functionality. * feat: Enhance unified search API with catch-all parameters and backward compatibility - Implemented a unified search API at /api/search that accepts catch-all parameters for filtering by type, observation type, concepts, and files. - Maintained backward compatibility by keeping granular endpoints functional while routing through the same infrastructure. - Completed comprehensive testing of search capabilities with real-world query scenarios. fix: Address missing debug output in search API query tests - Flushed PM2 logs and executed search queries to verify functionality. - Diagnosed absence of "Raw Chroma" debug messages in worker logs, indicating potential issues with logging or query processing. refactor: Improve build and deployment pipeline for claude-mem plugin - Successfully built and synced all hooks and services to the marketplace directory. - Ensured all dependencies are installed and up-to-date in the deployment location. feat: Implement hybrid search filters with 90-day recency window - Enhanced search server to apply a 90-day recency filter to Chroma results before categorizing by document type. fix: Correct parameter handling in searchUserPrompts method - Added support for filter-only queries and improved dual-path logic for clarity. refactor: Rename FTS5 method to clarify fallback status - Renamed escapeFTS5 to escapeFTS5_fallback_when_chroma_unavailable to indicate its temporary usage. feat: Introduce contextualize tool for comprehensive project overview - Added a new tool to fetch recent observations, sessions, and user prompts, providing a quick project overview. feat: Add semantic shortcut tools for common search patterns - Implemented 'decisions', 'changes', and 'how_it_works' tools for convenient access to frequently searched observation categories. feat: Unified timeline tool supports anchor and query modes - Combined get_context_timeline and get_timeline_by_query into a single interface for timeline exploration. feat: Unified search tool added to MCP server - New tool queries all memory types simultaneously, providing combined chronological results for improved search efficiency. * Refactor search functionality to clarify FTS5 fallback usage - Updated `worker-service.cjs` to replace FTS5 fallback function with a more descriptive name and improved error handling. - Enhanced documentation in `SKILL.md` to specify the unified API endpoint and clarify the behavior of the search engine, including the conditions under which FTS5 is used. - Modified `search-server.ts` to provide clearer logging and descriptions regarding the fallback to FTS5 when UVX/Python is unavailable. - Renamed and updated the `SessionSearch.ts` methods to reflect the conditions for using FTS5, emphasizing the lack of semantic understanding in fallback scenarios. * feat: Add ID-based fetch endpoints and simplify mem-search skill **Problem:** - Search returns IDs but no way to fetch by ID - Skill documentation was bloated with too many options - Claude wasn't using IDs because we didn't tell it how **Solution:** 1. Added three new HTTP endpoints: - GET /api/observation/:id - GET /api/session/:id - GET /api/prompt/:id 2. Completely rewrote SKILL.md: - Stripped complexity down to essentials - Clear 3-step prescriptive workflow: Search → Review IDs → Fetch by ID - Emphasized ID usage: "The IDs are there for a reason - USE THEM" - Removed confusing multi-endpoint documentation - Kept only unified search with filters **Impact:** - Token efficiency: Claude can now fetch full details only for relevant IDs - Clarity: One clear workflow instead of 10+ options to choose from - Usability: IDs are no longer wasted context - they're actionable 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: Move internal docs to private directory Moved POSTMORTEM and planning docs to ./private to exclude from PR reviews. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Remove experimental contextualize endpoint - Removed contextualize MCP tool from search-server (saves ~4KB) - Disabled FTS5 fallback paths in SessionSearch (now vector-first) - Cleaned up CLAUDE.md documentation - Removed contextualize-rewrite-plan.md doc Rationale: - Contextualize is better suited as a skill (LLM-powered) than an endpoint - Search API already provides vector search with configurable limits - Created issue #132 to track future contextualize skill implementation Changes: - src/servers/search-server.ts: Removed contextualize tool definition - src/services/sqlite/SessionSearch.ts: Disabled FTS5 fallback, added deprecation warnings - CLAUDE.md: Cleaned up outdated skill documentation - docs/: Removed contextualize plan document 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Complete FTS5 cleanup - remove all deprecated search code This completes the FTS5 cleanup work by removing all commented-out FTS5 search code while preserving database tables for backward compatibility. Changes: - Removed 200+ lines of commented FTS5 search code from SessionSearch.ts - Removed deprecated degraded_search_query__when_uvx_unavailable method - Updated all method documentation to clarify vector-first architecture - Updated class documentation to reflect filter-only query support - Updated CLAUDE.md to remove FTS5 search references - Clarified that FTS5 tables exist for backward compatibility only - Updated "Why SQLite FTS5" section to "Why Vector-First Search" Database impact: NONE - FTS5 tables remain intact for existing installations Search architecture: - ChromaDB: All text-based vector search queries - SQLite: Filter-only queries (date ranges, metadata, no query text) - FTS5 tables: Maintained but unused (backward compatibility) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: Remove all FTS5 fallback execution code from search-server Completes the FTS5 cleanup by removing all fallback execution paths that attempted to use FTS5 when ChromaDB was unavailable. Changes: - Removed all FTS5 fallback code execution paths - When ChromaDB fails or is unavailable, return empty results with helpful error messages - Updated all deprecated tool descriptions (search_observations, search_sessions, search_user_prompts) - Changed error messages to indicate FTS5 fallback has been removed - Added installation instructions for UVX/Python when vector search is unavailable - Updated comments from "hybrid search" to "vector-first search" - Removed ~100 lines of dead FTS5 fallback code Database impact: NONE - FTS5 tables remain intact (backward compatibility) Search behavior when ChromaDB unavailable: - Text queries: Return empty results with error explaining ChromaDB is required - Filter-only queries (no text): Continue to work via direct SQLite 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Address PR 133 review feedback Critical fixes: - Remove contextualize endpoint from worker-service (route + handler) - Fix build script logging to show correct .cjs extension (was .mjs) Documentation improvements: - Add comprehensive FTS5 retention rationale documentation - Include v7.0.0 removal TODO for future cleanup Testing: - Build succeeds with correct output logging - Worker restarts successfully (30th restart) - Contextualize endpoint properly removed (404 response) - Search endpoint verified working This addresses all critical review feedback from PR 133. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -13,7 +13,8 @@ import {
|
||||
|
||||
/**
|
||||
* Search interface for session-based memory
|
||||
* Provides FTS5 full-text search and structured queries for sessions, observations, and summaries
|
||||
* Provides filter-only structured queries for sessions, observations, and user prompts
|
||||
* Vector search is handled by ChromaDB - this class only supports filtering without query text
|
||||
*/
|
||||
export class SessionSearch {
|
||||
private db: Database.Database;
|
||||
@@ -31,7 +32,18 @@ export class SessionSearch {
|
||||
}
|
||||
|
||||
/**
|
||||
* Ensure FTS5 tables exist (inline migration)
|
||||
* Ensure FTS5 tables exist (backward compatibility only - no longer used for search)
|
||||
*
|
||||
* FTS5 tables are maintained for backward compatibility but not used for search.
|
||||
* Vector search (Chroma) is now the primary search mechanism.
|
||||
*
|
||||
* Retention Rationale:
|
||||
* - Prevents breaking existing installations with FTS5 tables
|
||||
* - Allows graceful migration path for users
|
||||
* - Tables maintained but search paths removed
|
||||
* - Triggers still fire to keep tables synchronized
|
||||
*
|
||||
* TODO: Remove FTS5 infrastructure in future major version (v7.0.0)
|
||||
*/
|
||||
private ensureFTSTables(): void {
|
||||
try {
|
||||
@@ -134,22 +146,6 @@ export class SessionSearch {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Escape FTS5 special characters in user input
|
||||
*
|
||||
* FTS5 uses double quotes for phrase searches and treats certain characters
|
||||
* as operators (*, AND, OR, NOT, parentheses, etc.). To prevent injection,
|
||||
* we wrap user input in double quotes and escape internal quotes by doubling them.
|
||||
* This converts any user input into a safe phrase search.
|
||||
*
|
||||
* @param text - User input to escape for FTS5 MATCH queries
|
||||
* @returns Safely escaped FTS5 query string
|
||||
*/
|
||||
private escapeFTS5(text: string): string {
|
||||
// Escape internal double quotes by doubling them (FTS5 standard)
|
||||
// Then wrap the entire string in double quotes for phrase search
|
||||
return `"${text.replace(/"/g, '""')}"`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Build WHERE clause for structured filters
|
||||
@@ -243,118 +239,78 @@ export class SessionSearch {
|
||||
}
|
||||
|
||||
/**
|
||||
* Search observations using FTS5 full-text search
|
||||
* Search observations using filter-only direct SQLite query.
|
||||
* Vector search is handled by ChromaDB - this only supports filtering without query text.
|
||||
*/
|
||||
searchObservations(query: string, options: SearchOptions = {}): ObservationSearchResult[] {
|
||||
searchObservations(query: string | undefined, options: SearchOptions = {}): ObservationSearchResult[] {
|
||||
const params: any[] = [];
|
||||
const { limit = 50, offset = 0, orderBy = 'relevance', ...filters } = options;
|
||||
|
||||
// Build FTS5 match query
|
||||
const ftsQuery = this.escapeFTS5(query);
|
||||
params.push(ftsQuery);
|
||||
// FILTER-ONLY PATH: When no query text, query table directly
|
||||
// This enables date filtering which Chroma cannot do (requires direct SQLite access)
|
||||
if (!query) {
|
||||
const filterClause = this.buildFilterClause(filters, params, 'o');
|
||||
if (!filterClause) {
|
||||
throw new Error('Either query or filters required for search');
|
||||
}
|
||||
|
||||
// Build filter conditions
|
||||
const filterClause = this.buildFilterClause(filters, params, 'o');
|
||||
const whereClause = filterClause ? `AND ${filterClause}` : '';
|
||||
const orderClause = this.buildOrderClause(orderBy, false);
|
||||
|
||||
// Build ORDER BY
|
||||
const orderClause = this.buildOrderClause(orderBy, true);
|
||||
const sql = `
|
||||
SELECT o.*, o.discovery_tokens
|
||||
FROM observations o
|
||||
WHERE ${filterClause}
|
||||
${orderClause}
|
||||
LIMIT ? OFFSET ?
|
||||
`;
|
||||
|
||||
// Main query with FTS5
|
||||
const sql = `
|
||||
SELECT
|
||||
o.*,
|
||||
o.discovery_tokens,
|
||||
observations_fts.rank as rank
|
||||
FROM observations o
|
||||
JOIN observations_fts ON o.id = observations_fts.rowid
|
||||
WHERE observations_fts MATCH ?
|
||||
${whereClause}
|
||||
${orderClause}
|
||||
LIMIT ? OFFSET ?
|
||||
`;
|
||||
|
||||
params.push(limit, offset);
|
||||
|
||||
const results = this.db.prepare(sql).all(...params) as ObservationSearchResult[];
|
||||
|
||||
// Normalize rank to score (0-1, higher is better)
|
||||
if (results.length > 0) {
|
||||
const minRank = Math.min(...results.map(r => r.rank || 0));
|
||||
const maxRank = Math.max(...results.map(r => r.rank || 0));
|
||||
const range = maxRank - minRank || 1;
|
||||
|
||||
results.forEach(r => {
|
||||
if (r.rank !== undefined) {
|
||||
// Invert rank (lower rank = better match) and normalize to 0-1
|
||||
r.score = 1 - ((r.rank - minRank) / range);
|
||||
}
|
||||
});
|
||||
params.push(limit, offset);
|
||||
return this.db.prepare(sql).all(...params) as ObservationSearchResult[];
|
||||
}
|
||||
|
||||
return results;
|
||||
// Vector search with query text should be handled by ChromaDB
|
||||
// This method only supports filter-only queries (query=undefined)
|
||||
console.warn('[SessionSearch] Text search not supported - use ChromaDB for vector search');
|
||||
return [];
|
||||
}
|
||||
|
||||
/**
|
||||
* Search session summaries using FTS5 full-text search
|
||||
* Search session summaries using filter-only direct SQLite query.
|
||||
* Vector search is handled by ChromaDB - this only supports filtering without query text.
|
||||
*/
|
||||
searchSessions(query: string, options: SearchOptions = {}): SessionSummarySearchResult[] {
|
||||
searchSessions(query: string | undefined, options: SearchOptions = {}): SessionSummarySearchResult[] {
|
||||
const params: any[] = [];
|
||||
const { limit = 50, offset = 0, orderBy = 'relevance', ...filters } = options;
|
||||
|
||||
// Build FTS5 match query
|
||||
const ftsQuery = this.escapeFTS5(query);
|
||||
params.push(ftsQuery);
|
||||
// FILTER-ONLY PATH: When no query text, query session_summaries table directly
|
||||
if (!query) {
|
||||
const filterOptions = { ...filters };
|
||||
delete filterOptions.type;
|
||||
const filterClause = this.buildFilterClause(filterOptions, params, 's');
|
||||
if (!filterClause) {
|
||||
throw new Error('Either query or filters required for search');
|
||||
}
|
||||
|
||||
// Build filter conditions (without type filter - not applicable to summaries)
|
||||
const filterOptions = { ...filters };
|
||||
delete filterOptions.type;
|
||||
const filterClause = this.buildFilterClause(filterOptions, params, 's');
|
||||
const whereClause = filterClause ? `AND ${filterClause}` : '';
|
||||
const orderClause = orderBy === 'date_asc'
|
||||
? 'ORDER BY s.created_at_epoch ASC'
|
||||
: 'ORDER BY s.created_at_epoch DESC';
|
||||
|
||||
// Note: session_summaries don't have files_read/files_modified in the same way
|
||||
// We'll need to adjust the filter clause
|
||||
const adjustedWhereClause = whereClause.replace(/files_read/g, 'files_read').replace(/files_modified/g, 'files_edited');
|
||||
const sql = `
|
||||
SELECT s.*, s.discovery_tokens
|
||||
FROM session_summaries s
|
||||
WHERE ${filterClause}
|
||||
${orderClause}
|
||||
LIMIT ? OFFSET ?
|
||||
`;
|
||||
|
||||
// Build ORDER BY
|
||||
const orderClause = orderBy === 'relevance'
|
||||
? 'ORDER BY session_summaries_fts.rank ASC'
|
||||
: orderBy === 'date_asc'
|
||||
? 'ORDER BY s.created_at_epoch ASC'
|
||||
: 'ORDER BY s.created_at_epoch DESC';
|
||||
|
||||
// Main query with FTS5
|
||||
const sql = `
|
||||
SELECT
|
||||
s.*,
|
||||
s.discovery_tokens,
|
||||
session_summaries_fts.rank as rank
|
||||
FROM session_summaries s
|
||||
JOIN session_summaries_fts ON s.id = session_summaries_fts.rowid
|
||||
WHERE session_summaries_fts MATCH ?
|
||||
${adjustedWhereClause}
|
||||
${orderClause}
|
||||
LIMIT ? OFFSET ?
|
||||
`;
|
||||
|
||||
params.push(limit, offset);
|
||||
|
||||
const results = this.db.prepare(sql).all(...params) as SessionSummarySearchResult[];
|
||||
|
||||
// Normalize rank to score
|
||||
if (results.length > 0) {
|
||||
const minRank = Math.min(...results.map(r => r.rank || 0));
|
||||
const maxRank = Math.max(...results.map(r => r.rank || 0));
|
||||
const range = maxRank - minRank || 1;
|
||||
|
||||
results.forEach(r => {
|
||||
if (r.rank !== undefined) {
|
||||
r.score = 1 - ((r.rank - minRank) / range);
|
||||
}
|
||||
});
|
||||
params.push(limit, offset);
|
||||
return this.db.prepare(sql).all(...params) as SessionSummarySearchResult[];
|
||||
}
|
||||
|
||||
return results;
|
||||
// Vector search with query text should be handled by ChromaDB
|
||||
// This method only supports filter-only queries (query=undefined)
|
||||
console.warn('[SessionSearch] Text search not supported - use ChromaDB for vector search');
|
||||
return [];
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -485,16 +441,13 @@ export class SessionSearch {
|
||||
}
|
||||
|
||||
/**
|
||||
* Search user prompts with full-text search
|
||||
* Search user prompts using filter-only direct SQLite query.
|
||||
* Vector search is handled by ChromaDB - this only supports filtering without query text.
|
||||
*/
|
||||
searchUserPrompts(query: string, options: SearchOptions = {}): UserPromptSearchResult[] {
|
||||
searchUserPrompts(query: string | undefined, options: SearchOptions = {}): UserPromptSearchResult[] {
|
||||
const params: any[] = [];
|
||||
const { limit = 20, offset = 0, orderBy = 'relevance', ...filters } = options;
|
||||
|
||||
// Build FTS5 match query
|
||||
const ftsQuery = this.escapeFTS5(query);
|
||||
params.push(ftsQuery);
|
||||
|
||||
// Build filter conditions (join with sdk_sessions for project filtering)
|
||||
const baseConditions: string[] = [];
|
||||
if (filters.project) {
|
||||
@@ -516,47 +469,34 @@ export class SessionSearch {
|
||||
}
|
||||
}
|
||||
|
||||
const whereClause = baseConditions.length > 0 ? `AND ${baseConditions.join(' AND ')}` : '';
|
||||
// FILTER-ONLY PATH: When no query text, query user_prompts table directly
|
||||
if (!query) {
|
||||
if (baseConditions.length === 0) {
|
||||
throw new Error('Either query or filters required for search');
|
||||
}
|
||||
|
||||
// Build ORDER BY
|
||||
const orderClause = orderBy === 'relevance'
|
||||
? 'ORDER BY user_prompts_fts.rank ASC'
|
||||
: orderBy === 'date_asc'
|
||||
? 'ORDER BY up.created_at_epoch ASC'
|
||||
: 'ORDER BY up.created_at_epoch DESC';
|
||||
const whereClause = `WHERE ${baseConditions.join(' AND ')}`;
|
||||
const orderClause = orderBy === 'date_asc'
|
||||
? 'ORDER BY up.created_at_epoch ASC'
|
||||
: 'ORDER BY up.created_at_epoch DESC';
|
||||
|
||||
// Main query with FTS5 (join sdk_sessions for project filtering)
|
||||
const sql = `
|
||||
SELECT
|
||||
up.*,
|
||||
user_prompts_fts.rank as rank
|
||||
FROM user_prompts up
|
||||
JOIN user_prompts_fts ON up.id = user_prompts_fts.rowid
|
||||
JOIN sdk_sessions s ON up.claude_session_id = s.claude_session_id
|
||||
WHERE user_prompts_fts MATCH ?
|
||||
${whereClause}
|
||||
${orderClause}
|
||||
LIMIT ? OFFSET ?
|
||||
`;
|
||||
const sql = `
|
||||
SELECT up.*
|
||||
FROM user_prompts up
|
||||
JOIN sdk_sessions s ON up.claude_session_id = s.claude_session_id
|
||||
${whereClause}
|
||||
${orderClause}
|
||||
LIMIT ? OFFSET ?
|
||||
`;
|
||||
|
||||
params.push(limit, offset);
|
||||
|
||||
const results = this.db.prepare(sql).all(...params) as UserPromptSearchResult[];
|
||||
|
||||
// Normalize rank to score
|
||||
if (results.length > 0) {
|
||||
const minRank = Math.min(...results.map(r => r.rank || 0));
|
||||
const maxRank = Math.max(...results.map(r => r.rank || 0));
|
||||
const range = maxRank - minRank || 1;
|
||||
|
||||
results.forEach(r => {
|
||||
if (r.rank !== undefined) {
|
||||
r.score = 1 - ((r.rank - minRank) / range);
|
||||
}
|
||||
});
|
||||
params.push(limit, offset);
|
||||
return this.db.prepare(sql).all(...params) as UserPromptSearchResult[];
|
||||
}
|
||||
|
||||
return results;
|
||||
// Vector search with query text should be handled by ChromaDB
|
||||
// This method only supports filter-only queries (query=undefined)
|
||||
console.warn('[SessionSearch] Text search not supported - use ChromaDB for vector search');
|
||||
return [];
|
||||
}
|
||||
|
||||
/**
|
||||
|
||||
Reference in New Issue
Block a user