Files
claude-mem/docs/reports/issue-598-conversation-history-pollution.md
T
Alex Newman 2659ec3231 fix: Claude Code 2.1.1 compatibility + log-level audit + path validation fixes (#614)
* Refactor CLAUDE.md and related files for December 2025 updates

- Updated CLAUDE.md in src/services/worker with new entries for December 2025, including changes to Search.ts, GeminiAgent.ts, SDKAgent.ts, and SessionManager.ts.
- Revised CLAUDE.md in src/shared to reflect updates and new entries for December 2025, including paths.ts and worker-utils.ts.
- Modified hook-constants.ts to clarify exit codes and their behaviors.
- Added comprehensive hooks reference documentation for Claude Code, detailing usage, events, and examples.
- Created initial CLAUDE.md files in various directories to track recent activity.

* fix: Merge user-message-hook output into context-hook hookSpecificOutput

- Add footer message to additionalContext in context-hook.ts
- Remove user-message-hook from SessionStart hooks array
- Fixes issue where stderr+exit(1) approach was silently discarded

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update logs and documentation for recent plugin and worker service changes

- Added detailed logs for worker service activities from Dec 10, 2025 to Jan 7, 2026, including initialization patterns, cleanup confirmations, and diagnostic logging.
- Updated plugin documentation with recent activities, including plugin synchronization and configuration changes from Dec 3, 2025 to Jan 7, 2026.
- Enhanced the context hook and worker service logs to reflect improvements and fixes in the plugin architecture.
- Documented the migration and verification processes for the Claude memory system and its integration with the marketplace.

* Refactor hooks architecture and remove deprecated user-message-hook

- Updated hook configurations in CLAUDE.md and hooks.json to reflect changes in session start behavior.
- Removed user-message-hook functionality as it is no longer utilized in Claude Code 2.1.0; context is now injected silently.
- Enhanced context-hook to handle session context injection without user-visible messages.
- Cleaned up documentation across multiple files to align with the new hook structure and removed references to obsolete hooks.
- Adjusted timing and command execution for hooks to improve performance and reliability.

* fix: Address PR #610 review issues

- Replace USER_MESSAGE_ONLY test with BLOCKING_ERROR test in hook-constants.test.ts
- Standardize Claude Code 2.1.0 note wording across all three documentation files
- Exclude deprecated user-message-hook.ts from logger-usage-standards test

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: Remove hardcoded fake token counts from context injection

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address PR #610 review issues by fixing test files, standardizing documentation notes, and verifying code quality improvements.

* fix: Add path validation to CLAUDE.md distribution to prevent invalid directory creation

- Add isValidPathForClaudeMd() function to reject invalid paths:
  - Tilde paths (~) that Node.js doesn't expand
  - URLs (http://, https://)
  - Paths with spaces (likely command text or PR references)
  - Paths with # (GitHub issue/PR references)
  - Relative paths that escape project boundary

- Integrate validation in updateFolderClaudeMdFiles loop
- Add 6 unit tests for path validation
- Update .gitignore to prevent accidental commit of malformed directories
- Clean up existing invalid directories (~/, PR #610..., git diff..., https:)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Implement path validation in CLAUDE.md generation to prevent invalid directory creation

- Added `isValidPathForClaudeMd()` function to validate file paths in `src/utils/claude-md-utils.ts`.
- Integrated path validation in `updateFolderClaudeMdFiles` to skip invalid paths.
- Added 6 new unit tests in `tests/utils/claude-md-utils.test.ts` to cover various rejection cases.
- Updated `.gitignore` to prevent tracking of invalid directories.
- Cleaned up existing invalid directories in the repository.

* feat: Promote critical WARN logs to ERROR level across codebase

Comprehensive log-level audit promoting 38+ WARN messages to ERROR for
improved debugging and incident response:

- Parser: observation type errors, data contamination
- SDK/Agents: empty init responses (Gemini, OpenRouter)
- Worker/Queue: session recovery, auto-recovery failures
- Chroma: sync failures, search failures (now treated as critical)
- SQLite: search failures (primary data store)
- Session/Generator: failures, missing context
- Infrastructure: shutdown, process management failures
- File Operations: CLAUDE.md updates, config reads
- Branch Management: recovery checkout failures

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: Address PR #614 review issues

- Remove incorrectly tracked tilde-prefixed files from git
- Fix absolute path validation to check projectRoot boundaries
- Add test coverage for absolute path validation edge cases

Closes review issues:
- Issue 1: ~/ prefixed files removed from tracking
- Issue 3: Absolute paths now validated against projectRoot
- Issue 4: Added 3 new test cases for absolute path scenarios

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* build assets and context

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 23:34:20 -05:00

13 KiB

Issue #598: Conversation History Pollution - Technical Analysis Report

Issue: Too many messages, polluting my conversation history Author: abhijit8ganguly-afk Created: 2026-01-07 Labels: bug Report Date: 2026-01-07


1. Executive Summary

Users are experiencing conversation history pollution when using /resume in Claude Code. Plugin-generated messages starting with "Hello memory agent" appear in the user's conversation history, making it difficult to resume sessions. This issue stems from a fundamental architectural concern: the Claude Agent SDK's resume mechanism appears to inject messages into the user's transcript when the resume parameter is passed with the user's contentSessionId instead of the plugin's separate memorySessionId.

Key Finding: The plugin maintains two separate session IDs for isolation:

  • contentSessionId: The user's Claude Code session (what appears in /resume)
  • memorySessionId: The plugin's internal SDK session (should NEVER appear in user's history)

When these become conflated, plugin messages pollute the user's conversation history.


2. Problem Analysis

2.1 User-Reported Symptoms

When using /resume, users see multiple messages starting with "Hello memory agent" appearing in their conversation history. These messages are internal to the claude-mem plugin and should be invisible to users.

2.2 Source of "Hello memory agent" Messages

The message originates from the mode configuration file at /Users/alexnewman/conductor/workspaces/claude-mem/budapest/plugin/modes/code.json:

{
  "prompts": {
    "continuation_greeting": "Hello memory agent, you are continuing to observe the primary Claude session.",
    "continuation_instruction": "IMPORTANT: Continue generating observations from tool use messages using the XML structure below."
  }
}

This greeting is injected via buildContinuationPrompt() in /Users/alexnewman/conductor/workspaces/claude-mem/budapest/src/sdk/prompts.ts:

export function buildContinuationPrompt(userPrompt: string, promptNumber: number, contentSessionId: string, mode: ModeConfig): string {
  return `${mode.prompts.continuation_greeting}
...

2.3 When These Messages Appear

The continuation prompt is used when session.lastPromptNumber > 1 (any prompt after the initial session start). This is controlled in SDKAgent.ts:

const initPrompt = isInitPrompt
  ? buildInitPrompt(session.project, session.contentSessionId, session.userPrompt, mode)
  : buildContinuationPrompt(session.userPrompt, session.lastPromptNumber, session.contentSessionId, mode);

3. Technical Details

3.1 Session ID Architecture

The plugin uses a dual session ID system to maintain isolation between user conversations and plugin operations:

Session ID Purpose Storage Should Appear in /resume
contentSessionId User's Claude Code session From hook context Yes - this IS the user's session
memorySessionId Plugin's internal SDK session Captured from SDK responses NEVER

Critical Code Comments from /Users/alexnewman/conductor/workspaces/claude-mem/budapest/src/services/sqlite/SessionStore.ts:

// NOTE: memory_session_id starts as NULL. It is captured by SDKAgent from the first SDK
// response and stored via updateMemorySessionId(). CRITICAL: memory_session_id must NEVER
// equal contentSessionId - that would inject memory messages into the user's transcript!

3.2 SDK Query Flow

The SDKAgent.startSession() method at /Users/alexnewman/conductor/workspaces/claude-mem/budapest/src/services/worker/SDKAgent.ts controls how the plugin interacts with Claude:

const queryResult = query({
  prompt: messageGenerator,
  options: {
    model: modelId,
    // Only resume if BOTH: (1) we have a memorySessionId AND (2) this isn't the first prompt
    // On worker restart, memorySessionId may exist from a previous SDK session but we
    // need to start fresh since the SDK context was lost
    ...(hasRealMemorySessionId && session.lastPromptNumber > 1 && { resume: session.memorySessionId }),
    disallowedTools,
    abortController: session.abortController,
    pathToClaudeCodeExecutable: claudePath
  }
});

Key Point: The resume parameter specifies which session to continue. If this accidentally uses contentSessionId, messages appear in the user's history.

3.3 Message Generator Architecture

The createMessageGenerator() method yields synthetic user messages to the SDK:

yield {
  type: 'user',
  message: {
    role: 'user',
    content: initPrompt  // Contains "Hello memory agent" for continuation prompts
  },
  session_id: session.contentSessionId,  // References user's session for context
  parent_tool_use_id: null,
  isSynthetic: true  // Marked as synthetic - should not appear in real history
};

3.4 Transcript Storage Locations

Claude Code stores conversation transcripts at:

~/.claude/projects/{dashed-cwd}/{session_id}.jsonl

If the wrong session ID is used, plugin messages get written to the user's transcript file.


4. Impact Assessment

4.1 User Experience Impact

Severity Description
High Users cannot effectively use /resume due to pollution
Medium Confusion about what messages are "real" vs plugin-generated
Low Increased scrolling/navigation effort

4.2 Functional Impact

  • Session Resume Degraded: The core /resume functionality becomes less useful
  • Context Window Pollution: Plugin messages consume valuable context window tokens
  • Trust Erosion: Users may question if the plugin is behaving correctly

4.3 Affected Users

All users who:

  1. Have claude-mem plugin installed
  2. Use /resume to continue sessions
  3. Have multi-turn conversations where continuation prompts are generated

5. Root Cause Analysis

5.1 Primary Root Cause

The issue likely occurs when the session ID passed to the SDK's resume parameter conflates with the user's session. This could happen in several scenarios:

Scenario A: Stale Session ID Resume (Previously Identified)

When the worker restarts with a stale memorySessionId from a previous session, it may attempt to resume into a non-existent session. The fix at SDKAgent.ts:109 prevents this:

...(hasRealMemorySessionId && session.lastPromptNumber > 1 && { resume: session.memorySessionId }),

However, if this logic was not working correctly, or if there was a race condition, the wrong ID could be used.

Scenario B: Session ID Capture Timing

The memorySessionId is captured from the first SDK response:

if (!session.memorySessionId && message.session_id) {
  session.memorySessionId = message.session_id;
  // Persist to database for cross-restart recovery
  this.dbManager.getSessionStore().updateMemorySessionId(
    session.sessionDbId,
    message.session_id
  );
}

If this capture fails or is delayed, subsequent messages might use the wrong session context.

Scenario C: Message Yielding with User Session Context

The message generator yields messages with session_id: session.contentSessionId:

yield {
  type: 'user',
  message: { role: 'user', content: initPrompt },
  session_id: session.contentSessionId,  // <-- This is the user's session ID!
  ...
};

This field may be used by the SDK to determine where to persist messages. If so, this is a design issue where the plugin's internal messages reference the user's session.

5.2 Contributing Factors

  1. Shared Conversation History: The plugin maintains a conversationHistory array that includes plugin messages, used for provider switching (Claude/Gemini/OpenRouter). This history may leak into user-visible contexts.

  2. Continuation Prompt Content: The "Hello memory agent" greeting is explicitly designed to be internal but has no technical mechanism preventing it from appearing in user transcripts.

  3. Synthetic Message Flag: Messages are marked isSynthetic: true but this flag may not be respected by all downstream components.


6.1 Immediate Mitigations

Option A: Remove or Minimize Continuation Greeting (Low Effort)

Modify the mode configuration to use a less intrusive greeting:

{
  "continuation_greeting": "",  // Empty or very minimal
}

Pros: Quick fix, no code changes Cons: Doesn't fix the underlying session ID issue

Option B: Verify Session ID Isolation (Medium Effort)

Add runtime validation to ensure memorySessionId never equals contentSessionId:

if (session.memorySessionId === session.contentSessionId) {
  logger.error('SESSION', 'CRITICAL: memorySessionId matches contentSessionId - messages will pollute user history!', {
    contentSessionId: session.contentSessionId,
    memorySessionId: session.memorySessionId
  });
  // Reset memorySessionId to force fresh SDK session
  session.memorySessionId = null;
}

6.2 Structural Fixes

Option C: Remove session_id from Yielded Messages (High Effort)

Investigate if the session_id field in yielded messages can be omitted or changed:

yield {
  type: 'user',
  message: { role: 'user', content: initPrompt },
  // session_id: session.contentSessionId,  // REMOVE or use memorySessionId
  parent_tool_use_id: null,
  isSynthetic: true
};

Requires: Understanding of SDK internals and testing

Option D: Separate Transcript Storage (High Effort)

Ensure plugin messages are stored in a completely separate transcript path:

  • User transcript: ~/.claude/projects/{cwd}/{contentSessionId}.jsonl
  • Plugin transcript: ~/.claude-mem/transcripts/{memorySessionId}.jsonl

6.3 Long-Term Architecture

Option E: Agent SDK Isolation Mode

Request or implement an SDK feature that marks certain messages as "agent-internal" and prevents them from appearing in user-facing /resume history.


7. Priority/Severity Assessment

Dimension Rating Justification
User Impact High Directly affects core user workflow (/resume)
Frequency High Affects all users with continuation prompts
Workaround Available Partial Users can ignore messages, but UX degraded
Fix Complexity Medium-High Requires understanding SDK session mechanics

This issue should be addressed promptly as it:

  1. Degrades a core Claude Code feature (/resume)
  2. Affects all plugin users
  3. May indicate a deeper session isolation problem
  4. Could lead to users disabling the plugin

  • Previous fix for stale session resume: .claude/plans/fix-stale-session-resume-crash.md
  • Session ID architecture: SESSION_ID_ARCHITECTURE.md (referenced in plans)

Key Files for Investigation

File Relevance
src/services/worker/SDKAgent.ts SDK query loop and session handling
src/sdk/prompts.ts Prompt generation including "Hello memory agent"
plugin/modes/code.json Mode configuration with greeting text
src/services/sqlite/SessionStore.ts Session ID storage and validation
tests/sdk-agent-resume.test.ts Test file for resume logic

Test Coverage

The resume parameter logic has unit tests at /Users/alexnewman/conductor/workspaces/claude-mem/budapest/tests/sdk-agent-resume.test.ts covering:

  • INIT prompt scenarios (should NOT resume)
  • Continuation prompt scenarios (should resume with memorySessionId)
  • Edge cases (empty/undefined memorySessionId)
  • Stale session crash prevention

9. Appendix: Code References

A. Continuation Greeting in Mode Config

File: plugin/modes/code.json

"continuation_greeting": "Hello memory agent, you are continuing to observe the primary Claude session."

B. Prompt Building

File: src/sdk/prompts.ts:174-177

export function buildContinuationPrompt(...): string {
  return `${mode.prompts.continuation_greeting}
...

C. Message Yielding

File: src/services/worker/SDKAgent.ts:283-292

yield {
  type: 'user',
  message: { role: 'user', content: initPrompt },
  session_id: session.contentSessionId,
  parent_tool_use_id: null,
  isSynthetic: true
};

D. Session ID Capture

File: src/services/worker/SDKAgent.ts:120-140

if (!session.memorySessionId && message.session_id) {
  session.memorySessionId = message.session_id;
  this.dbManager.getSessionStore().updateMemorySessionId(
    session.sessionDbId,
    message.session_id
  );
  ...
}

Report generated: 2026-01-07 Analysis based on codebase at commit: 687146ce (merge main)