Commit Graph

6 Commits

Author SHA1 Message Date
Copilot 8ec91e7ffa fix: break infinite summary-retry loop (#1633) (#2072)
* Initial plan

* fix: break infinite summary-retry loop (#1633)

Three-part fix:
1. Parser coercion: When LLM returns <observation> tags instead of <summary>,
   coerce observation content into summary fields (root cause fix)
2. Stronger summary prompt: Add clearer tag requirements with warnings
3. Circuit breaker: Track consecutive summary failures per session,
   skip further attempts after 3 failures to prevent unbounded prompt growth

Agent-Logs-Url: https://github.com/thedotmack/claude-mem/sessions/e345e8ec-bc97-4eaa-94bd-6e951fda8f77

Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com>

* refactor: extract shared constants for summary mode marker and failure threshold

Addresses code review feedback: SUMMARY_MODE_MARKER and
MAX_CONSECUTIVE_SUMMARY_FAILURES are now defined once in sdk/prompts.ts
and imported by ResponseProcessor and SessionManager.

Agent-Logs-Url: https://github.com/thedotmack/claude-mem/sessions/e345e8ec-bc97-4eaa-94bd-6e951fda8f77

Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com>

* fix: guard summary failure counter on summaryExpected (Greptile P1)

The circuit breaker counter previously incremented on any response
containing <observation> or <summary> tags — which matches virtually
every normal observation response. After 3 observations the breaker
would open and permanently block summarization, reproducing the
data-loss scenario #1633 was meant to prevent.

Gate the increment block on summaryExpected (already computed for
parseSummary coercion) so the counter only tracks actual summary
attempts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: cover circuit-breaker + apply review polish

- Use findLast / at(-1) for last-user-message lookup instead of
  filter + index (O(1) common case).
- Drop redundant `|| 0` fallback — field is required and initialized.
- Add comment noting counter is ephemeral by design.
- Add ResponseProcessor tests covering:
  * counter NOT incrementing on normal observation responses
    (regression guard for the Greptile P1)
  * counter incrementing when a summary was expected but missing
  * counter resetting to 0 on successful summary storage

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: iterate all observation blocks; don't count skip_summary as failure

Addresses CodeRabbit review on #2072:

- coerceObservationToSummary now iterates all <observation> blocks
  with a global regex and returns the first block that has title,
  narrative, or facts. Previously, an empty leading observation
  would short-circuit and discard populated follow-ups.

- Circuit-breaker counter now treats explicit <skip_summary/> as
  neutral — neither a failure nor a success — so a run that happens
  to end on a skip doesn't punish the session or mask a prior bad
  streak. Real failures (no summary, no skip) still increment.

- Tests added for both cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: reference SUMMARY_MODE_MARKER constant instead of hardcoded string

Addresses CodeRabbit nitpick: tests should pull the marker from the
canonical source so they don't silently drift when the constant is
renamed or edited.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: also coerce observations when <summary> has empty sub-tags

When the LLM wraps an empty <summary></summary> around real observation
content, the #1360 empty-subtag guard rejects the summary and returns
null — which would lose the observation content and resurrect the
#1633 retry loop. Fall back to coerceObservationToSummary in that
branch too, mirroring the unmatched-<summary> path.

Adds a test covering the empty-summary-wraps-observation case and
a guard test for empty summary with no observation content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com>
Co-authored-by: Alex Newman <thedotmack@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 12:00:38 -07:00
Alex Newman 40a25e0225 Merge pull request #1676 from ousamabenyounes/fix/issue-1625
fix: filter ghost observations with no content fields (#1625)
2026-04-14 18:41:51 -07:00
Ousama Ben Younes e398212983 fix: filter ghost observations with no content fields (#1625)
When the LLM is overwhelmed by large context it can emit bare
<observation/> blocks (or ones containing only <type>). These are
stored as rows where title, narrative, facts and concepts are all
null/empty, appearing as meaningless "Untitled" entries in the context
window. Add a guard in parseObservations() that skips any observation
where every content field is null/empty before pushing it to the
result array.

Generated by Claude Code
Vibe coded by ousamabenyounes

Co-Authored-By: Claude <noreply@anthropic.com>
2026-04-10 09:40:40 +00:00
Alex Newman 36de44d661 Merge branch 'pr-1556' into integration/validation-batch 2026-04-06 14:18:28 -07:00
Alex Newman 9063c5d8a7 fix: block memory agent prose-skip responses at prompt and runtime levels
Observer prompt now explicitly requires XML observation blocks or empty
responses — prose explanations like "Skipping" are discarded. ResponseProcessor
logs a warning when non-XML content is received. Recording focus expanded to
include concrete debugging findings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 19:39:01 -07:00
Ousama Ben Younes 93a30c5c8f fix: skip parseSummary false positives with no sub-tags (#1360)
When an observation response accidentally contains a <summary> tag with
plain text (no <request>/<investigated>/etc. sub-tags), parseSummary was
creating empty SESSION SUMMARY records with all fields as empty strings.

Add an all-null guard AFTER field extraction: if none of the 5 sub-tags
matched, the <summary> match is a false positive and we return null.

This is distinct from the commented-out validation above (which rejected
summaries with SOME missing fields). We only reject when ALL are absent —
real partial summaries are still saved per the maintainer's explicit note.

Closes #1360

Co-Authored-By: Claude <noreply@anthropic.com>
2026-04-01 06:51:33 +00:00