67 Commits

Author SHA1 Message Date
canxin121 acf483c9e6 fix(responses): reject invalid SSE data JSON
Guard the openai-response streaming path against truncated/invalid SSE data payloads by validating data: JSON before forwarding; surface a 502 terminal error instead of letting clients crash with JSON parse errors.
2026-02-24 01:42:54 +08:00
canxin121 49c8ec69d0 fix(openai): emit valid responses stream error chunks
When /v1/responses streaming fails after headers are sent, we now emit a type=error chunk instead of an HTTP-style {error:{...}} payload, preventing AI SDK chunk validation errors.
2026-02-23 12:59:50 +08:00
Luis Pater 4445a165e9 test(handlers): add tests for passthrough headers behavior in WriteErrorResponse
docker-image / docker_amd64 (push) Has been cancelled
docker-image / docker_arm64 (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
docker-image / docker_manifest (push) Has been cancelled
2026-02-19 21:49:44 +08:00
Luis Pater a6bdd9a652 feat: add passthrough headers configuration
- Introduced `passthrough-headers` option in configuration to control forwarding of upstream response headers.
- Updated handlers to respect the passthrough headers setting.
- Added tests to verify behavior when passthrough is enabled or disabled.
2026-02-19 21:31:29 +08:00
Luis Pater 2789396435 fix: ensure connection-scoped headers are filtered in upstream requests
- Added `connectionScopedHeaders` utility to respect "Connection" header directives.
- Updated `FilterUpstreamHeaders` to remove connection-scoped headers dynamically.
- Refactored and tested upstream header filtering with additional validations.
- Adjusted upstream header handling during retries to replace headers safely.
2026-02-19 13:19:10 +08:00
Luis Pater 61da7bd981 Merge PR #1626 into codex/pr-1626 2026-02-19 04:49:14 +08:00
Luis Pater 55f938164b Merge pull request #1618 from alexey-yanchenko/fix/completions-usage
Fix empty usage in /v1/completions
2026-02-19 03:57:11 +08:00
Luis Pater bb86a0c0c4 feat(logging, executor): add request logging tests and WebSocket-based Codex executor
- Introduced unit tests for request logging middleware to enhance coverage.
- Added WebSocket-based Codex executor to support Responses API upgrade.
- Updated middleware logic to selectively capture request bodies for memory efficiency.
- Enhanced Codex configuration handling with new WebSocket attributes.
2026-02-19 01:57:02 +08:00
Kirill Turanskiy 1f8f198c45 feat: passthrough upstream response headers to clients
CPA previously stripped ALL response headers from upstream AI provider
APIs, preventing clients from seeing rate-limit info, request IDs,
server-timing and other useful headers.

Changes:
- Add Headers field to Response and StreamResult structs
- Add FilterUpstreamHeaders helper (hop-by-hop + security denylist)
- Add WriteUpstreamHeaders helper (respects CPA-set headers)
- ExecuteWithAuthManager/ExecuteCountWithAuthManager now return headers
- ExecuteStreamWithAuthManager returns headers from initial connection
- All 11 provider executors populate Response.Headers
- All handler call sites write filtered upstream headers before response

Filtered headers (not forwarded):
- RFC 7230 hop-by-hop: Connection, Transfer-Encoding, Keep-Alive, etc.
- Security: Set-Cookie
- CPA-managed: Content-Length, Content-Encoding
2026-02-18 00:16:22 +03:00
Alexey Yanchenko 709d999f9f Add usage to /v1/completions 2026-02-17 17:21:03 +07:00
Luis Pater 46a6782065 refactor(all): replace manual pointer assignments with new to enhance code readability and maintainability 2026-02-15 14:10:10 +08:00
Luis Pater a5a25dec57 refactor(translator, executor): remove redundant bytes.Clone calls for improved performance
docker-image / docker_amd64 (push) Has been cancelled
docker-image / docker_arm64 (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
docker-image / docker_manifest (push) Has been cancelled
- Replaced all instances of `bytes.Clone` with direct references to enhance efficiency.
- Simplified payload handling across executors and translators by eliminating unnecessary data duplication.
2026-02-06 03:26:29 +08:00
Luis Pater 09ecfbcaed refactor(executor): optimize payload cloning and streamline SDK translator usage
- Replaced unnecessary `bytes.Clone` calls for `opts.OriginalRequest` throughout executors.
- Introduced intermediate variable `originalPayloadSource` to simplify payload processing.
- Ensured better clarity and structure in request translation logic.
2026-02-06 01:44:20 +08:00
Luis Pater 25c6b479c7 refactor(util, executor): optimize payload handling and schema processing
- Replaced repetitive string operations with a centralized `escapeGJSONPathKey` function.
- Streamlined handling of JSON schema cleaning for Gemini and Antigravity requests.
- Improved payload management by transitioning from byte slices to strings for processing.
- Removed unnecessary cloning of byte slices in several places.
2026-02-05 19:00:30 +08:00
Luis Pater c82d8e250a Merge pull request #1174 from lieyan666/fix/issue-1082-change-error-status-code
docker-image / docker_amd64 (push) Has been cancelled
docker-image / docker_arm64 (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
docker-image / docker_manifest (push) Has been cancelled
fix: change HTTP status code from 400 to 502 when no provider available
2026-02-01 07:10:52 +08:00
Luis Pater f887f9985d Merge pull request #1248 from shekohex/feat/responses-compact
docker-image / docker_amd64 (push) Has been cancelled
docker-image / docker_arm64 (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
docker-image / docker_manifest (push) Has been cancelled
feat(openai): add responses/compact support
2026-01-31 03:12:55 +08:00
sususu98 295f34d7f0 fix(logging): capture streaming TTFB on first chunk and make timestamps required
- Add firstChunkTimestamp field to ResponseWriterWrapper for sync capture
- Capture TTFB in Write() and WriteString() before async channel send
- Add SetFirstChunkTimestamp() to StreamingLogWriter interface
- Make requestTimestamp/apiResponseTimestamp required in LogRequest()
- Remove timestamp capture from WriteAPIResponse() (now via setter)
- Fix Gemini handler to set API_RESPONSE_TIMESTAMP before writing response

This ensures accurate TTFB measurement for all streaming API formats
(OpenAI, Gemini, Claude) by capturing timestamp synchronously when
the first response chunk arrives, not when the stream finalizes.
2026-01-29 22:32:24 +08:00
sususu98 c41ce77eea fix(logging): add API response timestamp and fix request timestamp timing
Previously:
- REQUEST INFO timestamp was captured at log write time (not request arrival)
- API RESPONSE had NO timestamp at all

This fix:
- Captures REQUEST INFO timestamp when request first arrives
- Adds API RESPONSE timestamp when upstream response arrives

Changes:
- Add Timestamp field to RequestInfo, set at middleware initialization
- Set API_RESPONSE_TIMESTAMP in appendAPIResponse() and gemini handler
- Pass timestamps through logging chain to writeNonStreamingLog()
- Add timestamp output to API RESPONSE section

This enables accurate measurement of backend response latency in error logs.
2026-01-29 22:22:18 +08:00
Luis Pater 4eb1e6093f feat(handlers): add test to verify no retries after partial stream response
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
Introduce `TestExecuteStreamWithAuthManager_DoesNotRetryAfterFirstByte` to validate that stream executions do not retry after receiving partial responses. Implement `payloadThenErrorStreamExecutor` for test coverage of this behavior.
2026-01-29 17:30:48 +08:00
Luis Pater e93e05ae25 refactor: consolidate channel send logic with context-safe handlers
Optimize channel operations by introducing reusable context-aware send functions (`send` and `sendErr`) across `wsrelay`, `handlers`, and `cliproxy`. Ensure graceful handling of canceled contexts during stream operations.
2026-01-28 10:58:35 +08:00
Shady Khalifa 53920b0399 fix(openai): drop stream for responses/compact 2026-01-27 18:27:34 +02:00
Shady Khalifa 95096bc3fc feat(openai): add responses/compact support 2026-01-26 16:36:01 +02:00
Luis Pater 2e6a2b655c Merge pull request #1132 from XYenon/fix/gemini-models-displayname-override
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
fix(gemini): preserve displayName and description in models list
2026-01-25 03:40:04 +08:00
Darley 46c6fb1e7a fix(api): enhance ClaudeModels response to align with api.anthropic.com 2026-01-24 04:41:08 +03:30
lieyan666 6da7ed53f2 fix: change HTTP status code from 400 to 502 when no provider available
Fixes #1082

When all Antigravity accounts are unavailable, the error response now returns
HTTP 502 (Bad Gateway) instead of HTTP 400 (Bad Request). This ensures that
NewAPI and other clients will retry the request on a different channel,
improving overall reliability.
2026-01-23 23:45:14 +08:00
hkfires ecc850bfb7 feat(executor): apply payload rules using requested model 2026-01-23 16:38:41 +08:00
XYenon 8c7c446f33 fix(gemini): preserve displayName and description in models list
Previously GeminiModels handler unconditionally overwrote displayName
and description with the model name, losing the original values defined
in model definitions (e.g., 'Gemini 3 Pro Preview').

Now only set these fields as fallback when they are missing or empty.
2026-01-22 15:19:27 +08:00
Luis Pater 384578a88c feat(cliproxy, gemini): improve ID matching logic and enrich normalized model output
- Enhanced ID matching in `cliproxy` by adding additional conditions to better handle ID equality cases.
- Updated `gemini` handlers to include `displayName` and `description` in normalized models for enriched metadata.
2026-01-17 04:44:09 +08:00
Luis Pater 526dd866ba refactor(gemini): replace static model handling with dynamic model registry lookup 2026-01-16 10:39:16 +08:00
hkfires 0b06d637e7 refactor: improve thinking logic 2026-01-15 13:06:39 +08:00
Luis Pater 43652d044c refactor(config): replace nonstream-keepalive with nonstream-keepalive-interval
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Updated `SDKConfig` to use `nonstream-keepalive-interval` (seconds) instead of the boolean `nonstream-keepalive`.
- Refactored handlers and logic to incorporate the new interval-based configuration.
- Updated config diff, tests, and example YAML to reflect the changes.
2026-01-13 03:14:38 +08:00
Luis Pater b1b379ea18 feat(api): add non-streaming keep-alive support for idle timeout prevention
- Introduced `StartNonStreamingKeepAlive` to emit periodic blank lines during non-streaming responses.
- Added `nonstream-keepalive` configuration option in `SDKConfig`.
- Updated handlers to utilize `StartNonStreamingKeepAlive` and ensure proper cleanup.
- Extended config diff and tests to include `nonstream-keepalive` changes.
2026-01-13 02:36:07 +08:00
hkfires 21ac161b21 fix(test): implement missing HttpRequest method in stream bootstrap mock 2026-01-12 16:33:43 +08:00
hkfires a95428f204 fix(handlers): preserve upstream response logs before duplicate detection 2025-12-28 22:35:36 +08:00
hkfires 3ca5fb1046 fix(handlers): match raw error text before JSON body for duplicate detection 2025-12-28 19:35:36 +08:00
hkfires a091d12f4e fix(logging): improve request/response capture 2025-12-28 19:04:31 +08:00
hkfires 09455f9e85 fix(config): make streaming keepalive and retries ints 2025-12-27 20:56:47 +08:00
Thai Nguyen Hung 54f71aa273 fix(test): remove extra argument from ExecuteStreamWithAuthManager call 2025-12-25 21:55:35 +07:00
hkfires e76ba0ede9 feat(logging): implement request ID tracking and propagation 2025-12-24 08:32:17 +08:00
Luis Pater f413feec61 refactor(handlers): streamline error and data channel handling in streaming logic
Improved consistency across OpenAI, Claude, and Gemini handlers by replacing initial `select` statement with a `for` loop for better readability and error-handling robustness.
2025-12-24 04:07:24 +08:00
gwizz 5bf89dd757 fix: keep streaming defaults legacy-safe 2025-12-23 00:53:18 +11:00
gwizz 4442574e53 fix: stop streaming loop on context cancel 2025-12-23 00:37:55 +11:00
gwizz 71a6dffbb6 fix: improve streaming bootstrap and forwarding 2025-12-22 23:34:23 +11:00
moxi 830fd8eac2 Fix responses-format handling for chat completions 2025-12-22 13:54:02 +08:00
Luis Pater 670685139a fix(api): update route patterns to support wildcards for Gemini actions
Normalize action handling by accommodating wildcard patterns in route definitions for Gemini endpoints. Adjust `request.Action` parsing logic to correctly process routes with prefixed actions.
2025-12-17 01:17:02 +08:00
Luis Pater 52b6306388 feat(config): add support for model prefixes and prefix normalization
Refactor model management to include an optional `prefix` field for model credentials, enabling better namespace handling. Update affected configuration files, APIs, and handlers to support prefix normalization and routing. Remove unused OpenAI compatibility provider logic to simplify processing.
2025-12-17 01:07:26 +08:00
hkfires 3bc489254b fix(api): prevent double logging for error responses
The WriteErrorResponse function now caches the error response body in the gin context.

The deferred request logger checks for this cached response. If an error response is found, it bypasses the standard response logging. This prevents scenarios where an error is logged twice or an empty payload log overwrites the original, more detailed error log.
2025-12-15 16:36:01 +08:00
hkfires 4c07ea41c3 feat(api): return structured JSON error responses
The API error handling is updated to return a structured JSON payload
instead of a plain text message. This provides more context and allows
clients to programmatically handle different error types.

The new error response has the following structure:
{
  "error": {
    "message": "...",
    "type": "..."
  }
}

The `type` field is determined by the HTTP status code, such as
`authentication_error`, `rate_limit_error`, or `server_error`.

If the underlying error message from an upstream service is already a
valid JSON string, it will be preserved and returned directly.

BREAKING CHANGE: API error responses are now in a structured JSON
format instead of plain text. Clients expecting plain text error
messages will need to be updated to parse the new JSON body.
2025-12-15 16:19:52 +08:00
teeverc 5ab3032335 Update sdk/api/handlers/claude/code_handlers.go
thank you gemini

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-12 00:26:01 -08:00
teeverc 1215c635a0 fix: flush Claude SSE chunks immediately to match OpenAI behavior
- Write each SSE chunk directly to c.Writer and flush immediately
- Remove buffered writer and ticker-based flushing that caused delayed output
- Add 500ms timeout case for consistency with OpenAI/Gemini handlers
- Clean up unused bufio import

This fixes the 'not streaming' issue where small responses were held
in the buffer until timeout/threshold was reached.

Amp-Thread-ID: https://ampcode.com/threads/T-019b1186-164e-740c-96ab-856f64ee6bee
Co-authored-by: Amp <amp@ampcode.com>
2025-12-12 00:14:19 -08:00