CLIProxyAPI

Author	SHA1	Message	Date
VooDisss	35f158d526	refactor(pr): narrow Codex cache fix scope Remove the experimental auth-affinity routing changes from this PR so it stays focused on the validated Codex continuity fix. This keeps the prompt-cache repair while avoiding unrelated routing-policy concerns such as provider/model affinity scope, lifecycle cleanup, and hard-pin fallback semantics.	2026-03-27 19:06:34 +02:00
VooDisss	6962e09dd9	fix(auth): scope affinity by provider Keep sticky auth affinity limited to matching providers and stop persisting execution-session IDs as long-lived affinity keys so provider switching and normal streaming traffic do not create incorrect pins or stale affinity state.	2026-03-27 18:52:58 +02:00
VooDisss	4c4cbd44da	fix(auth): avoid leaking or over-persisting affinity keys Stop using one-shot idempotency keys as long-lived auth-affinity identifiers and remove raw affinity-key values from debug logs so sticky routing keeps its continuity benefits without creating avoidable memory growth or credential exposure risks.	2026-03-27 18:34:51 +02:00
VooDisss	26eca8b6ba	fix(codex): preserve continuity and safe affinity fallback Restore Claude continuity after the continuity refactor, keep auth-affinity keys out of upstream Codex session identifiers, and only persist affinity after successful execution so retries can still rotate to healthy credentials when the first auth fails.	2026-03-27 18:27:33 +02:00
VooDisss	62b17f40a1	refactor(codex): align continuity helpers with review feedback Align websocket continuity resolution with the HTTP Codex path, make auth-affinity principal keys use a stable string representation, and extract small helpers that remove duplicated continuity and affinity logic without changing the validated cache-hit behavior.	2026-03-27 18:11:57 +02:00
VooDisss	511b8a992e	fix(codex): restore prompt cache continuity for Codex requests Prompt caching on Codex was not reliably reusable through the proxy because repeated chat-completions requests could reach the upstream without the same continuity envelope. In practice this showed up most clearly with OpenCode, where cache reads worked in the reference client but not through CLIProxyAPI, although the root cause is broader than OpenCode itself. The proxy was breaking continuity in several ways: executor-layer Codex request preparation stripped prompt_cache_retention, chat-completions translation did not preserve that field, continuity headers used a different shape than the working client behavior, and OpenAI-style Codex requests could be sent without a stable prompt_cache_key. When that happened, session_id fell back to a fresh random value per request, so upstream Codex treated repeated requests as unrelated turns instead of as part of the same cacheable context. This change fixes that by preserving caller-provided prompt_cache_retention on Codex execution paths, preserving prompt_cache_retention when translating OpenAI chat-completions requests to Codex, aligning Codex continuity headers to session_id, and introducing an explicit Codex continuity policy that derives a stable continuity key from the best available signal. The resolution order prefers an explicit prompt_cache_key, then execution session metadata, then an explicit idempotency key, then stable request-affinity metadata, then a stable client-principal hash, and finally a stable auth-ID hash when no better continuity signal exists. The same continuity key is applied to both prompt_cache_key in the request body and session_id in the request headers so repeated requests reuse the same upstream cache/session identity. The auth manager also keeps auth selection sticky for repeated request sequences, preventing otherwise-equivalent Codex requests from drifting across different upstream auth contexts and accidentally breaking cache reuse. To keep the implementation maintainable, the continuity resolution and diagnostics are centralized in a dedicated Codex continuity helper instead of being scattered across executor flow code. Regression coverage now verifies retention preservation, continuity-key precedence, stable auth-ID fallback, websocket parity, translator preservation, and auth-affinity behavior. Manual validation confirmed prompt cache reads now occur through CLIProxyAPI when using Codex via OpenCode, and the fix should also benefit other clients that rely on stable repeated Codex request continuity.	2026-03-27 17:49:29 +02:00
hkfires	fee736933b	feat(openai-compat): add per-model thinking support	2026-03-24 14:21:12 +08:00
Luis Pater	0906aeca87	Merge pull request #2254 from clcc2019/main refactor: streamline usage reporting by consolidating record publishi…	2026-03-24 00:39:31 +08:00
Luis Pater	a576088d5f	Merge pull request #2222 from kaitranntt/kai/fix/758-openai-proxy-alternating-model-support fix: fall back on model support errors during auth rotation	2026-03-24 00:03:28 +08:00
Luis Pater	66ff916838	Merge pull request #2220 from xulongwu4/main fix: normalize model name in TranslateRequest fallback to prevent prefix leak	2026-03-23 23:56:15 +08:00
Luis Pater	7b0453074e	Merge pull request #2219 from beck-8/fix/context-done-race fix: avoid data race when watching request cancellation	2026-03-23 22:57:21 +08:00
dslife2025	0ed2d16596	Merge branch 'router-for-me:main' into main	2026-03-23 09:50:43 +08:00
clcc2019	c1bf298216	refactor: streamline usage reporting by consolidating record publishing logic - Introduced a new method `buildRecord` in `usageReporter` to encapsulate record creation, improving code readability and maintainability. - Added latency tracking to usage records, ensuring accurate reporting of request latencies. - Updated tests to validate the inclusion of latency in usage records and ensure proper functionality of the new reporting structure.	2026-03-20 19:44:26 +08:00
hkfires	636da4c932	refactor(auth): replace manual input handling with AsyncPrompt for callback URLs	2026-03-20 12:24:27 +08:00
hkfires	cccb77b552	fix(auth): avoid blocking oauth callback wait on prompt	2026-03-20 11:48:30 +08:00
Luis Pater	2bd646ad70	refactor: replace `sjson.Set` usage with `sjson.SetBytes` to optimize mutable JSON transformations	2026-03-19 17:58:54 +08:00
Tam Nhu Tran	ea3e0b713e	fix: harden pooled model-support fallback state	2026-03-18 13:19:20 -04:00
Tam Nhu Tran	5135c22cd6	fix: fall back on model support errors during auth rotation	2026-03-18 12:43:45 -04:00
Longwu Ou	1e27990561	address PR review: log sjson error and add unit tests - Log a warning instead of silently ignoring sjson.SetBytes errors in the TranslateRequest fallback path - Add registry_test.go with tests covering the fallback model normalization and verifying registered transforms take precedence	2026-03-18 12:43:40 -04:00
Longwu Ou	e1e9fc43c1	fix: normalize model name in TranslateRequest fallback to prevent prefix leak When no request translator is registered for a format pair (e.g. openai-response → openai-response), TranslateRequest returned the raw payload unchanged. This caused client-side model prefixes (e.g. "copilot/gpt-5-mini") to leak into upstream requests, resulting in "The requested model is not supported" errors from providers. The fallback path now updates the "model" field in the payload to match the resolved model name before returning.	2026-03-18 12:30:22 -04:00
beck-8	b2921518ac	fix: avoid data race when watching request cancellation	2026-03-19 00:15:52 +08:00
Luis Pater	dc7187ca5b	fix(websocket): pin only websocket-capable auth IDs and add corresponding test	2026-03-16 09:57:38 +08:00
Luis Pater	b5701f416b	Fixed: #2102 fix(auth): ensure unique auth index for shared API keys across providers and credential identities	2026-03-15 02:48:54 +08:00
hkfires	58fd9bf964	fix(codex): add 'go' plan_type in registerModelsForAuth	2026-03-14 22:09:14 +08:00
hkfires	f44f0702f8	feat(service): extend model registration for team and business types	2026-03-13 14:12:19 +08:00
hkfires	c3d5dbe96f	feat(model_registry): enhance model registration and refresh mechanisms	2026-03-13 10:56:39 +08:00
hkfires	dea3e74d35	feat(antigravity): refactor model handling and remove unused code	2026-03-12 09:24:45 +08:00
Luis Pater	ddaa9d2436	Fixed: #2034 feat(proxy): centralize proxy handling with `proxyutil` package and enhance test coverage - Added `proxyutil` package to simplify proxy handling across the codebase. - Refactored various components (`executor`, `cliproxy`, `auth`, etc.) to use `proxyutil` for consistent and reusable proxy logic. - Introduced support for "direct" proxy mode to explicitly bypass all proxies. - Updated tests to validate proxy behavior (e.g., `direct`, HTTP/HTTPS, and SOCKS5). - Enhanced YAML configuration documentation for proxy options.	2026-03-11 11:08:02 +08:00
hkfires	d1e3195e6f	feat(codex): register models by plan tier	2026-03-10 11:20:37 +08:00
Luis Pater	ce53d3a287	Fixed: #1997 docker-image / docker_amd64 (push) Has been cancelled Details docker-image / docker_arm64 (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details docker-image / docker_manifest (push) Has been cancelled Details test(auth-scheduler): add benchmarks and priority-based scheduling improvements - Added `BenchmarkManagerPickNextMixedPriority500` for mixed-priority performance assessment. - Updated `pickNextMixed` to prioritize highest ready priority tiers. - Introduced `highestReadyPriorityLocked` and `pickReadyAtPriorityLocked` for better scheduling logic. - Added unit test to validate selection of highest priority tiers in mixed provider scenarios.	2026-03-09 22:27:15 +08:00
Supra4E8C	fc2f0b6983	fix: cap websocket body log growth	2026-03-09 17:48:30 +08:00
Luis Pater	f5941a411c	test(auth): cover scheduler refresh regression paths docker-image / docker_amd64 (push) Has been cancelled Details docker-image / docker_arm64 (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details docker-image / docker_manifest (push) Has been cancelled Details	2026-03-09 09:27:56 +08:00
DragonFSKY	90afb9cb73	fix(auth): new OAuth accounts invisible to scheduler after dynamic registration When new OAuth auth files are added while the service is running, `applyCoreAuthAddOrUpdate` calls `coreManager.Register()` (which upserts into the scheduler) BEFORE `registerModelsForAuth()`. At upsert time, `buildScheduledAuthMeta` snapshots `supportedModelSetForAuth` from the global model registry — but models haven't been registered yet, so the set is empty. With an empty `supportedModelSet`, `supportsModel()` always returns false and the new auth is never added to any model shard. Additionally, when all existing accounts are in cooldown, the scheduler returns `modelCooldownError`, but `shouldRetrySchedulerPick` only handles `Error` types — so the `syncScheduler` safety-net rebuild never triggers and the new accounts remain invisible. Fix: 1. Add `RefreshSchedulerEntry()` to re-upsert a single auth after its models are registered, rebuilding `supportedModelSet` from the now-populated registry. 2. Call it from `applyCoreAuthAddOrUpdate` after `registerModelsForAuth`. 3. Make `shouldRetrySchedulerPick` also match `modelCooldownError` so the full scheduler rebuild triggers when all credentials are cooling down — catching any similar stale-snapshot edge cases.	2026-03-09 03:11:47 +08:00
Luis Pater	2b134fc378	test(auth-scheduler): add unit tests and scheduler implementation - Added comprehensive unit tests for `authScheduler` and related components. - Implemented `authScheduler` with support for Round Robin, Fill First, and custom selector strategies. - Improved tracking of auth states, cooldowns, and recovery logic in scheduler.	2026-03-08 05:52:55 +08:00
Luis Pater	b9153719b0	Merge pull request #1925 from shenshuoyaoyouguang/pr/openai-compat-pool-thinking fix(openai-compat): improve pool fallback and preserve adaptive thinking	2026-03-08 01:05:05 +08:00
chujian	a52da26b5d	fix(auth): stop draining stream pool goroutines after context cancellation	2026-03-07 18:30:33 +08:00
chujian	522a68a4ea	fix(openai-compat): retry empty bootstrap streams	2026-03-07 18:08:13 +08:00
chujian	a02eda54d0	fix(openai-compat): address review feedback	2026-03-07 17:39:42 +08:00
chujian	7c1299922e	fix(openai-compat): improve pool fallback and preserve adaptive thinking	2026-03-07 16:54:28 +08:00
Luis Pater	ddcf1f279d	Fixed: #1901 docker-image / docker_amd64 (push) Has been cancelled Details docker-image / docker_arm64 (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details docker-image / docker_manifest (push) Has been cancelled Details test(websocket): add tests for incremental input and prewarm handling logic - Added test cases for incremental input support based on upstream capabilities. - Introduced validation for prewarm handling of `response.create` messages locally. - Enhanced test coverage for websocket executor behavior, including payload forwarding checks. - Updated websocket implementation with prewarm and incremental input logic for better testability.	2026-03-07 13:11:28 +08:00
Luis Pater	5ebc58fab4	refactor(executor): remove legacy `connCreateSent` logic and standardize `response.create` usage for all websocket events docker-image / docker_amd64 (push) Has been cancelled Details docker-image / docker_arm64 (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details docker-image / docker_manifest (push) Has been cancelled Details - Simplified connection logic by removing `connCreateSent` and related state handling. - Updated `buildCodexWebsocketRequestBody` to always use `response.create`. - Added unit tests to validate `response.create` behavior and beta header preservation. - Dropped unsupported `response.append` and outdated `response.done` event types.	2026-03-07 09:07:23 +08:00
hkfires	48ffc4dee7	feat(config): support excluded vertex models in config	2026-03-04 18:47:42 +08:00
Luis Pater	b48485b42b	Fixed: #822 fix(auth): normalize ID casing on Windows to prevent duplicate entries due to case-insensitive paths	2026-03-04 02:31:20 +08:00
Luis Pater	79009bb3d4	Fixed: #797 test(auth): add test for preserving ModelStates during auth updates	2026-03-04 02:06:24 +08:00
hkfires	532107b4fa	test(auth): add global model registry usage to conductor override tests	2026-03-03 09:18:56 +08:00
Luis Pater	cc1d8f6629	Fixed: #1747 docker-image / docker_amd64 (push) Has been cancelled Details docker-image / docker_arm64 (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details docker-image / docker_manifest (push) Has been cancelled Details feat(auth): add configurable max-retry-credentials for finer control over cross-credential retries	2026-03-01 02:42:36 +08:00
Luis Pater	27c68f5bb2	fix(auth): replace MarkResult with hook OnResult for result handling docker-image / docker_amd64 (push) Has been cancelled Details docker-image / docker_arm64 (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details docker-image / docker_manifest (push) Has been cancelled Details	2026-02-27 20:47:46 +08:00
Luis Pater	74bf7eda8f	Merge pull request #1686 from lyd123qw2008/fix/auth-refresh-concurrency-limit fix(auth): limit auto-refresh concurrency to prevent refresh storms	2026-02-27 05:59:20 +08:00
Luis Pater	24bcfd9c03	Merge pull request #1699 from 123hi123/fix/antigravity-primary-model-fallback docker-image / docker_amd64 (push) Has been cancelled Details docker-image / docker_arm64 (push) Has been cancelled Details goreleaser / goreleaser (push) Has been cancelled Details docker-image / docker_manifest (push) Has been cancelled Details fix(antigravity): keep primary model list and backfill empty auths	2026-02-26 04:28:29 +08:00
Luis Pater	aa1da8a858	Merge pull request #1685 from lyd123qw2008/fix/auth-auto-refresh-interval fix(auth): respect configured auto-refresh interval	2026-02-25 01:13:47 +08:00

1 2 3 4 5 ...

336 Commits