Commit Graph

69 Commits

Author SHA1 Message Date
sususu98 7c24d54ca8 feat(session-affinity): add session-sticky routing for multi-account load balancing
When multiple auth credentials are configured, requests from the same
session are now routed to the same credential, improving upstream prompt
cache hit rates and maintaining context continuity.

Core components:
- SessionAffinitySelector: wraps RoundRobin/FillFirst selectors with
  session-to-auth binding; automatic failover when bound auth is
  unavailable, re-binding via the fallback selector for even distribution
- SessionCache: TTL-based in-memory cache with background cleanup
  goroutine, supporting per-session and per-auth invalidation
- StoppableSelector interface: lifecycle hook for selectors holding
  resources, called during Manager.StopAutoRefresh()

Session ID extraction priority (extractSessionIDs):
1. metadata.user_id with Claude Code session format (old
   user_{hash}_session_{uuid} and new JSON {session_id} format)
2. X-Session-ID header (generic client support)
3. metadata.user_id (non-Claude format, used as-is)
4. conversation_id field
5. Stable FNV hash from system prompt + first user/assistant messages
   (fallback for clients with no explicit session ID); returns both a
   full hash (primaryID) and a short hash without assistant content
   (fallbackID) to inherit bindings from the first turn

Multi-format message hash covers OpenAI messages, Claude system array,
Gemini contents/systemInstruction, and OpenAI Responses API input items
(including inline messages with role but no type field).

Configuration (config.yaml routing section):
- session-affinity: bool (default false)
- session-affinity-ttl: duration string (default "1h")
- claude-code-session-affinity: bool (deprecated, alias for above)
All three fields trigger selector rebuild on config hot reload.

Side effect: Idempotency-Key header is no longer auto-generated with a
random UUID when absent — only forwarded when explicitly provided by the
client, to avoid polluting session hash extraction.
2026-04-16 00:18:47 +08:00
Luis Pater 5bfaf8086b feat(auth): add configurable worker pool size for auto-refresh loop
- Introduced `auth-auto-refresh-workers` config option to override default concurrency.
- Updated `authAutoRefreshLoop` to support customizable worker counts.
- Enhanced token refresh scheduling flexibility by aligning worker pool with runtime configurations.
2026-04-12 13:56:05 +08:00
Luis Pater 6c0a1efd71 refactor(auth): simplify auth directory scanning and improve JSON processing logic
- Replaced `filepath.Walk` with `os.ReadDir` for cleaner directory traversal.
- Fixed `isAuthJSON` check to use `filepath.Dir` for directory comparison.
- Updated auth hash cache generation and file synthesis to improve readability and maintainability.
2026-04-12 13:32:03 +08:00
Luis Pater a583463d60 feat(auth): implement auto-refresh loop for managing auth token schedule
- Introduced `authAutoRefreshLoop` to handle token refresh scheduling.
- Replaced semaphore-based refresh logic in `Manager` with the new loop.
- Added unit tests to verify refresh schedule logic and edge cases.
2026-04-12 02:06:40 +08:00
Luis Pater ad8e3964ff fix(auth): add retry logic for 429 status with Retry-After and improve testing 2026-04-09 07:07:19 +08:00
Luis Pater 941334da79 fix(auth): handle OAuth model alias in retry logic and refine Qwen quota handling 2026-04-09 03:44:19 +08:00
Luis Pater 6a27bceec0 Merge pull request #2576 from zilianpn/fix/disable-cooling-auth-errors
fix(auth): honor disable-cooling and enrich no-auth errors
2026-04-07 09:56:25 +08:00
zilianpn 0ea768011b fix(auth): honor disable-cooling and enrich no-auth errors 2026-04-07 01:12:13 +08:00
Luis Pater ea43361492 Merge pull request #2121 from destinoantagonista-wq/main
Reconcile registry model states on auth changes
2026-04-06 09:13:27 +08:00
MonsterQiu f611dd6e96 refactor(auth): dedupe route-aware model support checks 2026-03-31 15:42:25 +08:00
MonsterQiu 07b7c1a1e0 fix(auth): resolve oauth aliases before suspension checks 2026-03-31 14:27:14 +08:00
Luis Pater 17363edf25 fix(auth): skip downtime for request-scoped 404 errors in model state management 2026-03-30 22:22:42 +08:00
Luis Pater 73c831747b Merge pull request #2133 from DragonFSKY/fix/2061-stale-modelstates
fix(auth): prevent stale runtime state inheritance from disabled auth entries
2026-03-28 20:50:57 +08:00
Tam Nhu Tran ea3e0b713e fix: harden pooled model-support fallback state 2026-03-18 13:19:20 -04:00
Tam Nhu Tran 5135c22cd6 fix: fall back on model support errors during auth rotation 2026-03-18 12:43:45 -04:00
DragonFSKY 5c817a9b42 fix(auth): prevent stale ModelStates inheritance from disabled auth entries
When an auth file is deleted and re-created with the same path/ID, the
new auth could inherit stale ModelStates (cooldown/backoff) from the
previously disabled entry, preventing it from being routed.

Gate runtime state inheritance (ModelStates, LastRefreshedAt,
NextRefreshAfter) on both existing and incoming auth being non-disabled
in Manager.Update and Service.applyCoreAuthAddOrUpdate.

Closes #2061
2026-03-14 23:46:23 +08:00
destinoantagonista-wq f09ed25fd3 fix(auth): tighten registry model reconciliation 2026-03-14 14:40:06 +00:00
destinoantagonista-wq e166e56249 Reconcile registry model states on auth changes
Add Manager.ReconcileRegistryModelStates to clear stale per-model runtime failures for models currently registered in the global model registry. The method finds models supported for an auth, resets non-clean ModelState entries, updates aggregated availability, persists changes, and pushes a snapshot to the scheduler. Introduce modelStateIsClean helper to determine when a model state needs resetting. Call ReconcileRegistryModelStates from Service paths that register/refresh models (applyCoreAuthAddOrUpdate and refreshModelRegistrationForAuth) to keep the scheduler and global registry aligned after model re-registration.
2026-03-13 19:41:49 +00:00
DragonFSKY 90afb9cb73 fix(auth): new OAuth accounts invisible to scheduler after dynamic registration
When new OAuth auth files are added while the service is running,
`applyCoreAuthAddOrUpdate` calls `coreManager.Register()` (which upserts
into the scheduler) BEFORE `registerModelsForAuth()`. At upsert time,
`buildScheduledAuthMeta` snapshots `supportedModelSetForAuth` from the
global model registry — but models haven't been registered yet, so the
set is empty. With an empty `supportedModelSet`, `supportsModel()`
always returns false and the new auth is never added to any model shard.

Additionally, when all existing accounts are in cooldown, the scheduler
returns `modelCooldownError`, but `shouldRetrySchedulerPick` only
handles `*Error` types — so the `syncScheduler` safety-net rebuild
never triggers and the new accounts remain invisible.

Fix:
1. Add `RefreshSchedulerEntry()` to re-upsert a single auth after its
   models are registered, rebuilding `supportedModelSet` from the
   now-populated registry.
2. Call it from `applyCoreAuthAddOrUpdate` after `registerModelsForAuth`.
3. Make `shouldRetrySchedulerPick` also match `*modelCooldownError` so
   the full scheduler rebuild triggers when all credentials are cooling
   down — catching any similar stale-snapshot edge cases.
2026-03-09 03:11:47 +08:00
Luis Pater 2b134fc378 test(auth-scheduler): add unit tests and scheduler implementation
- Added comprehensive unit tests for `authScheduler` and related components.
- Implemented `authScheduler` with support for Round Robin, Fill First, and custom selector strategies.
- Improved tracking of auth states, cooldowns, and recovery logic in scheduler.
2026-03-08 05:52:55 +08:00
chujian a52da26b5d fix(auth): stop draining stream pool goroutines after context cancellation 2026-03-07 18:30:33 +08:00
chujian 522a68a4ea fix(openai-compat): retry empty bootstrap streams 2026-03-07 18:08:13 +08:00
chujian a02eda54d0 fix(openai-compat): address review feedback 2026-03-07 17:39:42 +08:00
chujian 7c1299922e fix(openai-compat): improve pool fallback and preserve adaptive thinking 2026-03-07 16:54:28 +08:00
Luis Pater 79009bb3d4 Fixed: #797
**test(auth): add test for preserving ModelStates during auth updates**
2026-03-04 02:06:24 +08:00
Luis Pater cc1d8f6629 Fixed: #1747
docker-image / docker_amd64 (push) Has been cancelled
docker-image / docker_arm64 (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
docker-image / docker_manifest (push) Has been cancelled
feat(auth): add configurable max-retry-credentials for finer control over cross-credential retries
2026-03-01 02:42:36 +08:00
Luis Pater 27c68f5bb2 fix(auth): replace MarkResult with hook OnResult for result handling
docker-image / docker_amd64 (push) Has been cancelled
docker-image / docker_arm64 (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
docker-image / docker_manifest (push) Has been cancelled
2026-02-27 20:47:46 +08:00
Luis Pater 74bf7eda8f Merge pull request #1686 from lyd123qw2008/fix/auth-refresh-concurrency-limit
fix(auth): limit auto-refresh concurrency to prevent refresh storms
2026-02-27 05:59:20 +08:00
lyd123qw2008 0aaf177640 fix(auth): limit auto-refresh concurrency to prevent refresh storms 2026-02-23 22:28:41 +08:00
lyd123qw2008 450d1227bd fix(auth): respect configured auto-refresh interval 2026-02-23 22:07:50 +08:00
Luis Pater 61da7bd981 Merge PR #1626 into codex/pr-1626 2026-02-19 04:49:14 +08:00
Luis Pater bb86a0c0c4 feat(logging, executor): add request logging tests and WebSocket-based Codex executor
- Introduced unit tests for request logging middleware to enhance coverage.
- Added WebSocket-based Codex executor to support Responses API upgrade.
- Updated middleware logic to selectively capture request bodies for memory efficiency.
- Enhanced Codex configuration handling with new WebSocket attributes.
2026-02-19 01:57:02 +08:00
Kirill Turanskiy 1f8f198c45 feat: passthrough upstream response headers to clients
CPA previously stripped ALL response headers from upstream AI provider
APIs, preventing clients from seeing rate-limit info, request IDs,
server-timing and other useful headers.

Changes:
- Add Headers field to Response and StreamResult structs
- Add FilterUpstreamHeaders helper (hop-by-hop + security denylist)
- Add WriteUpstreamHeaders helper (respects CPA-set headers)
- ExecuteWithAuthManager/ExecuteCountWithAuthManager now return headers
- ExecuteStreamWithAuthManager returns headers from initial connection
- All 11 provider executors populate Response.Headers
- All handler call sites write filtered upstream headers before response

Filtered headers (not forwarded):
- RFC 7230 hop-by-hop: Connection, Transfer-Encoding, Keep-Alive, etc.
- Security: Set-Cookie
- CPA-managed: Content-Length, Content-Encoding
2026-02-18 00:16:22 +03:00
Luis Pater 46a6782065 refactor(all): replace manual pointer assignments with new to enhance code readability and maintainability 2026-02-15 14:10:10 +08:00
Luis Pater f410dd0440 Merge pull request #1390 from sususu98/fix/400-invalid-request-no-retry
fix(auth): 400 invalid_request_error 立即返回不再重试
2026-02-07 03:14:25 +08:00
sususu98 233be6272a fix(auth): 400 invalid_request_error 立即返回不再重试
当上游返回 400 Bad Request 且错误消息包含 invalid_request_error 时,
表示请求本身格式错误,切换账户不会改变结果。

修改:
- 添加 isRequestInvalidError 判定函数
- 内层循环遇到此错误立即返回,不遍历其他账户
- 外层循环不再对此类错误进行重试
2026-02-02 17:35:51 +08:00
chujian 47cb52385e sdk/cliproxy/auth: update selector tests 2026-02-02 05:26:04 +08:00
Luis Pater e93e05ae25 refactor: consolidate channel send logic with context-safe handlers
Optimize channel operations by introducing reusable context-aware send functions (`send` and `sendErr`) across `wsrelay`, `handlers`, and `cliproxy`. Ensure graceful handling of canceled contexts during stream operations.
2026-01-28 10:58:35 +08:00
Luis Pater 70897247b2 feat(auth): add support for request_retry and disable_cooling overrides
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
Implement `request_retry` and `disable_cooling` metadata overrides for authentication management. Update retry and cooling logic accordingly across `Manager`, Antigravity executor, and file synthesizer. Add tests to validate new behaviors.
2026-01-26 21:59:08 +08:00
Luis Pater 9c341f5aa5 feat(auth): add skip persistence context key for file watcher events
Introduce `WithSkipPersist` to disable persistence during Manager Update/Register calls, preventing write-back loops caused by redundant file writes. Add corresponding tests and integrate with existing file store and conductor logic.
2026-01-26 18:20:19 +08:00
Luis Pater c32e2a8196 fix(auth): handle context cancellation in executor methods
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
2026-01-24 04:56:55 +08:00
Chén Mù f353a54555 Merge pull request #1171 from router-for-me/auth
refactor(auth): remove unused provider execution helpers
2026-01-23 19:43:42 +08:00
Chén Mù 1d6e2e751d Merge pull request #1140 from sxjeru/main
fix(auth): handle quota cooldown in retry logic for transient errors
2026-01-23 19:43:17 +08:00
hkfires cc50b63422 refactor(auth): remove unused provider execution helpers 2026-01-23 19:12:55 +08:00
hkfires 81b369aed9 fix(auth): include requested model in executor metadata 2026-01-23 18:30:08 +08:00
sxjeru 30a59168d7 fix(auth): handle quota cooldown in retry logic for transient errors 2026-01-21 21:48:23 +08:00
hkfires fe5b3c80cb refactor(config): rename oauth-model-mappings to oauth-model-alias 2026-01-15 18:03:26 +08:00
hkfires 8bc6df329f fix(auth): apply API key model mapping to request model 2026-01-15 13:06:41 +08:00
hkfires 847be0e99d fix(auth): use base model name for auth matching by stripping suffix 2026-01-15 13:06:41 +08:00
hkfires f6a2d072e6 refactor(thinking): refine configuration logging 2026-01-15 13:06:41 +08:00