Commit Graph

104 Commits

Author SHA1 Message Date
Luis Pater ee6429cc75 **feat(registry): add Gemini 3 Pro Image Preview model and remove Claude Sonnet 4.5 Thinking**
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Added new `Gemini 3 Pro Image Preview` model with detailed metadata and configuration.
- Removed outdated `Claude Sonnet 4.5 Thinking` model definition for cleanup and relevance.
2025-11-26 18:22:40 +08:00
nestharus d0e694d4ed feat(claude): add thinking model variants and beta headers support
- Add Claude thinking model definitions (sonnet-4-5-thinking, opus-4-5-thinking variants)
- Add Thinking support for antigravity models with -thinking suffix
- Add injectThinkingConfig() for automatic thinking budget based on model suffix
- Add resolveUpstreamModel() mappings for thinking variants to actual Claude models
- Add extractAndRemoveBetas() to convert betas array to anthropic-beta header
- Update applyClaudeHeaders() to merge custom betas from request body

Closes #324
2025-11-25 03:33:05 -08:00
Ben Vargas 0895533400 fix(registry): correct Claude Opus 4.5 created timestamp
Update epoch from 1730419200 (2024-11-01) to 1761955200 (2025-11-01).
2025-11-24 12:27:23 -07:00
Ben Vargas 43f007c234 feat(registry): add Claude Opus 4.5 model definition
Add support for claude-opus-4-5-20251101 with 200K context window
and 64K max output tokens.
2025-11-24 12:26:39 -07:00
Luis Pater db81331ae8 **refactor(middleware): extract request logging logic and optimize condition checks**
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Added `shouldLogRequest` helper to simplify path-based request logging logic.
- Updated middleware to skip management endpoints for improved security.
- Introduced an explicit `nil` logger check for minimal overhead.
- Updated dependencies in `go.mod`.

**feat(auth): add handling for 404 response with retry logic**

- Introduced support for 404 `not_found` status with a 12-hour backoff period.
- Updated `manager.go` to align state and status messages for 404 scenarios.

**refactor(translator): comment out debug logging in Gemini responses request**
2025-11-20 23:20:40 +08:00
Luis Pater 371324c090 **feat(registry): expand Gemini model definitions and support Vertex AI** 2025-11-20 18:16:26 +08:00
Luis Pater 0586da9c2b **refactor(registry): move Gemini 3 Pro Preview model definition to base set** 2025-11-20 10:51:16 +08:00
Ben Vargas 782bba0bc4 feat(registry): enable gemini-3-pro-preview for gemini-cli provider
Add gemini-3-pro-preview model to GetGeminiCLIModels() to make it
available for OAuth-based Gemini CLI users, matching the model
already available in AI Studio provider.

Model spec:
- ID: gemini-3-pro-preview
- Version: 3.0
- Input: 1M tokens
- Output: 64K tokens
- Thinking: 128-32K tokens (dynamic)
2025-11-19 12:47:39 -07:00
Luis Pater bf116b68f8 **feat(registry): add GPT-5.1 Codex Max model definitions and support**
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Introduced `gpt-5.1-codex-max` variants to model definitions (`low`, `medium`, `high`, `xhigh`).
- Updated executor logic to map effort levels for Codex Max models.
- Added `lastCodexMaxPrompt` processing for `gpt-5.1-codex-max` prompts.
- Defined instructions for `gpt-5.1-codex-max` in a new file: `codex_instructions/gpt-5.1-codex-max_prompt.md`.
2025-11-20 03:12:22 +08:00
Luis Pater 17016ae6a5 **feat(registry): add Gemini 3 Pro Preview model definition**
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
2025-11-18 23:48:21 +08:00
Luis Pater 01b7b60901 **feat(registry): add Gemini 3 Pro Preview model definition** 2025-11-18 23:46:58 +08:00
Luis Pater 23a7633e6d **fix(registry): update Thinking parameters and replace Gemini-3 Preview with Gemini-2.5 Flash Lite**
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
2025-11-18 11:51:52 +08:00
Luis Pater 772fa69515 Fixed: #254
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
**feat(registry): add Kimi-K2-Thinking model to model definitions**
2025-11-14 21:20:54 +08:00
Luis Pater 1ccb01631d **refactor(runtime): centralize reasoning effort logic for GPT models**
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
Extract reasoning effort mapping into a reusable function `setReasoningEffortByAlias` to reduce redundancy and improve maintainability. Introduce support for the "gpt-5.1-none" variant in the registry and runtime executor.
2025-11-14 17:24:40 +08:00
Ben Vargas cfbaed0e90 fix(runtime): remove gpt-5.1 minimal effort variant
Stop advertising and mapping the unsupported gpt-5.1-minimal variant in the model registry and Codex executor, and align bare gpt-5.1 requests to use medium reasoning effort like Codex CLI while preserving minimal for gpt-5.
2025-11-13 19:43:52 -07:00
Luis Pater 75b57bc112 Fixed: #246
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
feat(runtime): add support for GPT-5.1 models and variants

Introduce GPT-5.1 model family, including minimal, low, medium, high, Codex, and Codex Mini variants. Update tokenization and reasoning effort handling to accommodate new models in executor and registry.
2025-11-13 17:42:19 +08:00
TUGOhost 92f4278039 feat: add auto model resolution and model creation timestamp tracking
- Add 'created' field to model registry for tracking model creation time
- Implement GetFirstAvailableModel() to find the first available model by newest creation timestamp
- Add ResolveAutoModel() utility function to resolve "auto" model name to actual available model
- Update request handler to resolve "auto" model before processing requests
- Ensures automatic model selection when "auto" is specified as model name

This enables dynamic model selection based on availability and creation time, improving the user experience when no specific model is requested.
2025-11-11 20:30:09 +08:00
Luis Pater d745f07044 fix(registry): replace Gemini model list with updated stable and preview versions
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
2025-11-08 15:51:57 +08:00
Jeff Nash fcb0293c0d feat(registry): add GPT-5 Codex Mini model variants
Adds three new Codex Mini model variants (mini, mini-medium, mini-high)
that map to codex-mini-latest. Codex Mini supports medium and high
reasoning effort levels only (no low/minimal). Base model defaults to
medium reasoning effort.
2025-11-07 17:07:39 -08:00
Luis Pater a7d105bd69 Fixed: #223
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
fix(registry): add `MiniMax-M2` model to registry definitions
2025-11-08 00:10:51 +08:00
Luis Pater 7516ac4ce7 fix(registry): add gemini-3-pro-preview-11-2025 model to Gemini CLI model definitions
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
2025-11-06 08:47:17 +08:00
Luis Pater e18e288fda fix(registry): Remove gemini-2.5-flash-image Gemini models from gemini cli and add gemini-2.5-flash-image preview to AIStudio
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
These models were likely for internal preview or testing and are no longer relevant for public use.
2025-11-04 03:02:16 +08:00
Luis Pater 07da781336 feat(registry): add client model support check for executor filtering
- Introduced `ClientSupportsModel` function to `ModelRegistry` for verifying client support for specific models.
- Integrated model support validation into executor candidate filtering logic.
- Updated CLIProxy registry interface to include the new support check method.
2025-10-31 09:15:14 +08:00
hkfires 3d7aca22c0 feat(registry): add thinking budget support; populate Gemini models 2025-10-29 19:19:17 +08:00
hkfires 5dced4c0a6 feat(registry): unify Gemini models and add AI Studio set 2025-10-28 19:00:25 +08:00
Luis Pater 2d5d06c809 feat(registry): add Qwen3 Vision Model definition #164
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
2025-10-27 00:41:05 +08:00
Luis Pater d225558dae feat: improve error handling with added status codes and headers
- Updated Execute methods to include enhanced error handling via `StatusCode` and `Headers` extraction.
- Introduced structured error responses for cooling down scenarios, providing additional metadata and retry suggestions.
- Refined quota management, allowing for differentiation between cool-down, disabled, and other block reasons.
- Improved model filtering logic based on client availability and suspension criteria.
2025-10-22 09:01:11 +08:00
Luis Pater 6b23e2da74 feat(claude): add Claude 4.5 Haiku model definition
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
2025-10-16 04:53:07 +08:00
Luis Pater 20787cd107 feat(registry, executor, util): add support for gemini-2.5-flash-image-preview and improve aspect ratio handling
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Introduced `gemini-2.5-flash-image-preview` model to the registry with updated definitions.
- Enhanced Gemini CLI and API executors to handle image aspect ratio adjustments for the new model.
- Added utility function to create base64 white image placeholders based on aspect ratio configurations.
2025-10-10 01:49:58 +08:00
Luis Pater b2cdbbdd47 feat(registry, executor): add support for glm-4.6 model and enhance Gemini CLI token handling
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Added `glm-4.6` model to registry and documentation.
- Updated Gemini CLI executor to pass configuration to `prepareGeminiCLITokenSource` for improved token management.
2025-10-09 20:57:18 +08:00
Luis Pater d45ebff66b feat(registry, executor): add support for gemini-2.5-flash-image model
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Introduced `gemini-2.5-flash-image` model with updated definitions in registry.
- Enhanced model marker detection in Gemini CLI executor to include support for the new model.
2025-10-09 10:06:10 +08:00
Luis Pater bbdd68a8b4 feat(registry/runtime): add Gemini 2.5 model and increase buffer sizes
- Added new "Gemini 2.5 Flash Image Preview" model definition, with enhanced image generation capabilities.
- Increased scanner buffer size to 20,971,520 bytes across executors and translators to handle larger payloads.
2025-10-06 04:44:45 +08:00
hkfires 9abcaf177f feat(registry): Add display names and descriptions for iFlow models 2025-10-05 16:11:40 +08:00
hkfires b839e351c4 feat: Add support for iFlow provider 2025-10-05 15:51:09 +08:00
Luis Pater b2ca49376c feat(models): add support for Claude 4.5 Sonnet model in registry
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Introduced new model definition for `Claude 4.5 Sonnet` with metadata and creation details.
- Ensures compatibility and access to the latest Claude model variant.
2025-09-30 01:58:16 +08:00
hkfires cd0b1be46c fix(log): Reduce noise on metadata updates and provider sync 2025-09-26 21:42:42 +08:00
hkfires 0f55e550cf refactor(registry): Preserve duplicate models in client registration
The `RegisterClient` function previously deduplicated the list of models provided by a client. This could lead to an inaccurate representation of the client's state if it intentionally registered the same model ID multiple times.

This change refactors the registration logic to store the raw, unfiltered list of models, preserving their original order and count.

A new `rawModelIDs` slice tracks the complete list for storage in `clientModels`, while the logic for processing changes continues to use a unique set of model IDs for efficiency. This ensures the registry's state accurately reflects what the client provides.
2025-09-26 19:38:44 +08:00
hkfires e1de04230f fix(registry): Reset client status on model re-registration
When a client re-registers with the model registry, its previous status for a given model (e.g., quota exceeded or suspended) was not being cleared. This could lead to a situation where a client is permanently unable to use a model even after re-registering.

This change ensures that when a client re-registers an existing model, its ID is removed from the model's `QuotaExceededClients` and `SuspendedClients` lists. This effectively resets the client's status for that model, allowing for a fresh start upon reconnection.
2025-09-26 19:19:24 +08:00
hkfires a887a337a5 fix(registry): Handle duplicate model IDs in client registration
The previous model registration logic used a set-like map to track the models associated with a client. This caused issues when a client registered multiple instances of the same model ID, as they were all treated as a single registration.

This commit refactors the registration logic to use count maps for both the old and new model lists. This allows the system to accurately track the number of instances for each model ID provided by a client.

The changes ensure that:
- When a client updates its model list, the exact number of added or removed instances for each model ID is correctly calculated.
- Provider counts are accurately incremented or decremented based on the number of model instances being added, removed, or having their provider changed.
- The registry correctly handles scenarios where a client reduces the number of duplicate model registrations (e.g., from `[A, A]` to `[A]`), properly deregistering the surplus instance.
2025-09-26 18:52:58 +08:00
hkfires 2717ba3e50 fix(registry): Avoid provider update when new provider is empty
When a client re-registered and changed its provider from a non-empty value to an empty string, the logic would still trigger a provider update for the client's models. An empty provider string should not cause an update.

This commit fixes this behavior by adding a check to ensure the new provider is a non-empty string before updating the model's provider information.

Additionally, the logic for detecting a provider change has been simplified by removing an unnecessary variable.
2025-09-26 18:32:47 +08:00
hkfires 63af4c551d fix(registry): Fix provider change logic for new models
When a client changed its provider and registered a new model in the same `RegisterClient` call, the logic would incorrectly attempt to decrement the provider count for the new model from the old provider. This was because the loop iterated over all new model IDs without checking if they were part of the client's previous registration.

This commit adds a check to ensure that a model existed in the client's old model set before attempting to decrement the old provider's usage count. This prevents incorrect state updates in the registry during provider transitions that also introduce new models.
2025-09-26 18:32:47 +08:00
hkfires c675cf5e72 refactor(config): Implement reconciliation for providers and clients
This commit introduces a reconciliation mechanism for handling configuration updates, significantly improving efficiency and resource management.

Previously, reloading the configuration would tear down and recreate all access providers from scratch, regardless of whether their individual configurations had changed. This was inefficient and could disrupt services.

The new `sdkaccess.ReconcileProviders` function now compares the old and new configurations to intelligently manage the provider lifecycle:
- Unchanged providers are kept.
- New providers are created.
- Providers removed from the config are closed and discarded.
- Providers with updated configurations are gracefully closed and recreated.

To support this, a `Close()` method has been added to the `Provider` interface.

A similar reconciliation logic has been applied to the client registration state in `state.RegisterClient`. This ensures that model registrations are accurately tracked when a client's configuration is updated, correctly handling added, removed, and unchanged models. Enhanced logging provides visibility into these operations.
2025-09-26 18:32:47 +08:00
hkfires 3ca01b60a5 refactor(logging): Improve client loading and registration logs 2025-09-26 14:01:41 +08:00
Luis Pater f5dc380b63 rebuild branch
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
2025-09-25 10:32:48 +08:00
Luis Pater 3f69254f43 remove all 2025-09-25 10:31:02 +08:00
Luis Pater a2c5fdaf66 refactor(executor): remove ClientAdapter and legacy fallback logic
- Deleted `ClientAdapter` implementation and associated fallback methods.
- Removed legacy executor logic from `codex`, `claude`, `gemini`, and `qwen` executors.
- Simplified `handlers` by eliminating `UnwrapError` handling and related dependencies.
- Cleaned up `model_registry` by removing logic associated with suspended clients.
- Updated `.gitignore` to ignore `.serena/` directory.
2025-09-24 21:09:36 +08:00
Luis Pater e68a6037e2 feat(auth): enable model suspension and resumption logic in AuthManager
- Added model suspension with reason tracking for 401 (unauthorized) and 402/403 (payment-related) errors.
- Implemented resumption logic upon model quota recovery or auth state changes.
- Enhanced registry to manage suspended clients, including counts and observability data.
- Updated availability computation to exclude suspended clients, ensuring accurate client model tracking.
2025-09-23 09:24:55 +08:00
Luis Pater 4999fce7f4 v6 version first commit 2025-09-22 01:40:24 +08:00
Luis Pater b84cbee77a Add support for forcing GPT-5 Codex model configuration
docker-image / docker (push) Has been cancelled
goreleaser / goreleaser (push) Has been cancelled
- Introduced a new `ForceGPT5Codex` configuration option in settings.
- Added relevant API endpoints for managing `ForceGPT5Codex`.
- Enhanced Codex client to handle GPT-5 Codex-specific logic and mapping.
- Updated example configuration file to include the new option.

Add GPT-5 Codex model support and configuration options in documentation
2025-09-16 04:40:19 +08:00
hkfires 3e09bc9470 Add Gemini 2.5 Flash-Lite Model 2025-09-04 11:59:48 +08:00