Add free plan profile and Cerebras provider#111
Merged
priyanshujain merged 16 commits intomasterfrom Mar 20, 2026
Merged
Conversation
Thin wrapper over OpenAI-compatible API at api.cerebras.ai, following the same pattern as Groq and OpenRouter providers.
Add CEREBRAS_API_KEY to ProviderEnvVars and add cerebras case to ListModels switch, reusing listModelsOpenAICompat.
Add blank import for cerebras provider registration and add providerInfo entry with gpt-oss-120b and qwen-3-235b models.
Add gemini-2.5-flash-lite (1M), gpt-oss-120b (65K), qwen-3-235b (65K) context windows. All three are free-tier models with $0 pricing.
$0/mo profile: gemini-2.5-flash (default), gemini-2.5-pro (complex), cerebras/gpt-oss-120b (fast), gemini-2.5-flash-lite (nano).
….5-flash-lite gemini-2.0-flash was deprecated March 6, 2026. Replace all references in profiles and CLI setup. Keep context_window.go entry for backward compat with existing user configs.
Follow existing TestListModels_Groq pattern with httptest server.
Replace gemini-2.0-flash with gemini-2.5-flash-lite in config_test.go and config_profiles_test.go to match profile changes.
Add context window tests for gemini-2.5-flash-lite, gpt-oss-120b, qwen-3-235b. Update gemini-2.0-flash references in test fixtures.
Add Free row to multi-provider profile table, add Cerebras under new providers, replace all gemini-2.0-flash references, add deprecation note, update CLI setup flow to show Free first.
gpt-oss-120b doesn't exist on Cerebras. Replace with actual models: qwen-3-235b-a22b-instruct-2507 (most capable) and llama3.1-8b (fast).
Move qwen-3-235b to complex (strongest reasoning), llama3.1-8b to fast (latency-sensitive), keep gemini for default and nano.
Tests ListModels and Chat against live Cerebras API. Skipped when CEREBRAS_API_KEY is not set.
Keep both cerebras and zai additions in registry, context window tests, profiles, and ProfileNames.
- Deprecation note was saying gemini-2.5-flash-lite instead of gemini-2.0-flash - Cerebras description listed stale qwen-3-235b instead of llama3.1-8b - CLI setup description said "single-provider first" but Free is now first
Prevents custom profiles from shadowing the new built-in free profile.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
provider/cerebras) — thin OpenAI-compatible wrapper, same pattern as Groqgemini/gemini-2.5-flash(1M context)cerebras/qwen-3-235b-a22b-instruct-2507(strongest free model)cerebras/llama3.1-8b(low latency)gemini/gemini-2.5-flash-lite(lightweight)gemini-2.0-flash(retired March 6, 2026) withgemini-2.5-flash-liteacross all profilesCEREBRAS_API_KEYenv var, model listing, context windows, and $0 pricingTest plan
go build ./...compilesgo vet ./...cleango test ./provider/...— all pass (87.7% coverage), newlistModelsCerebrasat 100%go test ./config/...— all pass (80.4% coverage)go test ./internal/cli/...— all pass, profile validation covers free profileTestCerebrasIntegration_Chat) verified real API call to Cerebrasgpt-oss-120breferences (was a non-existent model)gemini-2.0-flashkept only incontext_window.gofor backward compat