Skip to content

Add free plan profile and Cerebras provider#111

Merged
priyanshujain merged 16 commits intomasterfrom
free-plan
Mar 20, 2026
Merged

Add free plan profile and Cerebras provider#111
priyanshujain merged 16 commits intomasterfrom
free-plan

Conversation

@priyanshujain
Copy link
Copy Markdown
Collaborator

Summary

  • Add Cerebras provider (provider/cerebras) — thin OpenAI-compatible wrapper, same pattern as Groq
  • Add Free ($0/mo) profile using Gemini + Cerebras (no credit card required):
    • Default: gemini/gemini-2.5-flash (1M context)
    • Complex: cerebras/qwen-3-235b-a22b-instruct-2507 (strongest free model)
    • Fast: cerebras/llama3.1-8b (low latency)
    • Nano: gemini/gemini-2.5-flash-lite (lightweight)
  • Replace deprecated gemini-2.0-flash (retired March 6, 2026) with gemini-2.5-flash-lite across all profiles
  • Register CEREBRAS_API_KEY env var, model listing, context windows, and $0 pricing

Test plan

  • go build ./... compiles
  • go vet ./... clean
  • go test ./provider/... — all pass (87.7% coverage), new listModelsCerebras at 100%
  • go test ./config/... — all pass (80.4% coverage)
  • go test ./internal/cli/... — all pass, profile validation covers free profile
  • Integration test (TestCerebrasIntegration_Chat) verified real API call to Cerebras
  • No remaining gpt-oss-120b references (was a non-existent model)
  • gemini-2.0-flash kept only in context_window.go for backward compat

Thin wrapper over OpenAI-compatible API at api.cerebras.ai,
following the same pattern as Groq and OpenRouter providers.
Add CEREBRAS_API_KEY to ProviderEnvVars and add cerebras case to
ListModels switch, reusing listModelsOpenAICompat.
Add blank import for cerebras provider registration and add
providerInfo entry with gpt-oss-120b and qwen-3-235b models.
Add gemini-2.5-flash-lite (1M), gpt-oss-120b (65K), qwen-3-235b (65K)
context windows. All three are free-tier models with $0 pricing.
$0/mo profile: gemini-2.5-flash (default), gemini-2.5-pro (complex),
cerebras/gpt-oss-120b (fast), gemini-2.5-flash-lite (nano).
….5-flash-lite

gemini-2.0-flash was deprecated March 6, 2026. Replace all references
in profiles and CLI setup. Keep context_window.go entry for backward
compat with existing user configs.
Follow existing TestListModels_Groq pattern with httptest server.
Replace gemini-2.0-flash with gemini-2.5-flash-lite in config_test.go
and config_profiles_test.go to match profile changes.
Add context window tests for gemini-2.5-flash-lite, gpt-oss-120b,
qwen-3-235b. Update gemini-2.0-flash references in test fixtures.
Add Free row to multi-provider profile table, add Cerebras under
new providers, replace all gemini-2.0-flash references, add
deprecation note, update CLI setup flow to show Free first.
gpt-oss-120b doesn't exist on Cerebras. Replace with actual models:
qwen-3-235b-a22b-instruct-2507 (most capable) and llama3.1-8b (fast).
Move qwen-3-235b to complex (strongest reasoning), llama3.1-8b to
fast (latency-sensitive), keep gemini for default and nano.
Tests ListModels and Chat against live Cerebras API.
Skipped when CEREBRAS_API_KEY is not set.
Keep both cerebras and zai additions in registry, context window
tests, profiles, and ProfileNames.
- Deprecation note was saying gemini-2.5-flash-lite instead of gemini-2.0-flash
- Cerebras description listed stale qwen-3-235b instead of llama3.1-8b
- CLI setup description said "single-provider first" but Free is now first
Prevents custom profiles from shadowing the new built-in free profile.
@priyanshujain priyanshujain merged commit c2f3b78 into master Mar 20, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant