You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: Add Google Vertex AI inference provider support (llamastack#2841)
# What does this PR do?
- Add new Vertex AI remote inference provider with litellm integration
- Support for Gemini models through Google Cloud Vertex AI platform
- Uses Google Cloud Application Default Credentials (ADC) for
authentication
- Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro,
gemini-2.0-flash.
- Updated provider registry to include vertexai provider
- Updated starter template to support Vertex AI configuration
- Added comprehensive documentation and sample configuration
<!-- If resolving an issue, uncomment and update the line below -->
relates to llamastack#2747
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Eran Cohen <[email protected]>
Co-authored-by: Francisco Arceo <[email protected]>
description="""Google Vertex AI inference provider enables you to use Google's Gemini models through Google Cloud's Vertex AI platform, providing several advantages:
225
+
226
+
• Enterprise-grade security: Uses Google Cloud's security controls and IAM
227
+
• Better integration: Seamless integration with other Google Cloud services
228
+
• Advanced features: Access to additional Vertex AI features like model tuning and monitoring
229
+
• Authentication: Uses Google Cloud Application Default Credentials (ADC) instead of API keys
230
+
231
+
Configuration:
232
+
- Set VERTEX_AI_PROJECT environment variable (required)
233
+
- Set VERTEX_AI_LOCATION environment variable (optional, defaults to us-central1)
234
+
- Use Google Cloud Application Default Credentials or service account key
0 commit comments