Skip to content

Update Google AI Plugin Go docs#191

Draft
chrisraygill wants to merge 6 commits intomainfrom
update-google-ai-plugin-docs-go
Draft

Update Google AI Plugin Go docs#191
chrisraygill wants to merge 6 commits intomainfrom
update-google-ai-plugin-docs-go

Conversation

@chrisraygill
Copy link
Collaborator

No description provided.

@gemini-code-assist
Copy link

Summary of Changes

Hello @chrisraygill, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a substantial update and expansion to the Go documentation for the Google AI plugin within Genkit. The changes aim to provide a more comprehensive and user-friendly guide for developers, covering a wider array of Google's generative AI models and their advanced features. The updated documentation includes detailed setup instructions, authentication methods, and extensive examples for various model types, multimodal inputs/outputs, and advanced configurations like structured output, reasoning, context caching, and grounding capabilities.

Highlights

  • Expanded Model Support: The documentation now covers a significantly broader range of Google's generative AI models, including the latest Gemini 3 Series, Gemini 2.5 Series, Gemma 3 Series, Imagen 4/3 Series, Veo 3.1/3.0/2.0 Series, and Gemini TTS models, providing detailed information and usage examples for each.
  • Enhanced Setup and Configuration: The setup section has been thoroughly revised to include clear installation instructions for the Go package, detailed configuration steps for initializing the Genkit plugin, and explicit guidance on API key authentication methods (environment variables and plugin configuration).
  • Advanced Generative AI Features: New sections introduce advanced capabilities such as structured output generation with schema limitations, configuration for model thinking and reasoning (Thinking Level for Gemini 3.0, Thinking Budget for Gemini 2.5), context caching, and comprehensive safety settings for content filtering.
  • New Grounding and Multimodal Capabilities: The documentation now includes examples for integrating Google Search, Google Maps, and Code Execution for enhanced model grounding. It also provides extensive examples for multimodal input (video, image, audio, and PDF understanding) and multimodal output (generating both text and images).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly expands the Go documentation for the Google Generative AI plugin, introducing comprehensive sections on various model types (Language, Embedding, Image, Video, Speech) and multimodal capabilities. The updated documentation includes detailed usage examples for features such as structured output, thinking and reasoning, context caching, safety settings, and grounding with Google Search, Google Maps, and code execution. However, the review comments point out several issues, including incomplete code snippets that are missing necessary import statements or undefined variables, incorrect methods for extracting image URLs and processing structured/audio output, and inconsistencies in heading styles that need to be addressed for clarity and correctness.

Comment on lines 1302 to 1315
// Option 1: Extract only the image (if it's the expected first part)
if len(resp.Message.Content) > 0 && resp.Message.Content[0].IsImage() {
imageUrl := resp.Message.Content[0].Text
fmt.Printf("Image URL: %s\n", imageUrl)
}

// Option 2: Extract both text and images
for _, part := range resp.Message.Content {
if part.IsText() {
fmt.Printf("Text: %s\n", part.Text)
} else if part.IsImage() {
fmt.Printf("Image URL: %s\n", part.Text)
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This code for extracting image and text from a multimodal response is incorrect. For an image part, the URL is in part.Media.URL, not part.Text. Using part.Text for an image part will not work. Also, the fmt package is used but not imported.

import "fmt"

// Option 1: Extract only the image (if it's the expected first part)
if len(resp.Message.Content) > 0 && resp.Message.Content[0].IsImage() {
	imageUrl := resp.Message.Content[0].Media.URL
	fmt.Printf("Image URL: %s\n", imageUrl)
}

// Option 2: Extract both text and images
for _, part := range resp.Message.Content {
	if part.IsText() {
		fmt.Printf("Text: %s\n", part.Text)
	} else if part.IsImage() {
		fmt.Printf("Image URL: %s\n", part.Media.URL)
	}
}

Comment on lines 997 to 1008
**Gemini 3 Series** - Latest experimental models with state-of-the-art reasoning:
- `gemini-3-pro-preview` - Preview of the most capable model for complex tasks
- `gemini-3-flash-preview` - Fast and intelligent model for high-volume tasks
- `gemini-3-pro-image-preview` - Supports image generation outputs

**Gemini 2.5 Series** - Latest stable models with advanced reasoning and multimodal capabilities:
- `gemini-2.5-pro` - Most capable stable model for complex tasks
- `gemini-2.5-flash` - Fast and efficient for most use cases
- `gemini-2.5-flash-lite` - Lightweight version for simple tasks
- `gemini-2.5-flash-image` - Supports image generation outputs

**Gemma 3 Series** - Open models for various use cases:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The document uses a mix of ### headings and bolded text for section titles (e.g., ### Available Models vs. **Gemini 3 Series**). For better document structure and consistency, consider using ### for all similar-level headings. This also applies to **Gemini 2.5 Series**, **Gemma 3 Series**, **Imagen 4 Series**, **Veo 3.1 Series**, etc.

Comment on lines 1068 to 1072
resp, err := genkit.Generate(ctx, g,
ai.WithModel(model),
ai.WithPrompt("Generate a profile for a fictional character"),
ai.WithOutputSchema(profile),
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The model variable is used here but not defined. This makes the code snippet incomplete and confusing. Please define it before use, for example by using ai.WithModelName("googleai/gemini-2.5-flash").

resp, err := genkit.Generate(ctx, g,
	ai.WithModelName("googleai/gemini-2.5-flash"),
	ai.WithPrompt("Generate a profile for a fictional character"),
	ai.WithOutputSchema(profile),
)

Comment on lines 1074 to 1075
// The model output will be in resp.Text which is a JSON string
json.Unmarshal([]byte(resp.Text()), &profile)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using json.Unmarshal on resp.Text() is not the idiomatic way to get structured output in Genkit for Go. It's better to use the resp.Output() helper function, which is safer and simpler.

// The model output can be unmarshalled using the Output helper.
if err = resp.Output(&profile); err != nil {
    // handle error
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bot is right, you know.

Comment on lines 1153 to 1168
```go
resp, err := genkit.Generate(ctx, g, ai.WithModel(modelRef), ai.WithPrompt("Tell me a joke."))
if err != nil {
return err
// Structure prompts with consistent content at the beginning
baseContext := strings.Repeat("You are a helpful cook... (large context) ...", 50)

// First request - content will be cached
resp, err := genkit.Generate(ctx, g,
ai.WithModel(model),
ai.WithPrompt(baseContext + "\n\nTask 1..."),
)

// Second request with same prefix - eligible for cache hit
resp, err = genkit.Generate(ctx, g,
ai.WithModel(model),
ai.WithPrompt(baseContext + "\n\nTask 2..."),
)
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This code snippet for context caching has a couple of issues:

  1. It uses strings.Repeat, but the strings package is not imported.
  2. The model variable is used but not defined.
    Please update the snippet to be a complete, runnable example.
import "strings"

// Structure prompts with consistent content at the beginning
baseContext := strings.Repeat("You are a helpful cook... (large context) ...", 50)

model := googlegenai.GoogleAIModelRef("gemini-2.5-pro", nil)

// First request - content will be cached
resp, err := genkit.Generate(ctx, g,
	ai.WithModel(model),
	ai.WithPrompt(baseContext + "\n\nTask 1..."),
)

// Second request with same prefix - eligible for cache hit
resp, err = genkit.Generate(ctx, g,
	ai.WithModel(model),
	ai.WithPrompt(baseContext + "\n\nTask 2..."),
)

Comment on lines 1325 to 1332
resp, err := genkit.Generate(ctx, g,
ai.WithModel(model),
ai.WithPrompt(
ai.NewTextPart("What happens in this video?"),
ai.NewMediaPart("video/mp4", "https://example.com/video.mp4"),
),
)
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The model variable is used in this snippet but is not defined. This makes the example incomplete. Please define it, for instance: model := googlegenai.GoogleAIModelRef("gemini-2.5-flash", nil). This issue is also present in the other multimodal examples for Image, Audio, and PDF understanding.

model := googlegenai.GoogleAIModelRef("gemini-2.5-flash", nil)
resp, err := genkit.Generate(ctx, g,
	ai.WithModel(model),
	ai.WithPrompt(
		ai.NewTextPart("What happens in this video?"),
		ai.NewMediaPart("video/mp4", "https://example.com/video.mp4"),
	),
)

Comment on lines 1443 to 1463
```go
// Start video generation
resp, err := genkit.Generate(ctx, g,
ai.WithModelName("googleai/veo-3.0-fast-generate-001"),
ai.WithPrompt("A majestic dragon soaring over a mystical forest at dawn."),
)

// Check progress using the operation ID
op := resp.Operation()
for !op.Done {
time.Sleep(5 * time.Second)
op = genkit.CheckOperation(ctx, g, op)
}

if op.Error != nil {
log.Fatal(op.Error)
}

// Access generated video URI from the operation result
videoURI := op.Output.Message.Content[0].Media.URL
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This video generation example uses functions from the time and log packages, but they are not imported. Please add the necessary imports to make the snippet runnable.

import (
    "log"
    "time"
)

// Start video generation
resp, err := genkit.Generate(ctx, g,
	ai.WithModelName("googleai/veo-3.0-fast-generate-001"),
	ai.WithPrompt("A majestic dragon soaring over a mystical forest at dawn."),
)

// Check progress using the operation ID
op := resp.Operation()
for !op.Done {
	time.Sleep(5 * time.Second)
	op = genkit.CheckOperation(ctx, g, op)
}

if op.Error != nil {
	log.Fatal(op.Error)
}

// Access generated video URI from the operation result
videoURI := op.Output.Message.Content[0].Media.URL

Comment on lines 1527 to 1551
```go
import "google.golang.org/genai"

resp, err := genkit.Generate(ctx, g,
ai.WithModelName("googleai/gemini-2.5-flash-preview-tts"),
ai.WithConfig(&genai.GenerateContentConfig{
ResponseModalities: []string{"AUDIO"},
SpeechConfig: &genai.SpeechConfig{
VoiceConfig: &genai.VoiceConfig{
PrebuiltVoiceConfig: &genai.PrebuiltVoiceConfig{
VoiceName: "Algenib",
},
},
},
}),
ai.WithPrompt("Say: Genkit is the best Gen AI library!"),
)

if err != nil {
log.Fatal(err)
}

// The model output will be a base64 encoded string in resp.Text()
// You can decode this and save it as a PCM file or convert to WAV.
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This speech model example has a couple of issues:

  1. The log package is used but not imported.
  2. The comment // The model output will be a base64 encoded string in resp.Text() is likely incorrect. For audio responses, the output is typically a media part, with the data encoded in a data URL in resp.Message.Content[0].Media.URL, not in resp.Text(). The current comment is misleading and doesn't show how to actually process the audio.

Temperature: genai.Ptr[float32](0.5),
}

// Option 1: Use a model reference with "baked-in" config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that this has strong typing so it ensures that you are using the correct config type for the plugin.

// Option 2: Pass configuration per-request
resp, err = genkit.Generate(ctx, g,
ai.WithModelName("googleai/gemini-2.5-flash"),
ai.WithConfig(config), // Pass config explicitly
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whereas this has no strong typing and will work the same but doesn't have any guiderails, although it will error at runtime.

resp, err := genkit.Generate(ctx, g,
ai.WithModel(model),
ai.WithPrompt("Generate a profile for a fictional character"),
ai.WithOutputSchema(profile),
Copy link
Collaborator

@apascal07 apascal07 Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not valid. Either WithOutputType(CharacterProfile{}) or char, resp, err := genkit.GenerateData[CharacterProfile](...).


The following configuration options are available for code execution:

- **codeExecution** _object_
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this notation? Why bold/italic?

Comment on lines 1303 to 1315
if len(resp.Message.Content) > 0 && resp.Message.Content[0].IsImage() {
imageUrl := resp.Message.Content[0].Text
fmt.Printf("Image URL: %s\n", imageUrl)
}

// Option 2: Extract both text and images
for _, part := range resp.Message.Content {
if part.IsText() {
fmt.Printf("Text: %s\n", part.Text)
} else if part.IsImage() {
fmt.Printf("Image URL: %s\n", part.Text)
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just do resp.Text() or resp.Media() instead of all this.

Comment on lines 1445 to 1451
resp, err := genkit.Generate(ctx, g,
ai.WithModelName("googleai/veo-3.0-fast-generate-001"),
ai.WithPrompt("A majestic dragon soaring over a mystical forest at dawn."),
)

// Check progress using the operation ID
op := resp.Operation()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
resp, err := genkit.Generate(ctx, g,
ai.WithModelName("googleai/veo-3.0-fast-generate-001"),
ai.WithPrompt("A majestic dragon soaring over a mystical forest at dawn."),
)
// Check progress using the operation ID
op := resp.Operation()
op, err := genkit.GenerateOperation(ctx, g,
ai.WithModelName("googleai/veo-3.0-fast-generate-001"),
ai.WithPrompt("A majestic dragon soaring over a mystical forest at dawn."),
)

}

// Access generated video URI from the operation result
videoURI := op.Output.Message.Content[0].Media.URL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

op.Output.Media() is the preferred way. Should work I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants