Implement embedding examples with Google GenAI#1084
Implement embedding examples with Google GenAI#1084WiktoriaStefanczyk1995 wants to merge 1 commit intogoogle-gemini:mainfrom
Conversation
Added code examples for embedding content using Google GenAI in multiple languages, including JavaScript, Go, and Python. Included functionality for semantic similarity and embedding length retrieval.
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @WiktoriaStefanczyk1995, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the existing codebase by adding a comprehensive set of examples for utilizing Google's Generative AI (GenAI) for content embedding. The new examples are designed to be practical and cover fundamental embedding operations, including generating embeddings, determining semantic similarity between texts, and controlling embedding output dimensionality. By offering these examples in JavaScript, Go, and Python, the PR aims to facilitate easier adoption and integration of GenAI embedding capabilities for developers working with various technology stacks. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new file, WiktoriaStefanczyk1995/cookbook, which appears to be an unintentional concatenation of code snippets in various languages (JavaScript, Go, Python, curl) rather than a structured, usable example. The file is malformed and cannot be executed. The filename also seems to be a local path. My primary feedback is to refactor this PR by separating the examples into individual, correctly formatted files for each language, placing them in the appropriate directory with descriptive names.
| https://ai.google.dev/gemini-api/docs/embeddings?hl=pl#supported-task-typesimport { GoogleGenAI } from "@google/genai"; | ||
|
|
||
| async function main() { | ||
|
|
||
| const ai = new GoogleGenAI({}); | ||
|
|
||
| const response = await ai.models.embedContent({ | ||
| model: 'gemini-embedding-001', | ||
| contents: 'What is the meaning of life?', | ||
| }); | ||
|
|
||
| console.log(response.embeddings); | ||
| } | ||
|
|
||
| main();package main | ||
|
|
||
| import ( | ||
| "context" | ||
| "encoding/json" | ||
| "fmt" | ||
| "log" | ||
|
|
||
| "google.golang.org/genai" | ||
| ) | ||
|
|
||
| func main() { | ||
| ctx := context.Background() | ||
| client, err := genai.NewClient(ctx, nil) | ||
| if err != nil { | ||
| log.Fatal(err) | ||
| } | ||
|
|
||
| contents := []*genai.Content{ | ||
| genai.NewContentFromText("What is the meaning of life?", genai.RoleUser), | ||
| } | ||
| result, err := client.Models.EmbedContent(ctx, | ||
| "gemini-embedding-001", | ||
| contents, | ||
| nil, | ||
| ) | ||
| if err != nil { | ||
| log.Fatal(err) | ||
| } | ||
|
|
||
| embeddings, err := json.MarshalIndent(result.Embeddings, "", " ") | ||
| if err != nil { | ||
| log.Fatal(err) | ||
| } | ||
| fmt.Println(string(embeddings)) | ||
| }curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent" \ | ||
| -H "x-goog-api-key: $GEMINI_API_KEY" \ | ||
| -H 'Content-Type: application/json' \ | ||
| -d '{"model": "models/gemini-embedding-001", | ||
| "content": {"parts":[{"text": "What is the meaning of life?"}]} | ||
| }'curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:batchEmbedContents" \ | ||
| -H "x-goog-api-key: $GEMINI_API_KEY" \ | ||
| -H 'Content-Type: application/json' \ | ||
| -d '{"requests": [{ | ||
| "model": "models/gemini-embedding-001", | ||
| "content": { | ||
| "parts":[{ | ||
| "text": "What is the meaning of life?"}]}, }, | ||
| { | ||
| "model": "models/gemini-embedding-001", | ||
| "content": { | ||
| "parts":[{ | ||
| "text": "How much wood would a woodchuck chuck?"}]}, }, | ||
| { | ||
| "model": "models/gemini-embedding-001", | ||
| "content": { | ||
| "parts":[{ | ||
| "text": "How does the brain work?"}]}, }, ]}' 2> /dev/null | grep -C 5 values | ||
| ```https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:batchEmbedContentsfrom google import genai | ||
|
|
||
| client = genai.Client() | ||
|
|
||
| result = client.models.embed_content( | ||
| model="gemini-embedding-001", | ||
| contents="What is the meaning of life?") | ||
|
|
||
| print(result.embeddings)google.golang.org/genaihttps://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContentsklearn.metrics.pairwiseimport { GoogleGenAI } from "@google/genai"; | ||
| import * as cosineSimilarity from "compute-cosine-similarity"; | ||
|
|
||
| async function main() { | ||
| const ai = new GoogleGenAI({}); | ||
|
|
||
| const texts = [ | ||
| "What is the meaning of life?", | ||
| "What is the purpose of existence?", | ||
| "How do I bake a cake?", | ||
| ]; | ||
|
|
||
| const response = await ai.models.embedContent({ | ||
| model: 'gemini-embedding-001', | ||
| contents: texts, | ||
| taskType: 'SEMANTIC_SIMILARITY' | ||
| }); | ||
|
|
||
| const embeddings = response.embeddings.map(e => e.values); | ||
|
|
||
| for (let i = 0; i < texts.length; i++) { | ||
| for (let j = i + 1; j < texts.length; j++) { | ||
| const text1 = texts[i]; | ||
| const text2 = texts[j]; | ||
| const similarity = cosineSimilarity(embeddings[i], embeddings[j]); | ||
| console.log(`Similarity between '${text1}' and '${text2}': ${similarity.toFixed(4)}`); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| main();package main | ||
|
|
||
| import ( | ||
| "context" | ||
| "fmt" | ||
| "log" | ||
| "math" | ||
|
|
||
| "google.golang.org/genai" | ||
| ) | ||
|
|
||
| // cosineSimilarity calculates the similarity between two vectors. | ||
| func cosineSimilarity(a, b []float32) (float64, error) { | ||
| if len(a) != len(b) { | ||
| return 0, fmt.Errorf("vectors must have the same length") | ||
| } | ||
|
|
||
| var dotProduct, aMagnitude, bMagnitude float64 | ||
| for i := 0; i < len(a); i++ { | ||
| dotProduct += float64(a[i] * b[i]) | ||
| aMagnitude += float64(a[i] * a[i]) | ||
| bMagnitude += float64(b[i] * b[i]) | ||
| } | ||
|
|
||
| if aMagnitude == 0 || bMagnitude == 0 { | ||
| return 0, nil | ||
| } | ||
|
|
||
| return dotProduct / (math.Sqrt(aMagnitude) * math.Sqrt(bMagnitude)), nil | ||
| } | ||
|
|
||
| func main() { | ||
| ctx := context.Background() | ||
| client, _ := genai.NewClient(ctx, nil) | ||
| defer client.Close() | ||
|
|
||
| texts := []string{ | ||
| "What is the meaning of life?", | ||
| "What is the purpose of existence?", | ||
| "How do I bake a cake?", | ||
| } | ||
|
|
||
| var contents []*genai.Content | ||
| for _, text := range texts { | ||
| contents = append(contents, genai.NewContentFromText(text, genai.RoleUser)) | ||
| } | ||
|
|
||
| result, _ := client.Models.EmbedContent(ctx, | ||
| "gemini-embedding-001", | ||
| contents, | ||
| &genai.EmbedContentRequest{TaskType: genai.TaskTypeSemanticSimilarity}, | ||
| ) | ||
|
|
||
| embeddings := result.Embeddings | ||
|
|
||
| for i := 0; i < len(texts); i++ { | ||
| for j := i + 1; j < len(texts); j++ { | ||
| similarity, _ := cosineSimilarity(embeddings[i].Values, embeddings[j].Values) | ||
| fmt.Printf("Similarity between '%s' and '%s': %.4f\n", texts[i], texts[j], similarity) | ||
| } | ||
| } | ||
| }Similarity between 'What is the meaning of life?' and 'What is the purpose of existence?': 0.9481 | ||
|
|
||
| Similarity between 'What is the meaning of life?' and 'How do I bake a cake?': 0.7471 | ||
|
|
||
| Similarity between 'What is the purpose of existence?' and 'How do I bake a cake?': 0.7371result.Embeddingscurl "https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent" \ | ||
| -H "x-goog-api-key: $GEMINI_API_KEY" \ | ||
| -H 'Content-Type: application/json' \ | ||
| -d '{"task_type": "SEMANTIC_SIMILARITY", | ||
| "content": { | ||
| "parts":[{ | ||
| "text": "What is the meaning of life?"}, {"text": "How much wood would a woodchuck chuck?"}, {"text": "How does the brain work?"}]} | ||
| }'from google import genai | ||
| from google.genai import types | ||
| import numpy as np | ||
| from sklearn.metrics.pairwise import cosine_similarity | ||
|
|
||
| client = genai.Client() | ||
|
|
||
| texts = [ | ||
| "What is the meaning of life?", | ||
| "What is the purpose of existence?", | ||
| "How do I bake a cake?"] | ||
|
|
||
| result = [ | ||
| np.array(e.values) for e in client.models.embed_content( | ||
| model="gemini-embedding-001", | ||
| contents=texts, | ||
| config=types.EmbedContentConfig(task_type="SEMANTIC_SIMILARITY")).embeddings | ||
| ] | ||
|
|
||
| # Calculate cosine similarity. Higher scores = greater semantic similarity. | ||
|
|
||
| embeddings_matrix = np.array(result) | ||
| similarity_matrix = cosine_similarity(embeddings_matrix) | ||
|
|
||
| for i, text1 in enumerate(texts): | ||
| for j in range(i + 1, len(texts)): | ||
| text2 = texts[j] | ||
| similarity = similarity_matrix[i, j] | ||
| print(f"Similarity between '{text1}' and '{text2}': {similarity:.4f}")google.genaipackage main | ||
|
|
||
| import ( | ||
| "context" | ||
| "fmt" | ||
| "log" | ||
|
|
||
| "google.golang.org/genai" | ||
| ) | ||
|
|
||
| func main() { | ||
| ctx := context.Background() | ||
| // The client uses Application Default Credentials. | ||
| // Authenticate with 'gcloud auth application-default login'. | ||
| client, err := genai.NewClient(ctx, nil) | ||
| if err != nil { | ||
| log.Fatal(err) | ||
| } | ||
| defer client.Close() | ||
|
|
||
| contents := []*genai.Content{ | ||
| genai.NewContentFromText("What is the meaning of life?", genai.RoleUser), | ||
| } | ||
|
|
||
| result, err := client.Models.EmbedContent(ctx, | ||
| "gemini-embedding-001", | ||
| contents, | ||
| &genai.EmbedContentRequest{OutputDimensionality: 768}, | ||
| ) | ||
| if err != nil { | ||
| log.Fatal(err) | ||
| } | ||
|
|
||
| embedding := result.Embeddings[0] | ||
| embeddingLength := len(embedding.Values) | ||
| fmt.Printf("Length of embedding: %d\n", embeddingLength) | ||
| }from google import genai | ||
| from google.genai import types | ||
|
|
||
| client = genai.Client() | ||
|
|
||
| result = client.models.embed_content( | ||
| model="gemini-embedding-001", | ||
| contents="What is the meaning of life?", | ||
| config=types.EmbedContentConfig(output_dimensionality=768) | ||
| ) | ||
|
|
||
| [embedding_obj] = result.embeddings | ||
| embedding_length = len(embedding_obj.values) | ||
|
|
||
| print(f"Length of embedding: {embedding_length}")curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent" \ | ||
| -H "x-goog-api-key: $GEMINI_API_KEY" \ | ||
| -H 'Content-Type: application/json' \ | ||
| -d '{ | ||
| "content": {"parts":[{ "text": "What is the meaning of life?"}]}, | ||
| "output_dimensionality": 768 | ||
| }'import { GoogleGenAI } from "@google/genai"; | ||
|
|
||
| async function main() { | ||
| const ai = new GoogleGenAI({}); | ||
|
|
||
| const response = await ai.models.embedContent({ | ||
| model: 'gemini-embedding-001', | ||
| content: 'What is the meaning of life?', | ||
| outputDimensionality: 768, | ||
| }); | ||
|
|
||
| const embeddingLength = response.embedding.values.length; | ||
| console.log(`Length of embedding: ${embeddingLength}`); | ||
| } | ||
|
|
||
| main();result.embeddingsLength of embedding: 768 |
There was a problem hiding this comment.
This file appears to be a collection of code snippets for different languages (JavaScript, Go, Python, curl) and other text, all concatenated together. This makes the file invalid and unusable, and it seems like this might have been an accidental commit.
The examples should be separated into individual, properly structured files. For instance:
- The JavaScript code should be in a
.jsfile. - The Go code should be in a
.gofile. - The Python code should be in a
.pyfile or a Colab notebook (.ipynb), following the repository's style guide for notebooks.
Additionally, the filename WiktoriaStefanczyk1995/cookbook seems incorrect and should be changed to something descriptive within the repository's structure, such as examples/embeddings/getting_started.js or similar.
Added code examples for embedding content using Google GenAI in multiple languages, including JavaScript, Go, and Python. Included functionality for semantic similarity and embedding length retrieval.