Name	Name	Last commit message	Last commit date
parent directory ..
tokenizer	tokenizer
LICENSE	LICENSE
README.md	README.md
embeddinggemma.py	embeddinggemma.py

Name

Last commit message

Last commit date

LICENSE

README.md

embeddinggemma.py

EmbeddingGemma

Input

Text sequences for embedding. The CLI uses a single query plus one or more documents via --documents or falls back to the built-in examples.
Token shape: (batch, sequence_length)

Output

sentence_embedding shape: (batch, 768)
Console shows cosine-similarity ranking between the query and provided documents.

Requirements

This model requires additional module.

pip3 install transformers

Usage

Automatically downloads the onnx and prototxt files on the first run. It is necessary to be connected to the Internet while downloading.

Run with the built-in demo (default query and planet documents):

$ python3 embeddinggemma.py

Similarity search with custom query and documents:

$ python3 embeddinggemma.py \
	--query "What is the Red Planet?" \
	--documents "Mercury is closest to the Sun" "Mars is called the Red Planet" "Saturn has rings"

Document input: pass one or more strings after --documents. The option can be repeated to group documents, e.g.:

$ python3 embeddinggemma.py --documents "Doc A" "Doc B" --documents "Doc C"

Reference

Framework

Pytorch

Model Format

ONNX opset=17

Netron

embeddinggemma-300m.onnx.prototxt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

EmbeddingGemma

Input

Output

Requirements

Usage

Reference

Framework

Model Format

Netron

FilesExpand file tree

embeddinggemma

Directory actions

More options

Directory actions

More options

Latest commit

History

embeddinggemma

Folders and files

parent directory

README.md

EmbeddingGemma

Input

Output

Requirements

Usage

Reference

Framework

Model Format

Netron