gravitee-inference

gravitee-inference is a Java library designed to make it easy for engineering teams to integrate and deploy AI models within the Gravitee platform—without needing specialized help from AI/ML teams.

Requirements

Java 21
Maven (mvn)

Import libraries

In your pom.xml add the dependencies

<dependency>
  <groupId>io.gravitee.inference.math.native</groupId>
  <artifactId>gravitee-inference-math-native</artifactId>
  <version>${gravitee.inference.version}</version>
</dependency>

<dependency>
  <groupId>io.gravitee.inference.api</groupId>
  <artifactId>gravitee-inference-api</artifactId>
  <version>${gravitee.inference.version}</version>
</dependency>

<dependency>
  <groupId>io.gravitee.inference.onnx</groupId>
  <artifactId>gravitee-inference-onnx</artifactId>
  <version${gravitee.inference.version}</version>
</dependency>

Supported AI Models

BERT (via ONNX)

We support BERT architecture in ONNX format for various NLP tasks:

Sequence Classification
Token Classification
Fill-mask
Vector Embedding (e.g., Sentence Similarity)

🧠 Sequence Classification

Use this to determine sentiment or categorize full sentences.

var resource = new OnnxBertResource(
    Paths.get("/path/to/your/model.onnx"),
    Paths.get("/path/to/your/tokenizer.json")
);

var configuration = Map.of(
    CLASSIFIER_MODE, ClassifierMode.SEQUENCE,
    CLASSIFIER_LABELS, List.of("Negative", "Positive")
);

var onnxConfig = new OnnxBertConfig(
    resource,
    NativeMath.INSTANCE,
    configuration
);

var model = new OnnxBertClassifierModel(onnxConfig);

// Single sentence
List<ClassifierResult> results = model.infer("I am so happy!").results();
results.forEach(result -> {
    System.out.println("Label: " + result.label());
    System.out.println("Score: " + result.score());
});

// Multiple sentences
model.infer(List.of("I am so happy!", "I am so sad!"));

Try this with distilbert-base-uncased-finetuned-sst-2-english.

🧾 Token Classification

Use this to extract structured entities like names, locations, and organizations from text.

var resource = new OnnxBertResource(
    Paths.get("/path/to/your/model.onnx"),
    Paths.get("/path/to/your/tokenizer.json")
);

var configuration = Map.of(
    Constants.CLASSIFIER_MODE, ClassifierMode.TOKEN,
    Constants.CLASSIFIER_LABELS, List.of(
        "O", "B-MISC", "I-MISC", "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC"
    ),
    Constants.DISCARD_LABELS, List.of("O", "B-MISC", "I-MISC")
);

var onnxConfig = new OnnxBertConfig(resource, NativeMath.INSTANCE, configuration);
var model = new OnnxBertClassifierModel(onnxConfig);

List<ClassifierResult> results = model.infer("My name is Laura and I live in Houston, Texas").results();
results.forEach(result -> {
    System.out.println("Label: " + result.label());
    System.out.println("Score: " + result.score());
    System.out.println("Begin: " + result.begin());
    System.out.println("End: " + result.end());
});

model.infer(List.of(
    "My name is Laura and I live in Houston, Texas",
    "My name is Clara and I live in Berkley, California"
));

Try this with dslim/bert-base-NER.

🎭 Fill Mask

Predict masked tokens in a sentence.

var resource = new OnnxBertResource(
    Paths.get("/path/to/your/model.onnx"),
    Paths.get("/path/to/your/tokenizer.json")
);

var onnxConfig = new OnnxBertConfig(resource, NativeMath.INSTANCE, Map.of());
var model = new OnnxBertFillMaskInference(onnxConfig);

List<FillMaskResult> results = model.infer("The capital of France is [MASK].");

System.out.println(results.getFirst().label()); // Paris

model.infer(List.of(
    "The capital of France is [MASK].",
    "The capital of [MASK] is London."
));

Try this with google-bert/bert-base-uncased.

📐 Vector Embeddings

Convert text into dense vector representations for similarity search or indexing.

var resource = new OnnxBertResource(
    Paths.get("/path/to/your/model.onnx"),
    Paths.get("/path/to/your/tokenizer.json")
);

var onnxConfig = new OnnxBertConfig(resource, NativeMath.INSTANCE, Map.of(
    POOLING_MODE, PoolingMode.MEAN,
    Constants.MAX_SEQUENCE_LENGTH, 512
));

var model = new OnnxBertEmbeddingModel(onnxConfig);
EmbeddingTokenCount embedding = model.infer("The big brown fox jumped over the lazy dog");

System.out.println(embedding.embedding().length); // 384
System.out.println(embedding.tokenCount()); // 11

// Similarity comparison
EmbeddingTokenCount embedding1 = model.infer("The big brown fox jumped over the lazy dog");
EmbeddingTokenCount embedding2 = model.infer("The brown fox jumped over the dog");

System.out.println(
    onnxConfig.gioMaths().cosineScore(embedding1.embedding(), embedding2.embedding())
);

Try this with Xenova/all-MiniLM-L6-v2.

⚡ SIMD Capabilities

To run with SIMD math acceleration:

Add the following to your JVM arguments:

--add-modules jdk.incubator.vector

Import the according dependencies:

<dependency>
    <groupId>io.gravitee.inference.math.simd</groupId>
    <artifactId>gravitee-inference-math-simd</artifactId>
    <version>${gravitee.inference.version}</version>
</dependency>

import io.gravitee.inference.math.simd.factory.SIMDMathFactory;

GioMaths maths = SIMDMathFactory.gioMaths();

The factory will resolve at runtime which SIMD capability your CPU handles.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.circleci		.circleci
.github		.github
gravitee-inference-api		gravitee-inference-api
gravitee-inference-math		gravitee-inference-math
gravitee-inference-onnx		gravitee-inference-onnx
gravitee-inference-rest		gravitee-inference-rest
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE.txt		LICENSE.txt
README.md		README.md
pom.xml		pom.xml
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gravitee-inference

Requirements

Import libraries

Supported AI Models

BERT (via ONNX)

🧠 Sequence Classification

🧾 Token Classification

🎭 Fill Mask

📐 Vector Embeddings

⚡ SIMD Capabilities

About

Uh oh!

Releases 9

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

gravitee-io/gravitee-inference

Folders and files

Latest commit

History

Repository files navigation

gravitee-inference

Requirements

Import libraries

Supported AI Models

BERT (via ONNX)

🧠 Sequence Classification

🧾 Token Classification

🎭 Fill Mask

📐 Vector Embeddings

⚡ SIMD Capabilities

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages