Skip to content

This project demonstrates how to integrate Ollama’s local LLM models with a Spring Boot application.

Notifications You must be signed in to change notification settings

edelgadoh/project-spring-ai

Repository files navigation

Spring AI with Ollama LLM Integration

Project Overview

This project demonstrates how to integrate Ollama’s local LLM models with a Spring Boot application. It’s also compatible with OpenAI and other LLM providers supported by Spring AI — you can switch providers simply by updating a few properties in application.properties.

The project showcases a complete range of LLM interactions — from simple prompt–response processing to advanced scenarios using Spring AI Advisors for pre- and post-processing of prompts and responses. It also includes embedding generation for Retrieval-Augmented Generation (RAG) use cases.

Architecture

The environment is orchestrated with Docker Compose, which spins up all the required services:

  • Qdrant - Vector database for storing and retrieving embeddings generated by Ollama models
  • Prometheus & Grafana - Monitoring and visualization of application performance metrics
  • Jaeger - Distributed tracing to observe request flows across the system

Observability dependencies included:

  • Micrometer - exposes application metrics from Spring AI components, which are scraped and visualized by Prometheus and Grafana.
  • OpenTelemetry - provides end-to-end distributed tracing and observability across all services, with traces collected and visualized by Jaeger.

Getting Started

Install Ollama

Visit https://ollama.com/download to download and install Ollama on your machine.

Start Ollama Service

sudo systemctl start ollama

Stop Ollama Service

sudo systemctl stop ollama

Pull models

  • Light model: ollama run llama3.2:1b Issue: it doesn't follow some system prompts strictly, since it's a small model (1b), thus not suitable for all use cases.

  • Regular model: ollama run llama3.2:3b This models is better at following system prompts, but still not perfect.

  • Embedding model ollama pull nomic-embed-text Used in the context of RAG (Retrieval Augmented Generation) to convert text into embeddings for similarity search.

Build the project

gradle clean build It will build the project & run tests using Ollama LLM locally and Qdrant container in all tests.

Run the project

gradle bootRun

More Information

This project was inspired by the Udemy course From Java Dev to AI Engineer: Spring AI Fast Track, which demonstrates Spring AI using OpenAI models. For more details and reference materials, visit the official course repository: 👉 https://github.com/eazybytes/spring-ai/

About

This project demonstrates how to integrate Ollama’s local LLM models with a Spring Boot application.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published