Post Similarity search using vector databases #2261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

iocanel merged 1 commit into quarkusio:main from iocanel:similarity-search-using-vector-databases

Mar 27, 2025

Contributor

iocanel commented Mar 18, 2025 •

edited

Loading

This is a post on how to use quarkus with vector databases to implement a similarity search example.

github-actions bot commented Mar 18, 2025 •

edited

Loading

🙈 The PR is closed and the preview is expired.

iocanel force-pushed the similarity-search-using-vector-databases branch 2 times, most recently from ea893ff to 5008cb9 Compare

March 18, 2025 14:15

iocanel requested a review from geoand

March 19, 2025 09:36

geoand reviewed

View reviewed changes

Contributor

geoand left a comment

Very neat! I've added some comments but I would like @jmartisk to also review

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated Show resolved Hide resolved

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Show resolved Hide resolved

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated Show resolved Hide resolved

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated Show resolved Hide resolved

iocanel force-pushed the similarity-search-using-vector-databases branch from b99b1de to 6a4a667 Compare

March 19, 2025 11:58

Contributor Author

iocanel commented Mar 19, 2025

@geoand applied feedback.

gsmet reviewed

View reviewed changes

Member

gsmet left a comment

Nice article, I spotted a few typos here and there, HTH.

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated

+              With LLMs becoming increasingly popular we often see them being used even for tasks that are not directly related to text generation.
+              Such case is using LLMs for recommendation systems. In this post we'll see how you can build such a system using https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus Langchain4j]
+              but without using LLMs. More specifically we'll create a simple movie similarity search system using a vector database. The role
+              of https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus Langchain4j] in this store is to abstract the underlying vector database through the `EmbeddingStore` interface.

Member

gsmet Mar 19, 2025

Wasn't sure what you wanted to write but store looked odd?

Suggested change

      
            of https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus Langchain4j] in this store is to abstract the underlying vector database through the `EmbeddingStore` interface.
          
            of https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus Langchain4j] in this story is to abstract the underlying vector database through the `EmbeddingStore` interface.

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated

+              but without using LLMs. More specifically we'll create a simple movie similarity search system using a vector database. The role
+              of https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html[Quarkus Langchain4j] in this store is to abstract the underlying vector database through the `EmbeddingStore` interface.
+              A relevant sample has been recently added to the https://github.com/quarkiverse/quarkus-langchain4j/tree/main/samples/[Quarkus Langchain4j samples].

Member

gsmet Mar 19, 2025

Could you fix LangChain4j case everywhere?

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated

+                  </dependency>
+              ----
+              To be able to use these dependencies without needing to specify versions, the bom can be add imported to the `dependencyManagement` of the project:

Member

gsmet Mar 19, 2025

Suggested change

      
            To be able to use these dependencies without needing to specify versions, the bom can be add imported to the `dependencyManagement` of the project:
          
            To be able to use these dependencies without needing to specify versions, the bom can be added to the `dependencyManagement` of the project:

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated

+              </dependency>
+              ----
+              To properly use the in process embedding model we need to configure it in the `application.properties` file.

Member

gsmet Mar 19, 2025

Suggested change

      
            To properly use the in process embedding model we need to configure it in the `application.properties` file. 
          
            To properly use the in-process embedding model we need to configure it in the `application.properties` file.

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated

+              ----
+              To properly use the in process embedding model we need to configure it in the `application.properties` file.
+              We also need to configure the pgvector dimension an ensure it's aligned with the dimension of the embedding model.

Member

gsmet Mar 19, 2025

Suggested change

      
            We also need to configure the pgvector dimension an ensure it's aligned with the dimension of the embedding model.
          
            We also need to configure the pgvector dimension and ensure it's aligned with the dimension of the embedding model.

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated

+              }
+              ----
+              To use the CSV mapper, we'll need to `jackson` csv dataformat:

Member

gsmet Mar 19, 2025

Suggested change

      
            To use the CSV mapper, we'll need to `jackson` csv dataformat:
          
            To use the CSV mapper, we'll need to add Jackson's CSV dataformat dependency:

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated

+              ==== Bringing it all together ====
+              The only thing that's left is to create a REST endpoint that will allow us to search for similar movies. We could also use a simple UI.
+              Let's start with the REST endpoint. It's pretty straight forward. We need to methods one for movie searching and one for searching similar movies.

Member

gsmet Mar 19, 2025

Suggested change

      
            Let's start with the REST endpoint. It's pretty straight forward. We need to methods one for movie searching and one for searching similar movies.
          
            Let's start with the REST endpoint. It's pretty straightforward. We need two methods, one for searching movies and one for searching similar movies.

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated


		The key elements of that page are:

		* movie-box: a text filed for entering the movie title

Member

gsmet Mar 19, 2025

Suggested change

      
            * movie-box: a text filed for entering the movie title
          
            * movie-box: a text field for entering the movie title

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated

+              * movie-poster: an image for displaying the movie poster
+              * similar-results: an additional unordered list for displaying the similar movies
+              It's important to remember that the `Movie` entity is using `jackson` to map the CSV columns to the entity fields.

Member

gsmet Mar 19, 2025

Suggested change

      
            It's important to remember that the `Movie` entity is using `jackson` to map the CSV columns to the entity fields.
          
            It's important to remember that the `Movie` entity is using Jackson to map the CSV columns to the entity fields.

_posts/2025-03-18-movie-similarity-search-using-vector-databases.adoc Outdated

+              </html>
+              ----
+              I won't go into much detail about the hmtl code as it's outside the scope of this post.

Member

gsmet Mar 19, 2025

Suggested change

      
            I won't go into much detail about the hmtl code as it's outside the scope of this post. 
          
            I won't go into much detail about the HTML code as it's outside the scope of this post.


          post: Similarity search using vector databases

137c925

iocanel force-pushed the similarity-search-using-vector-databases branch from 6a4a667 to 137c925 Compare

March 22, 2025 16:59

Contributor Author

iocanel commented Mar 26, 2025

@gsmet @geoand: Forgot to mention that I've applied the feedback.

geoand approved these changes

View reviewed changes

iocanel merged commit 55f2c32 into quarkusio:main

1 check passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet