Skip to content

[Example]: Multimodal RAG with LanceDB #2498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 32 commits into
base: main
Choose a base branch
from

Conversation

AyushExel
Copy link

@AyushExel AyushExel commented Aug 11, 2025

Multimodal RAG with LanceDB.

What this does:

  • Fetches a product catalog from a live API (fakestoreapi.com).
  • Embeds product descriptions using a CLIP model (clip-ViT-B-32) and stores them in LanceDB.
  • Implements a RAG agent with a find_products tool that can:
    1. Perform semantic search (e.g., "something for the cold weather").
    2. Perform metadata filtering (e.g., "show me all electronics").
      (e.g., "a cool t-shirt" from the "men's clothing" category under $20).
  • Generates and displays an image collage of the results for instant visual feedback.
  • Creates logfire tracing dashboard if api key is set

Install dependencies:

    pip install lancedb sentence-transformers torch httpx pandas Pillow logfire[httpx]

Set your Google API key (for the agent's text generation):

    export GOOGLE_API_KEY=your_api_key_here

Usage:
First, build the product database from the live API

    python lancedb_multimodal.py build
# Then, ask for a recommendation:
    python lancedb_multimodal.py search "a cool t-shirt in men's clothing under 20 dollars"

Copy link
Contributor

hyperlint-ai bot commented Aug 11, 2025

PR Change Summary

Added a new example for using LanceDB in a multimodal e-commerce context, enhancing the documentation for product cataloging and search functionalities.

  • Introduced a new example for LanceDB multimodal e-commerce RAG
  • Included installation instructions and usage examples for building and searching the product database
  • Demonstrated vector search and object storage capabilities with LanceDB

Modified Files

  • docs/dependencies.md

Added Files

  • docs/examples/lancedb-multimodal.md

How can I customize these reviews?

Check out the Hyperlint AI Reviewer docs for more information on how to customize the review.

If you just want to ignore it on this PR, you can add the hyperlint-ignore label to the PR. Future changes won't trigger a Hyperlint review.

Note specifically for link checks, we only check the first 30 links in a file and we cache the results for several hours (for instance, if you just added a page, you might experience this). Our recommendation is to add hyperlint-ignore to the PR to ignore the link check for this PR.

@AyushExel AyushExel changed the title [Example [Example]: Multimodal RAG with LanceDB Aug 11, 2025
@AyushExel AyushExel marked this pull request as draft August 11, 2025 09:50
@AyushExel AyushExel marked this pull request as ready for review August 11, 2025 11:12
@AyushExel
Copy link
Author

I need some help with managing deps. It seems like lancedb is not being detected during test time even though I've added it to examples deps list.

@Kludex
Copy link
Member

Kludex commented Aug 11, 2025

I need some help with managing deps. It seems like lancedb is not being detected during test time even though I've added it to examples deps list.

You need to add the dependency you need for the docs on the dev = in the root pyproject.toml.

Comment on lines 53 to 58
# pyright: reportMissingImports=false
# pyright: reportInvalidTypeForm=false
# pyright: reportUnknownVariableType=false
# pyright: reportUnknownMemberType=false
# pyright: reportUnknownArgumentType=false
# pyright: reportUntypedBaseClass=false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why so many?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be clear, we're happy to accept an example from you, but this is just a big middle finger to how we operate.

it's either saying:

  • I don't care about any of your typing, linting, DX work
  • or, we're special, we don't need to do this

Either lanceDB is typesafe, in which case remove these. or LanceDB is not type-safe in which case the example probably doesn't belong in our docs!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added these to figure out the deps part while fighting CI.. Will revert back

Comment on lines 53 to 58
# pyright: reportMissingImports=false
# pyright: reportInvalidTypeForm=false
# pyright: reportUnknownVariableType=false
# pyright: reportUnknownMemberType=false
# pyright: reportUnknownArgumentType=false
# pyright: reportUntypedBaseClass=false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be clear, we're happy to accept an example from you, but this is just a big middle finger to how we operate.

it's either saying:

  • I don't care about any of your typing, linting, DX work
  • or, we're special, we don't need to do this

Either lanceDB is typesafe, in which case remove these. or LanceDB is not type-safe in which case the example probably doesn't belong in our docs!

@AyushExel
Copy link
Author

Addressed specific feedbacks:

  1. Moved lancedb out of separate dependency group to dev
  2. Removed hotfixes for pyright - except a couple of places involving dynamic types, similar to https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/rag.py#L189 . Not sure if there's a better practice. Happy to be educated.
  3. Removed top-level docstring guide from code file - Might want to double check this, as the other examples have them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants