Skip to content

Loreer is wise AI assistance that has the vast knowledge of all lores, whether that is movie, series, game, or book related. Powered by LLMs and AI.

License

Notifications You must be signed in to change notification settings

M-Ali-ML/Loreer

Repository files navigation

Loreer

lorrer_gif

Loreer is a sophisticated AI assistant designed to provide comprehensive knowledge on various lores, including movies, series, games, and books. Powered by advanced Large Language Models (LLMs), Loreer utilizes a Retrieval-Augmented Generation (RAG) system to deliver precise and relevant information. This RAG system is built upon League of Legends (LoL) lore and can precisely answer any LoL's lore related questions.

Below is a high-level overview of the system. Loreer-design)

Features

  • Comprehensive lore knowledge
  • Powered by state-of-the-art LLM
  • Efficient data processing and embedding
  • Local RAG system integration

Repo Structure

Name Objective Path
00-parse_xml_dumb.ipynb parses the XML file and removes unnecessary pages Link
01-preprocess_data.ipynb clean the data and split it into chunks Link
02-embed_chunks.ipynb Embedd the chunks into a vector embedding for fast retrival Link
03-pure_LLM.ipynb running a quantized Llama-3-8b locally using llama.cpp Link
04-RAG_system.ipynb Integerate the embedded query and the LLM into a single prompt Link
web_app.py A streamlit web app that utilize RAG system backend Link

Web App

A web app using Streamlit is build as a UI. To run the web app simply run the below command.

streamlit run web_app.py

Models and data

Models:

In RAG system there can be lots of models, for this repo the used models are:

Name Objective link
meta-llama-3-8b-instruct.Q4_K_M.gguf A quantized LLM Llama 3 8b model with GGUF format for llama.cpp usage HF
Alibaba-NLP/gte-base-en-v1.5 An embedding model that support the context length of up to 8192 ranks high on MTEB HF

Data:

For data, The LoL Wiki has an offical data dump that was parsed, cleaned, and embedded into a vector database.

All links to download the models and the data are available in the repo.

Reproducibility

The repository's code serves as a strong foundation for the potential extension of this RAG system to encompass different Wiki/fandom domains.

Hardware requirments

I managed to run the quantized llama3 on a laptop with 16GB of ram and a rtx 3060 with 6GB of vram, while it takes about 2-10 seconds to answer a query on the whole RAG pipeline [Retrieval, information extraction and summarization, prompt answering], it is astonishing just how much you can get on mid tier laptop.

This improvement is attributed to the utilization of llama.cpp, which facilitates C++ inference for large language models (LLMs). This approach significantly reduces the overhead, low speed, and high resources consumption associated with Python.

References

License

This project is licensed under the MIT License.

About

Loreer is wise AI assistance that has the vast knowledge of all lores, whether that is movie, series, game, or book related. Powered by LLMs and AI.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published