This repository demonstrates how to use the retrieval augmented generation (RAG) pattern in conjunction with an existing SQL database.
The repository consists of two parts:
setupcontains all Ansible-related files, which help you setting up this little demo.srccontains the actual application.
To deploy the demo on a Power10/Power11 LPAR, adjust the inventory.yml accordingly (hostname/IP, password, user, etc.).
On your local machine, install Ansible in a Python environment:
$ python -m pip install ansibleNext, execute the entrypoint playbook to compile llama.cpp, deploy the code and instruct models, as well as the frontend and backend service:
$ ansible-playbook -i setup/inventory.yml setup/entrypoint.ymlThe frontend and the backend are two independent services, which do not interact with each other. The frontend is based on Gradio and calls the code and instruct models directly. The backend is meant to be used to integrate the SQL-RAG service in a broader (agentic) AI solution.
Note
If you want to independently execute each step, have a look at the entrypoint.yml and execute each imported playbook separately.
The frontend should be available under http://<HOSTNAME>:<PORT> as specified in the inventory.yml.
More documentation and features to come!
- database: currently, a SQLite database file is shipped, which was created based on the
db.pyanddata.pymodules. The next release will include a playbook for setting up the database itself including a choice, which database to use (e.g. PostgreSQL). - backend API documentation