Add LLM inference for response generation by edkaya · Pull Request #43 · AET-DevOps25/team-continuous-disappointment

edkaya · 2025-06-10T00:12:08Z

Added BaseChat model
Added webui service to make calls to llm
Added rag service to implement retrieval logic + prompt generation
Introduced the api endpoint in genai -> /genai/generate
-- Body must contain the query and conversation_id (or another kind of id, which is necessary to make a db call to fetch user chat history, will be further discussed in the week).

esadakcam · 2025-06-10T06:52:41Z

genai/controller/generate_controller.py

+                collection_name
+            )
+            # todo: retrieve messages from chat history as BaseMessage
+            messages = []


I think we have tree options for fetching the history of user:

The request body contains the history. On server side before sending the request to the Genai service, we can attach the history to the request. Also, chat flow is managed by server. Moreover, we can attach user's food preferences.

Genai service can query the mongodb.

Genai service can request history from server.

Personally, I prefer the first one. The flow is managed easier.

I also thought about the first two options, but the first one makes also better sense to me. Therefore, we would not handle conversationId logic in the genai service either. Lets use the first one

esadakcam · 2025-06-10T06:56:35Z

genai/service/openwebui_service.py

+
+from genai.config import Config
+
+BASE_URL = "https://gpu.aet.cit.tum.de/"


Lets make it a config so that we can easily change the LLM provider.

esadakcam · 2025-06-10T07:01:10Z

genai/rag/llm/chat_model.py

+                  stop=None,
+                  **kwargs) -> ChatResult:
+        prompt = "\n".join([
+            msg.content for msg in messages if isinstance(msg, HumanMessage)


Does it mean that we add only user's messages as context? If so, shouldn't we include all chat?

Yes correct, inlcuding llm's messages has sometimes a downside, where the llm can hallucinate from its previous responses in the chat history and could not fully generate a correct answer for an already given query. (If the given response was not correct in the chat history)

But if we think about our application, i think it wont be a huge issue, since we prioritize the context first and utilize a full RAG functionality. So I will add ai messages in response generation 👍

esadakcam · 2025-06-10T11:16:16Z

genai/controller/generate_controller.py

+        messages (List[Dict]): Full conversation history, each with 'role' and 'content'
+            Example:
+            [
+                {"role": "user", "content": "I have eggs and tomatoes."},


On server role is an enum and stored all capital letters. It is better to check the role case insensitive.

@esadakcam I changed it, thanks! I just checked server code, and I think there is typo in server enums, it should be ASSISTANT instead of ASISTANT. Since I dont want to break the structure in server code, could you change it in a follow up?

edkaya added 6 commits June 9, 2025 23:50

add base chat model and webui service

865e948

Add rag service to prepare prompts and document retriaval

726846d

add whitespace

b23caa7

Adjust system prompt

5324a1b

fix linting

c74e6e4

fix linting

2a6ff5e

edkaya requested a review from esadakcam June 10, 2025 00:12

esadakcam approved these changes Jun 10, 2025

View reviewed changes

integrate feedback, add raw message conversation method

1a1c612

edkaya requested a review from esadakcam June 10, 2025 11:13

esadakcam reviewed Jun 10, 2025

View reviewed changes

esadakcam approved these changes Jun 10, 2025

View reviewed changes

edkaya added 2 commits June 10, 2025 13:24

Make check case insensitive

1c16a24

fix linting

e381ef5

edkaya merged commit 8ce1842 into main Jun 10, 2025
8 checks passed

edkaya mentioned this pull request Jun 19, 2025

Add LLM integration for genai service #66

Closed

edkaya deleted the feature/setup-llm-inference branch June 21, 2025 15:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLM inference for response generation#43

Add LLM inference for response generation#43
edkaya merged 9 commits intomainfrom
feature/setup-llm-inference

edkaya commented Jun 10, 2025 •

edited

Loading

Uh oh!

esadakcam Jun 10, 2025

Uh oh!

edkaya Jun 10, 2025

Uh oh!

esadakcam Jun 10, 2025

Uh oh!

esadakcam Jun 10, 2025

Uh oh!

edkaya Jun 10, 2025

Uh oh!

esadakcam Jun 10, 2025

Uh oh!

edkaya Jun 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		from genai.config import Config

		BASE_URL = "https://gpu.aet.cit.tum.de/"

Conversation

edkaya commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

esadakcam Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

edkaya Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

esadakcam Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

esadakcam Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

edkaya Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

esadakcam Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

edkaya Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

edkaya commented Jun 10, 2025 •

edited

Loading