rag by david20571015 · Pull Request #9 · NCTU-SYNC/sync-server-llm

david20571015 · 2025-03-10T05:45:40Z

feat: update proto files
refactor(rag): combine services to a rag workflow
doc(config): update execute instructions
feat(client): add example client implementation for searching with gRPC
fix(docker): remove wip docker files

gemini-code-assist

Hello @david20571015, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request introduces a new Retrieval-Augmented Generation (RAG) workflow, combining search and summarization services into a single, streamlined process. It updates the project's configuration, adds an example client implementation for searching with gRPC, and removes WIP docker files. The changes include refactoring the service architecture, updating protobuf definitions, and providing instructions for executing the new workflow.

Highlights

Feature: RAG Workflow: Combines the search and summarization services into a single RAG workflow for more efficient content retrieval and summarization.
Refactor: Service Architecture: Refactors the backend to integrate the RAG workflow, replacing the separate search and summarize services.
Configuration: Updates: Updates the configuration files to support the new RAG workflow, including new settings for embedding models, prompts, and LLM configurations.
Client Implementation: Adds an example client implementation for searching with gRPC, demonstrating how to interact with the new RAG service.
Documentation: Updates the README with instructions on how to configure and execute the new RAG workflow.

Changelog

Click here to see the changelog

.gitignore
- Removes specific config files from being ignored, allowing for easier tracking of configuration changes.
README.md
- Updates the usage instructions to reflect the new RAG workflow.
- Removes the reference to configs/example.toml.
- Adds an example of how to run the client script.
configs/config.toml
- Introduces a new configuration file with settings for the RAG workflow, including server settings, embedding model, prompt templates, and LLM configurations.
configs/example.toml
- Removes the example configuration file, as the settings are now consolidated in config.toml.
llm_backend/init.py
- Removes the separate search and summarize service setup.
- Introduces the RAG service setup.
- Updates imports to reflect the new RAG structure.
llm_backend/rag/init.py
- Introduces the RAG module with necessary imports for config and service.
llm_backend/rag/config.py
- Defines the configuration settings for the RAG workflow, including embedding models, prompt templates, and LLM configurations.
- Introduces RetrieveConfig, ChatGptConfig, SummarizeConfig, and RagConfig classes.
- Adds validation for prompt templates to ensure they contain required placeholders.
llm_backend/rag/content_formatters.py
- Introduces content formatters for plain and numbered text formats.
llm_backend/rag/service.py
- Implements the RAG service, handling requests and orchestrating the RAG workflow.
llm_backend/rag/workflow.py
- Defines the RAG workflow, including steps for retrieving and summarizing content.
- Uses llama-index to create the RAG pipeline.
llm_backend/search/init.py
- Removes search service related code.
llm_backend/search/config.py
- Removes search service related code.
llm_backend/search/service.py
- Removes search service related code.
llm_backend/summarize/init.py
- Removes summarize service related code.
llm_backend/summarize/service.py
- Removes summarize service related code.
llm_backend/utils.py
- Removes utility functions.
protos
- Updates the submodule to the latest commit.
pyproject.toml
- Updates dependencies, including llama-index-core.
scripts/client.py
- Adds an example client implementation for searching with gRPC.
scripts/serve.py
- Updates the server script to use the new RAG service.
- Removes the setup for the separate search and summarize services.
uv.lock
- Updates the lockfile to reflect the new dependencies.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

From scattered sources, knowledge we seek,
A RAG pipeline, making us sleek.
Retrieve and summarize, the workflow's art,
A symphony of data, playing its part.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request refactors the backend to use a single RAG (Retrieval-Augmented Generation) workflow, combining the search and summarization services. This simplifies the architecture and improves maintainability. The changes include updating proto files, refactoring the service implementation, updating configurations, and providing an example client implementation. Overall, the changes look good, but there are a few areas that could be improved.

Summary of Findings

Configuration Clarity: The configuration file (configs/config.toml) contains a large, multi-line string for system_template that includes comments and formatting instructions. This could be made more readable and maintainable by breaking it down into smaller, well-defined sections or using a more structured format.
Error Handling: In scripts/serve.py, the load_config function exits the program when the config file is not found. While this prevents the server from starting with an invalid configuration, it might be better to handle this more gracefully, such as by providing a default configuration or allowing the user to specify an alternative path.
Default Values in Config: The configs/config.toml file contains default values for various parameters. It would be beneficial to ensure that these default values are consistent with the default values defined in the code (e.g., in llm_backend/rag/config.py).

Merge Readiness

The pull request introduces significant refactoring and new features. While the changes appear to be well-structured, it's crucial to ensure thorough testing, particularly around the combined RAG workflow. I recommend addressing the configuration clarity and error handling suggestions before merging. I am unable to directly approve this pull request, and recommend that others review and approve this code before merging.

gemini-code-assist · 2025-03-10T05:46:12Z

configs/config.toml

+system_template = """
+# Project Mission: My project mission is to extract 5 articles of the same type from the internet each time and provide them to Chat GPT in the same format to generate summaries and digests, making it convenient for the general public to read.
+# Input Format: The format during input is as follows: 1.xxx 2.xxx 3.xxx 4.xxx 5.xxx Each news article is numbered with a digit title. There is a blank line between different news articles, but within the same article, there are no line breaks.
+# Detailed Project Execution: The detailed execution of the project involves refraining from adding personal opinions. I only generate summaries based on the provided news and refrain from providing responses beyond the scope of the news.
+# Audience for My Content: The audience comprises professionals from various fields, as well as students and homemakers. They span a wide range of age groups and have a strong desire for knowledge. However, due to limited time, they cannot personally read a large amount of news information. Therefore, my content typically needs to be transformed into something understandable by the general public, with logical analysis involving specific questions and solutions.
+
+# Assuming you are now a reader, think step by step about what you think the key points of the news would be, and provide the first version of the summary. Then, based on this summary, pose sub-questions and further modify to provide the final summary.
+# Answer in Traditional Chinese, and refrain from providing thoughts and content beyond what you've provided. Endeavor to comprehensively describe the key points of the news.
+# Responses should strive to be rigorous and formal, with real evidence when answering questions.
+# Answers can be as complete and comprehensive as possible, expanding on details and actual content.
+# The "Output Format" is: provide an overarching title that summarizes the news content above, then summarizes the content.
+"""


This system_template is a very long string with comments and formatting instructions embedded within it. Consider breaking this down into smaller, more manageable parts, or using a more structured format (e.g., a list of instructions). This would improve readability and maintainability.

gemini-code-assist · 2025-03-10T05:46:12Z

llm_backend/rag/config.py

    model: Annotated[
        str,
-        Field("gpt-3.5-turbo"),
+        Field(DEFAULT_OPENAI_MODEL),
        AfterValidator(is_available_model),
    ]


Consider adding a default value for the api_key field in ChatGptConfig. This could be useful for local development or testing purposes. You could also add a check to ensure that the API key is set before making any API calls.

gemini-code-assist · 2025-03-10T05:46:12Z

scripts/serve.py

        logger.error("Config file not found: %s", e)
-        raise
+        sys.exit(1)


Instead of exiting the program when the config file is not found, consider handling this more gracefully. For example, you could provide a default configuration or allow the user to specify an alternative path. This would make the server more robust and user-friendly.

Suggested change

logger.error("Config file not found: %s", e)

raise

sys.exit(1)

sys.exit(1)

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

david20571015 added 6 commits February 17, 2025 16:49

feat: update proto files

4fd424c

refactor(rag): combine services to a rag workflow

ba9ed84

doc(config): update execute instructions

047913b

feat(client): add example client implementation for searching with gRPC

339abe5

Merge branch 'main' into rag

4c468c6

fix(docker): remove wip docker files

7a32208

Copilot AI review requested due to automatic review settings March 10, 2025 05:45

gemini-code-assist bot reviewed Mar 10, 2025

View reviewed changes

david20571015 merged commit 2c072bc into main Mar 10, 2025
1 check passed

david20571015 deleted the rag branch March 10, 2025 05:46

gemini-code-assist bot reviewed Mar 10, 2025

View reviewed changes

Copilot AI reviewed Mar 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rag#9

rag#9
david20571015 merged 6 commits intomainfrom
rag

david20571015 commented Mar 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 10, 2025

Uh oh!

gemini-code-assist bot Mar 10, 2025

Uh oh!

gemini-code-assist bot Mar 10, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

david20571015 commented Mar 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Uh oh!

gemini-code-assist bot Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants