Conversation
david20571015
commented
Mar 10, 2025
- feat: update proto files
- refactor(rag): combine services to a rag workflow
- doc(config): update execute instructions
- feat(client): add example client implementation for searching with gRPC
- fix(docker): remove wip docker files
There was a problem hiding this comment.
Hello @david20571015, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request introduces a new Retrieval-Augmented Generation (RAG) workflow, combining search and summarization services into a single, streamlined process. It updates the project's configuration, adds an example client implementation for searching with gRPC, and removes WIP docker files. The changes include refactoring the service architecture, updating protobuf definitions, and providing instructions for executing the new workflow.
Highlights
- Feature: RAG Workflow: Combines the search and summarization services into a single RAG workflow for more efficient content retrieval and summarization.
- Refactor: Service Architecture: Refactors the backend to integrate the RAG workflow, replacing the separate search and summarize services.
- Configuration: Updates: Updates the configuration files to support the new RAG workflow, including new settings for embedding models, prompts, and LLM configurations.
- Client Implementation: Adds an example client implementation for searching with gRPC, demonstrating how to interact with the new RAG service.
- Documentation: Updates the README with instructions on how to configure and execute the new RAG workflow.
Changelog
Click here to see the changelog
- .gitignore
- Removes specific config files from being ignored, allowing for easier tracking of configuration changes.
- README.md
- Updates the usage instructions to reflect the new RAG workflow.
- Removes the reference to
configs/example.toml. - Adds an example of how to run the client script.
- configs/config.toml
- Introduces a new configuration file with settings for the RAG workflow, including server settings, embedding model, prompt templates, and LLM configurations.
- configs/example.toml
- Removes the example configuration file, as the settings are now consolidated in
config.toml.
- Removes the example configuration file, as the settings are now consolidated in
- llm_backend/init.py
- Removes the separate search and summarize service setup.
- Introduces the RAG service setup.
- Updates imports to reflect the new RAG structure.
- llm_backend/rag/init.py
- Introduces the RAG module with necessary imports for config and service.
- llm_backend/rag/config.py
- Defines the configuration settings for the RAG workflow, including embedding models, prompt templates, and LLM configurations.
- Introduces
RetrieveConfig,ChatGptConfig,SummarizeConfig, andRagConfigclasses. - Adds validation for prompt templates to ensure they contain required placeholders.
- llm_backend/rag/content_formatters.py
- Introduces content formatters for plain and numbered text formats.
- llm_backend/rag/service.py
- Implements the RAG service, handling requests and orchestrating the RAG workflow.
- llm_backend/rag/workflow.py
- Defines the RAG workflow, including steps for retrieving and summarizing content.
- Uses
llama-indexto create the RAG pipeline.
- llm_backend/search/init.py
- Removes search service related code.
- llm_backend/search/config.py
- Removes search service related code.
- llm_backend/search/service.py
- Removes search service related code.
- llm_backend/summarize/init.py
- Removes summarize service related code.
- llm_backend/summarize/service.py
- Removes summarize service related code.
- llm_backend/utils.py
- Removes utility functions.
- protos
- Updates the submodule to the latest commit.
- pyproject.toml
- Updates dependencies, including
llama-index-core.
- Updates dependencies, including
- scripts/client.py
- Adds an example client implementation for searching with gRPC.
- scripts/serve.py
- Updates the server script to use the new RAG service.
- Removes the setup for the separate search and summarize services.
- uv.lock
- Updates the lockfile to reflect the new dependencies.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
From scattered sources, knowledge we seek,
A RAG pipeline, making us sleek.
Retrieve and summarize, the workflow's art,
A symphony of data, playing its part.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
The pull request refactors the backend to use a single RAG (Retrieval-Augmented Generation) workflow, combining the search and summarization services. This simplifies the architecture and improves maintainability. The changes include updating proto files, refactoring the service implementation, updating configurations, and providing an example client implementation. Overall, the changes look good, but there are a few areas that could be improved.
Summary of Findings
- Configuration Clarity: The configuration file (
configs/config.toml) contains a large, multi-line string forsystem_templatethat includes comments and formatting instructions. This could be made more readable and maintainable by breaking it down into smaller, well-defined sections or using a more structured format. - Error Handling: In
scripts/serve.py, theload_configfunction exits the program when the config file is not found. While this prevents the server from starting with an invalid configuration, it might be better to handle this more gracefully, such as by providing a default configuration or allowing the user to specify an alternative path. - Default Values in Config: The
configs/config.tomlfile contains default values for various parameters. It would be beneficial to ensure that these default values are consistent with the default values defined in the code (e.g., inllm_backend/rag/config.py).
Merge Readiness
The pull request introduces significant refactoring and new features. While the changes appear to be well-structured, it's crucial to ensure thorough testing, particularly around the combined RAG workflow. I recommend addressing the configuration clarity and error handling suggestions before merging. I am unable to directly approve this pull request, and recommend that others review and approve this code before merging.
| system_template = """ | ||
| # Project Mission: My project mission is to extract 5 articles of the same type from the internet each time and provide them to Chat GPT in the same format to generate summaries and digests, making it convenient for the general public to read. | ||
| # Input Format: The format during input is as follows: 1.xxx 2.xxx 3.xxx 4.xxx 5.xxx Each news article is numbered with a digit title. There is a blank line between different news articles, but within the same article, there are no line breaks. | ||
| # Detailed Project Execution: The detailed execution of the project involves refraining from adding personal opinions. I only generate summaries based on the provided news and refrain from providing responses beyond the scope of the news. | ||
| # Audience for My Content: The audience comprises professionals from various fields, as well as students and homemakers. They span a wide range of age groups and have a strong desire for knowledge. However, due to limited time, they cannot personally read a large amount of news information. Therefore, my content typically needs to be transformed into something understandable by the general public, with logical analysis involving specific questions and solutions. | ||
|
|
||
| # Assuming you are now a reader, think step by step about what you think the key points of the news would be, and provide the first version of the summary. Then, based on this summary, pose sub-questions and further modify to provide the final summary. | ||
| # Answer in Traditional Chinese, and refrain from providing thoughts and content beyond what you've provided. Endeavor to comprehensively describe the key points of the news. | ||
| # Responses should strive to be rigorous and formal, with real evidence when answering questions. | ||
| # Answers can be as complete and comprehensive as possible, expanding on details and actual content. | ||
| # The "Output Format" is: provide an overarching title that summarizes the news content above, then summarizes the content. | ||
| """ |
There was a problem hiding this comment.
| model: Annotated[ | ||
| str, | ||
| Field("gpt-3.5-turbo"), | ||
| Field(DEFAULT_OPENAI_MODEL), | ||
| AfterValidator(is_available_model), | ||
| ] |
| logger.error("Config file not found: %s", e) | ||
| raise | ||
| sys.exit(1) |
There was a problem hiding this comment.
Instead of exiting the program when the config file is not found, consider handling this more gracefully. For example, you could provide a default configuration or allow the user to specify an alternative path. This would make the server more robust and user-friendly.
| logger.error("Config file not found: %s", e) | |
| raise | |
| sys.exit(1) | |
| sys.exit(1) |