Skip to content

rag#9

Merged
david20571015 merged 6 commits intomainfrom
rag
Mar 10, 2025
Merged

rag#9
david20571015 merged 6 commits intomainfrom
rag

Conversation

@david20571015
Copy link
Contributor

  • feat: update proto files
  • refactor(rag): combine services to a rag workflow
  • doc(config): update execute instructions
  • feat(client): add example client implementation for searching with gRPC
  • fix(docker): remove wip docker files

Copilot AI review requested due to automatic review settings March 10, 2025 05:45
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @david20571015, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request introduces a new Retrieval-Augmented Generation (RAG) workflow, combining search and summarization services into a single, streamlined process. It updates the project's configuration, adds an example client implementation for searching with gRPC, and removes WIP docker files. The changes include refactoring the service architecture, updating protobuf definitions, and providing instructions for executing the new workflow.

Highlights

  • Feature: RAG Workflow: Combines the search and summarization services into a single RAG workflow for more efficient content retrieval and summarization.
  • Refactor: Service Architecture: Refactors the backend to integrate the RAG workflow, replacing the separate search and summarize services.
  • Configuration: Updates: Updates the configuration files to support the new RAG workflow, including new settings for embedding models, prompts, and LLM configurations.
  • Client Implementation: Adds an example client implementation for searching with gRPC, demonstrating how to interact with the new RAG service.
  • Documentation: Updates the README with instructions on how to configure and execute the new RAG workflow.

Changelog

Click here to see the changelog
  • .gitignore
    • Removes specific config files from being ignored, allowing for easier tracking of configuration changes.
  • README.md
    • Updates the usage instructions to reflect the new RAG workflow.
    • Removes the reference to configs/example.toml.
    • Adds an example of how to run the client script.
  • configs/config.toml
    • Introduces a new configuration file with settings for the RAG workflow, including server settings, embedding model, prompt templates, and LLM configurations.
  • configs/example.toml
    • Removes the example configuration file, as the settings are now consolidated in config.toml.
  • llm_backend/init.py
    • Removes the separate search and summarize service setup.
    • Introduces the RAG service setup.
    • Updates imports to reflect the new RAG structure.
  • llm_backend/rag/init.py
    • Introduces the RAG module with necessary imports for config and service.
  • llm_backend/rag/config.py
    • Defines the configuration settings for the RAG workflow, including embedding models, prompt templates, and LLM configurations.
    • Introduces RetrieveConfig, ChatGptConfig, SummarizeConfig, and RagConfig classes.
    • Adds validation for prompt templates to ensure they contain required placeholders.
  • llm_backend/rag/content_formatters.py
    • Introduces content formatters for plain and numbered text formats.
  • llm_backend/rag/service.py
    • Implements the RAG service, handling requests and orchestrating the RAG workflow.
  • llm_backend/rag/workflow.py
    • Defines the RAG workflow, including steps for retrieving and summarizing content.
    • Uses llama-index to create the RAG pipeline.
  • llm_backend/search/init.py
    • Removes search service related code.
  • llm_backend/search/config.py
    • Removes search service related code.
  • llm_backend/search/service.py
    • Removes search service related code.
  • llm_backend/summarize/init.py
    • Removes summarize service related code.
  • llm_backend/summarize/service.py
    • Removes summarize service related code.
  • llm_backend/utils.py
    • Removes utility functions.
  • protos
    • Updates the submodule to the latest commit.
  • pyproject.toml
    • Updates dependencies, including llama-index-core.
  • scripts/client.py
    • Adds an example client implementation for searching with gRPC.
  • scripts/serve.py
    • Updates the server script to use the new RAG service.
    • Removes the setup for the separate search and summarize services.
  • uv.lock
    • Updates the lockfile to reflect the new dependencies.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


From scattered sources, knowledge we seek,
A RAG pipeline, making us sleek.
Retrieve and summarize, the workflow's art,
A symphony of data, playing its part.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@david20571015 david20571015 merged commit 2c072bc into main Mar 10, 2025
1 check passed
@david20571015 david20571015 deleted the rag branch March 10, 2025 05:46
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request refactors the backend to use a single RAG (Retrieval-Augmented Generation) workflow, combining the search and summarization services. This simplifies the architecture and improves maintainability. The changes include updating proto files, refactoring the service implementation, updating configurations, and providing an example client implementation. Overall, the changes look good, but there are a few areas that could be improved.

Summary of Findings

  • Configuration Clarity: The configuration file (configs/config.toml) contains a large, multi-line string for system_template that includes comments and formatting instructions. This could be made more readable and maintainable by breaking it down into smaller, well-defined sections or using a more structured format.
  • Error Handling: In scripts/serve.py, the load_config function exits the program when the config file is not found. While this prevents the server from starting with an invalid configuration, it might be better to handle this more gracefully, such as by providing a default configuration or allowing the user to specify an alternative path.
  • Default Values in Config: The configs/config.toml file contains default values for various parameters. It would be beneficial to ensure that these default values are consistent with the default values defined in the code (e.g., in llm_backend/rag/config.py).

Merge Readiness

The pull request introduces significant refactoring and new features. While the changes appear to be well-structured, it's crucial to ensure thorough testing, particularly around the combined RAG workflow. I recommend addressing the configuration clarity and error handling suggestions before merging. I am unable to directly approve this pull request, and recommend that others review and approve this code before merging.

Comment on lines +15 to +26
system_template = """
# Project Mission: My project mission is to extract 5 articles of the same type from the internet each time and provide them to Chat GPT in the same format to generate summaries and digests, making it convenient for the general public to read.
# Input Format: The format during input is as follows: 1.xxx 2.xxx 3.xxx 4.xxx 5.xxx Each news article is numbered with a digit title. There is a blank line between different news articles, but within the same article, there are no line breaks.
# Detailed Project Execution: The detailed execution of the project involves refraining from adding personal opinions. I only generate summaries based on the provided news and refrain from providing responses beyond the scope of the news.
# Audience for My Content: The audience comprises professionals from various fields, as well as students and homemakers. They span a wide range of age groups and have a strong desire for knowledge. However, due to limited time, they cannot personally read a large amount of news information. Therefore, my content typically needs to be transformed into something understandable by the general public, with logical analysis involving specific questions and solutions.

# Assuming you are now a reader, think step by step about what you think the key points of the news would be, and provide the first version of the summary. Then, based on this summary, pose sub-questions and further modify to provide the final summary.
# Answer in Traditional Chinese, and refrain from providing thoughts and content beyond what you've provided. Endeavor to comprehensively describe the key points of the news.
# Responses should strive to be rigorous and formal, with real evidence when answering questions.
# Answers can be as complete and comprehensive as possible, expanding on details and actual content.
# The "Output Format" is: provide an overarching title that summarizes the news content above, then summarizes the content.
"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This system_template is a very long string with comments and formatting instructions embedded within it. Consider breaking this down into smaller, more manageable parts, or using a more structured format (e.g., a list of instructions). This would improve readability and maintainability.

Comment on lines 77 to 81
model: Annotated[
str,
Field("gpt-3.5-turbo"),
Field(DEFAULT_OPENAI_MODEL),
AfterValidator(is_available_model),
]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider adding a default value for the api_key field in ChatGptConfig. This could be useful for local development or testing purposes. You could also add a check to ensure that the API key is set before making any API calls.

Comment on lines 30 to +31
logger.error("Config file not found: %s", e)
raise
sys.exit(1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of exiting the program when the config file is not found, consider handling this more gracefully. For example, you could provide a default configuration or allow the user to specify an alternative path. This would make the server more robust and user-friendly.

Suggested change
logger.error("Config file not found: %s", e)
raise
sys.exit(1)
sys.exit(1)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants