-
Notifications
You must be signed in to change notification settings - Fork 0
rag #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
rag #9
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
4fd424c
feat: update proto files
david20571015 ba9ed84
refactor(rag): combine services to a rag workflow
david20571015 047913b
doc(config): update execute instructions
david20571015 339abe5
feat(client): add example client implementation for searching with gRPC
david20571015 4c468c6
Merge branch 'main' into rag
david20571015 7a32208
fix(docker): remove wip docker files
david20571015 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,6 +1,4 @@ | ||
| .DS_Store | ||
| configs/* | ||
| !configs/example.toml | ||
| llm_backend/protos/ | ||
|
|
||
| # Byte-compiled / optimized / DLL files | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| [server] | ||
| host = 'localhost' | ||
| port = 50051 | ||
| max_workers = 10 | ||
|
|
||
| [service.retrieve] | ||
| # Name of embedding model. All available models can be found [here](https://huggingface.co/models?language=zh) | ||
| embedding_model = 'intfloat/multilingual-e5-large' | ||
|
|
||
| # The template must contain the `{keywords}` placeholder. | ||
| prompt_template = 'Please search for the content related to the following keywords: {keywords}.' | ||
| similarity_top_k = 5 | ||
|
|
||
| [service.summarize] | ||
| system_template = """ | ||
| # Project Mission: My project mission is to extract 5 articles of the same type from the internet each time and provide them to Chat GPT in the same format to generate summaries and digests, making it convenient for the general public to read. | ||
| # Input Format: The format during input is as follows: 1.xxx 2.xxx 3.xxx 4.xxx 5.xxx Each news article is numbered with a digit title. There is a blank line between different news articles, but within the same article, there are no line breaks. | ||
| # Detailed Project Execution: The detailed execution of the project involves refraining from adding personal opinions. I only generate summaries based on the provided news and refrain from providing responses beyond the scope of the news. | ||
| # Audience for My Content: The audience comprises professionals from various fields, as well as students and homemakers. They span a wide range of age groups and have a strong desire for knowledge. However, due to limited time, they cannot personally read a large amount of news information. Therefore, my content typically needs to be transformed into something understandable by the general public, with logical analysis involving specific questions and solutions. | ||
|
|
||
| # Assuming you are now a reader, think step by step about what you think the key points of the news would be, and provide the first version of the summary. Then, based on this summary, pose sub-questions and further modify to provide the final summary. | ||
| # Answer in Traditional Chinese, and refrain from providing thoughts and content beyond what you've provided. Endeavor to comprehensively describe the key points of the news. | ||
| # Responses should strive to be rigorous and formal, with real evidence when answering questions. | ||
| # Answers can be as complete and comprehensive as possible, expanding on details and actual content. | ||
| # The "Output Format" is: provide an overarching title that summarizes the news content above, then summarizes the content. | ||
| """ | ||
|
|
||
| # The template must contain the `{context_str}` and `{query_str}` placeholders. | ||
| user_template = """ | ||
| {query_str} | ||
| --------------------- | ||
| {context_str}""" | ||
|
|
||
| # The content of `{query_str}` placeholder in the user template. | ||
| query_str = '假設你是一個摘要抓取者,請將以下---內的文字做一篇文章摘要,用文章敘述的方式呈現,不要用列點的,至少要有500字,要有標題。' | ||
|
|
||
| # The transform function from the request strings to the query strings. | ||
| # Must be one of: | ||
| # - 'plain': The query string is the same as the request string. | ||
| # - 'numbered': Add a number (1., 2., ...) to the beginning of each request string. | ||
| content_format = 'plain' | ||
|
|
||
| [service.summarize.llm] | ||
| model = 'gpt-4o-mini' | ||
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| from ..protos.rag_pb2_grpc import ( | ||
| add_RagServiceServicer_to_server as add_RagServiceServicer_to_server, | ||
| ) | ||
| from .config import RagConfig as RagConfig | ||
| from .service import RagService as RagService |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,12 +1,17 @@ | ||
| from enum import StrEnum | ||
| from typing import Annotated | ||
|
|
||
| from llama_index.llms.openai.utils import ALL_AVAILABLE_MODELS | ||
| from pydantic import AfterValidator, BaseModel, Field | ||
| from pydantic_settings import BaseSettings, SettingsConfigDict | ||
| from pydantic_settings import BaseSettings | ||
|
|
||
| from ..utils import contains_placeholder | ||
| from .content_formatters import ContentFormat | ||
|
|
||
| DEFAULT_EMBEDDING_MODEL = "intfloat/multilingual-e5-large" | ||
| DEFAULT_OPENAI_MODEL = "gpt-4o-mini" | ||
| DEFAULT_SIMILARITY_TOP_K = 10 | ||
| DEFAULT_QUERY_PROMPT_TEMPLATE = ( | ||
| "Please search for the content related to the following keywords: {keywords}." | ||
| ) | ||
| DEFAULT_SYSTEM_TEMPLATE = ( | ||
| "You are an expert Q&A system that is trusted around the world.\n" | ||
| "Always answer the query using the provided context information," | ||
|
|
@@ -30,9 +35,33 @@ | |
| DEFAULT_QUERY_STR = "請用繁體中文總結這幾篇新聞。" | ||
|
|
||
|
|
||
| class ContentFormat(StrEnum): | ||
| PLAIN = "plain" | ||
| NUMBERED = "numbered" | ||
| def contains_placeholder(*placeholders: str): | ||
| def validate_template(template: str): | ||
| for placeholder in placeholders: | ||
| if f"{{{placeholder}}}" not in template: | ||
| raise ValueError(f"Template must contain '{{{placeholder}}}'") | ||
| return template | ||
|
|
||
| return validate_template | ||
|
|
||
|
|
||
| class QDrantConfig(BaseSettings): | ||
| host: str = Field("test", validation_alias="QDRANT_HOST") | ||
| port: int = Field(6333, gt=0, validation_alias="QDRANT_PORT") | ||
| collection: str = Field("news", validation_alias="QDRANT_COLLECTION") | ||
|
|
||
|
|
||
| class RetrieveConfig(BaseModel): | ||
| vector_database: QDrantConfig = QDrantConfig() # type: ignore | ||
| embedding_model: str = Field( | ||
| DEFAULT_EMBEDDING_MODEL, | ||
| description="Name of embedding model." | ||
| "All available models can be found [here](https://huggingface.co/models?library=sentence-transformers&language=zh).", | ||
| ) | ||
| prompt_template: Annotated[ | ||
| str, AfterValidator(contains_placeholder("keywords")) | ||
| ] = DEFAULT_QUERY_PROMPT_TEMPLATE | ||
| similarity_top_k: int = Field(DEFAULT_SIMILARITY_TOP_K, gt=1) | ||
|
|
||
|
|
||
| def is_available_model(model_name: str): | ||
|
|
@@ -43,23 +72,17 @@ def is_available_model(model_name: str): | |
| return model_name | ||
|
|
||
|
|
||
| class ChatgptConfig(BaseSettings): | ||
| model_config = SettingsConfigDict( | ||
| env_file=(".env", ".env.prod"), | ||
| env_file_encoding="utf-8", | ||
| case_sensitive=True, | ||
| extra="ignore", | ||
| ) | ||
|
|
||
| class ChatGptConfig(BaseSettings): | ||
| api_key: str = Field(validation_alias="OPENAI_API_KEY") | ||
| model: Annotated[ | ||
| str, | ||
| Field("gpt-3.5-turbo"), | ||
| Field(DEFAULT_OPENAI_MODEL), | ||
| AfterValidator(is_available_model), | ||
| ] | ||
|
Comment on lines
77
to
81
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
|
|
||
| class SummarizeQueryConfig(BaseModel): | ||
| class SummarizeConfig(BaseModel): | ||
| llm: ChatGptConfig = ChatGptConfig() # type: ignore | ||
| system_template: str = DEFAULT_SYSTEM_TEMPLATE | ||
| user_template: Annotated[ | ||
| str, AfterValidator(contains_placeholder("context_str", "query_str")) | ||
|
|
@@ -71,6 +94,6 @@ class SummarizeQueryConfig(BaseModel): | |
| content_format: ContentFormat = ContentFormat.PLAIN | ||
|
|
||
|
|
||
| class SummarizeConfig(BaseModel): | ||
| chatgpt: ChatgptConfig | ||
| query: SummarizeQueryConfig | ||
| class RagConfig(BaseModel): | ||
| retrieve: RetrieveConfig | ||
| summarize: SummarizeConfig | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| from collections.abc import Callable, Sequence | ||
| from enum import StrEnum | ||
|
|
||
|
|
||
| class ContentFormat(StrEnum): | ||
| PLAIN = "plain" | ||
| NUMBERED = "numbered" | ||
|
|
||
|
|
||
| ContentFormatter = Callable[[Sequence[str]], Sequence[str]] | ||
|
|
||
| CONTENT_FORMATTERS: dict[ContentFormat, ContentFormatter] = { | ||
| ContentFormat.PLAIN: lambda x: x, | ||
| ContentFormat.NUMBERED: lambda x: [ | ||
| f"{i}. {line}" for i, line in enumerate(x, start=1) | ||
| ], | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| import grpc | ||
|
|
||
| from llm_backend.protos import rag_pb2, rag_pb2_grpc | ||
| from llm_backend.rag.config import RagConfig | ||
| from llm_backend.rag.workflow import RagWorkflow | ||
|
|
||
|
|
||
| class RagService(rag_pb2_grpc.RagServiceServicer): | ||
| def __init__(self, config: RagConfig): | ||
| self.workflow = RagWorkflow(config=config) | ||
|
|
||
| async def Rag( | ||
| self, | ||
| request: rag_pb2.RagRequest, | ||
| context: grpc.aio.ServicerContext, | ||
| ): | ||
| result = await self.workflow.run( | ||
| keywords=request.keywords, | ||
| similarity_top_k=request.similarity_top_k, | ||
| ) | ||
|
|
||
| return rag_pb2.RagResponse( | ||
| retrieved_ids=result["retrieved_ids"], | ||
| summary=result["summary"], | ||
| ) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This
system_templateis a very long string with comments and formatting instructions embedded within it. Consider breaking this down into smaller, more manageable parts, or using a more structured format (e.g., a list of instructions). This would improve readability and maintainability.