11# Docs Agent
22
3- The Docs Agent project enables [ PaLM API] [ genai-doc-site ] users to launch a chat application
4- on a Linux host machine using their documents as a dataset.
3+ The Docs Agent project enables [ Gemini API] [ genai-doc-site ] (previously PaLM API) users to
4+ launch a chat application on a Linux host machine using their documents as a dataset.
55
66** Note** : If you want to set up and launch the Docs Agent sample app on your host machine,
77check out the [ Set up Docs Agent] [ set-up-docs-agent ] section below.
@@ -15,16 +15,16 @@ can be from various sources such as Markdown, HTML, Google Docs, Gmail, PDF, etc
1515
1616The main goal of the Docs Agent project is:
1717
18- - You can supply your own set of documents to enable a PaLM 2 model to generate useful,
18+ - You can supply your own set of documents to enable Google AI models to generate useful,
1919 relevant, and accurate responses that are grounded on the documented information.
2020
2121The Docs Agent sample app is designed to be easily set up and configured in a Linux environment
22- and is required that you have access to Google’s [ PaLM API] [ genai-doc-site ] .
22+ and is required that you have access to Google’s [ Gemini API] [ genai-doc-site ] .
2323
2424Keep in mind that this approach does not involve “fine-tuning” an LLM (large language model).
2525Instead, the Docs Agent sample app uses a mixture of prompt engineering and embedding techniques,
2626also known as Retrieval Augmented Generation (RAG), on top of a publicly available LLM model
27- like PaLM 2 .
27+ like Gemini Pro .
2828
2929![ Docs Agent architecture] ( docs/images/docs-agent-architecture-01.png )
3030
@@ -46,7 +46,7 @@ easy to process Markdown files into embeddings. However, there is no hard requir
4646source documents must exist in Markdown format. What’s important is that the processed content
4747is available as embeddings in the vector database.
4848
49- ### Structure of a prompt to a PaLM 2 model
49+ ### Structure of a prompt to a language model
5050
5151To enable an LLM to answer questions that are not part of the public knowledge (which the LLM
5252is likely trained on), the Docs Agent project applies a mixture of prompt engineering and
@@ -59,7 +59,7 @@ Once the most relevant content is returned, the Docs Agent server uses the promp
5959shown in Figure 3 to augment the user question with a preset ** condition** and a list of
6060** context** . (When the Docs Agent server starts, the condition value is read from the
6161[ ` config.yaml ` ] [ config-yaml ] file.) Then the Docs Agent server sends this prompt to a
62- PaLM 2 model using the PaLM API and receives a response generated by the model.
62+ language model using the Gemini API and receives a response generated by the model.
6363
6464![ Docs Agent prompt strcture] ( docs/images/docs-agent-prompt-structure-01.png )
6565
@@ -108,15 +108,15 @@ The following list summarizes the tasks and features of the Docs Agent sample ap
108108 relevant content given user questions (which are also processed into embeddings using
109109 the same ` embedding-gecko-001 ` model).
110110- ** Add context to a user question in a prompt** : Add the list of content returned from
111- the semantic search as context to the user question and send the prompt to a PaLM 2
112- model using the PaLM API.
111+ the semantic search as context to the user question and send the prompt to a language
112+ model using the Gemini API.
113113- ** (Experimental) “Fact-check” responses** : This experimental feature composes a
114- follow-up prompt and asks the PaLM 2 model to “fact-check” its own previous response.
115- (See the [ Using a PaLM 2 model to fact-check its own response] [ fact-check-section ] section.)
114+ follow-up prompt and asks the language model to “fact-check” its own previous response.
115+ (See the [ Using a language model to fact-check its own response] [ fact-check-section ] section.)
116116- ** Generate 5 related questions** : In addition to displaying a response to the user
117- question, the web UI displays five questions generated by the PaLM 2 model based on
117+ question, the web UI displays five questions generated by the language model based on
118118 the context of the user question. (See the
119- [ Using a PaLM 2 model to suggest related questions] [ related-questions-section ] section.)
119+ [ Using a language model to suggest related questions] [ related-questions-section ] section.)
120120- ** Display URLs of knowledge sources** : The vector database stores URLs as metadata for
121121 embeddings. Whenever the vector database is used to retrieve context (for instance, to
122122 provide context to user questions), the database can also return the URLs of the sources
@@ -150,16 +150,16 @@ The following events take place in the Docs Agent sample app:
150150 text chunks that are most relevant to the user question.
1511516 . The Docs Agent server adds this list of text chunks as context (plus a condition
152152 for responses) to the user question and constructs them into a prompt.
153- 7 . The system sends the prompt to a PaLM 2 model via the PaLM API.
154- 8 . The PaLM 2 model generates a response and the Docs Agent server renders it on
153+ 7 . The system sends the prompt to a language model via the Gemini API.
154+ 8 . The language model generates a response and the Docs Agent server renders it on
155155 the chat UI.
156156
157157Additional events for [ “fact-checking” a generated response] [ fact-check-section ] :
158158
1591599 . The Docs Agent server prepares another prompt that compares the generated response
160- (in step 8) to the context (in step 6) and asks the PaLM model to look for
160+ (in step 8) to the context (in step 6) and asks the language model to look for
161161 a discrepancy in the response.
162- 10 . The PaLM model generates a response that points out one major discrepancy
162+ 10 . The language model generates a response that points out one major discrepancy
163163 (if it exists) between its previous response and the context.
16416411 . The Docs Agent server renders this response on the chat UI as a call-out note.
16516512 . The Docs Agent server passes this second response to the vector database to
@@ -172,9 +172,9 @@ Additional events for [“fact-checking” a generated response][fact-check-sect
172172Additional events for
173173[ suggesting 5 questions related to the user question] [ related-questions-section ] :
174174
175- 15 . The Docs Agent server prepares another prompt that asks the PaLM model to
175+ 15 . The Docs Agent server prepares another prompt that asks the language model to
176176 generate 5 questions based on the context (in step 6).
177- 16 . The PaLM model generates a response that contains a list of questions related
177+ 16 . The language model generates a response that contains a list of questions related
178178 to the context.
17917917 . The Docs Agent server renders the questions on the chat UI.
180180
@@ -188,11 +188,11 @@ enhancing the usability of the Q&A experience powered by generative AI.
188188** Figure 6** . A screenshot of the Docs Agent chat UI showing the sections generated by
189189three distinct prompts.
190190
191- ### Using a PaLM 2 model to fact-check its own response
191+ ### Using a language model to fact-check its own response
192192
193193In addition to using the prompt structure above (shown in Figure 3), we‘re currently
194194experimenting with the following prompt setup for “fact-checking” responses generated
195- by the PaLM model:
195+ by the language model:
196196
197197- Condition:
198198
@@ -247,18 +247,18 @@ database. Once the vector database returns a list of the most relevant content t
247247the UI only displays the top URL to the user.
248248
249249Keep in mind that this "fact-checking" prompt setup is currently considered ** experimental**
250- because we‘ve seen cases where a PaLM model would end up adding incorrect information into its
250+ because we‘ve seen cases where a language model would end up adding incorrect information into its
251251second response as well. However, we saw that adding this second response (which brings attention
252- to the PaLM model’s possible hallucinations) seems to improve the usability of the system since it
253- serves as a reminder to the users that the PaLM model‘s response is far from being perfect, which
252+ to the language model’s possible hallucinations) seems to improve the usability of the system since it
253+ serves as a reminder to the users that the language model‘s response is far from being perfect, which
254254helps encourage the users to take more steps to validate generated responses for themselves.
255255
256- ### Using a PaLM 2 model to suggest related questions
256+ ### Using a language model to suggest related questions
257257
258258The project‘s latest web UI includes the “Related questions” section, which displays five
259259questions that are related to the user question (see Figure 6). These five questions are also
260- generated by a PaLM model (via the PaLM API). Using the list of contents returned from the vector
261- database as context, the system prepares another prompt asking the PaLM model to generate five
260+ generated by a language model (via the Gemini API). Using the list of contents returned from the vector
261+ database as context, the system prepares another prompt asking the language model to generate five
262262questions from the included context.
263263
264264The following is the exact structure of this prompt:
@@ -364,7 +364,7 @@ This section provides instructions on how to set up the Docs Agent project on a
364364
365365 This is a [ known issue] [ poetry-known-issue ] in ` poetry ` .
366366
367- 5 . Set the PaLM API key as a environment variable:
367+ 5 . Set the Gemini API key as a environment variable:
368368
369369 ```
370370 export PALM_API_KEY=<YOUR_API_KEY_HERE>
@@ -603,8 +603,8 @@ To launch the Docs Agent chat app, do the following:
603603 already running on port 5000 on your host machine, you can use the ` -p ` flag to specify
604604 a different port (for example, ` poetry run ./chatbot/launch.sh -p 5050 ` ).
605605
606- ** Note** : If this ` poetry run ./chatbot/launch.sh ` command fails to run, check the ` HOSTNAME ` environment
607- variable on your host machine (for example, ` echo $HOSTNAME ` ). If this variable is unset, try setting it to
606+ ** Note** : If this ` poetry run ./chatbot/launch.sh ` command fails to run, check the ` HOSTNAME ` environment
607+ variable on your host machine (for example, ` echo $HOSTNAME ` ). If this variable is unset, try setting it to
608608 ` localhost ` by running ` export HOSTNAME=localhost ` .
609609
610610 Once the app starts running, this command prints output similar to the following:
@@ -659,14 +659,14 @@ Meggin Kearney (`@Meggin`), and Kyo Lee (`@kyolee415`).
659659[ markdown-to-plain-text ] : ./scripts/markdown_to_plain_text.py
660660[ populate-vector-database ] : ./scripts/populate_vector_database.py
661661[ context-source-01 ] : http://eventhorizontelescope.org
662- [ fact-check-section ] : #using-a-palm-2 -model-to-fact-check-its-own-response
663- [ related-questions-section ] : #using-a-palm-2 -model-to-suggest-related-questions
662+ [ fact-check-section ] : #using-a-language -model-to-fact-check-its-own-response
663+ [ related-questions-section ] : #using-a-language -model-to-suggest-related-questions
664664[ submit-a-rewrite ] : #enabling-users-to-submit-a-rewrite-of-a-generated-response
665665[ like-generate-responses ] : #enabling-users-to-like-generated-responses
666666[ populate-db-steps ] : #populate-a-new-vector-database-from-markdown-files
667667[ start-the-app-steps ] : #start-the-docs-agent-chat-app
668668[ launch-script ] : ./chatbot/launch.sh
669- [ genai-doc-site ] : https://developers.generativeai. google/products/palm
669+ [ genai-doc-site ] : https://ai. google.dev/docs
670670[ chroma-docs ] : https://docs.trychroma.com/
671671[ flutter-docs-src ] : https://github.com/flutter/website/tree/main/src
672672[ flutter-docs-site ] : https://docs.flutter.dev/
0 commit comments