Skip to content

Commit f35f116

Browse files
committed
[Release] Docs Agent version 0.2.0
What's changed: - Enable Docs Agent to work with Gemini models
1 parent 6475319 commit f35f116

File tree

11 files changed

+1254
-981
lines changed

11 files changed

+1254
-981
lines changed

demos/palm/python/docs-agent/README.md

Lines changed: 33 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Docs Agent
22

3-
The Docs Agent project enables [PaLM API][genai-doc-site] users to launch a chat application
4-
on a Linux host machine using their documents as a dataset.
3+
The Docs Agent project enables [Gemini API][genai-doc-site] (previously PaLM API) users to
4+
launch a chat application on a Linux host machine using their documents as a dataset.
55

66
**Note**: If you want to set up and launch the Docs Agent sample app on your host machine,
77
check out the [Set up Docs Agent][set-up-docs-agent] section below.
@@ -15,16 +15,16 @@ can be from various sources such as Markdown, HTML, Google Docs, Gmail, PDF, etc
1515

1616
The main goal of the Docs Agent project is:
1717

18-
- You can supply your own set of documents to enable a PaLM 2 model to generate useful,
18+
- You can supply your own set of documents to enable Google AI models to generate useful,
1919
relevant, and accurate responses that are grounded on the documented information.
2020

2121
The Docs Agent sample app is designed to be easily set up and configured in a Linux environment
22-
and is required that you have access to Google’s [PaLM API][genai-doc-site].
22+
and is required that you have access to Google’s [Gemini API][genai-doc-site].
2323

2424
Keep in mind that this approach does not involve “fine-tuning” an LLM (large language model).
2525
Instead, the Docs Agent sample app uses a mixture of prompt engineering and embedding techniques,
2626
also known as Retrieval Augmented Generation (RAG), on top of a publicly available LLM model
27-
like PaLM 2.
27+
like Gemini Pro.
2828

2929
![Docs Agent architecture](docs/images/docs-agent-architecture-01.png)
3030

@@ -46,7 +46,7 @@ easy to process Markdown files into embeddings. However, there is no hard requir
4646
source documents must exist in Markdown format. What’s important is that the processed content
4747
is available as embeddings in the vector database.
4848

49-
### Structure of a prompt to a PaLM 2 model
49+
### Structure of a prompt to a language model
5050

5151
To enable an LLM to answer questions that are not part of the public knowledge (which the LLM
5252
is likely trained on), the Docs Agent project applies a mixture of prompt engineering and
@@ -59,7 +59,7 @@ Once the most relevant content is returned, the Docs Agent server uses the promp
5959
shown in Figure 3 to augment the user question with a preset **condition** and a list of
6060
**context**. (When the Docs Agent server starts, the condition value is read from the
6161
[`config.yaml`][config-yaml] file.) Then the Docs Agent server sends this prompt to a
62-
PaLM 2 model using the PaLM API and receives a response generated by the model.
62+
language model using the Gemini API and receives a response generated by the model.
6363

6464
![Docs Agent prompt strcture](docs/images/docs-agent-prompt-structure-01.png)
6565

@@ -108,15 +108,15 @@ The following list summarizes the tasks and features of the Docs Agent sample ap
108108
relevant content given user questions (which are also processed into embeddings using
109109
the same `embedding-gecko-001` model).
110110
- **Add context to a user question in a prompt**: Add the list of content returned from
111-
the semantic search as context to the user question and send the prompt to a PaLM 2
112-
model using the PaLM API.
111+
the semantic search as context to the user question and send the prompt to a language
112+
model using the Gemini API.
113113
- **(Experimental) “Fact-check” responses**: This experimental feature composes a
114-
follow-up prompt and asks the PaLM 2 model to “fact-check” its own previous response.
115-
(See the [Using a PaLM 2 model to fact-check its own response][fact-check-section] section.)
114+
follow-up prompt and asks the language model to “fact-check” its own previous response.
115+
(See the [Using a language model to fact-check its own response][fact-check-section] section.)
116116
- **Generate 5 related questions**: In addition to displaying a response to the user
117-
question, the web UI displays five questions generated by the PaLM 2 model based on
117+
question, the web UI displays five questions generated by the language model based on
118118
the context of the user question. (See the
119-
[Using a PaLM 2 model to suggest related questions][related-questions-section] section.)
119+
[Using a language model to suggest related questions][related-questions-section] section.)
120120
- **Display URLs of knowledge sources**: The vector database stores URLs as metadata for
121121
embeddings. Whenever the vector database is used to retrieve context (for instance, to
122122
provide context to user questions), the database can also return the URLs of the sources
@@ -150,16 +150,16 @@ The following events take place in the Docs Agent sample app:
150150
text chunks that are most relevant to the user question.
151151
6. The Docs Agent server adds this list of text chunks as context (plus a condition
152152
for responses) to the user question and constructs them into a prompt.
153-
7. The system sends the prompt to a PaLM 2 model via the PaLM API.
154-
8. The PaLM 2 model generates a response and the Docs Agent server renders it on
153+
7. The system sends the prompt to a language model via the Gemini API.
154+
8. The language model generates a response and the Docs Agent server renders it on
155155
the chat UI.
156156

157157
Additional events for [“fact-checking” a generated response][fact-check-section]:
158158

159159
9. The Docs Agent server prepares another prompt that compares the generated response
160-
(in step 8) to the context (in step 6) and asks the PaLM model to look for
160+
(in step 8) to the context (in step 6) and asks the language model to look for
161161
a discrepancy in the response.
162-
10. The PaLM model generates a response that points out one major discrepancy
162+
10. The language model generates a response that points out one major discrepancy
163163
(if it exists) between its previous response and the context.
164164
11. The Docs Agent server renders this response on the chat UI as a call-out note.
165165
12. The Docs Agent server passes this second response to the vector database to
@@ -172,9 +172,9 @@ Additional events for [“fact-checking” a generated response][fact-check-sect
172172
Additional events for
173173
[suggesting 5 questions related to the user question][related-questions-section]:
174174

175-
15. The Docs Agent server prepares another prompt that asks the PaLM model to
175+
15. The Docs Agent server prepares another prompt that asks the language model to
176176
generate 5 questions based on the context (in step 6).
177-
16. The PaLM model generates a response that contains a list of questions related
177+
16. The language model generates a response that contains a list of questions related
178178
to the context.
179179
17. The Docs Agent server renders the questions on the chat UI.
180180

@@ -188,11 +188,11 @@ enhancing the usability of the Q&A experience powered by generative AI.
188188
**Figure 6**. A screenshot of the Docs Agent chat UI showing the sections generated by
189189
three distinct prompts.
190190

191-
### Using a PaLM 2 model to fact-check its own response
191+
### Using a language model to fact-check its own response
192192

193193
In addition to using the prompt structure above (shown in Figure 3), we‘re currently
194194
experimenting with the following prompt setup for “fact-checking” responses generated
195-
by the PaLM model:
195+
by the language model:
196196

197197
- Condition:
198198

@@ -247,18 +247,18 @@ database. Once the vector database returns a list of the most relevant content t
247247
the UI only displays the top URL to the user.
248248

249249
Keep in mind that this "fact-checking" prompt setup is currently considered **experimental**
250-
because we‘ve seen cases where a PaLM model would end up adding incorrect information into its
250+
because we‘ve seen cases where a language model would end up adding incorrect information into its
251251
second response as well. However, we saw that adding this second response (which brings attention
252-
to the PaLM model’s possible hallucinations) seems to improve the usability of the system since it
253-
serves as a reminder to the users that the PaLM model‘s response is far from being perfect, which
252+
to the language model’s possible hallucinations) seems to improve the usability of the system since it
253+
serves as a reminder to the users that the language model‘s response is far from being perfect, which
254254
helps encourage the users to take more steps to validate generated responses for themselves.
255255

256-
### Using a PaLM 2 model to suggest related questions
256+
### Using a language model to suggest related questions
257257

258258
The project‘s latest web UI includes the “Related questions” section, which displays five
259259
questions that are related to the user question (see Figure 6). These five questions are also
260-
generated by a PaLM model (via the PaLM API). Using the list of contents returned from the vector
261-
database as context, the system prepares another prompt asking the PaLM model to generate five
260+
generated by a language model (via the Gemini API). Using the list of contents returned from the vector
261+
database as context, the system prepares another prompt asking the language model to generate five
262262
questions from the included context.
263263

264264
The following is the exact structure of this prompt:
@@ -364,7 +364,7 @@ This section provides instructions on how to set up the Docs Agent project on a
364364

365365
This is a [known issue][poetry-known-issue] in `poetry`.
366366

367-
5. Set the PaLM API key as a environment variable:
367+
5. Set the Gemini API key as a environment variable:
368368

369369
```
370370
export PALM_API_KEY=<YOUR_API_KEY_HERE>
@@ -603,8 +603,8 @@ To launch the Docs Agent chat app, do the following:
603603
already running on port 5000 on your host machine, you can use the `-p` flag to specify
604604
a different port (for example, `poetry run ./chatbot/launch.sh -p 5050`).
605605

606-
**Note**: If this `poetry run ./chatbot/launch.sh` command fails to run, check the `HOSTNAME` environment
607-
variable on your host machine (for example, `echo $HOSTNAME`). If this variable is unset, try setting it to
606+
**Note**: If this `poetry run ./chatbot/launch.sh` command fails to run, check the `HOSTNAME` environment
607+
variable on your host machine (for example, `echo $HOSTNAME`). If this variable is unset, try setting it to
608608
`localhost` by running `export HOSTNAME=localhost`.
609609

610610
Once the app starts running, this command prints output similar to the following:
@@ -659,14 +659,14 @@ Meggin Kearney (`@Meggin`), and Kyo Lee (`@kyolee415`).
659659
[markdown-to-plain-text]: ./scripts/markdown_to_plain_text.py
660660
[populate-vector-database]: ./scripts/populate_vector_database.py
661661
[context-source-01]: http://eventhorizontelescope.org
662-
[fact-check-section]: #using-a-palm-2-model-to-fact-check-its-own-response
663-
[related-questions-section]: #using-a-palm-2-model-to-suggest-related-questions
662+
[fact-check-section]: #using-a-language-model-to-fact-check-its-own-response
663+
[related-questions-section]: #using-a-language-model-to-suggest-related-questions
664664
[submit-a-rewrite]: #enabling-users-to-submit-a-rewrite-of-a-generated-response
665665
[like-generate-responses]: #enabling-users-to-like-generated-responses
666666
[populate-db-steps]: #populate-a-new-vector-database-from-markdown-files
667667
[start-the-app-steps]: #start-the-docs-agent-chat-app
668668
[launch-script]: ./chatbot/launch.sh
669-
[genai-doc-site]: https://developers.generativeai.google/products/palm
669+
[genai-doc-site]: https://ai.google.dev/docs
670670
[chroma-docs]: https://docs.trychroma.com/
671671
[flutter-docs-src]: https://github.com/flutter/website/tree/main/src
672672
[flutter-docs-site]: https://docs.flutter.dev/

demos/palm/python/docs-agent/chatbot/chatui.py

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -146,14 +146,24 @@ def ask_model(question):
146146
query_result = docs_agent.query_vector_store(question)
147147
context = query_result.fetch_formatted(Format.CONTEXT)
148148
context_with_instruction = docs_agent.add_instruction_to_context(context)
149-
response = docs_agent.ask_text_model_with_context(
150-
context_with_instruction, question
151-
)
149+
if "gemini" in docs_agent.get_language_model_name():
150+
response = docs_agent.ask_content_model_with_context(
151+
context_with_instruction, question
152+
)
153+
else:
154+
response = docs_agent.ask_text_model_with_context(
155+
context_with_instruction, question
156+
)
152157

153158
### PROMPT 2: FACT-CHECK THE PREVIOUS RESPONSE.
154-
fact_checked_response = docs_agent.ask_text_model_to_fact_check(
155-
context_with_instruction, response
156-
)
159+
if "gemini" in docs_agent.get_language_model_name():
160+
fact_checked_response = docs_agent.ask_content_model_to_fact_check(
161+
context_with_instruction, response
162+
)
163+
else:
164+
fact_checked_response = docs_agent.ask_text_model_to_fact_check(
165+
context_with_instruction, response
166+
)
157167

158168
### PROMPT 3: GET 5 RELATED QUESTIONS.
159169
# 1. Use the response from Prompt 1 as context and add a custom condition.

demos/palm/python/docs-agent/chatbot/templates/chatui/result.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ <h2>Question</h2>
33
<p>{{ question | replace("+", " ") | replace("%3F", "?")}}</p>
44
</div>
55
<div class="response-text" id="response-box">
6-
<h2>PaLM's answer</h2>
6+
<h2>Generated answer</h2>
77
<span id="palm-response">
88
{{ response_in_html | safe }}
99
</span>

demos/palm/python/docs-agent/chroma.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,6 @@ def get_collection(self, name, embedding_function=None, embedding_model=None):
7676
)
7777
)
7878
else:
79-
print("Embedding model: " + str(embedding_model))
8079
try:
8180
palm = PaLM(embed_model=embedding_model, find_models=False)
8281
# We cannot redefine embedding_function with def and

demos/palm/python/docs-agent/config.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,8 @@
2323
# embedding_model: The PaLM embedding model used to generate embeddings.
2424
#
2525
api_endpoint: "generativelanguage.googleapis.com"
26-
embedding_model: "models/embedding-gecko-001"
26+
language_model: "models/gemini-pro"
27+
embedding_model: "models/embedding-001"
2728

2829

2930
### Docs Agent environment

demos/palm/python/docs-agent/docs_agent.py

Lines changed: 58 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434

3535
# Select your PaLM API endpoint.
3636
PALM_API_ENDPOINT = "generativelanguage.googleapis.com"
37+
LANGUAGE_MODEL = None
3738
EMBEDDING_MODEL = None
3839

3940
# Set up the path to the chroma vector database.
@@ -54,13 +55,34 @@
5455
MODEL_ERROR_MESSAGE = config_values.returnConfigValue("model_error_message")
5556
LOG_LEVEL = config_values.returnConfigValue("log_level")
5657
PALM_API_ENDPOINT = config_values.returnConfigValue("api_endpoint")
58+
LANGUAGE_MODEL = config_values.returnConfigValue("language_model")
5759
EMBEDDING_MODEL = config_values.returnConfigValue("embedding_model")
5860

5961
# Select the number of contents to be used for providing context.
6062
NUM_RETURNS = 5
6163

6264
# Initialize the PaLM instance.
63-
palm = PaLM(api_key=API_KEY, api_endpoint=PALM_API_ENDPOINT)
65+
if LANGUAGE_MODEL != None and EMBEDDING_MODEL != None:
66+
if "gemini" in LANGUAGE_MODEL:
67+
palm = PaLM(
68+
api_key=API_KEY,
69+
api_endpoint=PALM_API_ENDPOINT,
70+
content_model=LANGUAGE_MODEL,
71+
embed_model=EMBEDDING_MODEL,
72+
)
73+
else:
74+
palm = PaLM(
75+
api_key=API_KEY,
76+
api_endpoint=PALM_API_ENDPOINT,
77+
text_model=LANGUAGE_MODEL,
78+
embed_model=EMBEDDING_MODEL,
79+
)
80+
elif EMBEDDING_MODEL != None:
81+
palm = PaLM(
82+
api_key=API_KEY, api_endpoint=PALM_API_ENDPOINT, embed_model=EMBEDDING_MODEL
83+
)
84+
else:
85+
palm = PaLM(api_key=API_KEY, api_endpoint=PALM_API_ENDPOINT)
6486

6587

6688
class DocsAgent:
@@ -79,8 +101,11 @@ def __init__(self):
79101
self.prompt_condition = CONDITION_TEXT
80102
self.fact_check_question = FACT_CHECK_QUESTION
81103
self.model_error_message = MODEL_ERROR_MESSAGE
104+
# Models settings
105+
self.language_model = LANGUAGE_MODEL
106+
self.embedding_model = EMBEDDING_MODEL
82107

83-
# Use this method for talking to PaLM (Text)
108+
# Use this method for talking to a PaLM text model
84109
def ask_text_model_with_context(self, context, question):
85110
new_prompt = f"{context}\n\nQuestion: {question}"
86111
# Print the prompt for debugging if the log level is VERBOSE.
@@ -101,7 +126,22 @@ def ask_text_model_with_context(self, context, question):
101126
return self.model_error_message
102127
return response.result
103128

104-
# Use this method for talking to PaLM (Chat)
129+
# Use this method for talking to a Gemini content model
130+
def ask_content_model_with_context(self, context, question):
131+
new_prompt = context + "\n\nQuestion: " + question
132+
# Print the prompt for debugging if the log level is VERBOSE.
133+
if LOG_LEVEL == "VERBOSE":
134+
self.print_the_prompt(new_prompt)
135+
try:
136+
response = palm.generate_content(new_prompt)
137+
except google.api_core.exceptions.InvalidArgument:
138+
return self.model_error_message
139+
for chunk in response:
140+
if str(chunk.candidates[0].content) == "":
141+
return self.model_error_message
142+
return response.text
143+
144+
# Use this method for talking to a PaLM chat model
105145
def ask_chat_model_with_context(self, context, question):
106146
try:
107147
response = palm.chat(
@@ -116,12 +156,18 @@ def ask_chat_model_with_context(self, context, question):
116156
return self.model_error_message
117157
return response.last
118158

119-
# Use this method for asking PaLM (Text) for fact-checking
159+
# Use this method for asking a PaLM text model for fact-checking
120160
def ask_text_model_to_fact_check(self, context, prev_response):
121161
question = self.fact_check_question + "\n\nText: "
122162
question += prev_response
123163
return self.ask_text_model_with_context(context, question)
124164

165+
# Use this method for asking a Gemini content model for fact-checking
166+
def ask_content_model_to_fact_check(self, context, prev_response):
167+
question = self.fact_check_question + "\n\nText: "
168+
question += prev_response
169+
return self.ask_content_model_with_context(context, question)
170+
125171
# Query the local Chroma vector database using the user question
126172
def query_vector_store(self, question):
127173
return self.collection.query(question, NUM_RETURNS)
@@ -142,6 +188,14 @@ def add_custom_instruction_to_context(self, condition, context):
142188
def generate_embedding(self, text):
143189
return palm.embed(text)
144190

191+
# Get the name of the language model used in this Docs Agent setup
192+
def get_language_model_name(self):
193+
return self.language_model
194+
195+
# Get the name of the embedding model used in this Docs Agent setup
196+
def get_embedding_model_name(self):
197+
return self.embedding_model
198+
145199
# Print the prompt on the terminal for debugging
146200
def print_the_prompt(self, prompt):
147201
print("#########################################")

0 commit comments

Comments
 (0)