Skip to content

Commit 87138c4

Browse files
committed
[Docs Agent] Release of Docs Agent v.0.3.4
What's changed - **Enhanced file processing:** Process an entire directory with the `agent helpme` command and export results to a `responses.yaml` file. - **Improved session control:** Manage conversation context using the `--new` and `--cont` flags with `agent helpme`. - **Debugging tools:** Capture detailed logs for troubleshooting and export them to CSV with the new `agent write-logs-to-csv` command. - **Documentation expansion:** Explore updated configuration references, recent release notes, and a new concept doc on chunking. - **Bug fixes:** Added fixes for the FIDL splitter, file handling, and blank results, and updates to various README files.
1 parent f2cd674 commit 87138c4

File tree

35 files changed

+1430
-394
lines changed

35 files changed

+1430
-394
lines changed

examples/gemini/python/docs-agent/README.md

Lines changed: 50 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ and project deployment.
99

1010
## Overview
1111

12-
Docs Agent apps use a technique known as Retrieval Augmented Generation (RAG), which allows
13-
you to bring your own documents as knowledge sources to AI language models. This approach
14-
helps the AI language models to generate relevant and accurate responses that are grounded
15-
in the information that you provide and control.
12+
Docs Agent apps use a technique known as Retrieval Augmented Generation (RAG), which
13+
allows you to bring your own documents as knowledge sources to AI language models.
14+
This approach helps the AI language models to generate relevant and accurate responses
15+
that are grounded in the information that you provide and control.
1616

1717
![Docs Agent architecture](docs/images/docs-agent-architecture-01.png)
1818

@@ -26,66 +26,43 @@ the [Set up Docs Agent][set-up-docs-agent] section below.
2626

2727
The following list summarizes the tasks and features supported by Docs Agent:
2828

29-
- **Process Markdown**: Split Markdown files into small plain text files. (See the
30-
Python scripts in the [`preprocess`][preprocess-dir] directory.)
31-
- **Generate embeddings**: Use an embedding model to process small plain text files
32-
into embeddings, and store them in a vector database. (See the
33-
[`populate_vector_database.py`][populate-vector-database] script.)
34-
- **Perform semantic search**: Compare embeddings in the vector database to retrieve
35-
most relevant content given user questions.
36-
- **Add context to a user question**: Add a list of text chunks returned from
37-
a semantic search as context in a prompt. (See the
38-
[Structure of a prompt to a language model][prompt-structure] section.)
39-
- **(Experimental) “Fact-check” responses**: This experimental feature composes
29+
- **Process Markdown**: Split Markdown files into small plain text chunks. (See
30+
[Docs Agent chunking process][chunking-process].)
31+
- **Generate embeddings**: Use an embedding model to process text chunks into embeddings
32+
and store them in a vector database.
33+
- **Perform semantic search**: Compare embeddings in a vector database to retrieve
34+
chunks that are most relevant to user questions.
35+
- **Add context to a user question**: Add chunks returned from a semantic search as
36+
[context][prompt-structure] to a prompt.
37+
- **Fact-check responses**: This [experimental feature][fact-check-section] composes
4038
a follow-up prompt and asks the language model to “fact-check” its own previous response.
41-
(See the [Using a language model to fact-check its own response][fact-check-section]
42-
section.)
43-
- **Generate related questions**: In addition to displaying a response to the user
44-
question, the web UI displays 5 questions generated by the language model based on
45-
the context of the user question. (See the
46-
[Using a language model to suggest related questions][related-questions-section]
47-
section.)
48-
- **Return URLs of documentation sources**: Docs Agent's vector database stores URLs
49-
as metadata next to embeddings. Whenever the vector database is used to retrieve
50-
text chunks for context, the database can also return the URLs of the sources used
51-
to generate the embeddings.
52-
- **Collect feedback from users**: Docs Agent's chatbot web UI includes buttons that
53-
allow users to [like generated responses][like-generated-responses] or
54-
[submit rewrites][submit-a-rewrite].
39+
- **Generate related questions**: In addition to answering a question, Docs Agent can
40+
[suggest related questions][related-questions-section] based on the context of the
41+
question.
42+
- **Return URLs of source documents**: URLs are stored as chunks' metadata. This enables
43+
Docs Agent to return the URLs of the source documents.
44+
- **Collect feedback from users**: Docs Agent's web app has buttons that allow users
45+
to [like responses][like-generated-responses] or [submit rewrites][submit-a-rewrite].
5546
- **Convert Google Docs, PDF, and Gmail into Markdown files**: This feature uses
56-
Apps Script to convert Google Docs, PDF, and Gmail into Markdown files, which then
57-
can be used as input datasets for Docs Agent. (See the
58-
[`apps_script`][apps-script-readme] directory.)
59-
- **Run benchmark test to monitor the quality of AI-generated responses**: Using
60-
Docs Agent, you can run [benchmark test][benchmark-test] to measure and compare
61-
the quality of text chunks, embeddings, and AI-generated responses.
62-
- **Use the Semantic Retrieval API and AQA model**: You can use Gemini's
63-
[Semantic Retrieval API][semantic-api] to upload source documents to an online
64-
corpus and use the [AQA model][aqa-model] that is specifically created for answering
65-
questions using an online corpus.
66-
- **Manage online corpora using the Docs Agent CLI**: The Docs Agent CLI enables you
67-
to create, populate, update and delete online corpora using the Semantic Retrieval AI.
68-
For the list of all available Docs Agent command lines, see the
69-
[Docs Agent CLI reference][cli-reference] page.
70-
- **Run the Docs Agent CLI from anywhere in a terminal**: You can set up the
71-
Docs Agent CLI to ask questions to the Gemini model from anywhere in a terminal.
72-
For more information, see the [Set up Docs Agent CLI][cli-readme] page.
73-
- **Support the Gemini 1.5 models**: You can use the new Gemini 1.5 models,
74-
`gemini-1.5-pro-latest` and `text-embedding-004`, with Docs Agent today.
75-
For the moment, the following `config.yaml` setup is recommended:
76-
77-
```
78-
models:
79-
- language_model: "models/aqa"
80-
embedding_model: "models/text-embedding-004"
81-
api_endpoint: "generativelanguage.googleapis.com"
82-
...
83-
app_mode: "1.5"
84-
db_type: "chroma"
85-
```
86-
87-
The setup above uses 3 Gemini models to their strength: AQA (`aqa`),
88-
Gemini 1.0 Pro (`gemini-pro`), and Gemini 1.5 Pro (`gemini-1.5-pro-latest`).
47+
[Apps Script][apps-script-readme] to convert Google Docs, PDF, and Gmail into
48+
Markdown files, which then can be used as input datasets for Docs Agent.
49+
- **Run benchmark test**: Docs Agent can [run benchmark test][benchmark-test] to measure
50+
and compare the quality of text chunks, embeddings, and AI-generated responses.
51+
- **Use the Semantic Retrieval API and AQA model**: Docs Agent can use Gemini's
52+
[Semantic Retrieval API][semantic-api] to upload source documents to online corpora
53+
and use the [AQA model][aqa-model] for answering questions.
54+
- **Manage online corpora using the Docs Agent CLI**: The [Docs Agent CLI][cli-reference]
55+
lets you create, update and delete online corpora using the Semantic Retrieval AI.
56+
- **Prevent duplicate chunks and delete obsolete chunks in databases**: Docs Agent
57+
uses [metadata in chunks][chunking-process] to prevent uploading duplicate chunks
58+
and delete obsolete chunks that are no longer present in the source.
59+
- **Run the Docs Agent CLI from anywhere in a terminal**:
60+
[Set up the Docs Agent CLI][cli-readme] to make requests to the Gemini models
61+
from anywhere in a terminal.
62+
- **Support the Gemini 1.5 models**: Docs Agent works with the Gemini 1.5 models,
63+
`gemini-1.5-pro-latest` and `text-embedding-004`. The new ["1.5"][new-15-mode] web app
64+
mode uses all three Gemini models to their strength: AQA (`aqa`), Gemini 1.0 Pro
65+
(`gemini-pro`), and Gemini 1.5 Pro (`gemini-1.5-pro-latest`).
8966

9067
For more information on Docs Agent's architecture and features,
9168
see the [Docs Agent concepts][docs-agent-concepts] page.
@@ -122,26 +99,26 @@ Update your host machine's environment to prepare for the Docs Agent setup:
12299

123100
1. Update the Linux package repositories on the host machine:
124101

125-
```posix-terminal
102+
```
126103
sudo apt update
127104
```
128105

129106
2. Install the following dependencies:
130107

131-
```posix-terminal
108+
```
132109
sudo apt install git pipx python3-venv
133110
```
134111

135112
3. Install `poetry`:
136113

137-
```posix-terminal
114+
```
138115
pipx install poetry
139116
```
140117

141118
4. To add `$HOME/.local/bin` to your `PATH` variable, run the following
142119
command:
143120

144-
```posix-terminal
121+
```
145122
pipx ensurepath
146123
```
147124

@@ -157,7 +134,7 @@ Update your host machine's environment to prepare for the Docs Agent setup:
157134

158135
6. Update your environment:
159136

160-
```posix-termainl
137+
```
161138
source ~/.bashrc
162139
```
163140

@@ -202,25 +179,25 @@ Clone the Docs Agent project and install dependencies:
202179

203180
1. Clone the following repo:
204181

205-
```posix-terminal
182+
```
206183
git clone https://github.com/google/generative-ai-docs.git
207184
```
208185

209186
2. Go to the Docs Agent project directory:
210187

211-
```posix-terminal
188+
```
212189
cd generative-ai-docs/examples/gemini/python/docs-agent
213190
```
214191

215192
3. Install dependencies using `poetry`:
216193

217-
```posix-terminal
194+
```
218195
poetry install
219196
```
220197

221198
4. Enter the `poetry` shell environment:
222199

223-
```posix-terminal
200+
```
224201
poetry shell
225202
```
226203

@@ -437,3 +414,5 @@ Meggin Kearney (`@Meggin`), and Kyo Lee (`@kyolee415`).
437414
[oauth-client]: https://ai.google.dev/docs/oauth_quickstart#set-cloud
438415
[cli-readme]: docs_agent/interfaces/README.md
439416
[cli-reference]: docs/cli-reference.md
417+
[chunking-process]: docs/chunking-process.md
418+
[new-15-mode]: docs/config-reference.md#app_mode
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# Docs Agent chunking process
2+
3+
This page describes Docs Agent's chunking process and potential optimizations.
4+
5+
Currently, Docs Agent utilizes Markdown headings (`#`, `##`, and `###`) to
6+
split documents into smaller, manageable chunks. However, the Docs Agent team
7+
is actively developing more advanced strategies to improve the quality and
8+
relevance of these chunks for retrieval.
9+
10+
## Chunking technique
11+
12+
In Retrieval Augmented Generation ([RAG][rag]) based systems, ensuring each
13+
chunk contains the right information and context is crucial for accurate
14+
retrieval. The goal of an effective chunking process is to ensure that each
15+
chunk encapsulates a focused topic, which enhances the accuracy of retrieval
16+
and ultimately leads to better answers. At the same time, the Docs Agent team
17+
acknowledges the importance of a flexible approach that allows for
18+
customization based on specific datasets and use cases.
19+
20+
Key characteristics in Docs Agent’s chunking process include:
21+
22+
- **Docs Agent splits documents based on Markdown headings.** However,
23+
this approach has limitations, especially when dealing with large sections.
24+
- **Docs Agent chunks are smaller than 5000 bytes (characters).** This size
25+
limit is set by the embedding model used in generating embeddings.
26+
- **Docs Agent enhances chunks with additional metadata.** The metadata helps
27+
Docs Agent to execute operations efficiently, such as preventing duplicate
28+
chunks in databases and deleting obsolete chunks that are no longer
29+
present in the source.
30+
- **Docs Agent retrieves the top 5 chunks and displays the top chunk's URL.**
31+
However, this is adjustable in Docs Agent’s configuration (see the `widget`
32+
and `experimental` app modes).
33+
34+
The Docs Agent team continues to explore various optimizations to enhance
35+
the functionality and effectiveness of the chunking process. These efforts
36+
include refining the chunking algorithm itself and developing advanced
37+
post-processing techniques, for instance, reconstructing chunks to original
38+
documents after retrieval.
39+
40+
Additionally, the team has been exploring methods for co-optimizing content
41+
structure and chunking strategies, which aims to maximize retrieval
42+
effectiveness by ensuring the structure of the source document itself
43+
complements the chunking process.
44+
45+
## Chunks retrieval
46+
47+
Docs Agent employs two distinct approaches for storing and retrieving chunks:
48+
49+
- **The local database approach uses a [Chroma][chroma] vector database.**
50+
This approach grants greater control over the chunking and retrieval
51+
process. This option is recommended for development and experimental
52+
setups.
53+
- **The online corpus approach uses Gemini’s
54+
[Semantic Retrieval API][semantic-retrieval].** This approach provides
55+
the advantages of centrally hosted online databases, ensuring
56+
accessibility for all users throughout the organization. This approach
57+
has some drawbacks, as control is reduced because the API may dictate
58+
how chunks are selected and where customization can be applied.
59+
60+
Choosing between these approaches depends on the specific needs of the user’s
61+
deployment situation, which is to balance control and transparency against
62+
possible improvements in performance, broader reach and ease of use.
63+
64+
<!-- Reference links -->
65+
66+
[rag]: concepts.md
67+
[chroma]: https://docs.trychroma.com/
68+
[semantic-retrieval]: https://ai.google.dev/gemini-api/docs/semantic_retrieval

examples/gemini/python/docs-agent/docs/cli-reference.md

Lines changed: 57 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,15 @@
33
This page provides a list of the Docs Agent command lines and their usages
44
and examples.
55

6-
**Important**: All `agent` commands in this page need to run in the
7-
`poetry shell` environment.
6+
The Docs Agent CLI helps developers to manage the Docs Agent project and
7+
interact with language models. It can handle various tasks such as
8+
processing documents, populating vector databases, launching the chatbot,
9+
running benchmark test, sending prompts to language models, and more.
810

9-
## Processing of Markdown files
11+
**Important**: All `agent` commands need to run in the `poetry shell`
12+
environment.
13+
14+
## Processing documents
1015

1116
### Chunk Markdown files into small text chunks
1217

@@ -53,7 +58,16 @@ The command below deletes development databases specified in the
5358
agent cleanup-dev
5459
```
5560

56-
## Docs Agent chatbot web app
61+
### Write logs to a CSV file
62+
63+
The command below writes the summaries of all captured debugging information
64+
(in the `logs/debugs` directory) to a `.csv` file:
65+
66+
```sh
67+
agent write-logs-to-csv
68+
```
69+
70+
## Launching the chatbot web app
5771

5872
### Launch the Docs Agent web app
5973

@@ -89,7 +103,7 @@ a log view page (which is accessible at `<APP_URL>/logs`):
89103
agent chatbot --enable_show_logs
90104
```
91105

92-
## Docs Agent benchmark test
106+
## Running benchmark test
93107

94108
### Run the Docs Agent benchmark test
95109

@@ -158,7 +172,44 @@ absolure or relative path, for example:
158172
agent helpme write comments for this C++ file? --file ../my-project/test.cc
159173
```
160174

161-
## Online corpus management
175+
### Ask for advice in a session
176+
177+
The command below starts a new session (`--new`), which tracks responses,
178+
before running the `agent helpme` command:
179+
180+
```sh
181+
agent helpme <REQUEST> --file <PATH_TO_FILE> --new
182+
```
183+
184+
For example:
185+
186+
```sh
187+
agent helpme write a draft of all features found in this README file? --file ./README.md --new
188+
```
189+
190+
After starting a session, use the `--cont` flag to include the previous
191+
responses as context to the request:
192+
193+
```sh
194+
agent helpme <REQUEST> --cont
195+
```
196+
197+
For example:
198+
199+
```sh
200+
agent helpme write a concept doc that delves into more details of these features? --cont
201+
```
202+
203+
### Ask for advice using RAG
204+
205+
The command below uses a local or online vector database (specified in
206+
the `config.yaml` file) to retrieve relevant context for the request:
207+
208+
```sh
209+
agent helpme <REQUEST> --file <PATH_TO_FILE> --rag
210+
```
211+
212+
## Managing online corpora
162213

163214
### List all existing online corpora
164215

0 commit comments

Comments
 (0)