RAG and agentic AI (#688)

Paul-Cornell · web-flow · commit 041712d14092 · 2025-07-15T16:19:22.000-07:00
diff --git a/api-reference/partition/chunking.mdx b/api-reference/partition/chunking.mdx
@@ -3,7 +3,7 @@ title: Chunking strategies
 ---
 
 Chunking functions use metadata and document elements detected with partition functions to split a document into
-appropriately-sized chunks for uses cases such as Retrieval Augmented Generation (RAG).
+appropriately-sized chunks for uses cases such as retrieval-augmented generation (RAG).
 
 If you are familiar with chunking methods that split long text documents into smaller chunks, you'll notice that
 Unstructured methods slightly differ, since the partitioning step already divides an entire document into its structural elements.
diff --git a/api-reference/workflow/overview.mdx b/api-reference/workflow/overview.mdx
@@ -3,7 +3,7 @@ title: Overview
 ---
 
 The [Unstructured UI](/ui/overview) features a no-code user interface for transforming your unstructured data into data that is ready 
-for Retrieval Augmented Generation (RAG).
+for retrieval-augmented generation (RAG).
 
 The Unstructured Workflow Endpoint, part of the [Unstructured API](/api-reference/overview), enables a full range of partitioning, chunking, embedding, and 
 enrichment options for your files and data. It is designed to batch-process files and data in remote locations; send processed results to 
diff --git a/open-source/core-functionality/chunking.mdx b/open-source/core-functionality/chunking.mdx
@@ -1,6 +1,6 @@
 ---
 title: Chunking
-description: Chunking functions in `unstructured` use metadata and document elements detected with `partition` functions to post-process elements into more useful "chunks" for uses cases such as Retrieval Augmented Generation (RAG).
+description: Chunking functions in `unstructured` use metadata and document elements detected with `partition` functions to post-process elements into more useful "chunks" for uses cases such as retrieval-augmented generation (RAG).
 ---
 
 ## Chunking Basics
diff --git a/open-source/core-functionality/overview.mdx b/open-source/core-functionality/overview.mdx
@@ -14,4 +14,4 @@ After reading this section, you should understand the following:
     
 *   How to prepare data for downstream use cases using staging functions
     
-*   How to chunk partitioned documents for use cases such as Retrieval Augmented Generation (RAG).
+*   How to chunk partitioned documents for use cases such as retrieval-augmented generation (RAG).
diff --git a/open-source/how-to/embedding.mdx b/open-source/how-to/embedding.mdx
@@ -21,7 +21,7 @@ These vectors are stored or _embedded_ next to the data itself.
 
 These vector embeddings allow _vector databases_ to more quickly and efficiently analyze and process these inherent 
 properties and relationships between data. For example, you can save the extracted text along with its embeddings in a _vector store_. 
-When a user queries a retrieval augmented generation (RAG) application, the application can use a vector database to perform a similarity search in that vector store 
+When a user queries a retrieval-augmented generation (RAG) application, the application can use a vector database to perform a similarity search in that vector store 
 and then return the documents whose embeddings are the closest to that user's query.
 
 Learn more about [chunking](https://unstructured.io/blog/chunking-for-rag-best-practices) and 
diff --git a/open-source/introduction/overview.mdx b/open-source/introduction/overview.mdx
@@ -35,7 +35,7 @@ and use cases.
 
 *   Pretraining models
 *   Fine-tuning models
-*   Retrieval Augmented Generation (RAG)
+*   Retrieval-augmented generation (RAG)
 *   Traditional ETL
 
 <Note>GPU usage is not supported for the Unstructured open source library.</Note>
diff --git a/snippets/concepts/glossary.mdx b/snippets/concepts/glossary.mdx
@@ -36,11 +36,11 @@ High-level overview of available strategies and models in `Unstructured` library
 
 LLMs, like GPT, are trained on vast amounts of data and can comprehend and generate human-like text. They have achieved state-of-the-art results across many NLP tasks and can be fine-tuned to cater to specific domains or requirements.
 
-## Retrieval augmented generation (RAG)
+## Retrieval-augmented generation (RAG)
 
 Large language models (LLMs) like OpenAI’s ChatGPT and Anthropic’s Claude have revolutionized the AI landscape with their prowess. However, they inherently suffer from significant drawbacks. One major issue is their static nature, which means they’re “frozen in time.” Despite this, LLMs might often respond to newer queries with unwarranted confidence, a phenomenon known as “hallucination.” Such errors can be highly detrimental, mainly when these models serve critical real-world applications.
 
-Retrieval augmented generation (RAG) is a groundbreaking technique designed to counteract the limitations of foundational LLMs. By pairing an LLM with an RAG pipeline, we can enable users to access the underlying data sources that the model uses. This transparent approach ensures that an LLM’s claims can be verified for accuracy and builds a trust factor among users.
+Retrieval-augmented generation (RAG) is a groundbreaking technique designed to counteract the limitations of foundational LLMs. By pairing an LLM with an RAG pipeline, we can enable users to access the underlying data sources that the model uses. This transparent approach ensures that an LLM’s claims can be verified for accuracy and builds a trust factor among users.
 
 Moreover, RAG offers a cost-effective solution. Instead of bearing the extensive computational and financial burdens of training custom models or fine-tuning existing ones, RAG can, in many situations, serve as a sufficient alternative. This reduction in resource consumption is particularly beneficial for organizations that need more means to develop and deploy foundational models from scratch.
 
diff --git a/snippets/quickstarts/single-file-ui.mdx b/snippets/quickstarts/single-file-ui.mdx
@@ -132,7 +132,7 @@ import EnrichmentImagesTablesHiResOnly from '/snippets/general-shared-text/enric
            allowfullscreen
            ></iframe>
 
-           - Add a **Chunker** node after the **Partitioner** node, to chunk the partitioned data into smaller pieces for your retrieval augmented generation (RAG) applications. 
+           - Add a **Chunker** node after the **Partitioner** node, to chunk the partitioned data into smaller pieces for your retrieval-augmented generation (RAG) applications. 
              To do this, click the add (**+**) button to the right of the **Partitioner** node, and then click **Enrich > Chunker**. Click the new **Chunker** node and 
              specify its settings. For help, click the **FAQ** button in the **Chunker** node's pane. [Learn more about chunking and chunker settings](/ui/chunking).
            - Add an **Enrichment** node after the **Chunker** node, to apply enrichments to the chunked data such as image summaries, table summaries, table-to-HTML transforms, and 
diff --git a/ui/embedding.mdx b/ui/embedding.mdx
@@ -9,7 +9,7 @@ These vectors are stored or _embedded_ next to the text itself. These vector emb
 an _embedding provider_.
 
 You typically save these embeddings in a _vector store_. 
-When a user queries a retrieval augmented generation (RAG) application, the application can use a vector database to perform 
+When a user queries a retrieval-augmented generation (RAG) application, the application can use a vector database to perform 
 a [similarity search](https://www.pinecone.io/learn/what-is-similarity-search/) in that vector store 
 and then return the items whose embeddings are the closest to that user's query.
 
diff --git a/ui/overview.mdx b/ui/overview.mdx
@@ -2,7 +2,7 @@
 title: Overview
 ---
 
-The Unstructured user interface (UI) is a no-code user interface, pay-as-you-go platform for transforming your unstructured data into data that is ready for Retrieval Augmented Generation (RAG). 
+The Unstructured user interface (UI) is a no-code user interface, pay-as-you-go platform for transforming your unstructured data into data that is ready for retrieval-augmented generation (RAG). 
 
 <Tip>To start using the Unstructured UI right away, skip ahead to the [quickstart](/ui/quickstart).</Tip>
 
diff --git a/welcome.mdx b/welcome.mdx
@@ -5,7 +5,7 @@ sidebarTitle: Overview
 
 ![ETL plus for GenAI data banner](/img/welcome/ETL-For-GenAI-Data.png)
 
-Unstructured provides a platform and tools to ingest and process unstructured documents for Retrieval Augmented Generation (RAG) and model fine-tuning.
+Unstructured provides a platform and tools to ingest and process unstructured documents for retrieval-augmented generation (RAG), agentic AI, and model fine-tuning.
 
 This 60-second video describes more about what Unstructured does and its benefits:
 

Original file line number	Diff line number	Diff line change
`@@ -14,4 +14,4 @@ After reading this section, you should understand the following:`
`14`	`14`
`15`	`15`	`* How to prepare data for downstream use cases using staging functions`
`16`	`16`
`17`		`-* How to chunk partitioned documents for use cases such as Retrieval Augmented Generation (RAG).`
	`17`	`+* How to chunk partitioned documents for use cases such as retrieval-augmented generation (RAG).`