Skip to content

Commit e9f13f5

Browse files
committed
Address feedback and tweak docs
1 parent 2d7b453 commit e9f13f5

File tree

6 files changed

+38
-65
lines changed

6 files changed

+38
-65
lines changed

.github/prompts/testcov.prompt.md

Lines changed: 0 additions & 27 deletions
This file was deleted.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ The repo includes sample data so it's ready to try end to end. In this sample ap
6060
- Chat (multi-turn) and Q&A (single turn) interfaces
6161
- Renders citations and thought process for each answer
6262
- Includes settings directly in the UI to tweak the behavior and experiment with options
63-
- Integrates Azure AI Search for indexing and retrieval of documents, with support for [many document formats](/docs/data_ingestion.md#supported-document-formats) as well as [cloud-based data ingestion](/docs/data_ingestion.md#cloud-based-ingestion)
63+
- Integrates Azure AI Search for indexing and retrieval of documents, with support for [many document formats](/docs/data_ingestion.md#supported-document-formats) as well as [cloud data ingestion](/docs/data_ingestion.md#cloud-data-ingestion)
6464
- Optional usage of [multimodal models](/docs/multimodal.md) to reason over image-heavy documents
6565
- Optional addition of [speech input/output](/docs/deploy_features.md#enabling-speech-inputoutput) for accessibility
6666
- Optional automation of [user login and data access](/docs/login_and_acl.md) via Microsoft Entra

azure.yaml

Lines changed: 30 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -41,36 +41,36 @@ services:
4141
interactive: false
4242
continueOnError: false
4343
# Un-comment this section if using USE_CLOUD_INGESTION option
44-
# document-extractor:
45-
# project: ./app/functions/document_extractor
46-
# language: py
47-
# host: function
48-
# hooks:
49-
# prepackage:
50-
# shell: pwsh
51-
# run: python ../../../scripts/copy_prepdocslib.py
52-
# interactive: false
53-
# continueOnError: false
54-
# figure-processor:
55-
# project: ./app/functions/figure_processor
56-
# language: py
57-
# host: function
58-
# hooks:
59-
# prepackage:
60-
# shell: pwsh
61-
# run: python ../../../scripts/copy_prepdocslib.py
62-
# interactive: false
63-
# continueOnError: false
64-
# text-processor:
65-
# project: ./app/functions/text_processor
66-
# language: py
67-
# host: function
68-
# hooks:
69-
# prepackage:
70-
# shell: pwsh
71-
# run: python ../../../scripts/copy_prepdocslib.py
72-
# interactive: false
73-
# continueOnError: false
44+
document-extractor:
45+
project: ./app/functions/document_extractor
46+
language: py
47+
host: function
48+
hooks:
49+
prepackage:
50+
shell: pwsh
51+
run: python ../../../scripts/copy_prepdocslib.py
52+
interactive: false
53+
continueOnError: false
54+
figure-processor:
55+
project: ./app/functions/figure_processor
56+
language: py
57+
host: function
58+
hooks:
59+
prepackage:
60+
shell: pwsh
61+
run: python ../../../scripts/copy_prepdocslib.py
62+
interactive: false
63+
continueOnError: false
64+
text-processor:
65+
project: ./app/functions/text_processor
66+
language: py
67+
host: function
68+
hooks:
69+
prepackage:
70+
shell: pwsh
71+
run: python ../../../scripts/copy_prepdocslib.py
72+
interactive: false
73+
continueOnError: false
7474
hooks:
7575
preprovision:
7676
windows:

docs/data_ingestion.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
The [azure-search-openai-demo](/) project can set up a full RAG chat app on Azure AI Search and OpenAI so that you can chat on custom data, like internal enterprise data or domain-specific knowledge sets. For full instructions on setting up the project, consult the [main README](/README.md), and then return here for detailed instructions on the data ingestion component.
44

5-
The chat app provides two ways to ingest data: manual ingestion and cloud-based ingestion. Both approaches use the same code for processing the data, but the manual ingestion runs locally while cloud ingestion runs in Azure Functions as Azure AI Search custom skills.
5+
The chat app provides two ways to ingest data: manual ingestion and cloud ingestion. Both approaches use the same code for processing the data, but the manual ingestion runs locally while cloud ingestion runs in Azure Functions as Azure AI Search custom skills.
66

77
- [Supported document formats](#supported-document-formats)
88
- [Ingestion stages](#ingestion-stages)
@@ -13,7 +13,7 @@ The chat app provides two ways to ingest data: manual ingestion and cloud-based
1313
- [Categorizing data for enhanced search](#enhancing-search-functionality-with-data-categorization)
1414
- [Indexing additional documents](#indexing-additional-documents)
1515
- [Removing documents](#removing-documents)
16-
- [Cloud-based ingestion](#cloud-based-ingestion)
16+
- [Cloud ingestion](#cloud-ingestion)
1717
- [Custom skills pipeline](#custom-skills-pipeline)
1818
- [Indexing of additional documents](#indexing-of-additional-documents)
1919
- [Removal of documents](#removal-of-documents)
@@ -36,7 +36,7 @@ In order to ingest a document format, we need a tool that can turn it into text.
3636

3737
## Ingestion stages
3838

39-
The ingestion pipeline consists of three main stages that transform raw documents into searchable content in Azure AI Search. These stages apply to both [local ingestion](#local-ingestion) and [cloud-based ingestion](#cloud-based-ingestion).
39+
The ingestion pipeline consists of three main stages that transform raw documents into searchable content in Azure AI Search. These stages apply to both [local ingestion](#local-ingestion) and [cloud ingestion](#cloud-ingestion).
4040

4141
### Document extraction
4242

@@ -132,7 +132,7 @@ To remove all documents, use `./scripts/prepdocs.sh --removeall` or `./scripts/p
132132

133133
You can also remove individual documents by using the `--remove` flag. Open either `scripts/prepdocs.sh` or `scripts/prepdocs.ps1` and replace `/data/*` with `/data/YOUR-DOCUMENT-FILENAME-GOES-HERE.pdf`. Then run `scripts/prepdocs.sh --remove` or `scripts/prepdocs.ps1 --remove`.
134134

135-
## Cloud-based ingestion
135+
## Cloud ingestion
136136

137137
This project includes an optional feature to perform data ingestion in the cloud using Azure Functions as custom skills for Azure AI Search indexers. This approach offloads the ingestion workload from your local machine to the cloud, allowing for more scalable and efficient processing of large datasets.
138138

docs/deploy_features.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -322,9 +322,9 @@ Alternatively you can use the browser's built-in [Speech Synthesis API](https://
322322
azd env set USE_SPEECH_OUTPUT_BROWSER true
323323
```
324324
325-
## Enabling cloud-based data ingestion
325+
## Enabling cloud data ingestion
326326
327-
By default, this project runs a local script in order to ingest data. Once you move beyond the sample documents, you may want cloud-based ingestion, which uses Azure AI Search indexers and custom Azure AI Search skills based off the same code used by the local ingestion. That approach scales better to larger amounts of data.
327+
By default, this project runs a local script in order to ingest data. Once you move beyond the sample documents, you may want cloud ingestion, which uses Azure AI Search indexers and custom Azure AI Search skills based off the same code used by the local ingestion. That approach scales better to larger amounts of data.
328328
329329
To enable cloud ingestion:
330330

tests/test_mediadescriber.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ def mock_get(self, url, **kwargs):
6868
"startPageNumber": 1,
6969
"endPageNumber": 1,
7070
"unit": "pixel",
71-
"pages": [{"pageNumber": 0}],
71+
"pages": [{"pageNumber": 1}],
7272
}
7373
],
7474
},

0 commit comments

Comments
 (0)