Skip to content

Commit 29db1c7

Browse files
Review main-docs/set_env_for_training_data_and_reference_doc.md [Checked] (#72)
1 parent 0a73205 commit 29db1c7

File tree

1 file changed

+41
-31
lines changed

1 file changed

+41
-31
lines changed
Lines changed: 41 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,55 +1,65 @@
1-
# Set env variables for training data and reference doc for Pro mode
2-
Folders [document_training](../data/document_training/) and [field_extraction_pro_mode](../data/field_extraction_pro_mode) contain the manually labeled data for training and reference doc for Pro mode as a quick sample. Before using these knowledge source files, you need an Azure Storage blob container to store them. Let's follow below steps to prepare the data environment:
3-
4-
1. *Create an Azure Storage Account:* If you don’t already have one, follow the guide to [create an Azure Storage Account](https://aka.ms/create-a-storage-account).
5-
> If you already have an account, you can skip this step.
6-
2. *Install Azure Storage Explorer:* Azure Storage Explorer is a tool which makes it easy to work with Azure Storage data. Install it and login with your credential, follow the [guide](https://aka.ms/download-and-install-Azure-Storage-Explorer).
7-
3. *Create or Choose a Blob Container:* Create a blob container from Azure Storage Explorer or use an existing one.
8-
<img src="./create-blob-container.png" width="600" />
9-
4. *Set SAS URL Related Environment Variables in ".env" File:* Depending on the sample that you will run, you will need to set required environment variables in [.env](../notebooks/.env). There are two options to set up environment variables to utilize required Shared Access Signature (SAS) URL.
10-
- Option A - Generate a SAS URL manually on Azure Storage Explorer
11-
- Right-click on blob container and select the `Get Shared Access Signature...` in the menu.
12-
- Check the required permissions: `Read`, `Write` and `List`
13-
- We will need `Write` for uploading, modifying, or appending blobs
14-
- Click the `Create` button.
15-
<img src="./get-access-signature.png" height="600" /> <img src="./choose-signature-options.png" height="600" />
16-
- *Copy the SAS URL:* After creating the SAS, click `Copy` to get the URL with token. This will be used as the value for **TRAINING_DATA_SAS_URL** or **REFERENCE_DOC_SAS_URL** when running the sample code.
1+
# Set Environment Variables for Training Data and Reference Documents in Pro Mode
2+
3+
The folders [document_training](../data/document_training/) and [field_extraction_pro_mode](../data/field_extraction_pro_mode) contain manually labeled data used for training data in Standard mode and reference documents in Pro mode as quick samples. Before using these knowledge source files, you need an Azure Storage blob container to store them. Please follow the steps below to prepare your data environment:
4+
5+
1. **Create an Azure Storage Account:**
6+
If you don’t already have one, follow the guide to [create an Azure Storage Account](https://aka.ms/create-a-storage-account).
7+
> If you already have an account, you can skip this step.
8+
9+
2. **Install Azure Storage Explorer:**
10+
Azure Storage Explorer is a tool that simplifies working with Azure Storage data. Install it and log in with your credentials by following the [installation guide](https://aka.ms/download-and-install-Azure-Storage-Explorer).
11+
12+
3. **Create or Choose a Blob Container:**
13+
Using Azure Storage Explorer, create a new blob container or use an existing one.
14+
<img src="./create-blob-container.png" width="600" />
15+
16+
4. **Set SAS URL-related Environment Variables in the `.env` File:**
17+
Depending on the sample you plan to run, configure the required environment variables in the [.env](../notebooks/.env) file. There are two options to set up environment variables that utilize the required Shared Access Signature (SAS) URL.
18+
19+
- **Option A - Generate a SAS URL Manually via Azure Storage Explorer**
20+
- Right-click on the blob container and select **Get Shared Access Signature...** from the menu.
21+
- Select the permissions: **Read**, **Write**, and **List**.
22+
- Note: **Write** permission is required for uploading, modifying, or appending blobs.
23+
- Click the **Create** button.
24+
<img src="./get-access-signature.png" height="600" /> <img src="./choose-signature-options.png" height="600" />
25+
- **Copy the SAS URL:** After creating the SAS, click **Copy** to get the URL with the token. This URL will be used as the value for either **TRAINING_DATA_SAS_URL** or **REFERENCE_DOC_SAS_URL** when running the sample code.
1726
<img src="./copy-access-signature.png" width="600" />
1827
19-
- Set the following in [.env](../notebooks/.env).
20-
> NOTE: **REFERENCE_DOC_SAS_URL** can be the same as the **TRAINING_DATA_SAS_URL** to re-use the same blob container
21-
- For [analyzer_training](../notebooks/analyzer_training.ipynb): Add the SAS URL as value of **TRAINIGN_DATA_SAS_URL**.
28+
- Set the following variables in the [.env](../notebooks/.env) file:
29+
> **Note:** The value for **REFERENCE_DOC_SAS_URL** can be the same as **TRAINING_DATA_SAS_URL** to reuse the same blob container.
30+
- For [analyzer_training](../notebooks/analyzer_training.ipynb): Add the SAS URL as the value of **TRAINING_DATA_SAS_URL**.
2231
```env
2332
TRAINING_DATA_SAS_URL=<Blob container SAS URL>
2433
```
25-
- For [field_extraction_pro_mode](../notebooks/field_extraction_pro_mode.ipynb): Add the SAS URL as value of **REFERENCE_DOC_SAS_URL**.
34+
- For [field_extraction_pro_mode](../notebooks/field_extraction_pro_mode.ipynb): Add the SAS URL as the value of **REFERENCE_DOC_SAS_URL**.
2635
```env
2736
REFERENCE_DOC_SAS_URL=<Blob container SAS URL>
2837
```
29-
- Option B - Auto-generate the SAS URL via code in sample notebooks
30-
- Instead of manually creating a SAS URL, you can set storage account and container information, and let the code generate a temporary SAS URL at runtime.
31-
> NOTE: **TRAINING_DATA_STORAGE_ACCOUNT_NAME** and **TRAINING_DATA_CONTAINER_NAME** can be the same as the **REFERENCE_DOC_STORAGE_ACCOUNT_NAME** and **REFERENCE_DOC_CONTAINER_NAME** to re-use the same blob container
32-
- For [analyzer_training](../notebooks/analyzer_training.ipynb): Add the storage account name as `TRAINING_DATA_STORAGE_ACCOUNT_NAME` and the container name under that storage account as `TRAINING_DATA_CONTAINER_NAME`.
38+
39+
- **Option B - Auto-generate the SAS URL via Code in Sample Notebooks**
40+
- Instead of manually creating a SAS URL, you can specify the storage account and container information and let the code generate a temporary SAS URL at runtime.
41+
> **Note:** **TRAINING_DATA_STORAGE_ACCOUNT_NAME** and **TRAINING_DATA_CONTAINER_NAME** can be the same as **REFERENCE_DOC_STORAGE_ACCOUNT_NAME** and **REFERENCE_DOC_CONTAINER_NAME** to reuse the same blob container.
42+
- For [analyzer_training](../notebooks/analyzer_training.ipynb): Add the storage account name as `TRAINING_DATA_STORAGE_ACCOUNT_NAME` and the container name under that storage account as `TRAINING_DATA_CONTAINER_NAME`.
3343
```env
3444
TRAINING_DATA_STORAGE_ACCOUNT_NAME=<your-storage-account-name>
3545
TRAINING_DATA_CONTAINER_NAME=<your-container-name>
3646
```
37-
- For [field_extraction_pro_mode](../notebooks/field_extraction_pro_mode.ipynb): Add the storage account name as `REFERENCE_DOC_STORAGE_ACCOUNT_NAME` and the container name under that storage account as `REFERENCE_DOC_CONTAINER_NAME`.
47+
- For [field_extraction_pro_mode](../notebooks/field_extraction_pro_mode.ipynb): Add the storage account name as `REFERENCE_DOC_STORAGE_ACCOUNT_NAME` and the container name under that storage account as `REFERENCE_DOC_CONTAINER_NAME`.
3848
```env
3949
REFERENCE_DOC_STORAGE_ACCOUNT_NAME=<your-storage-account-name>
4050
REFERENCE_DOC_CONTAINER_NAME=<your-container-name>
4151
```
4252
43-
5. *Set Folder Prefix in ".env" File:* Depending on the sample that you will run, you will need to set required environment variables in [.env](../notebooks/.env).
44-
- For [analyzer_training](../notebooks/analyzer_training.ipynb): Add a prefix for **TRAINING_DATA_PATH**. You can choose any folder name you like for **TRAINING_DATA_PATH**. For example, you could use "training_files".
53+
5. **Set Folder Prefixes in the `.env` File:**
54+
Depending on the sample you will run, set the required environment variables in the [.env](../notebooks/.env) file.
55+
56+
- For [analyzer_training](../notebooks/analyzer_training.ipynb): Add a prefix for **TRAINING_DATA_PATH**. You can choose any folder name within the blob container. For example, use `training_files`.
4557
```env
4658
TRAINING_DATA_PATH=<Designated folder path under the blob container>
4759
```
48-
- For [field_extraction_pro_mode](../notebooks/field_extraction_pro_mode.ipynb): Add a prefix for **REFERENCE_DOC_PATH**. You can choose any folder name you like for **REFERENCE_DOC_PATH**. For example, you could use "reference_docs".
60+
- For [field_extraction_pro_mode](../notebooks/field_extraction_pro_mode.ipynb): Add a prefix for **REFERENCE_DOC_PATH**. You can choose any folder name within the blob container. For example, use `reference_docs`.
4961
```env
5062
REFERENCE_DOC_PATH=<Designated folder path under the blob container>
5163
```
5264
53-
Now, we have completed the preparation of the data environment. Next, we could create an analyzer through code.
54-
55-
65+
Once these steps are completed, your data environment is ready. You can proceed to create an analyzer through code.

0 commit comments

Comments
 (0)