Skip to content

Commit 58f1763

Browse files
Nov2022 release (#9)
* CogServices requires custom domain for VNET * VNET enhancement to simplify provision task * KV integration in the index * Missing DFS entry for Search SPA * Remove the no document cover case. * Improved readibility * Tiny update on demo script. To improve * Search private env. for VNET * Search requires spec. Sku for VNET * Functions .NET Core 6.0 * Improved parameters mgt * SupportCoverImage enhancement * FormRecognizer enhancement * Add folders to showcase the Folders refinement * Icon would be the default cover image. * Params for container in the right place * Added Parameters placeholder * Remove Tags from response, moved to UI config * Re-Factor AzureSearch services * Exclude kmdev & kmprod * Architecture sample data * Unused field * Support Facet operator for future use * SVG icon for results rendering * Facet operator support for future use * Create file-shares * Preparation for Document Translation support * New translation data source and indexer * Storage - EnableHierarchicalNamespace * Id instead of ObjectId * Unused import * StorageContainersAsString - ContainersList * Document Translation - first round * Nov2002 - changes log * DocumentTranslation round #2 * Adding a new skill documentation * Removed TODO tag * PlaceHolder for future pivoted tables * Remove headers argument * Add functions keys command * Adding Authenication related code * Detecting Japanese language issue. Renamed settings. * Renamed settings. More unit tests (i.e. Japanese detection) * Exclude overlay content * Upgrade bootstrap icons * Upgrade bootstrap to 5.2.3 * Use document_shape in all indexers config * Move embedded and converted into document complex type * Add an option to disable EasyAuth at startup * Add an option to delete aliases while deleting an index * Add VS workspace * Document Translation enhancements * Document Translation in skillsets * Convey the Translation flag * NotFound rather than BadRequest response in cover image * Enforce the embedded flag document/embedded * NuGet updates * EasyAuth optional at startup * Document Translation Support in the UI * Documentation update * PII Detection integration * Move filters to a separate row * Changes log update Co-authored-by: Nicolas Uthurriague <[email protected]> Co-authored-by: harikanagidi <[email protected]>
1 parent e6a1a28 commit 58f1763

File tree

185 files changed

+3765
-1397
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

185 files changed

+3765
-1397
lines changed

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -276,3 +276,8 @@ profile.arm.json
276276
99-*
277277
dstoolkit*.json
278278
dstk*.json
279+
kmdev.json
280+
kmprod.json
281+
282+
!overlay/OVERLAY.md
283+
overlay/*

CHANGES.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
# Changes log
22

3+
## November 2022
4+
5+
- UI & API improvements
6+
- Simple Document Translation support
7+
- Form Recognizer
8+
- Not relying on direct content url anymore
9+
- Projection of the outputs to metadata storage for later use (UI)
10+
- PII Detection simple integration
11+
- VNET enhancements
12+
- Auto-creation of storage file-shares
13+
- Bugs fixing
14+
315
## October 2022
416

517
- Support for full VNET integration

README.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ This KM solution accelerator aims to provide you with a workable end-to-end Know
4747
- Content security model (permissions)
4848
- Modular User Interface
4949

50-
With this cloud-based accelerator you will get an end-to-end solution with the tools to deploy, extend, operate & monitor it.
50+
With this cloud-based accelerator you will get an end-to-end solution with the tools to deploy, extend, operate & monitor.
5151

5252
In that respect, the solution provides
5353
- Azure Web App Authentication support
@@ -83,7 +83,13 @@ Below is a non-exhaustive list of key highlights:
8383
- Having an HTML representation of a document could ease some NLP work.
8484
- Table of contents is a common structure which we expose in the HTML representation of a PDF.
8585

86-
* **Tables extraction**: tabular information are common in unstructured data. The solution will extract, index and project tables to a dedicated knowledge store.
86+
* **Tables extraction**: tabular information are common in unstructured data corpus. The solution will extract, index and project tables to a dedicated knowledge store (optional).
87+
88+
* **Translation**": there are two translation features in this solution
89+
* **Text Translation** : non-native content and title are normalized to a define language (default is english)
90+
* **Document Translation** : for non-native documents, the solution will translate them. They will follow the same Document processing as any document. Translated documents will provide you with translated tables for instance.
91+
92+
* **Text Analytics** : extract Entities (Named, Linked) from any document and OCR'ed image text.
8793

8894
* **Export to Excel**: popular ask when exploring unstructured data.
8995

@@ -188,7 +194,7 @@ This solution is inspired from the original work of the
188194
Core contributors to this solution accelerator are
189195
- [Nicolas Uthurriague](https://github.com/puthurr)
190196
- [Edoardo Quasso](https://github.com/EdoQuasso) for the Azure Cognitive Functions (Python)
191-
- [Timm Walz](https://github.com/nonstoptimm)
197+
- [Harika Nagidi](https://github.com/harikanagidi) for VNET support and deployment improvements.
192198

193199
# Special Thanks
194200

configuration/config/Authentication.md

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,26 +15,26 @@ The platform would be Web with the below Redirect URIs
1515
- "https://{{config.name}}ui.azurewebsites.net"
1616
- "https://{{config.name}}ui.azurewebsites.net/.auth/login/aad/callback",
1717

18+
__Check the ID Tokens__
1819

1920
## API Permissions
2021

2122
All permissions would be Delegated. Minimum permission required is
2223

23-
Microsoft Graph / User.Read (Sign In and read user profile)
24+
- Microsoft Graph / User.Read (Sign In and read user profile)
2425

2526
You might want to add more permissions as you see fit. Microsoft Graph has extensive set of APIs for collaboration.
2627

2728
## Expose an API section
2829

29-
Application ID URI would show
30+
Application ID URI would show (typically)
3031

3132
- api://{{config.clientId}}
3233

33-
Scopes defined by this API would list the user_impersonation scope
34+
__Scopes defined by this API__ should list the user_impersonation scope
3435

3536
- api://{{config.clientId}}/user_impersonation
3637

37-
3838
## Manifest section - Emit Security Groups Claims
3939

4040
To secure content, our solution accelerator would look up for the Security Groups (SGs) the user is member of so that a user would only see the content he is allowed to see.
@@ -87,9 +87,9 @@ In the configuration/config/webapps/webappui.json, you will find the below entri
8787
"slotSetting": false
8888
}
8989
```
90-
9190
The client secret app settings is not deployed as part of our solution accelerator.
9291

92+
9393
# Publishing your webapp settings
9494

9595
```ps
@@ -108,3 +108,24 @@ As you consented the application to read your profile upon the first connection,
108108

109109
To decode the security JWT token you may use [jwt.io](https://jwt.io). It will highlight among other things your security groups membership.
110110

111+
# Non-Azure EasyAuth Authentication (non default)
112+
113+
In the UI webapp settings, change AzureEasyAuthIntegration to false.
114+
115+
Add the below settings
116+
```json
117+
{
118+
"name": "AzureAd:CallbackPath",
119+
"value": "/signin-oidc",
120+
"slotSetting": false
121+
},
122+
{
123+
"name": "AzureAd:ClientSecret",
124+
"value": "<YOUR AZURE APP SECRET HERE>",
125+
"slotSetting": false
126+
}
127+
```
128+
129+
Restart the UI web app.
130+
131+
Your UI application will now authenticate the users by itself.

configuration/config/cogservices/config.json

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
"ServiceEndPoint":"Microsoft.CognitiveServices",
33
"GroupId":"CognitiveService",
44
"Apptype":"CognitiveService",
5+
"PrivateDNSZone":"privatelink.cognitiveservices.azure.com",
56
"Parameters" : {
67
"cogSvcLanguage": "{{config.name}}coglanguage",
78
"cogSvcVision": "{{config.name}}cogvision",
@@ -17,7 +18,7 @@
1718
"Sku":"S1",
1819
"Parameter":"cogSvcVision",
1920
"EnablePrivateAccess":true,
20-
"AddExistingSubnets":true
21+
"AccessSubnetRestriction":true
2122
},
2223
{
2324
"Name": "{{config.name}}cogform",
@@ -27,7 +28,7 @@
2728
"Sku":"S0",
2829
"Parameter":"cogSvcForm",
2930
"EnablePrivateAccess":true,
30-
"AddExistingSubnets":true
31+
"AccessSubnetRestriction":true
3132
},
3233
{
3334
"Name": "{{config.name}}cogtranslate",
@@ -37,7 +38,7 @@
3738
"Sku":"S1",
3839
"Parameter":"cogSvcTranslate",
3940
"EnablePrivateAccess":true,
40-
"AddExistingSubnets":true
41+
"AccessSubnetRestriction":true
4142
},
4243
{
4344
"Name": "{{config.name}}coglanguage",
@@ -47,7 +48,7 @@
4748
"Sku":"S",
4849
"Parameter":"cogSvcLanguage",
4950
"EnablePrivateAccess":true,
50-
"AddExistingSubnets":true
51+
"AccessSubnetRestriction":true
5152
}
5253
]
5354
}

configuration/config/containerregistry/config.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
"GroupId":"ContainerRegistry",
33
"ServiceEndPoint":"Microsoft.ContainerRegistry",
44
"Apptype":"ContainerRegistry",
5+
"PrivateDNSZone":"privatelink.azurecr.io",
56
"Parameters" : {
67
"acr": "{{config.name}}acr.azurecr.io",
78
"acr_prefix": "{{config.name}}acr"
@@ -11,7 +12,7 @@
1112
"Name": "{{config.name}}acr",
1213
"ResourceGroup": "{{config.resourceGroupName}}",
1314
"EnablePrivateAccess":true,
14-
"AddExistingSubnets":true,
15+
"AccessSubnetRestriction":true,
1516
"IPAddressToAdd":[]
1617
}
1718
]

configuration/config/functions/config.json

Lines changed: 31 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,36 @@
11
{
22
"GroupId": "Functions",
3+
"Apptype": "FunctionApp",
4+
"PrivateDNSZone":"privatelink.azurewebsites.net",
5+
"Parameters" : [],
36
"AppPlans": [
47
{
58
"Id": "skillsplan",
69
"Name": "{{config.name}}skillsplan",
710
"Sku": "{{param.pricing.premium}}",
811
"ResourceGroup": "{{config.resourceGroupName}}",
912
"IsLinux": false,
10-
"FunctionApps": [
13+
"Services": [
1114
{
1215
"Id": "geolocations",
1316
"Name": "{{config.name}}geolocations",
1417
"Path": "src\\CognitiveSearch.Skills\\C#\\Geo\\GeoLocations",
1518
"Version": 4,
19+
"DotnetVersion": "V6.0",
1620
"Functions": [
1721
{
1822
"Name": "locations"
1923
}
2024
],
2125
"AccessIPRestriction": false,
22-
"AccessSubnetRestriction": false
26+
"AccessSubnetRestriction": true
2327
},
2428
{
2529
"Id": "text",
2630
"Name": "{{config.name}}text",
2731
"Path": "src\\CognitiveSearch.Skills\\C#\\Text.Function",
2832
"Version": 4,
33+
"DotnetVersion": "V6.0",
2934
"Functions": [
3035
{
3136
"Name": "TextMesh"
@@ -41,13 +46,14 @@
4146
}
4247
],
4348
"AccessIPRestriction": false,
44-
"AccessSubnetRestriction": false
49+
"AccessSubnetRestriction": true
4550
},
4651
{
4752
"Id": "entities",
4853
"Name": "{{config.name}}entities",
4954
"Path": "src\\CognitiveSearch.Skills\\C#\\Entities.Function",
5055
"Version": 4,
56+
"DotnetVersion": "V6.0",
5157
"Functions": [
5258
{
5359
"Name": "concatenation"
@@ -60,7 +66,7 @@
6066
}
6167
],
6268
"AccessIPRestriction": false,
63-
"AccessSubnetRestriction": false
69+
"AccessSubnetRestriction": true
6470
}
6571
]
6672
},
@@ -70,19 +76,20 @@
7076
"Sku": "{{param.pricing.premium}}",
7177
"ResourceGroup": "{{config.resourceGroupName}}",
7278
"IsLinux": false,
73-
"FunctionApps": [
79+
"Services": [
7480
{
7581
"Id": "imgext",
7682
"Name": "{{config.name}}imgext",
7783
"Path": "src\\CognitiveSearch.Skills\\C#\\Image\\Image.Extraction",
7884
"Version": 4,
85+
"DotnetVersion": "V6.0",
7986
"Functions": [
8087
{
8188
"Name": "DurableImageExtractionSkill_HttpStart"
8289
}
8390
],
8491
"AccessIPRestriction": false,
85-
"AccessSubnetRestriction": false
92+
"AccessSubnetRestriction": true
8693
}
8794
]
8895
},
@@ -92,32 +99,34 @@
9299
"Sku": "{{param.pricing.premium}}",
93100
"ResourceGroup": "{{config.resourceGroupName}}",
94101
"IsLinux": false,
95-
"FunctionApps": [
102+
"Services": [
96103
{
97104
"Id": "mtda",
98105
"Name": "{{config.name}}mtda",
99106
"Path": "src\\CognitiveSearch.Skills\\C#\\Metadata\\Assignment",
100107
"Version": 4,
108+
"DotnetVersion": "V6.0",
101109
"Functions": [
102110
{
103111
"Name": "Assign"
104112
}
105113
],
106114
"AccessIPRestriction": false,
107-
"AccessSubnetRestriction": false
115+
"AccessSubnetRestriction": true
108116
},
109117
{
110118
"Id": "mtdext",
111119
"Name": "{{config.name}}mtdext",
112120
"Path": "src\\CognitiveSearch.Skills\\C#\\Metadata\\Extraction",
113121
"Version": 4,
122+
"DotnetVersion": "V6.0",
114123
"Functions": [
115124
{
116125
"Name": "MetadataExtractionSkill"
117126
}
118127
],
119128
"AccessIPRestriction": false,
120-
"AccessSubnetRestriction": false
129+
"AccessSubnetRestriction": true
121130
}
122131
]
123132
},
@@ -127,7 +136,7 @@
127136
"Sku": "{{param.pricing.premium}}",
128137
"ResourceGroup": "{{config.resourceGroupName}}",
129138
"IsLinux": true,
130-
"FunctionApps": [
139+
"Services": [
131140
{
132141
"Id": "vision",
133142
"Name": "{{config.name}}vision",
@@ -158,7 +167,7 @@
158167
}
159168
],
160169
"AccessIPRestriction": false,
161-
"AccessSubnetRestriction": false
170+
"AccessSubnetRestriction": true
162171
}
163172
]
164173
},
@@ -168,7 +177,7 @@
168177
"Sku": "{{param.pricing.premium}}",
169178
"ResourceGroup": "{{config.resourceGroupName}}",
170179
"IsLinux": true,
171-
"FunctionApps": [
180+
"Services": [
172181
{
173182
"Id": "language",
174183
"Name": "{{config.name}}language",
@@ -188,15 +197,24 @@
188197
{
189198
"Name": "LanguageDetection"
190199
},
200+
{
201+
"Name": "PIIDetection"
202+
},
191203
{
192204
"Name": "Summarization"
193205
},
194206
{
195207
"Name": "Translator"
208+
},
209+
{
210+
"Name": "DocumentTranslation"
211+
},
212+
{
213+
"Name": "DocumentTranslationHttpStart"
196214
}
197215
],
198216
"AccessIPRestriction": false,
199-
"AccessSubnetRestriction": false
217+
"AccessSubnetRestriction": true
200218
}
201219
]
202220
}

configuration/config/functions/imgext.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,8 @@
1515
"slotSetting": false
1616
},
1717
{
18-
"name": "ContainerName",
19-
"value": "{{param.documentsStorageContainerName}},{{param.imagesStorageContainerName}}",
18+
"name": "ContainersList",
19+
"value": "{{param.StorageContainersAsString}}",
2020
"slotSetting": false
2121
},
2222
{

0 commit comments

Comments
 (0)