Skip to content
This repository was archived by the owner on Jan 15, 2025. It is now read-only.

Commit a4b06fc

Browse files
author
Eyal
authored
Orchestrator CLI draft spec (#1021)
* orchestrator CLI initial spec draft
1 parent d59abb5 commit a4b06fc

File tree

2 files changed

+164
-0
lines changed

2 files changed

+164
-0
lines changed

specs/NLRModels.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
## -- DRAFT --
2+
3+
4+
5+
# Natural Language Representation Model
6+
7+
8+
Turing Natural Language Representation (Turing *NLR*) models are generic language representation models which have been trained towards more sophisticated pretraining tasks for both monolingual as well as multilingual scenarios. Turing NLR models are used as a natural replacement for BERT-like models.
9+
10+
## Models
11+
12+
### Turing NLR v3
13+
Turing NLR v3 is the latest NLR (monolingual) model which belongs to the transformer (BERT-like) family of models. This is a hybrid model which has both representation as well as generation capabilities. This model was pretrained using bi-directional LM (via Auto Encoding) and sequence-to-sequence LM (via Partially Auto-Regressive) with Pseudo-Masked Language Model. See [1][1] for details.
14+
15+
**TBD - add more**
16+
17+
18+
19+
## References
20+
21+
* [UniLMv2 Paper](https://arxiv.org/abs/2002.12804)
22+
23+
[1]: https://arxiv.org/abs/2002.12804 "UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training"

specs/OrchestratorCLI.md

Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
## *--DRAFT--*
2+
3+
4+
5+
# Orchestrator CLI Spec
6+
7+
8+
9+
## Summary
10+
11+
Orchestrator command is used by Bot Framework developer to add language understanding functionality based on [Orchestrator][1] technology to their bots.
12+
13+
The Orchestrator command may also be used in advanced mode (see *interactive* sub-command) for refined experimentation and fine tuning derived language models.
14+
15+
16+
17+
## Requirements
18+
### General and Common
19+
20+
Use the following common guidelines for all commands (unless explicitly excluded).
21+
22+
- All commands return:
23+
- Success or Error status
24+
- Error code and message if available
25+
- Upon successful completion:
26+
- Print back affected values (if too long, specify container e.g. filename or use ellipsis ...)
27+
- Server detailed response (e.g. training status)
28+
- All commands must display synopsis with example in help
29+
- All commands must print which flags/arguments are optional, mandatory and default values
30+
- All lengthy operations (> 5 seconds) shall print progress indicator (dots at 5 sec interval)
31+
- All flags assume to be provided value pair (e.g. -o, --out expects -o, --out <filename>)
32+
- All flags that are marked `mandatory default to...` are mandatory *unless* specified in config in which case they default to provided values in config file
33+
- Use *camelCase* for all parameters
34+
35+
### Prerequisites
36+
* BF CLI environment (OCliff etc). See [Foundation Dev Spec][5] for more.
37+
* Access to downloadable [NLR][4] base models
38+
39+
### Use Cases
40+
41+
Orchestrator is first shipped as a LUIS alternative for *Dispatch Scenario*, i.e. for mapping utterances to intents only. Entity recognition is on the roadmap as a future capability.
42+
43+
#### Primary Workflow
44+
45+
The mainstream bot language recognition development cycle with Orchestrator is assumed to be generally as follows:
46+
47+
1. Create Intent-utterances example based .lu definition referred to as a *label file* using the Language Understanding practices as described in [Language Understanding][2] for dispatch (e.g. author .lu file or within the [Composer][3] GUI experience).
48+
2. Download Natural Language Representation ([NLR][4]) base Model (will be referred to as the *basemodel*).
49+
3. Combine the label file .lu from (1) with the base model from (2) to create a *snapshot* file with a .blu extension.
50+
4. Create another test .lu file similar to (1) with utterances that are similar but are not identical to the ones specified in the example based .lu definition in (1).
51+
5. Test quality of utterance to intent recognition.
52+
6. Examine report to ensure that the recognition quality is satisfactory. For more on report interpretation see **TBD**.
53+
7. If not, adjust the label file in (1) and repeat steps 2-6 until satisfied.
54+
55+
For more information on Machine Learning design practices see [References](#References).
56+
57+
#### Variation: Composer Dialogs
58+
59+
If designing a Composer or Adaptive based multi folder, multi-dialog bot that requires processing across directories and generation of corresponding .dialog files use the *build* command.
60+
61+
The *build* command does... **TBD**
62+
63+
#### Advanced
64+
65+
The advanced language recognition development cycle assumes some level understand of machine learning concepts and interactive iterations over the language example definition and potentially evaluation of different models.
66+
67+
For the advanced scenario please refer to the following documentation **TBD**
68+
69+
70+
71+
## Design Specifications
72+
73+
### Command Line Form
74+
At the root *bf orchestrator* shall print synopsis, and brief description of each sub-command. Similarly each sub-command --help command shall print additional usage details.
75+
76+
77+
78+
| Sub-Command | Options | Comments |
79+
| ----------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
80+
| basemodel | :get <br />:list<br />:compare **TBD** | Downloads the default, generally optimal base model, the Natural Language Representation (nlr) as basis for creating the snapshot file. see [NLR Models][4] for more on *nlr*<br />Other models may work better in some scenarios, and there are tradeoffs of performance; speed, memory usage. One may experiment by using different models. It is however recommended to get familiar with advanced usage (**TBD** see advanced command)<br />To see the list of available models use the list command. |
81+
| create | -i, --in=in The path to source label files from where orchestrator example file will be created from. Default<br/> to current working directory.<br/><br/> -m, --model=model Path to Orchestrator model directory.<br/><br/> -o, --out=out Path where generated orchestrator snapshot file will be placed. Default to current working<br/> directory. | Creates Orchestrator snapshot .blu file from .lu/.qna files |
82+
| test | -a, --ambiguous=ambiguous Ambiguous threshold, default to 0.2<br/> -i, --in=in Path to a previously created snapshot file<br/> -l, --low_confidence=low_confidence Low confidence threshold, default to 0.5<br/> -m, --model=model Directory or hosting Orchestrator config and model files.<br/> -o, --out=out Directory where analysis and output files will be placed.<br/> -p, --multi_label=multi_label Plural/multi-label prediction threshold, default to 1<br/> -t, --test=test Path to a test file.<br/> -u, --unknown=unknown Unknow label threshold, default to 0.3<br />-t, --prediction=prediction | Run tests on label, and model files. <br /><br />**TBD: How** we distinguish evaluate/assess/test, with mode flag or implicit with explanation? e.g.if test file not specified, run in assessment mode. Add See Also: explain, add link to detailed discussion... |
83+
| build | -i, --in=in Path to lu file or folder with lu files.<br/> -m, --model=model Path to Orchestrator model.<br/><br/> -o, --out=out Path where Orchestrator snapshot/dialog file(s) will be placed. Default to current working<br/> directory.<br/><br/> --dialog Generate multi language or cross train Orchestrator recognizers.<br/><br/> --luconfig=luconfig Path to luconfig.json. | **TBD: Explain build command** |
84+
| interactive | Enter **advanced** interactive mode | See full reference here **TBD** |
85+
| query | --phrase \<phrase><br />--model \<snapshot file> | Queries a snapshot .blu file for given phrase to find corresponding label. |
86+
87+
*standard global commands such as --help & --force are assumed
88+
89+
90+
91+
92+
93+
#### Porting Map [If Applicable]
94+
95+
96+
97+
Dispatch CLI is a predecessor to Orchestrator dispatch functionality in concept.
98+
99+
**TBD** Need to try to map either concepts or commands to older tool
100+
101+
**Command Group:** Orchestrator
102+
103+
| Sub-Command | Options | Old Command | Comments |
104+
| ----------- | ------- | ----------- | -------- |
105+
| | | | |
106+
| | | | |
107+
108+
109+
110+
## Special Considerations
111+
Orchestrator command group is starting as a plugin while in preview and will eventually be promoted to a command group parallel to luis, and qnamaker. It will evolve to include additional [NLR][4] base models, and differently from the mentioned LU services it is a local implementation with no service functionality behind. As such it is likely that some advanced tuning capability will be required in order to exploit the full power. In that case, users will be required to possess more advanced knowledge of machine learning and model optimization as well as be more comfortable with the advanced functionality described above.
112+
113+
114+
115+
## Issues
116+
<while actual issues should be tracked github/issues initial placeholders may be placed here transiently for convenience.
117+
118+
* Fix TBD and additional references
119+
* ...
120+
121+
## References
122+
123+
124+
* [Orchestrator](https://aka.ms/bf-orchestrator)
125+
* [Language Understanding](https://docs.microsoft.com/en-us/composer/concept-language-understanding)
126+
* [Composer](https://docs.microsoft.com/en-us/composer/introduction)
127+
* [Natural Language Representation Models](./nlrmodels.md)
128+
* [Wikipedia: Training, validation, and test sets](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets)
129+
* [Machine Learning Mastery](https://machinelearningmastery.com/difference-test-validation-datasets/).
130+
131+
132+
[1]: https://aka.ms/bf-orchestrator "Orchestrator"
133+
[2]: https://docs.microsoft.com/en-us/composer/concept-language-understanding "Language Understanding"
134+
[3]: https://docs.microsoft.com/en-us/composer/introduction "Composer"
135+
[4]: ./NLRModels.md "Natural Language Representation Models"
136+
[5]: ./FoundationDevSpec.md "Foundation dev spec"
137+
138+
139+
140+
141+

0 commit comments

Comments
 (0)