File tree Expand file tree Collapse file tree 1 file changed +57
-0
lines changed
Expand file tree Collapse file tree 1 file changed +57
-0
lines changed Original file line number Diff line number Diff line change 1+ # Details of Execution
2+
3+ The lifecycle of a span query includes:
4+
5+ - Input from client to the generate
6+ - Output to client of the generate
7+ - By-products: what the model server caches as a result of that generate
8+
9+ ## Input Concerns
10+
11+ Input from client to the generate.
12+
13+ ### Messages
14+
15+ ```
16+ (system m): a message with role "system" and content m
17+ (user m): a message with role "user" and content m
18+ (assistant m): a message with role "assistant" and content m
19+ ```
20+
21+ ### Notes on the Effect of Chat Templates
22+
23+ We assume that when mapping ` A ` to ` a ` the chat template is applied
24+ then the tokenizer.
25+
26+ ### Terminology
27+
28+ The terminology below has capitalized letters representing strings and
29+ lowercase letters representing token sequences. For example:
30+
31+ ```
32+ A, B, C: these represent messages
33+ a, b, c: these represent corresponding token sequences, with chat template applied
34+ _: ensure that the preceding sequence both starts and ends on a block boundary
35+ +: special token for begin span
36+ x: special token for restore cross attention
37+ ```
38+
39+ ### Rules
40+
41+ ```
42+ (seq A B C) -> abc
43+ (plus A B C) -> (+a)_(+b)_(+c)_ meaning add + to each and ensure each starts and ends on a block boundary
44+ (cross A B C) -> ab(xc)_ meaning add x before the last element and ensure (xc) starts and ends on a block boundary
45+ ```
46+
47+ ### Examples
48+
49+ ```
50+ (cross A (plus B C) D) -> a(+b)_(+c)_(xd)_
51+ ```
52+
53+ ## By-product of generate
54+
55+ What the model server caches as a result of that generate.
56+
57+ TODO
You can’t perform that action at this time.
0 commit comments