Commit 1d47a8e
[Evals] Optimize tracing for Phoenix (elastic#238599)
## What
Optimizes inference instrumentation for use in Phoenix, by no longer
requiring a single root trace. Additionally, retry on missing scores in
the criteria evaluator.
## Why
We previously required a single root trace for its events to be
exported. This was done to not have to ingest all unrelated HTTP
requests that do not have any value for what we use Phoenix for:
debugging LLM-based workflows. The change we are making here is that we
use context to flag events as inference-related, instead of requiring it
to be a span. We don't need a single `RunExperiment` span anymore for
this reason, which messes up the token metrics in Phoenix for
evaluations.
I also added retries around scoring - in some cases the LLM did not
score all criteria. To that end, I've added `executeUntilValid` which
will ask the LLM to keep calling a tool until it no longer encounters
errors.
Some other minor changes:
- experiments can now contain metadata
- export types for `@kbn/evals` worker fixtures
I've targeted this at 9.2.0 as it is a dev-only change (mostly) and I
want to limit the chance of conflicts.1 parent 9d3f206 commit 1d47a8e
File tree
17 files changed
+648
-79
lines changed17 files changed
+648
-79
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
Lines changed: 3 additions & 22 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
10 | | - | |
11 | | - | |
12 | | - | |
13 | | - | |
14 | | - | |
| 9 | + | |
15 | 10 | | |
16 | 11 | | |
17 | | - | |
18 | | - | |
19 | 12 | | |
20 | 13 | | |
21 | 14 | | |
22 | 15 | | |
23 | 16 | | |
24 | | - | |
| 17 | + | |
25 | 18 | | |
26 | 19 | | |
27 | 20 | | |
| |||
33 | 26 | | |
34 | 27 | | |
35 | 28 | | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
| 29 | + | |
49 | 30 | | |
50 | 31 | | |
51 | 32 | | |
| |||
Lines changed: 49 additions & 23 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
12 | | - | |
| 11 | + | |
| 12 | + | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
69 | 66 | | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
74 | 72 | | |
75 | 73 | | |
76 | 74 | | |
| |||
85 | 83 | | |
86 | 84 | | |
87 | 85 | | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
94 | 120 | | |
95 | 121 | | |
96 | 122 | | |
| |||
Lines changed: 7 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| 89 | + | |
89 | 90 | | |
90 | 91 | | |
91 | 92 | | |
| 93 | + | |
92 | 94 | | |
93 | 95 | | |
94 | 96 | | |
| |||
98 | 100 | | |
99 | 101 | | |
100 | 102 | | |
| 103 | + | |
101 | 104 | | |
102 | 105 | | |
103 | 106 | | |
| 107 | + | |
104 | 108 | | |
105 | 109 | | |
106 | 110 | | |
107 | | - | |
| 111 | + | |
108 | 112 | | |
109 | 113 | | |
110 | 114 | | |
| |||
115 | 119 | | |
116 | 120 | | |
117 | 121 | | |
| 122 | + | |
118 | 123 | | |
119 | 124 | | |
120 | 125 | | |
| |||
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
16 | | - | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
14 | 19 | | |
| 20 | + | |
15 | 21 | | |
16 | 22 | | |
17 | 23 | | |
| |||
31 | 37 | | |
32 | 38 | | |
33 | 39 | | |
| 40 | + | |
| 41 | + | |
34 | 42 | | |
35 | 43 | | |
36 | 44 | | |
| |||
54 | 62 | | |
55 | 63 | | |
56 | 64 | | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| 32 | + | |
32 | 33 | | |
33 | 34 | | |
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
| 14 | + | |
0 commit comments