Skip to content

Commit 45c09f1

Browse files
committed
eval driven system design cookbook updates merge
2 parents 384d120 + f92933b commit 45c09f1

File tree

2 files changed

+34
-0
lines changed

2 files changed

+34
-0
lines changed

examples/partners/eval_driven_system_design/receipt_inspection.ipynb

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,7 @@
112112
"source": [
113113
"## Project Lifecycle\n",
114114
"\n",
115+
<<<<<<< HEAD
115116
"Not every project will proceed in the same way, but projects generally have some \n",
116117
"important components in common.\n",
117118
"\n",
@@ -121,6 +122,10 @@
121122
"represents the ongoing nature of problem understanding - uncovering more about\n",
122123
"the customer domain will influence every step of the process. We wil examine \n",
123124
"several of these iterative cycles of refinement in detail below. \n",
125+
=======
126+
"Not every project will proceed in the same way, but projects generally have some common\n",
127+
"important components.\n",
128+
>>>>>>> origin/main
124129
"\n",
125130
"### 1. Understand the Problem\n",
126131
"\n",
@@ -140,11 +145,18 @@
140145
"It's very rare that a real-world project will start with all the data necessary to get\n",
141146
"to a satisfactory solution, much less to establish confidence.\n",
142147
"\n",
148+
<<<<<<< HEAD
143149
"In our case, we're going to assume that we have a decent sample of system *inputs*, \n",
144150
"in the form of but receipt images, but start without any fully annotated data. We find \n",
145151
"this is a not-unusual situation when automating an existing process. Instead, \n",
146152
"we'll walk through the process of building that out as we go along by collaborating with\n",
147153
"domain experts, and make our evals progressively more comprehensive.\n",
154+
=======
155+
"In our case, we're going to assume that we have a decent sample of system *inputs*\n",
156+
"(here, photographs of receipts), but start without any fully annotated data. We'll walk\n",
157+
"through the process of incrementally expanding our test and training sets as we go along\n",
158+
"and make our evals progressively more comprehensive.\n",
159+
>>>>>>> origin/main
148160
"\n",
149161
"### 3. Build an End-to-End V0 System\n",
150162
"\n",
@@ -402,7 +414,11 @@
402414
"cell_type": "markdown",
403415
"metadata": {},
404416
"source": [
417+
<<<<<<< HEAD
405418
"![Walmart_image](../../../images/Supplies_20240322_220858_Raven_Scan_3_jpeg.rf.50852940734939c8838819d7795e1756.jpg)"
419+
=======
420+
"<img src=\"../../../images/Supplies_20240322_220858_Raven_Scan_3_jpeg.rf.50852940734939c8838819d7795e1756.jpg\" alt=\"Walmart_image\" width=\"400\"/>"
421+
>>>>>>> origin/main
406422
]
407423
},
408424
{
@@ -505,6 +521,7 @@
505521
"source": [
506522
"### Action Decision\n",
507523
"\n",
524+
<<<<<<< HEAD
508525
"Next, we need to close the loop and get to an actual decision based on receipts. \n",
509526
"\n",
510527
"Ordinarily one would start with the most capable model - `o3`, at this time - for a \n",
@@ -521,6 +538,10 @@
521538
"\n",
522539
"Otherwise, this is pretty similar to the last, so we'll present the code without \n",
523540
"further comment."
541+
=======
542+
"Next, we need to close the loop and get to an actual decision based on receipts. This\n",
543+
"looks pretty similar, so we'll present the code without comment."
544+
>>>>>>> origin/main
524545
]
525546
},
526547
{
@@ -909,10 +930,14 @@
909930
"metadata": {},
910931
"source": [
911932
"After you run that eval you'll be able to view it in the UI, and should see something\n",
933+
<<<<<<< HEAD
912934
"like the below. \n",
913935
"\n",
914936
"(Note, if you have a Zero-Data-Retention agreement, this data is not stored\n",
915937
"by OpenAI, so will not be available in this interface.)\n",
938+
=======
939+
"like:\n",
940+
>>>>>>> origin/main
916941
"\n",
917942
"![Summary UI](../../../images/partner_summary_ui.png)\n",
918943
"\n",
@@ -1642,7 +1667,11 @@
16421667
"ARE NOT TRAVEL-RELATED, THEN IT MUST BE AUDITED.\n",
16431668
"```\n",
16441669
"\n",
1670+
<<<<<<< HEAD
16451671
"4. We added three examples, JSON input/output pairs wrapped in XML tags.\n",
1672+
=======
1673+
"3. We added three examples, JSON input/output pairs wrapped in XML tags.\n",
1674+
>>>>>>> origin/main
16461675
"\n",
16471676
"With our prompt revisions, we'll regenerate the data to evaluate and re-run the same\n",
16481677
"eval to compare our results:"

registry.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,18 @@
99
date: 2025-06-01
1010
authors:
1111
- shikhar-cyber
12+
<<<<<<< HEAD
1213
- moredatarequired
1314
- tooluser
1415
- eddiesiegel
1516
tags:
1617
- evals
1718
- API Flywheel
1819
- completions
20+
=======
21+
tags:
22+
- evals
23+
>>>>>>> origin/main
1924
- responses
2025
- functions
2126
- tracing

0 commit comments

Comments
 (0)