Skip to content

Commit 0714228

Browse files
committed
wip
1 parent 2be0980 commit 0714228

File tree

1 file changed

+38
-38
lines changed

1 file changed

+38
-38
lines changed

docs/docs/codeflash-concepts/benchmarking.md

Lines changed: 38 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,42 @@
1-
## How codeflash decides if an optimization is faster
1+
## How Codeflash measures the runtime of code
22

3-
Codeflash reports benchmarking results that look something like this.
3+
Codeflash reports benchmarking results that look like this.
44

55
⏱️ Runtime : 32.8 microseconds → 29.2 microseconds (best of 315 runs)
66

7-
In this document we explain how we measure the runtime of code, how we determine if an optimization is faster, why we measure
8-
timing as bes of N runs, how we measure the runtime of a wide variety of codes.
7+
In this document we explain -
8+
- how we measure the runtime of code
9+
- how we determine if an optimization is actually faster
10+
- why we measure the timing as best of N runs
11+
- how we measure the runtime when we run on a wide variety of test cases.
912

10-
## Design of Codeflash auto-benchmarking
13+
## Goals of Codeflash auto-benchmarking
1114

12-
A core part of the design of Codeflash is that it does not make strong assumptions
13-
of what types of optimizations are faster. Codeflash automatically benchmarks the code
14-
on a variety of inputs and determines empirically if the optimization is actually faster.
15+
A core design of Codeflash is that it does not make assumptions
16+
on the types of optimizations that might be faster. It generates multiple possible optimizations with LLMs and then automatically benchmarks the code
17+
on a variety of inputs to verify empirically if the optimization is actually faster.
1518

16-
The aims behind the design of Codeflash auto-benchmarking are:
17-
- Be able to accurately measure the runtime of code.
18-
- Be able to measure runtime of a wide variety of codes.
19-
- Be able to measure runtime of code on a variety of inputs.
20-
- Do all the above on real machine, where there might be other processes running, creating timing measurement noise.
21-
- Be able to make a binary decision on whether an optimization is faster or not.
19+
The goals of Codeflash auto-benchmarking are:
20+
- Accurately measure the runtime of code.
21+
- Measure runtime of a wide variety of codes.
22+
- Measure runtime on a variety of inputs.
23+
- Do all the above on real machine, where there might be other processes running, causing timing measurement noise.
24+
- Make a binary decision on whether an optimization is faster or not.
2225

23-
A useful train analogy -
24-
(timing decision is a binary decision)
25-
Imagine that you are a train supervisor who is comparing that between two trains, Train A and Train B, which one is faster.
26+
## A useful train analogy -
2627

27-
[//]: # (Your objective is to figure out which train is the fastest to go from Seattle to San Diego. )
28+
Imagine that you are a boss at a train company who wants to purchase a train to run between the two cities of San Francisco and Los Angeles.
29+
You are deciding between two trains, Train A and Train B, and want to run the train that is the fastest between the two cities.
2830

29-
[//]: # (The route first goes to San Francisco, Los Angeles and then ends at San Diego.)
3031
You can measure the speed of the trains by timing how long it takes to go from San Francisco to Los Angeles.
3132

3233
Unfortunately, there are real life factors that can affect the speed of the trains. There might
33-
be rail traffic, weather conditions, terrain or other factors that can slow down the trains.
34+
be rail traffic, unfavorable weather conditions, hills or other factors that can slow down the trains.
3435

3536
To settle the contest, you ask a train driver to race the two trains and run the trains as fast as possible.
36-
You run both the trains A and B from San Francisco to Los Angeles and measure the time it takes.
37+
You measure the time it takes to go from San Francisco to Los Angeles.
3738

38-
Now the train A took 5% less time to do Seattle->San Diego than train B. But the driver complaints that
39+
Now the train A took 5% less time than train B. But the driver complaints that
3940
train B's run had poor weather on the way so they can't make conclusions yet. It is very important to definitively
4041
know which train is faster.
4142

@@ -46,33 +47,32 @@ This gives us timing data looking like the following. The units are in hours.
4647

4748
![img_2.png](img_2.png)
4849

49-
Now our task becomes seemingly harder to decide which train is faster because now there are 50x2 data points.
50+
Now the task to decide which train is faster becomes harder since now there are 50x2 data points.
5051

51-
Unfortunately, the timing data is also noisy. Other trains might be running on the tracks, the weather might change,
52-
or the train might be delayed for some other reason. This makes it hard to determine which train is faster.
52+
Unfortunately, the timing data is noisy. Other trains might be running on the tracks, the weather might change etc.
53+
This makes it hard to determine which train is faster in reality.
5354

54-
The crucial point is that, the noise in the timing data is not the fault of the train.
55-
If we think about which train is fast - speed is a property of the train and not the hindrances.
56-
The ideal way to decide the time would be to clear out all the hindrances and measure the time.
57-
That way we would have a clean data set that is not noisy.
55+
The crucial point here is that the noise in the timing data is not the fault of the train.
56+
Speed of the train is an intrinsic property of the train and not the external hindrances.
57+
an important property of the noise is that it is only additive in nature , i.e. when there is a hindrance, the time taken only increases.
58+
There is no negative noise, which would make the trains go faster.
59+
The ideal way to decide the run time would be to clear out all the hindrances and rerun the race.
60+
That way we would have a clean data set that is not noisy, and the run times would be the true speed of the train.
5861

5962
But in real life, we cannot do that. The best we can do is to try to minimize the noise,
60-
and get the "signal" which is the speed of the train, and not the noise which is the time added by hindrances.
63+
to get close to the "signal" which is the intrinsic speed of the train, and not the noise which is the time added by hindrances.
6164
Luckily, we can do that. When we repeat the race multiple times, we get multiple data points.
62-
There will be a lot of cases where when the train goes between two stations, all the conditions are favorable,
65+
There will be many cases where when the train runs between the two stations, all the conditions are favorable,
6366
and the train is able to run at its maximum speed. This will be when the noise is the least, and the
6467
measured time will be the smallest. This is the "signal" we are looking for - the fastest speed that the train
65-
can achieve. The noise is only additive noise, i.e. when there is a hindrance, the time taken only increases, there is no
66-
negative noise, which would make the train go faster.
68+
can achieve over the whole route. This speed can be compared to find the fastest train.
6769

68-
So the key idea is that we find the minimum time that the train can achieve at a sector. That is very close to the fastest speed that the train can achieve.
70+
So the key idea is that we find the minimum time that the train can achieve between two cities. That is very close to the fastest speed that the train can achieve.
6971

7072
## How Codeflash benchmarks code
7173

72-
From the above, it is clear that we want to measure the fastest speed, which corresponds to the minimum time that the train can achieve.
73-
This has the least amount of additive noise, and is the most accurate measure of the intrinsic speed of the train.
74-
75-
The same idea applies to Codeflash . With processors, there are many different types of noise that can increase the runtime of a function.
74+
The idea of measuring the fastest speed, which minimizes the noise, is the same idea that Codeflash uses to measure the runtime of code.
75+
With computer processors, there are many different types of noise that can increase the runtime of a function.
7676
The noise can be caused by -
7777
- The hardware - there can be cache misses, cpu frequency scaling up and down, etc.
7878
- the operating system - there can be context switches, memory allocation, etc.

0 commit comments

Comments
 (0)