You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The above code runs the function multiple times on different inputs and takes the minimum time for each input.
27
+
7
28
In this document we explain -
8
29
- how we measure the runtime of code
9
30
- how we determine if an optimization is actually faster
@@ -71,61 +92,65 @@ So the key idea is that we find the minimum time that the train can achieve betw
71
92
72
93
## How Codeflash benchmarks code
73
94
74
-
The idea of measuring the fastest speed, which minimizes the noise, is the same idea that Codeflash uses to measure the runtime of code.
95
+
The idea of measuring the fastest speed, which minimizes the external noise, is the same idea that Codeflash uses to measure the runtime of code.
75
96
With computer processors, there are many different types of noise that can increase the runtime of a function.
76
97
The noise can be caused by -
77
98
- The hardware - there can be cache misses, cpu frequency scaling up and down, etc.
78
99
- the operating system - there can be context switches, memory allocation, etc.
79
100
- the language - there can be garbage collection, thread scheduling etc.
80
101
81
102
Codeflash tries to minimize the noise by running the function multiple times and taking the minimum time.
82
-
This is when the function is not slowed down by any hindrances. The processor frequency is at its maximum,
83
-
cache misses are not happening, the operating system is not doing any context switches etc.
103
+
This happens when the function is not slowed down by any hindrances. The processor frequency is at its maximum,
104
+
cache misses are minimal, the operating system is not doing any context switches etc.
84
105
This is the fastest speed that the function can achieve, and is the most accurate measure of the intrinsic speed of the function.
85
106
86
107
When Codeflash wants to measure if an optimization is faster than the original function, it runs the two functions
87
108
multiple times and takes the minimum time for both the functions. This most accurate measurement of the
88
109
intrinstic speed of the function, which is the signal we are looking for. We can now compare the two functions and see which one is faster.
89
110
90
111
We have found that when we run the function several times, the chance of getting "lucky" when the function is not
91
-
slowed down by any hindrances gets very high. There codeflash tries to run the function as many times as reasonably possible.
92
-
Currently we loop the code for 10 seconds and a minimum of 5 loops, which gives us a good balance between accuracy of runtime of and the time it takes to run the function.
112
+
slowed down by any hindrances becomes very high. To maximize this luck, codeflash tries to run the function as many times as reasonably possible.
113
+
Currently, we loop the code for 10 seconds with a minimum of 5 loops, which gives us a good balance between accuracy of time measurement and the time it takes to run the function.
93
114
94
115
## What happens when there are multiple inputs to a function?
95
116
96
117
The above idea works well when there is only one input to a function. But what if there are multiple inputs?
97
118
98
-
Lets consider the train analogy again. Now the train goes between multiple stations. It first starts from Seattle up north,
119
+
Let's consider the train analogy again. Now the train race is extended between multiple stations. It first starts from Seattle up north,
99
120
and then goes south to San Francisco, then Los Angeles, and finally terminating at San Diego. We want to again measure
100
-
which train is the faster one on this route.
121
+
which train is the faster one for this route.
101
122
102
123
We can only measure the time taken by the train to go from one station to the next.
103
124
104
125
Here is how the timing data looks like. The units are in hours.
105
126
106
127

107
-
Now our task becomes seemingly harder to decide which train is faster because now there are 50x3x2 data points to consider.
128
+
Now the task to decide which train is faster becomes even harder since now there are 50x3x2 data points to consider.
108
129
Unfortunately, the timing data is also noisy. Running the same train on the same route might not give the same time, because
109
130
of external factors like weather, traffic, etc. This makes it hard to determine which train is faster.
110
131
111
132
So, which train is faster?
112
133
113
134
The above insight of measuring the fastest speed of the train is still applicable here. But since there are multiple
114
135
sectors, we need to measure the fastest speed of the train separately for each sector. This is because one sector might
115
-
have hills or winding tracks, which might slow down the train. But these will affect both the trains equally.
136
+
have hills or winding tracks, which might slow down the train.
116
137
117
-
So to find the train that is fastest between the two stations, we find the minimum time taken by the train to go from one station to the next.
138
+
So, we divide the route into sectors between the stations,
139
+
and measure the fastest speed of the train for each sector. To find the train that is fastest between the two stations, we find the minimum time taken by the train to go from one station to the next.
118
140
We then sum the minimum times for all the sectors to get the total time taken by the train to go from the first station to the last station.
119
141
The train that has the smallest sum of minimum times is the fastest train. Since this measures the intrinsic speed of the
120
-
train on a given route.
142
+
train on a given route. The reason to calculate the minimum time for each sector is to increase our "luck" of not
143
+
getting slowed down by any hindrances. The chance of encountering external noise is lower in one sector than in the whole route.
144
+
This makes the time measurement more accurate, when we measure the minimum time for each sector.
145
+
146
+
147
+
This is the same idea that Codeflash applies to functions. For a workload composed of multiple inputs,
148
+
it measures the intrinsic speed of a function on different inputs separately.
149
+
The total instrinsic runtime of the workload is the sum of the intrinsic runtime of the function on each input.
121
150
122
-
This is the same idea that Codeflash applies to functions. It measures the intrinsic speed of a function on separate inputs.
123
-
It then assumes a workload is composed of multiple inputs, and measures the intrinsic speed of the function on each input.
124
-
Then the instrinsic runtime of the function on the workload which consists of multipe inputs
125
-
is the sum of the intrinsic runtime of the function on each input.
126
151
(make drawings for each of the concepts)
127
152
128
-
We have found that this approach is very accurate and is the best way to measure the speed of a function, even in noisy Virtual machines.
129
-
We use a noise floor of 5% of the runtime, and only of the optimization is at least 5% faster than the original function, we consider it to be a significant improvement.
130
-
This technique gets rid of most of the measurement noise, and gives us a very accurate measure of the intrinsic speed of the function.
153
+
We have found that this approach to be very accurate and is a great way to measure the speed of a function, even in noisy Virtual machines.
154
+
We use a noise floor of 5% of the runtime (10% on Github Actions), and only the optimizations that are at least 5% faster than the original function, we consider it to be a significant improvement.
155
+
This technique gets rid of most of the measurement noise, and gives us a very accurate measure of the noise-free intrinsic speed of the function.
0 commit comments