Skip to content

Commit bebb858

Browse files
Nicholas Clarkech1bo
authored andcommitted
Add a 'principle' about optimisation.
There are some key principles which we used when putting together the Haskell node - it feels like these should be captured somewhere! This one in particular seems very relevant to people designing alternate node implementations.
1 parent 5b0e2e0 commit bebb858

File tree

1 file changed

+79
-0
lines changed

1 file changed

+79
-0
lines changed

src/principles/optimise-worst.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Principles
2+
3+
## Optimise only for the worst case
4+
5+
Algorithms often possess different performance characteristics in the best,
6+
average and worst cases. Quick sort has, for example, `O(n*log(n))` time
7+
complexity in the average case, but `O(n^2)` in the worst case.
8+
9+
Often in software we are interested in optimising for the average case, since
10+
over time it tends to result in the highest overall performance. In Cardano,
11+
however, we take a different approach: we want to only optimise the worst case
12+
performance, and indeed, pick algorithms where _worst case is the same as best
13+
(or average) case_.
14+
15+
### Motivation
16+
17+
There are two motivating ideas behind this principle:
18+
19+
1. _Performance will come to be relied upon_. This is an instance of Hyrum's
20+
law in action:
21+
22+
> With a sufficient number of users of an API, it does not matter what you
23+
> promise in the contract: all observable behaviors of your system will be
24+
> depended on by somebody.
25+
26+
If we manage to increase average performance, tools will be developed that
27+
come to expect this behaviour. If blocks can be processed faster, for
28+
example, there may be pressure to use that extra time to allow larger script
29+
execution budgets.
30+
31+
2. Forcing honest nodes to do more work provides an attack opportunity for the
32+
adversary. In the worst case, an adversary could potentially force nodes to
33+
fail to adopt or forge blocks, and hence potentially gain control of the
34+
chain.
35+
36+
### Example: UTxO locality
37+
38+
As an example, there is certain evidence that the UTxO set exhibits a degree of
39+
temporal locality. That is, there are a number of long-lived UTxO entries that
40+
are unlikely to be spent, while recently created entries are quite likely to
41+
come up again.
42+
43+
We could choose to take advantage of this to organise the UTxO along the form
44+
of a MRU (most recently used) cache. This would likely speed up UTxO lookup
45+
in the average case, particularly were the UTxO stored on disk.
46+
47+
However, an attacker could then choose to deliberately craft a number of
48+
transactions containing UTxO that were created a long time ago. A block filled
49+
with such transactions, while being perfectly legitimate, might take
50+
significantly longer to process than a regular block. The attacker could use
51+
this delay to gain an advantage in block forging and thus magnify their
52+
effective stake.
53+
54+
### Considerations
55+
56+
Whilst we have written the above as a general principle, there are of course
57+
various considerations that effect how much we want to follow it:
58+
59+
1. The situation we most want to avoid is where an attacker could control an
60+
input such that they can force the worst-case behaviour. The UTxO locality
61+
example above is one such.
62+
2. Still problematic are cases where an attacker cannot control the performance
63+
but can predict it, or less serious still, observe it. Either way a
64+
prepared attacker could take advantage of the degraded performance to launch
65+
an attack on the chain.
66+
3. Either of these cases are exascerbated if the same behaviour is coordinated
67+
across all nodes, since this allows an attacker to exploit a performance
68+
drop across the entire network.
69+
70+
In situations where the inputs to a function are random or not observable by
71+
an adversary, or where the effects are purely local, it may therefore still be
72+
sensible to optimise for the average case. An example of such might be in
73+
local state query computations, since these are triggered only by a (trusted)
74+
local connection and do not lie on a critical path for block forging or
75+
adoption.
76+
77+
The key point is that, in the Cardano setting, optimisations should be carefully
78+
considered and it is certainly not the case that better average-case performance
79+
is always desirable!

0 commit comments

Comments
 (0)