Conversation
This adds logic to the awst validation step that ensures there are no colliding state keys. It should be noted that collisions are still possible with dynamic types and same-sized static types, which is out of scope for this PR
|
|
Checking stubs for changes and corresponding version bump in origin/fix/error_on_key_collision Last puyapy release: v5.7.1 💡 Stub version change: 3.4.0 -> 3.5.0 |
There was a problem hiding this comment.
Pull request overview
Adds AWST validation to error on colliding app-storage definition keys so that duplicate global/local/box storage keys can’t silently overwrite each other during compilation.
Changes:
- Track previously-seen
AppStorageDefinitionkeys per storage kind and emit a compilation error on duplicates (with prior definition location logged). - Add regression tests covering duplicate global state, local state, box key, box prefix, and inheritance scenarios.
- Add a changelog fragment documenting the new compilation error behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| tests/test_awst_validation.py | New tests asserting compilation fails with a clear “duplicate … key” message for each storage kind and across inheritance. |
| src/puya/awst/validation/storage.py | Implements duplicate storage-definition key detection (global/local/box) during AWST validation. |
| changelog.d/20260227_085026_joepolny_error_on_key_collision.md | Documents the fix in the “Fixed” section. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Now that there is validation against duplicate keys, we need split out these test cases
776bf29 to
fbf1c78
Compare
| self.first = GlobalState(UInt64, key=b"dup") ## N: previous definition of global state key was here | ||
| self.second = GlobalState(UInt64, key=b"dup") ## E: duplicate global state key b'dup' |
There was a problem hiding this comment.
If a developer chooses to have duplicate keys for some reason I don't think an error is the way to go (a warning feels ok tbh).
This is different from automatically assigned keys conflicting with each other. In that case we should either error or switch to a different key-derivation strategy.
There was a problem hiding this comment.
It's never the desired behavior to have two properties with the same key. The values would just overwrite each other but that is not clear from the POV of the high-level language. TEALScript did have a allowPotentialCollisions option that allowed the developers to override this behavior so maybe we could add that option, but I think error by default is the way to go
There was a problem hiding this comment.
It's never the desired behavior to have two properties with the same key.
We have used this before in testing, to effectively treat a storage value like a C union. A bit contrived perhaps, but it was useful at the time.
Another use case is for contract upgrades (ignore for now that mutable contracts are not recommended), when you want to migrate data between versions, you might want to read the old value as a certain type and write it back out as the new type (with alterations). I can see you're only looking at definitions in this PR, so you could still do this with stack-only definitions - and that would be the preferred approach.
Overall I agree that the most likely case is that a duplicate key is an error, but without a way to suppress a compiler error(*), I'm not sure we should treat it as such unconditionally. If we make it a warning, then we have the -Werror option that can be utilised - and there is a plan to add specific warning suppressions.
(*) There's no plan to add the ability to suppress a compiler error, for hopefully obvious reasons.
There was a problem hiding this comment.
Ah I didn't think about updates. I still feel pretty strongly that it should be an error out of the box with a flag to disable like TEALScript. As a developer myself and someone that has worked with other people writing contracts, I have written and seen multiple accidental collisions occur. I have never personally seen it done on purpose. Having collisions can lead to severe vulnerabilities that are difficult to debug, which is why I think an error is justified
There was a problem hiding this comment.
Yeah as long as there's a way around it for those edge cases, I think definitive key collisions being an error is reasonable. Technically a breaking change though to make it an error, maybe we should make it default to a warning to start and flip it to error in the next breaking change release?
| self.first = BoxMap(UInt64, UInt64, key_prefix=b"dup") ## N: previous definition of box key or prefix was here | ||
| self.second = BoxMap(UInt64, UInt64, key_prefix=b"dup") ## E: duplicate box key or prefix b'dup' |
There was a problem hiding this comment.
Should we detect a Box instance with a key that shares a prefix with a declared BoxMap? That feels more error prone than explicitly deciding two global states should have the same state key.
There was a problem hiding this comment.
Yeah because a common pattern is to fetch all boxes with a prefix and treat them as a map on the client side. Having a key collide with a prefix would result in a single box erroneously being seen as a value in the map despite being completely separate.
There was a problem hiding this comment.
I see that the code already supports that. We should add a test for this case :)
| class BaseContract(ARC4Contract): | ||
| def __init__(self) -> None: | ||
| self.base_counter = GlobalState(UInt64, key=b"dup") ## N: previous definition of global state key was here | ||
|
|
||
|
|
||
| class Contract(BaseContract): | ||
| def __init__(self) -> None: | ||
| super().__init__() | ||
| self.child_counter = GlobalState(UInt64, key=b"dup") ## E: duplicate global state key b'dup' |
There was a problem hiding this comment.
I don't think that explicit keys should be the preferred way to write this code. This is tempting a developer to cause this error :(
There was a problem hiding this comment.
Explicit keys are required to have shortened names on-chain (or you use shortened property names, but that hurts UX/readability). Contract inheritance with two global state values having the same key is the exact bug I ran into.
There was a problem hiding this comment.
In that case the issue is that the current key-derivation strategy we currently use (just use the property name) is insufficient!
I still stand on the side of "developers should not have to mangle global/local state keys most of the time".
There was a problem hiding this comment.
The problem is that you generally want short key names on-chain because they eat into the size budget you have for values. For boxes you are paying for those bytes and for global/local you are bloating the chain via deltas. We could have Puya try to come up with short keys, but I think this may be confusing especially if the same contract might generate different keys depending on the other contracts its composed with. I also think that is a lot more complex to implement.
I think the default approach of using the members names is good. It is a very safe default but its non-optimal. The question is then how do we ensure optimizations are done safely. I think developers defining keys manually but having guardrails to protect from foot guns is a good approach.
Also, regardless of the default strategy, defining explicit key names is a feature of Puya and we should try to make it as safe as possible.
| if kind_label is None: | ||
| return | ||
|
|
||
| existing = self._seen_definitions[defn.kind].get(defn.key.value) |
There was a problem hiding this comment.
For Boxes this should check if there's any other box with the same key or if there's any key_prefix that is a prefix of this key.
For BoxMap it should check if there are key_prefixes that are prefixes of each other.
There was a problem hiding this comment.
For BoxMap key is the prefix:
Line 2491 in e9dece6
There was a problem hiding this comment.
What I mean is that Box(key="my_key") should conflict with BoxMap(key_prefix="my_")!
There was a problem hiding this comment.
Ahh yeah good call. Although I'm not sure how we'd do that here since AppStorageDefinition.key is ambiguous. It could be either a prefix or a key.
There was a problem hiding this comment.
If it's a map then AppStorageDefinition.key_wtype will be set.
That said, overlaps between individual box keys and box-map key-prefixes is not a guarantee of a collision. We can consider fixed sized key types as the most basic example, in the above if BoxMap(key_prefix="my_") has keys of FixedBytes[2] then it can't overlap with Box(key="my_key").
It's even more complicated with dynamic length keys.
Finally, even if there is a chance of overlap just based on types, there may be application specific reasons there won't be any overlap due to certain key values not being possible. Trivial example: the key is a String, but is always checked to be at least a certain minimum size - we don't have a type that encodes such a constraint.
I think maybe anything involving potential key-prefix vs key or key-prefix vs key-prefix overlap should be a warning at most, it gets complicated otherwise.
There was a problem hiding this comment.
If it's a map then AppStorageDefinition.key_wtype will be set.
Ah nice yeah we can use that.
I think maybe anything involving potential key-prefix vs key or key-prefix vs key-prefix overlap should be a warning at most, it gets complicated otherwise.
I don't think the complexity should impact whether it's a warning or error. I think it MUCH more desirable to throw an error too often rather than not enough. It's very rare developers overlap keys intentionally. If they do, having to set a single flag once seems like a reasonable trade-off.
It's hard to have 100% certainty with this and we may miss some edge cases along the way, but preventing even just one vulnerability in a smart contract makes it worth it imo.
fwiw TEALScript has quite a complex prefix vs key vs same-size types vs dynamic type collision detection that throws an error. No one has ever complained about it and it has saved me before.
There was a problem hiding this comment.
I don't think the complexity should impact whether it's a warning or error.
I agree, it should be based on the robustness of the check. When the code is otherwise compilable, an error should be reserved for cases of absolute certainty that there is a problem. Warnings should be treated as a strong caution - contract authors reviewing any compilation warnings before deployment should be treated as a security critical activity.
fwiw TEALScript has quite a complex prefix vs key vs same-size types vs dynamic type collision detection that throws an error. No one has ever complained about it and it has saved me before.
As it stands, the current logic in this PR will definitely have trivial false positives and also false negatives, so if we wanted to get a small and simple fix in, we could just ignore maps for now. Alternatively we handle it in one go, in which case maybe you can import the logic from TealScript into this PR so we can see how it looks in context?
| if kind_label is None: | ||
| return | ||
|
|
||
| existing = self._seen_definitions[defn.kind].get(defn.key.value) |
There was a problem hiding this comment.
minor: usually we would use setdefault for this pattern - then check if existing is not defn, this way the key -> value mapping is not duplicated, and no need for an else branch
| location=existing.source_location, | ||
| ) | ||
| logger.error( | ||
| f"duplicate {kind_label} {defn.key.value!r}", |
There was a problem hiding this comment.
I would remove the {defn.key.value!r} part from the error message, as it may not render well in non UTF-8 cases. Additionally when the error is logged the output includes the source at the associated source location which should include the key.
| kind_labels = { | ||
| AppStorageKind.app_global: "global state key", | ||
| AppStorageKind.account_local: "local state key", | ||
| AppStorageKind.box: "box key or prefix", | ||
| } | ||
| kind_label = kind_labels.get(defn.kind) | ||
| if kind_label is None: | ||
| return |
There was a problem hiding this comment.
| kind_labels = { | |
| AppStorageKind.app_global: "global state key", | |
| AppStorageKind.account_local: "local state key", | |
| AppStorageKind.box: "box key or prefix", | |
| } | |
| kind_label = kind_labels.get(defn.kind) | |
| if kind_label is None: | |
| return | |
| match defn.kind: | |
| case AppStorageKind.app_global: | |
| kind = "global state" | |
| case AppStorageKind.account_local: | |
| kind = "local state" | |
| case AppStorageKind.box: | |
| kind = "box" | |
| case unexpected: | |
| typing.assert_never(unexpected) | |
| if defn.key_wtype: | |
| kind_label = f"{kind} prefix" | |
| else: | |
| kind_label = f"{kind} key" |
Can you clarify what this means? I'm not sure how it applies to the validation being done in this PR |
This PR is only checking for explicit key/prefix collisions. Another type of collision can occur with map keys. For example, Map key collisions are definitely more of an edge case, but this is the kind of domain where one exploited edge case can lead to millions of dollars lost. |
|
An aside from something I've discussed w/Joe yesterday: What should we do when diamond inheritance means there are two paths to a given contract state variable? If we adhere to python semantics this means that there's a single variable (since regular python objects behave like dicts). OTOH slot-based classes error when this happens: $ python3
Python 3.9.6 (default, Jan 9 2026, 11:03:41)
[Clang 17.0.0 (clang-1700.6.4.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> class A():
... def __init__(self):
... self.x = 1
... __slots__ = ('x',)
...
>>> class B:
... def __init__(self):
... self.x = 2
... __slots__ = ('x',)
...
>>> class C(A, B):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: multiple bases have instance lay-out conflictOur named tuples and arc4 structs are basically slot-like classes (but we don't admit inheritance for them). Our contracts are mostly dict-like. |
This adds logic to the awst validation step that ensures there are no colliding state keys. It should be noted that collisions are still possible with dynamic types and same-sized static types, which is out of scope for this PR.
Closes algorandfoundation/puya-ts#335