Skip to content

fix: throw error on key collision#664

Open
joe-p wants to merge 4 commits intomainfrom
fix/error_on_key_collision
Open

fix: throw error on key collision#664
joe-p wants to merge 4 commits intomainfrom
fix/error_on_key_collision

Conversation

@joe-p
Copy link
Copy Markdown
Contributor

@joe-p joe-p commented Feb 27, 2026

This adds logic to the awst validation step that ensures there are no colliding state keys. It should be noted that collisions are still possible with dynamic types and same-sized static types, which is out of scope for this PR.

Closes algorandfoundation/puya-ts#335

This adds logic to the awst validation step that ensures there are no
colliding state keys. It should be noted that collisions are still
possible with dynamic types and same-sized static types, which is out of
scope for this PR
@engineering-ci
Copy link
Copy Markdown

Name Status O0 bytes O1 bytes O2 bytes O0 ops O1 ops O2 ops

@engineering-ci
Copy link
Copy Markdown

engineering-ci bot commented Feb 27, 2026

Checking stubs for changes and corresponding version bump in origin/fix/error_on_key_collision

Last puyapy release: v5.7.1
This branch stubs version: 3.5.0
Last released stubs version: 3.4.0
Main stubs version: 3.5.0

💡 Stub version change: 3.4.0 -> 3.5.0
✅ Stub files unchanged
✅ Stub check passed!

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds AWST validation to error on colliding app-storage definition keys so that duplicate global/local/box storage keys can’t silently overwrite each other during compilation.

Changes:

  • Track previously-seen AppStorageDefinition keys per storage kind and emit a compilation error on duplicates (with prior definition location logged).
  • Add regression tests covering duplicate global state, local state, box key, box prefix, and inheritance scenarios.
  • Add a changelog fragment documenting the new compilation error behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
tests/test_awst_validation.py New tests asserting compilation fails with a clear “duplicate … key” message for each storage kind and across inheritance.
src/puya/awst/validation/storage.py Implements duplicate storage-definition key detection (global/local/box) during AWST validation.
changelog.d/20260227_085026_joepolny_error_on_key_collision.md Documents the fix in the “Fixed” section.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Now that there is validation against duplicate keys, we need split out
these test cases
@joe-p joe-p force-pushed the fix/error_on_key_collision branch from 776bf29 to fbf1c78 Compare February 27, 2026 14:15
@joe-p joe-p marked this pull request as draft February 27, 2026 14:19
@engineering-ci
Copy link
Copy Markdown

engineering-ci bot commented Feb 27, 2026

Coverage

Tests Skipped Failures Errors Time
1492 3 💤 0 ❌ 0 🔥 13m 57s ⏱️

@joe-p joe-p marked this pull request as ready for review February 27, 2026 14:42
Comment on lines +7 to +8
self.first = GlobalState(UInt64, key=b"dup") ## N: previous definition of global state key was here
self.second = GlobalState(UInt64, key=b"dup") ## E: duplicate global state key b'dup'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a developer chooses to have duplicate keys for some reason I don't think an error is the way to go (a warning feels ok tbh).

This is different from automatically assigned keys conflicting with each other. In that case we should either error or switch to a different key-derivation strategy.

Copy link
Copy Markdown
Contributor Author

@joe-p joe-p Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's never the desired behavior to have two properties with the same key. The values would just overwrite each other but that is not clear from the POV of the high-level language. TEALScript did have a allowPotentialCollisions option that allowed the developers to override this behavior so maybe we could add that option, but I think error by default is the way to go

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's never the desired behavior to have two properties with the same key.

We have used this before in testing, to effectively treat a storage value like a C union. A bit contrived perhaps, but it was useful at the time.

Another use case is for contract upgrades (ignore for now that mutable contracts are not recommended), when you want to migrate data between versions, you might want to read the old value as a certain type and write it back out as the new type (with alterations). I can see you're only looking at definitions in this PR, so you could still do this with stack-only definitions - and that would be the preferred approach.

Overall I agree that the most likely case is that a duplicate key is an error, but without a way to suppress a compiler error(*), I'm not sure we should treat it as such unconditionally. If we make it a warning, then we have the -Werror option that can be utilised - and there is a plan to add specific warning suppressions.

(*) There's no plan to add the ability to suppress a compiler error, for hopefully obvious reasons.

Copy link
Copy Markdown
Contributor Author

@joe-p joe-p Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I didn't think about updates. I still feel pretty strongly that it should be an error out of the box with a flag to disable like TEALScript. As a developer myself and someone that has worked with other people writing contracts, I have written and seen multiple accidental collisions occur. I have never personally seen it done on purpose. Having collisions can lead to severe vulnerabilities that are difficult to debug, which is why I think an error is justified

Copy link
Copy Markdown
Contributor

@achidlow achidlow Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah as long as there's a way around it for those edge cases, I think definitive key collisions being an error is reasonable. Technically a breaking change though to make it an error, maybe we should make it default to a warning to start and flip it to error in the next breaking change release?

Comment on lines +37 to +38
self.first = BoxMap(UInt64, UInt64, key_prefix=b"dup") ## N: previous definition of box key or prefix was here
self.second = BoxMap(UInt64, UInt64, key_prefix=b"dup") ## E: duplicate box key or prefix b'dup'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we detect a Box instance with a key that shares a prefix with a declared BoxMap? That feels more error prone than explicitly deciding two global states should have the same state key.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah because a common pattern is to fetch all boxes with a prefix and treat them as a map on the client side. Having a key collide with a prefix would result in a single box erroneously being seen as a value in the map despite being completely separate.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the code already supports that. We should add a test for this case :)

Comment on lines +45 to +53
class BaseContract(ARC4Contract):
def __init__(self) -> None:
self.base_counter = GlobalState(UInt64, key=b"dup") ## N: previous definition of global state key was here


class Contract(BaseContract):
def __init__(self) -> None:
super().__init__()
self.child_counter = GlobalState(UInt64, key=b"dup") ## E: duplicate global state key b'dup'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that explicit keys should be the preferred way to write this code. This is tempting a developer to cause this error :(

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit keys are required to have shortened names on-chain (or you use shortened property names, but that hurts UX/readability). Contract inheritance with two global state values having the same key is the exact bug I ran into.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case the issue is that the current key-derivation strategy we currently use (just use the property name) is insufficient!
I still stand on the side of "developers should not have to mangle global/local state keys most of the time".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that you generally want short key names on-chain because they eat into the size budget you have for values. For boxes you are paying for those bytes and for global/local you are bloating the chain via deltas. We could have Puya try to come up with short keys, but I think this may be confusing especially if the same contract might generate different keys depending on the other contracts its composed with. I also think that is a lot more complex to implement.

I think the default approach of using the members names is good. It is a very safe default but its non-optimal. The question is then how do we ensure optimizations are done safely. I think developers defining keys manually but having guardrails to protect from foot guns is a good approach.

Also, regardless of the default strategy, defining explicit key names is a feature of Puya and we should try to make it as safe as possible.

if kind_label is None:
return

existing = self._seen_definitions[defn.kind].get(defn.key.value)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Boxes this should check if there's any other box with the same key or if there's any key_prefix that is a prefix of this key.

For BoxMap it should check if there are key_prefixes that are prefixes of each other.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For BoxMap key is the prefix:

"""for maps, this is the prefix"""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that Box(key="my_key") should conflict with BoxMap(key_prefix="my_")!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yeah good call. Although I'm not sure how we'd do that here since AppStorageDefinition.key is ambiguous. It could be either a prefix or a key.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a map then AppStorageDefinition.key_wtype will be set.

That said, overlaps between individual box keys and box-map key-prefixes is not a guarantee of a collision. We can consider fixed sized key types as the most basic example, in the above if BoxMap(key_prefix="my_") has keys of FixedBytes[2] then it can't overlap with Box(key="my_key").

It's even more complicated with dynamic length keys.

Finally, even if there is a chance of overlap just based on types, there may be application specific reasons there won't be any overlap due to certain key values not being possible. Trivial example: the key is a String, but is always checked to be at least a certain minimum size - we don't have a type that encodes such a constraint.

I think maybe anything involving potential key-prefix vs key or key-prefix vs key-prefix overlap should be a warning at most, it gets complicated otherwise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a map then AppStorageDefinition.key_wtype will be set.

Ah nice yeah we can use that.

I think maybe anything involving potential key-prefix vs key or key-prefix vs key-prefix overlap should be a warning at most, it gets complicated otherwise.

I don't think the complexity should impact whether it's a warning or error. I think it MUCH more desirable to throw an error too often rather than not enough. It's very rare developers overlap keys intentionally. If they do, having to set a single flag once seems like a reasonable trade-off.

It's hard to have 100% certainty with this and we may miss some edge cases along the way, but preventing even just one vulnerability in a smart contract makes it worth it imo.

fwiw TEALScript has quite a complex prefix vs key vs same-size types vs dynamic type collision detection that throws an error. No one has ever complained about it and it has saved me before.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the complexity should impact whether it's a warning or error.

I agree, it should be based on the robustness of the check. When the code is otherwise compilable, an error should be reserved for cases of absolute certainty that there is a problem. Warnings should be treated as a strong caution - contract authors reviewing any compilation warnings before deployment should be treated as a security critical activity.

fwiw TEALScript has quite a complex prefix vs key vs same-size types vs dynamic type collision detection that throws an error. No one has ever complained about it and it has saved me before.

As it stands, the current logic in this PR will definitely have trivial false positives and also false negatives, so if we wanted to get a small and simple fix in, we could just ignore maps for now. Alternatively we handle it in one go, in which case maybe you can import the logic from TealScript into this PR so we can see how it looks in context?

if kind_label is None:
return

existing = self._seen_definitions[defn.kind].get(defn.key.value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: usually we would use setdefault for this pattern - then check if existing is not defn, this way the key -> value mapping is not duplicated, and no need for an else branch

location=existing.source_location,
)
logger.error(
f"duplicate {kind_label} {defn.key.value!r}",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the {defn.key.value!r} part from the error message, as it may not render well in non UTF-8 cases. Additionally when the error is logged the output includes the source at the associated source location which should include the key.

Comment on lines +43 to +50
kind_labels = {
AppStorageKind.app_global: "global state key",
AppStorageKind.account_local: "local state key",
AppStorageKind.box: "box key or prefix",
}
kind_label = kind_labels.get(defn.kind)
if kind_label is None:
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kind_labels = {
AppStorageKind.app_global: "global state key",
AppStorageKind.account_local: "local state key",
AppStorageKind.box: "box key or prefix",
}
kind_label = kind_labels.get(defn.kind)
if kind_label is None:
return
match defn.kind:
case AppStorageKind.app_global:
kind = "global state"
case AppStorageKind.account_local:
kind = "local state"
case AppStorageKind.box:
kind = "box"
case unexpected:
typing.assert_never(unexpected)
if defn.key_wtype:
kind_label = f"{kind} prefix"
else:
kind_label = f"{kind} key"

@daniel-makerx
Copy link
Copy Markdown
Contributor

It should be noted that collisions are still possible with dynamic types and same-sized static types, which is out of scope for this PR.

Can you clarify what this means? I'm not sure how it applies to the validation being done in this PR

@joe-p
Copy link
Copy Markdown
Contributor Author

joe-p commented Mar 3, 2026

Can you clarify what this means? I'm not sure how it applies to the validation being done in this PR

This PR is only checking for explicit key/prefix collisions.

Another type of collision can occur with map keys. For example, uint64, AVMBytes, byte[8], (uint32,uint32) could all overwrite each other but it is not at all obvious to the developer writing in the high level language unless they are intmately familiar with how ABI encoding and box keys work. The same is true for prefixes. A BoxMap with a prefix could be overwritten by a single box of AVMBytes for example.

Map key collisions are definitely more of an edge case, but this is the kind of domain where one exploited edge case can lead to millions of dollars lost.

@iglosiggio
Copy link
Copy Markdown
Member

iglosiggio commented Mar 11, 2026

An aside from something I've discussed w/Joe yesterday: What should we do when diamond inheritance means there are two paths to a given contract state variable? If we adhere to python semantics this means that there's a single variable (since regular python objects behave like dicts).

OTOH slot-based classes error when this happens:

$ python3
Python 3.9.6 (default, Jan  9 2026, 11:03:41)
[Clang 17.0.0 (clang-1700.6.4.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> class A():
...     def __init__(self):
...         self.x = 1
...     __slots__ = ('x',)
...
>>> class B:
...     def __init__(self):
...         self.x = 2
...     __slots__ = ('x',)
...
>>> class C(A, B):
...     pass
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: multiple bases have instance lay-out conflict

Our named tuples and arc4 structs are basically slot-like classes (but we don't admit inheritance for them). Our contracts are mostly dict-like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Colliding state keys are not detected

5 participants