-
Notifications
You must be signed in to change notification settings - Fork 237
IPIP-512: Limit Identity CID Size to 128 Bytes in UnixFS Contexts #512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
lidel
wants to merge
1
commit into
main
Choose a base branch
from
doc/identity-cid-size-limit
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
--- | ||
title: "IPIP-0512: Limit Identity CID Size to 128 Bytes in UnixFS Contexts" | ||
date: 2025-09-09 | ||
ipip: proposal | ||
editors: | ||
- name: Marcin Rataj | ||
github: lidel | ||
affiliation: | ||
name: Interplanetary Shipyard | ||
url: https://ipshipyard.com/ | ||
relatedIssues: | ||
- https://github.com/ipfs/boxo/pull/1018 | ||
- https://github.com/multiformats/cid/issues/21 | ||
- https://github.com/multiformats/multihash/issues/130 | ||
thanks: | ||
- name: Rod Vagg | ||
github: rvagg | ||
- name: Volker Mische | ||
github: vmx | ||
- name: Alex Potsides | ||
github: achingbrain | ||
affiliation: | ||
name: Interplanetary Shipyard | ||
url: https://ipshipyard.com/ | ||
order: 512 | ||
tags: ['ipips'] | ||
--- | ||
|
||
## Summary | ||
|
||
This IPIP establishes a 128-byte maximum digest size limit for identity CIDs (multihash code `0x00`) in UnixFS contexts to prevent abuse and clarify appropriate usage boundaries. | ||
|
||
## Motivation | ||
|
||
Identity CIDs are unique in that they inline data directly into the CID itself rather than hashing it. Without clear limits, this creates several problems: | ||
|
||
1. **Resource Exhaustion**: Poorly written clients could encode large payloads as identity CIDs and propagate them through the network, consuming bandwidth and resources without providing value. | ||
|
||
2. **Security Vulnerabilities**: Identity CIDs provide no integrity verification and are vulnerable to bit flips. Large identity CIDs amplify this risk. | ||
|
||
3. **Unclear Boundaries**: The ecosystem lacks clear guidelines on when identity CIDs are appropriate, leading to potential misuse. | ||
|
||
4. **CIDs as Data Containers**: Without limits, identity CIDs could embed arbitrary amounts of data, effectively turning CIDs from content addresses into data containers. | ||
|
||
As discussed in [ipfs/boxo#1018](https://github.com/ipfs/boxo/pull/1018), the community consensus is that large identity CIDs are problematic and a reasonable limit is needed. | ||
|
||
## Detailed design | ||
|
||
This IPIP adds a new section to the UnixFS specification documenting the 128-byte digest size limit for identity CIDs: | ||
|
||
### Changes to UnixFS Specification | ||
|
||
Add new section "Identity CID Size Limit" that specifies: | ||
|
||
- Identity CIDs (multihash code `0x00`) are experimental and limited to 128-byte digest size | ||
- Implementations MUST never produce identity CIDs exceeding 128 bytes | ||
- Implementations MUST reject identity CIDs exceeding 128 bytes when reading | ||
- Implementations SHOULD automatically convert to regular blocks if data modifications would exceed the limit | ||
|
||
### Test Fixtures | ||
|
||
Add invalid test case for a 129-byte identity CID that implementations MUST reject. | ||
|
||
## Design rationale | ||
|
||
The 128-byte limit was chosen based on several factors: | ||
|
||
1. **Alignment with Existing Constraints**: The limit matches `DefaultMaxDigestSize` already used for cryptographic hashes in the ecosystem. 128 bytes is a sensible limit that accommodates the digest sizes of the longest popular hash functions (e.g., SHA-512 produces 64-byte digests), while preventing unbounded growth. | ||
|
||
2. **Community Consensus**: Key maintainers expressed support for this limit: | ||
- [@rvagg](https://github.com/ipfs/boxo/pull/1018#issuecomment-3240647923): "128 seems reasonable to me. I'm happy to have them squished down their happy-path use to a size where they're more likely being used for their size-saving utility" | ||
- [@vmx](https://github.com/ipfs/boxo/pull/1018#issuecomment-3241779136): "I'm not a fan of large identity CIDs... 128 bytes sound reasonable to me" | ||
- [@achingbrain](https://github.com/ipfs/boxo/pull/1018#discussion_r2318132492): "It looks fine at first glance 👍" (confirming Helia compatibility) | ||
|
||
3. **Practical Usage**: 128 bytes is sufficient for legitimate use cases (small inline data) while preventing abuse. | ||
|
||
4. **Implementation Precedent**: This limit has been implemented and tested in [ipfs/boxo#1018](https://github.com/ipfs/boxo/pull/1018) and included in Kubo 0.38 RC1 for broader testing. | ||
|
||
### User benefit | ||
|
||
- **Protection from Resource Exhaustion**: Users are protected from malicious or poorly-written clients that might otherwise propagate large identity CIDs. | ||
- **Clear Guidelines**: Developers have explicit boundaries for appropriate identity CID usage. | ||
- **Consistent Behavior**: All conforming implementations will handle identity CIDs consistently. | ||
- **No Wasted Resources**: Avoids unnecessary roundtrips where clients send data to remote services only to have the deserialized bytes sent back, when the client already had the data and could have avoided the entire network operation. | ||
|
||
### Compatibility | ||
|
||
Identity CIDs have always been marked as experimental, and this change does not impact users who used default settings in software like Kubo or Helia, which never produced identity CIDs by default. | ||
|
||
This is a breaking change only for any existing identity CIDs with digest sizes exceeding 128 bytes. However: | ||
|
||
- Existing valid identity CIDs (≤128 bytes) remain unaffected | ||
- The change has been tested in Kubo 0.38 RC1 to gather feedback | ||
- Most users are unaffected as identity CIDs require explicit opt-in | ||
|
||
Implementations upgrading to support this IPIP will need to: | ||
1. Add validation to reject oversized identity CIDs when reading | ||
2. Prevent creation of identity CIDs exceeding the limit | ||
3. Consider automatic conversion to regular blocks when data grows | ||
|
||
### Security | ||
|
||
This change improves security by: | ||
|
||
1. **Preventing Unbounded Resource Consumption**: Limits the amount of data that can be inlined in CIDs | ||
2. **Reducing Attack Surface**: Smaller identity CIDs reduce the impact of bit flip vulnerabilities | ||
3. **Clear Security Boundaries**: Explicit limits help security audits and threat modeling | ||
4. **Mitigating Known Vulnerabilities**: The go-car library previously had a vulnerability ([GHSA-9x4h-8wgm-8xfg](https://github.com/ipld/go-car/security/advisories/GHSA-9x4h-8wgm-8xfg)) where decoding user-controlled identity CIDs could cause excessive memory allocation, leading to denial of service. While go-car mitigated this by capping allocations at 1MiB, establishing a 128-byte limit at the UnixFS specification level ensures all implementations are protected from this class of vulnerabilities by default. | ||
|
||
### Alternatives | ||
|
||
Several alternatives were considered: | ||
|
||
1. **No Limit**: Rejected due to resource exhaustion and abuse potential | ||
2. **Smaller Limit (32-64 bytes)**: Would break more existing use cases | ||
3. **Larger Limit (256+ bytes)**: As noted by @rvagg, "the higher you go, the harder it is to justify their use" | ||
4. **Complete Deprecation**: Too disruptive; identity CIDs have legitimate uses for tiny data | ||
|
||
## Test fixtures | ||
|
||
### Valid Identity CID (128 bytes) | ||
|
||
- CID: `bafkqbaabijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbeeqscijbee` | ||
- Content: 128 'B' characters | ||
- Expected: Implementations MUST accept this CID | ||
|
||
### Invalid Identity CID (129 bytes) | ||
|
||
- CID: `bafkqbaibifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqkbifaucqi` | ||
- Content: 129 'A' characters | ||
- Expected: Implementations MUST reject this CID with an appropriate error | ||
|
||
### Copyright | ||
|
||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if users do have identity cids, what is the recommendation for conversion to non-identity-cids if we block on read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These details are left up to implementation, but as an example:
--hash=identity