Skip to content

Subdomain support for CIDs longer than 63 #7318

@lidel

Description

@lidel

I hoped to punt this until we need to switch away from sha256 in CIDs, but we may need to solve this problem sooner than expected due to ED25519 keys being new default soon (#6916)

Problem: DNS label limit of 63

RFC 1034: "each node has a label, which is zero to 63 octets in length"

The default CIDv1 Base32 with multihash of sha256 and RSA libp2p-key fits:

but if we use ED25519 libp2p-key then we are 2 characters over the limit:

Label longer than 63 characters means the hostname can't resolve:

$ ping bafzaajaiaejca4syrpdu6gdx4wsdnokxkprgzxf4wrstuc34gxw5k5jrag2so5gk.ipns.dweb.link
ping: bafzaajaiaejca4syrpdu6gdx4wsdnokxkprgzxf4wrstuc34gxw5k5jrag2so5gk.ipns.dweb.link: Name or service not known

And links are not picked up by tools like Slack:

oops-2020-05-14--17-59-27

Note: I used ED25519 as an example, but not limited to that single type of CID. Even if we find a way to fit ED25519 in a single label, the problem remains for CIDs with a multihash created with longer hash functions.

Solved: IPNS-specific fix for ED25519 keys

In parallel to the generic fix, we could represent ED25519 keys in a way that fits under 63 characters, solving the UX issue for IPNS websites loaded from public gateways.

Done: #7441 – we support {cidv1base36}.ipns.dweb.link which perfectly fits

Open Problem: generic solution for long CIDs

I am happy to open PR with a fix, but unsure if I have the best fix in mind, would love to gather feedback first.

❓ (A) support split CIDs (but have broken TLS)

The first idea I have is to split the label when the max is reached.
To maximize entropy for Origin isolation, the remainder should be on the left side:

Pros:

  • 👍 each long CID gets own Origin – we keep isolation
  • 👍 path redirect provided by subdomain gateway can take care of splitting
  • 👍 future-proof solution for longer hashes such as sha2-512
    • the next limit is pretty far away: the maximum length of full domain name: 253 characters, including dots
    • sha2-512 on dweb.link is 121 characters

Cons:

  • 💢 decreased entropy in security guarantees provided by origin isolation
  • 💢 wildcard TLS certificate does not pass validation for more than a single level of labels
  • 💢 copying & pasting CID as-is no longer works on public gateways (user needs to put . in the middle etc)
    • Note: to make it easier UX-wise, we should allow . anywhere inside of CID, but internally merge labels, and return a redirect to canonical version that splits at deterministic position (enforcing maximum label for Origin).

❓ (B) redirect long CIDs to an "insecure" subdomain

This would make it possible for content to load, but longer CIDs would not get Origin isolation per CID.

To make this bit more clear and idiomatic, we could present this as "cross origin resource sharing" endpoint that allows both CORS requests + supports loading everything from a single origin + has paths locked down in browsers like noted in ipfs/in-web-browsers#157.

Think in terms of

  • https://dweb.link/ipfs/superlongcid redirecting to https://cors.dweb.link/superlongcid

Pros:

  • 👍 does not break TLS wildcard certs (easy setup for gateway operators)
  • 👍 useful outside this problem: provides idiomatic way for exposing path gateway on subdomain gateways (for use when origin isolation is not needed)

Cons:

  • 💢 long CIDs don't get Origin isolation

❓ (C) swap DAG root with CID that uses shorter hash function

Pros:

  • 👍 "just works"

Cons:

  • 💢 decreased entropy
  • 💢 newly created root blocks need to be persisted somehow: if I bookmark the page loaded via shortened CID and then the root block gets garbage-collected, the address is dead.
    • potential fix: we could always create redundant sha256 root block for every DAG that uses longer hash function for interop

❓ (D) leverage HTTP proxy mode (on localhost)

When Gateway port is used as HTTP proxy, local client does not perform DNS lookup, but original URL is sent in HTTP request to the proxy for processing.

Because HTTP proxy IS go-ipfs node in that scenario, it does not do DNS lookup, but extract original (long) CID and resolves it, without involvement of DNS.

As long user agents are not overzealous in validating URLs, this would allow for long (>63) CIDs on subdomains.

This is important, because it enables localhost gateway (used by Brave) to resolve long CIDs correctly without any additional hacks.

UX details tbd. This could be the solution for localhost gateway, but for public ones we still need something else.

Other ideas?

Would love to find a better way to work around this

cc @aschmahmann @Stebalien ipfs/in-web-browsers#89

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High: Likely tackled by core team if no one steps upkind/bugA bug in existing code (including security flaws)status/in-progressIn progresstopic/cidv1b32Topic cidv1b32topic/ed25519Issues related to ed25519 Peer IDstopic/gatewayTopic gateway

    Type

    No type

    Projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions