|
| 1 | +--- |
| 2 | +title: "Maven @ IPFS" |
| 3 | +date: 2026-01-22T10:55:53+01:00 |
| 4 | +draft: false |
| 5 | +authors: cstamas |
| 6 | +author: cstamas ([@cstamas](https://bsky.app/profile/cstamas.bsky.social)) |
| 7 | +categories: |
| 8 | + - Blog |
| 9 | +tags: |
| 10 | + - maven |
| 11 | + - ipfs |
| 12 | +projects: |
| 13 | + - Maven |
| 14 | +--- |
| 15 | + |
| 16 | +Lately has been toying with IPFS to achieve content sharing without centralized infrastructure. In other words, instead |
| 17 | +to free-ride on some (centralized) infrastructure that may be a public good, or some commercial offering, solve the publishing |
| 18 | +(and also "owning" and "hosting" the data) by my self. Kinda reminded me this to early 2000s, when many ran |
| 19 | +Maven repositories in their own basements, making access to their repositories fragile, with longevity accessibility |
| 20 | +and uptimes totally unpredictable. That was before Central, which consolidated but also promoted itself as single point of |
| 21 | +failure. And hence, ended up as over- and misused, see [Maven Central and the Tragedy of the Commons](https://www.sonatype.com/blog/maven-central-and-the-tragedy-of-the-commons). |
| 22 | + |
| 23 | +Hence, I wanted plan B for our Maveniverse organization. What if we -- aside of continued publishing to Central -- offer |
| 24 | +other ways to get artifacts for those interested in them, let them keep them in a way they want (not possible with Central, |
| 25 | +mirroring is still not an option), and have them full access to artifacts (ie indexing, scanning, whatever)? |
| 26 | + |
| 27 | +## Enter IPFS |
| 28 | + |
| 29 | +I will not spend a lot on explaining IPFS, it has a nice [documentation available already](https://docs.ipfs.tech/). |
| 30 | +If anything else, at least read [this page](https://docs.ipfs.tech/concepts/ipfs-solves/). |
| 31 | + |
| 32 | +In short, IPFS is a decentralized network, running IPFS nodes and implementing Content Addressable Storage (CAS) and |
| 33 | +more. Terms worth knowing about are [CID](https://docs.ipfs.tech/concepts/content-addressing/#what-is-a-cid), |
| 34 | +that in very simplified form, can be understood as a "hash" (content solely content hash), pointing to some |
| 35 | +content (any content). The content behind CID can be one of multiple things (but once set, is immutable): it can |
| 36 | +point to contents of a JAR file, or in fact any file, but it can be a DAG backed by IPLD schema [file system](https://docs.ipfs.tech/concepts/file-systems/). |
| 37 | +This means that CID may point even to "file hierarchy", like on Linux systems, with directories and files and everything. |
| 38 | +These structures are colloquially called "Merkle Trees", and are built "bottom up", from leaves (files) toward root. |
| 39 | +Hence, in case of change (ie another file added to a directory), the existing file CID remains unchanged, but due their |
| 40 | +shared parent directory changed, its CID and root CID will change. |
| 41 | +Producing CID is free to everyone: just run an IPFS node and upload some content to it, and you will get a CID for it. |
| 42 | +Important thing to consider, is that if two persons, independently, upload some content and both end up with same CID |
| 43 | +(with some fine print, later about that), it means they both independently published bit-by-bit _same content_ (IPFS is |
| 44 | +content addressable). |
| 45 | + |
| 46 | +Having the CID is like having the "address", but where is the content behind the address? For start, it is in the node |
| 47 | +you uploaded the content to get the CID. With "pinning", you can make your node pull down any CID backing content |
| 48 | +(assuming is reachable). Basically you maintain your node, letting it stick to content you want/need, and merely |
| 49 | +caching the rest (IPFS node storage performs regular garbage collections, dropping unreachable or stale content). Furthermore, |
| 50 | +there are (paid or free) services for "pinning content", by using those services, you can make sure content is "pinned" |
| 51 | +and swiftly served to any node wanting it. But this is out of scope for this article. |
| 52 | + |
| 53 | +Another term is [IPNS](https://docs.ipfs.tech/concepts/ipns/), that is like "mutable CID" (maybe consider it like DNS). For creating IPNS entries |
| 54 | +a private cryptographical key is required, and each key can produce _one IPNS entry_. At the same time, one node (or user) |
| 55 | +can mangle as many keys as they want. And this is important thing: as I explained above, anyone can create CID, but if |
| 56 | +consumer asks IPNS "which is CID you published", all user has to do is resolve IPNS to get the right CID, as IPNS |
| 57 | +entry is the function of private key, and it cannot be faked. Nor CID or IPNS is "human friendly", kinda, but there |
| 58 | +are solutions to it like [DNSLink](https://dnslink.dev/) where IPNS can be exposed via DNS and "human friendly" domain. |
| 59 | + |
| 60 | +{{% pageinfo color="info" %}} |
| 61 | + |
| 62 | +Have to note several key things to reassure IPFS users: |
| 63 | +* IPFS works in similar fashion as Torrent, uses DHT and various other means to let nodes discover each other. In short, |
| 64 | + the more nodes the merrier. In other words, "popular" content may be cached on multiple nodes, and hence, will be |
| 65 | + faster getting them. |
| 66 | +* Each IPFS node participates in traffic direction (ie passing messages) among each other, telling about discovered nodes, |
| 67 | + and offering local content, if asked for. |
| 68 | +* Important thing to note, is that if you run your node **it will store only the content you tell it to store** |
| 69 | + (by locally pushing or pinning), it will NOT store random content from internet. |
| 70 | +* There is pretty much nothing needed on your side (network setup or alike) to make IPFS published. |
| 71 | + |
| 72 | +{{% /pageinfo %}} |
| 73 | + |
| 74 | +## Rounding it up |
| 75 | + |
| 76 | +So what gives this? |
| 77 | +* CID points to content/structure and is immutable (same CID will always return same content, if content is accessible) |
| 78 | +* IPNS points to "up to date" CID, and you can be sure entry was published by key owner only and nobody else |
| 79 | +* DNS points to IPNS, and again, assuming you trust the domain owner, you can then delegate your trust to IPNS entry it points to. This trust delegation |
| 80 | + is very similar to Central, where you need to provide proof to get publishing namespace (that is ideally a reverse domain). |
| 81 | + |
| 82 | +In short, we have a series of indirection: `domain -> IPNS -> CID`. If you get to CID by hopping over these stops, all |
| 83 | +fine. But what happens if your private key (used for publishing IPNS) is compromised? Just create new key, republish the |
| 84 | +content with it, and update DNSLink for your domain (and of course, communicate it). After all, we still can GPG sign |
| 85 | +artifacts, so IPNS + GPG is good enough. |
| 86 | + |
| 87 | +An example of this setup can be seen at [ipfs.maveniverse.eu](https://ipfs.maveniverse.eu/). |
| 88 | + |
| 89 | +Details: |
| 90 | +* the `ipfs.maveniverse.eu` domain uses DNSLink to publish TXT record with IPNS (try `dig _dnslink.ipfs.maveniverse.eu TXT`) |
| 91 | +* the IPNS entry is in form of `/ipns/xxxxxxxxxx` |
| 92 | +* the IPFS node can resolve `/ipns/xxxxx` address to `/ipfs/xxxxx` CID. |
| 93 | + |
| 94 | +## Maven @ IPFS |
| 95 | + |
| 96 | +Maven release repositories seems like perfect candidates to be put onto IPFS: they are immutable. Or to be more precise, |
| 97 | +the leaves (artifacts) in a repository will remain immutable, and by deploying (more) only parent and paren parent changes, |
| 98 | +in essence the root CID changes. Aside of parents, G and A level metadata changes as well, but those are not leaves. |
| 99 | + |
| 100 | +Maveniverse [IPFS](https://github.com/maveniverse/ipfs) extension provides support for this setup above, and is even |
| 101 | +usable on CI to consume (and later to deploy). |
| 102 | + |
| 103 | +The extension requirements are Java 11+ and a reachable Kubo RPC API (simplest is to have it running on localhost) and |
| 104 | +adds following components to Maven: |
| 105 | +* adds IPFS transporter supporting `ipfs:/` URLs |
| 106 | +* adds IPFS publishing support via lifecycle participant |
| 107 | + |
| 108 | +The IPFS URL looks like `ipfs:/name[/subpath]` where parts are: |
| 109 | +* protocol `ipfs:/` is fixed and must |
| 110 | +* for consuming, the `name` element should be **resolvable**. It can be CID, IPNS or DNSLink-ed domain. |
| 111 | +* for deploying, the `name` element aside of that above, should be the name of a **private key** present in IPFS node used to publish IPNS |
| 112 | +* optional `/subpath` defines the path prefix within `name` |
| 113 | + |
| 114 | +Have to mention, that if using CID for `name`, it is user responsibility to _ensure_ proper CID is used, since as explained |
| 115 | +above, CIDs can be created by anyone, and it may contain fake or even malicious artifacts. When using IPNS record, |
| 116 | +similar thing, user has to ensure that he resolves the proper IPNS (but if trust is established, all is good). Finally, |
| 117 | +in case of using (IPFS resolvable) domain, same level of trust can be established as in case of Central, one can |
| 118 | +safely assume that domain owner publishes right thing (same as on Central). |
| 119 | + |
| 120 | +A little bit of digression here: in case `name` is a domain, I was tinkering to **limit** Maven @ IPFS to get only artifacts |
| 121 | +from domains namespace, for example `maveniverse.eu` should offer **only `eu.maveniverse` namespace**. Any ideas welcome! |
| 122 | + |
| 123 | +{{% pageinfo color="info" %}} |
| 124 | + |
| 125 | +Important: the current workflow Maven @ IPFS implements works of small scale, ie Maveniverse forge level, that has handful |
| 126 | +of Megabytes of artifacts. The current workflow due refresh/pinning (downloading whole blob) does not scale, but works |
| 127 | +pretty nicely on small scale. |
| 128 | + |
| 129 | +{{% /pageinfo %}} |
| 130 | + |
| 131 | +## Mimir @ IPFS |
| 132 | + |
| 133 | +As mentioned above, in repositories published with Maven @ IPFS, the _leaves will not change_. That means that their |
| 134 | +CID remains unchanged. Next level would be _IPFS global caching_, for example with Mimir (that already offers similar |
| 135 | +service on LAN using JGroups). Here, some translation needs to be done, that begins with GAV and ends with CID. |
| 136 | + |
| 137 | +Once something in place, will report back! Cheers and have fun! |
0 commit comments