We are currently capturing the number of clients observed in the IPFS public DHT network and we report this as part of our weekly reports (currently in this repo - see example for Week 17 as well as at probelab.io: https://probelab.io/ipfsdht/#client-vs-server-node-estimate.
As per this discussion thread in Slack, this is great, but only captures part of the story, i.e., it focuses on the public IPFS DHT only, which in turn, means that it is mostly focusing on Kubo. However, IPFS is more than the kubo implementation and more than the public IPFS DHT. A request from @BigLep is to be able to "show the number of peer ids observed across various "networks" and break out by implementation".
In order to go about doing this, we'd need to identify data sources (i.e., how to collect the data) from different: i) IPFS implementations (e.g., Kubo, Helia, Iroh), and ii) networks that run IPFS nodes (e.g., the IPFS DHT, the Lotus DHT, cid.contact/IPNI, etc). We should also ideally deduplicate the PeerIDs to avoid double-counting a peer that participates in more than one network (?).
I'm starting this issue to capture first what we want to target and then come up with data collection ideas (e.g., through measurement tools, logs etc.).
cc: @BigLep @dennis-tra