Skip to content

bug: fs.ls seems to query too many blocks #260

@SgtPooki

Description

@SgtPooki

We need to investigate how wide dags are handled in @helia/verified-fetch and add some tests to ensure we're not (or can choose not to) unnecessarily load children blocks when doing fs.ls.

This issue should probably belong in @helia/unixfs but opening here because this is where the issue is observed first.

Potentially related to ipfs/js-ipfs-unixfs#437


Based on a discussion on slack:

Slack discussion log

Arkadiy - Internet Archive [Thursday at 12:58 PM]
ok it looks like the stall on the ls iterator happens due to this:

Daniel Norman [Thursday at 12:59 PM]
Nice! So we probably just need some better concurrency control

Arkadiy - Internet Archive [Thursday at 1:07 PM]
I have also identified ~1370 as the magic number that breaks every time; I am guessing the requests are dispatched all at once so that must be hitting some hard ceiling

Arkadiy - Internet Archive [Thursday at 1:43 PM]
also this is chromium specific firefox actually does the whole batch fine

SgtPooki [Thursday at 2:30 PM]
IIRC: there is no way to query the browser to determine if the network request queue is full, blocked, nor reaching some limit.Alex investigated this a bit when looking into the WebTransport bug in chrome a while ago

Arkadiy - Internet Archive [Friday at 9:25 AM]
ok I actually have a deployed test for you

SgtPooki [Friday at 9:45 AM]
FYI you may want to switch from the default memoryStore to a persistent store: https://github.com/parkan/helia-ia-frontend/blob/3f5382d728e41b1cffc14058def592e6b65eb005/sw/sw.ts#L118-L128

Arkadiy - Internet Archive [Friday at 10:07 AM]
thanks, this is probably better than my currently hand-managed app-layer cache

SgtPooki [Friday at 10:19 AM]
I did initially, but I can’t after

SgtPooki [Friday at 10:32 AM]
yea thats what i got before too.. and if I refresh, it seems to work.one issue a persistent storage should solve for you: re-clicking browse content wouldn’t start over at zero.. it would have all the content it already got

Arkadiy - Internet Archive [Friday at 11:33 AM]
it already does not start over at zero as far as I can tell

SgtPooki [Friday at 12:36 PM]
yea i forked and bumped up timeout and am seeing the hang more obviously.I would recommend enabling the debugger in the SW to try to see more about what’s happening:import { enable } from 'weald'

enable('helia*,helia*:trace,libp2p*,libp2p*:trace')that will get you a TON of logs.. but should hopefully help narrow in on what is hanging.. when you start seeing repeated logs look at last logs before the repeated items, or when logs stop look at the last item

Arkadiy - Internet Archive [Friday at 12:37 PM]
yeah I tried your suggested magic domain thing and it didn't work, I think because I register my own service worker?

Arkadiy - Internet Archive [Yesterday at 11:55 AM]
IndexedDB blockstore seems to be working well, still need to figure out how to tune it however -- the API docs link (https://ipfs.github.io/js-stores/modules/blockstore_idb.html) is 404

Arkadiy - Internet Archive [Yesterday at 12:29 PM]
hmmm ok I I tried updating to the latest @helia/verifeid-fetch and it broke esbuild treeshaking due to how the modules are initialized (I think), I previously had my bundle size for sw to <600KiB but now it refuses to shake out kad etc even though I am only using http and verified fetch

Arkadiy - Internet Archive [Yesterday at 1:29 PM]
also... this is trying to itreate a multi-thousand file directory  (edited)

Arkadiy - Internet Archive [Yesterday at 3:09 PM]
is the top layer of a ~10k file unixfs directory really on the order 10-100MiB?

Arkadiy - Internet Archive [Yesterday at 3:17 PM]
re: bundle size, the last version we can successfully shake is ~2.1.3 (~790KB bundle), changes after this blow up to ~1.2MB with all possible esbuild optimizations enabled (edited)

SgtPooki [Yesterday at 3:22 PM]
so.. with the multiple requests. currently we only have a block based blockbroker..i have been wanting for a while to do car requests, and stream those into the block store.. idk that it would solve your issues because of the size of it, but I know there’s a need.The bundle size issue, is that with helia/verified-fetch specifically?

Arkadiy - Internet Archive [Yesterday at 3:27 PM]
let's set the bundle size aside for the moment, it's relatively marginal at the end of the day, I just wanted to note it down since you are so generously reading what I am posting

SgtPooki [Yesterday at 3:54 PM]
you can do range requests today, even on car files with @helia/verified-fetch IIRC.. but range requests on car files doesn’t make practical sense today AFAIK.. we started a discussion a month or few back about CarV3 spec that would support an index and therefore allow for some range requests and better functionality.. but I don’t think we have enough funding support to prioritize that at the moment.. nor feedback from folks who would be interested.I know Storacha is also interested in this.. and some better data-lineage support.. I’m not sure where the carv3 discussion ended up.. cc @gammazero @lidel---As for the delegated routing and multiple trustless gateway requests falling over.. is it because there is no concurrency  controls for the requests? I feel like I would need to dive in here deeper, but if you have some ideas of how we can fix things for you, we’re happy to do so.. we might need to move this to a github issue so we can track the discussion better, but I know i’m much more responsive on here

Arkadiy - Internet Archive [Yesterday at 3:59 PM]
re: first point, understood -- I think we can go deeper on this at a later date, I do think broad "get this big file as one piece car-stream" is a separate concern from "iterating 10000 files in a unixfs graph takes 12000 requests", prioritizing these probably depends on complexity of implementation (the second problem seems more solvable without structural changes at format level)the second point, yes, I the behavior seems to be to spray requests without much concurrency control... I am currently focusing on getting things working to demo grade with a single overprovisioned SP, but we have a lot of material onboarded to riffraff via RIBS and of course ideally we'd be able to source from both in some fashion

SgtPooki [Yesterday at 3:59 PM]
I know you can implement your own blockBroker in order to fetch blocks in a way that works for you.. so you could optimize the actual fetching

Arkadiy - Internet Archive [Yesterday at 4:02 PM]
I've made a lot of changes btw

SgtPooki [Yesterday at 4:02 PM]
@achingbrain might have some better ideas or be able to help make the performance optimizations you need without as much guessing as i’m doing

Arkadiy - Internet Archive [Yesterday at 4:02 PM]
so if you haven't looked yet don't without pulling in uhhh 10 mins

SgtPooki [Yesterday at 4:02 PM]
haha! I haven’t looked since last thur/fri I will check in again later.. I do wanna help unblock you. i love the internet archive and want to help you  succeed here

Arkadiy - Internet Archive [Yesterday at 4:02 PM]
the good news is that 50% of what I was trying to do was actually possible/implemented in helia libs

SgtPooki [Yesterday at 4:03 PM]
nice

Arkadiy - Internet Archive [Yesterday at 4:03 PM]
but it was really hard to find those...

SgtPooki [Yesterday at 4:03 PM]
yeaa.. there is a lot out there.. we need a waaaaay more thorough onboarding guide

Arkadiy - Internet Archive [Yesterday at 4:03 PM]
which is why I ended up rewriting it

SgtPooki [Yesterday at 4:07 PM]
is that CID you’ve been testing with hosted on a filecoin SP?

Arkadiy - Internet Archive [Yesterday at 4:07 PM]
(that and being able to reasonably negotiate block broker flows)

SgtPooki [Yesterday at 4:28 PM]
yea, i think kubo recently updated to announce only root level content too instead of all blocks.. (or its in progress)

Adin Schmahmann [Yesterday at 4:50 PM]
is the top layer of a ~10k file unixfs directory really on the order 10-100MiB?For bafybeia67mubwee37paqwwrr2lsi52cyeajtouctmhwugt7or2srslcaey it's ~900kBYou can see it via: curl "https://ia.dcentnetworks.nl/ipfs/bafybeia67mubwee37paqwwrr2lsi52cyeajtouctmhwugt7or2srslcaey?format=car&dag-scope=entity"

lidel [Yesterday at 6:38 PM]
I strongly suspect the problem with directory listing in JS implementation is fetching unnecessary child blocks (likely to read their type or size -- @SgtPooki mind confirming that is what is happening?)Note: GO implementation in boxo/gateway  stopped doing that years ago, and is instant, only fetches 900KiB.  Opening https://ipfs.io/ipfs/bafybeia67mubwee37paqwwrr2lsi52cyeajtouctmhwugt7or2srslcaey/ requires only the 900 KiB @aschmahmann linked above (my node hosts it, so it works on public gw  now).You can also see the difference in CLI, bu passing --resolve-type=false --size=false --stream to ipfs ls -- it will only use blocks necessary to enumerate directory, won't fetch roots of children:$ curl -s "https://ia.dcentnetworks.nl/ipfs/bafybeia67mubwee37paqwwrr2lsi52cyeajtouctmhwugt7or2srslcaey?format=car&dag-scope=entity" | ipfs dag import --pin-roots=false
$ time ipfs ls --resolve-type=false --size=false --stream bafybeia67mubwee37paqwwrr2lsi52cyeajtouctmhwugt7or2srslcaey
...
0.31s user 0.14s system 138% cpu 0.324 total(edited)

Metadata

Metadata

Assignees

Labels

P0Critical: Tackled by core team ASAPdif/mediumPrior experience is likely helpfuleffort/daysEstimated to take multiple days, but less than a weekkind/bugA bug in existing code (including security flaws)need/analysisNeeds further analysis before proceedingstatus/blockedUnable to be worked further until needs are met

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions