-
Notifications
You must be signed in to change notification settings - Fork 2
Remove hardcoded paths #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I will add more tests, but I wanted to share as soon as possible to get comments about the concepts. |
platypii
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over all I like the refactor. The prop changes are good. The routing changes are good. 👍
The only part I don't really like is using classes for filesystem and source. I would prefer a more functional-programming style where the Source is a type union, and then a bunch of util functions operate on that. Similar to the file operations in utils.ts. Mostly an aesthetic choice. And it might be a little awkward here. But using serializable, immutable objects, and then pure functions operating on those objects, often makes life easier, either with web workers, react, or for unit testing. I could be convinced here though.
|
The main reason I used classes is the If we decide to only support a given set of filesystems and not allow extending it, indeed, we could switch to a type union and pure functions. Otherwise, I'm not sure how to implement it easily (should we have a global registry of filesystems, ie a "service locator", and provide a method to register a new file system? it seems complex). What do you think? |
platypii
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not a fan of the abstract classes. It will become a problem if it needs to be sent to a webworker, since classes aren't serializable. I'd lean toward an interface with the methods (getFileName, etc) and then have implementations of the interface. This is how I implemented local/s3 filesystems in the other repo.
All that being said, I say we merge this. These are good changes, and we can make smaller refactors later if we want.
|
OK! So, let's move from classes to interfaces in another PR. |
This PR implements the concept of filesystem and source.
A source is a URL (
https://hyperparam.blob.core.windows.net/hyperparam/starcoderdata-js-00000-of-00065.parquet), or a path (folder/file.txtor even an empty path: ``).A filesystem can check if it supports a given source, and if so, it provides features to handle it: determine if it's a file or a directory, split the source to populate the breadcrumb navigation, get the list of files in case of a directory, get the resolvable URL in case of a file, etc.
I implemented:
folder1/text.txt, and handles files and foldersbut it's extensible to:
https://hyperparam-public.s3.amazonaws.com/wiki-en-00000-of-00041.parquet URLs)
It would also help handle multiple authentications depending on the file system.
The PR is quite big, sorry about that! I think it should help provide a better abstraction that
parsedKey(#27) and help decouple the app and the components, since we use dependency inversion to create the URLs, for example (so: no more hardcode/files?key=, which fixes #22) .