Add support for loading datasets from Hugging Face using from_huggingface_oscur#49
Add support for loading datasets from Hugging Face using from_huggingface_oscur#49simonprovost merged 5 commits intomainfrom
Conversation
There was a problem hiding this comment.
So needed feature to Urban Mapper ! Thanksss @soniacq for this one 1️⃣
Overall, nice PR. Let's just turn it to be more of a wide scope being able to target any datasets from any HF hub repository.
More work will need to be done for any potential merge as per the comments below, but here are a couple I could not make throughout the code:
- Re working the
loaderbasic jupyter notebook so that: Following #45 it will need to show this current better integration or show both, so that users can do it themselves if they want, I guess it's not bad to show both ways (our integration, and a manual one as per #45). Your call @soniacq. <-------- A couple of lines of code to add to the Notebook. - Update the documentation guide for the loader to show a full section on this current nice integration of
HF. See more indocs/user-guide/modules/1-loaders.md. <-------- One to three paragraphs at most and one title. - Update the API reference point to the
Loader_Factorynow that we have a newfrom_Xmethod thanks to your PR. See more indocs/api/loaders.md. <------ One line.
Cheers and thanks once more 🎉
PS: Feel free to resolve the comments when they are done, it notifies my email so that I can come back as soon as possible to review again!
|
Hey @soniacq ! Thanks so so much for this one this is so cool 😎 Was just passing by seeing if you let me a comment for me the re review. So I'll take the following opportunity: I can see the compilation does not go through. Easy peasy ! Now that the integration of HF will be mandatory in the Let me know if you have any question (on discord is all good as always) Cheers |
|
@soniacq Myyyy mistake for missing to also say that when adding a new package; the requirements txt files also need update (on the long term this will vanish).
uv export --dev --no-hashes | awk '{print $1}' FS=' ;' > requirements-dev.txtNote The above (1) will vanish when PyPi will get Urban Mapper first release. (2) will never vanish as this is how Read The Doc works. Unless, this appears to be merged at some point: astral-sh/uv#10074 Next review I intend to look at the generated preview documentation (When it compiles) 👌 Cheers |
|
@simonprovost do we really need
UrbanMapper explicitly asks for dill>=0.3.9, which is incompatible with all available |
@soniacq Go for it! I do not think this will cause any issue to the saved Urban Pipeline a alpha version ✅ Cheers |
ad6e34f to
eab981e
Compare
|
All right @soniacq, amazing PR thanks for this contribution, certainly is going to be helpful to more than one (I hope so!) 💪 Let me summarize what I've modified (as agreed upon on Discord):
Feel free to modify, I am a bit tired and am not I guess an expert with doc either so if changes needed to be done feel free to do it ✅
Next, and you did not know about that. I reworked the history entirely because it was becoming a spaghetti kind of history with multiple commits changing the very same file which in practice is not best to do given that it adds noisy commits. If we do that on 100 PRs then the history will not make sense anymore. I recommend this article for fixing up stuff next time but no rush: https://github.com/TheAssemblyArmada/Thyme/wiki/Using-Fixup-Commits. Meanwhile, we went from 10 commits to three therefore. To avoid having the "contribution" reset, feel free to reset Steps: git reset --soft HEAD~n # N being three I believeThen commit again, to have a clean history, what I pushed is like the example of a clean history. Lastly, I have successfully rebase with main ( As a result, to merge the PR, the last thing we need to do is now updating the Loader Basic example notebook in reworking it all so that the Cheers |
eab981e to
6dc0ec3
Compare
…LoaderFactory to ensure latitude and longitude columns are specified.
…ples to the notebook.
simonprovost
left a comment
There was a problem hiding this comment.
Brillant! Love this PR so much, thanks @soniacq for this addition to UM 🎉
Merging now, we've covered enough and seems to be pretty reliable so let's gooo!
Cheers!!!!
WIP
Add support for loading datasets from Hugging Face using from_huggingface_oscur method in LoaderFactory
📚 Documentation preview 📚: https://UrbanMapper--49.org.readthedocs.build/en/49/