Replies: 2 comments
-
future thinking:
independent versus dependent pipelines How we format the extractors:
make the extractors directly applicable to the pubget outputs, not just our idiosyncratic data-pond. like being able to read in labelbuddy jsonl files. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Apply to labelbuddy file as well (for development) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
https://github.com/litmining/pubextract is a plugin for pubget developed by @jeromedockes
The original goal was for it to be a place to store extractors that can be applied to pubget outputs, natively as plugins.
However, there is a concern of lack of maintenance and redundancy with this neurostore-text-extraction.
JB is also interested in having a generic place to store text extractors.
It's not clear if these two projects are fully mergable (as neurostore-text-extraction's goal is primarily on production level extractors), but there are some things we could do to make this library for generically useful, such as:
Beta Was this translation helpful? Give feedback.
All reactions