-
Notifications
You must be signed in to change notification settings - Fork 51
feat: custom column generator #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be removed, just wanted to add an example script to test the PR somewhere.
At the same time we should maybe think of having such a folder somewhere?
d7dba9a to
59f4ccd
Compare
|
|
||
| original_columns = set(data.columns) | ||
| try: | ||
| result = self.config.generator_function(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we did something like
result = self.config.generator_function(data, resource_provider=self.resource_provider)the user could then optionally leverage LLMs and other resources in their custom generate method. I'm exploring something like this in the plugin decorator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking what's the best way of doing this... For calling LLMs in particular, it should probably also leverage the remaining of what's in place - threading etc.
Should it inherit WithLLMGeneration perhaps? 🤔
Also LLM generators are typically CELL_BY_CELL... Not clear to me how to specify the generation strategy here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, good point. Maybe we should split that mixin up a bit so that we can just get the useful properties (and not the shared generate method)
We could let the user choose the generation strategy? Haven't thought about how messy that would make the implementation, but it would basically define the signature of their generate method.
Adding a
CustomColumnGeneratorthat works essentially the same as theLocalCallableValidatorwhile being more generic.Still a few points to address:
GenerationStrategy.FULL_COLUMN?Closes #6.