Skip to content

Conversation

@andreatgretel
Copy link
Contributor

@andreatgretel andreatgretel commented Nov 5, 2025

Adding a CustomColumnGenerator that works essentially the same as the LocalCallableValidator while being more generic.

Still a few points to address:

  • what if the user wants to call an LLM inside of the generator function?
  • should we support strategies other than GenerationStrategy.FULL_COLUMN?
  • should we support multiple columns being added at once?

Closes #6.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be removed, just wanted to add an example script to test the PR somewhere.

At the same time we should maybe think of having such a folder somewhere?

@andreatgretel andreatgretel force-pushed the andreatgretel/custom-column-generator branch from d7dba9a to 59f4ccd Compare November 5, 2025 14:14

original_columns = set(data.columns)
try:
result = self.config.generator_function(data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did something like

result = self.config.generator_function(data, resource_provider=self.resource_provider)

the user could then optionally leverage LLMs and other resources in their custom generate method. I'm exploring something like this in the plugin decorator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking what's the best way of doing this... For calling LLMs in particular, it should probably also leverage the remaining of what's in place - threading etc.
Should it inherit WithLLMGeneration perhaps? 🤔

Also LLM generators are typically CELL_BY_CELL... Not clear to me how to specify the generation strategy here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, good point. Maybe we should split that mixin up a bit so that we can just get the useful properties (and not the shared generate method)

We could let the user choose the generation strategy? Haven't thought about how messy that would make the implementation, but it would basically define the signature of their generate method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CustomColumnGenerator for quick customization

3 participants