refactor: update required resources treatment and use subclasses over mixins #184

johnnygreco · 2026-01-07T22:51:24Z

The problem we are solving

ConfigurableTaskMetadata requires you to specify the resources required to execute the task. In practice, however, all tasks always have access to all resources. While this issue isn’t a big deal, it is confusing for plugin builders who don’t have the above context.

Changes

Remove required_resources from metadata
Always assume all tasks have access to all resources.
Use subclasses (instead of mixins) to streamline development and simplify plugin development.

An important note is that I think we still need some what for a generator to specify its required resources. For example, say we want to filter plugin types to only grab the ones that need the model registry or just need the datastore. The solution implemented here is an abstract method called get_required_resources that must be implemented on generators. This effectively pushes this complication to a lower lever, where most developers won't need to worry about it. I'll highlight some places in the code to show what I mean.

src/data_designer/engine/column_generators/generators/base.py

johnnygreco · 2026-01-07T22:57:06Z

src/data_designer/engine/column_generators/generators/llm_completion.py



-class WithChatCompletionGeneration(WithModelGeneration):
+class ColumnGeneratorWithSingleModelChatCompletion(ColumnGeneratorWithSingleModel[TaskConfigT]):


This class name is getting a bit long for my liking lol

may be we drop single from these names assuming we mostly work with one model by default

src/data_designer/engine/column_generators/generators/base.py

nabinchha · 2026-01-08T01:33:51Z

src/data_designer/engine/column_generators/generators/base.py

        logger.info(f"  |-- column name: {self.config.name!r}")
        logger.info(f"  |-- model config:\n{self.model_config.model_dump_json(indent=4)}")
-        if self.model_config.provider is None:
-            logger.info(f"  |-- default model provider: {self._get_provider_name()!r}")


We're loosing this log message, which was added at some point to indicate the default mode provider being used when model config itself doesn't reference a provider.

The update makes it so this message is always logged, right? At least that was my intention. WDYT?

nabinchha · 2026-01-08T01:36:04Z

src/data_designer/engine/column_generators/generators/llm_completion.py



-class WithChatCompletionGeneration(WithModelGeneration):
+class ColumnGeneratorWithSingleModelChatCompletion(ColumnGeneratorWithSingleModel[TaskConfigT]):


may be we drop single from these names assuming we mostly work with one model by default

nabinchha · 2026-01-08T01:38:41Z

src/data_designer/engine/dataset_builders/column_wise_builder.py

            )

-    def _fan_out_with_threads(self, generator: WithModelGeneration, max_workers: int) -> None:
+    def _fan_out_with_threads(self, generator: ColumnGeneratorWithSingleModel, max_workers: int) -> None:


should this be at the ColumnGeneratorWithModelRegistry level?

johnnygreco · 2026-01-09T16:01:07Z

src/data_designer/config/column_types.py

-def column_type_used_in_execution_dag(column_type: str | DataDesignerColumnType) -> bool:
-    """Return True if the column type is used in the workflow execution DAG."""
-    column_type = resolve_string_enum(column_type, DataDesignerColumnType)
-    dag_column_types = {
-        DataDesignerColumnType.EXPRESSION,
-        DataDesignerColumnType.LLM_CODE,
-        DataDesignerColumnType.LLM_JUDGE,
-        DataDesignerColumnType.LLM_STRUCTURED,
-        DataDesignerColumnType.LLM_TEXT,
-        DataDesignerColumnType.VALIDATION,
-        DataDesignerColumnType.EMBEDDING,
-    }
-    dag_column_types.update(plugin_manager.get_plugin_column_types(DataDesignerColumnType))
-    return column_type in dag_column_types
-
-
-def column_type_is_model_generated(column_type: str | DataDesignerColumnType) -> bool:
-    """Return True if the column type is a model-generated column."""
-    column_type = resolve_string_enum(column_type, DataDesignerColumnType)
-    model_generated_column_types = {
-        DataDesignerColumnType.LLM_TEXT,
-        DataDesignerColumnType.LLM_CODE,
-        DataDesignerColumnType.LLM_STRUCTURED,
-        DataDesignerColumnType.LLM_JUDGE,
-        DataDesignerColumnType.EMBEDDING,
-    }
-    model_generated_column_types.update(
-        plugin_manager.get_plugin_column_types(
-            DataDesignerColumnType,
-            required_resources=["model_registry"],
-        )
-    )
-    return column_type in model_generated_column_types
-
-


Moved these both to engine

johnnygreco commented Jan 7, 2026

View reviewed changes

src/data_designer/engine/column_generators/generators/base.py Outdated Show resolved Hide resolved

johnnygreco commented Jan 7, 2026

View reviewed changes

johnnygreco changed the title ~~refactor: remove required resources from metadata and leverage subclasses over mixins~~ refactor: update required resources treatment and leverage subclasses over mixins Jan 7, 2026

johnnygreco changed the title ~~refactor: update required resources treatment and leverage subclasses over mixins~~ refactor: update required resources treatment and use subclasses over mixins Jan 7, 2026

johnnygreco requested review from mikeknep and nabinchha January 7, 2026 22:59

johnnygreco assigned andreatgretel and unassigned andreatgretel Jan 7, 2026

johnnygreco requested a review from andreatgretel January 7, 2026 23:01

johnnygreco force-pushed the johnny/refactor/remove-required-resources-from-metadata branch from d7d6e9c to c4f717b Compare January 7, 2026 23:01

nabinchha reviewed Jan 8, 2026

View reviewed changes

johnnygreco added 4 commits January 9, 2026 10:17

removing required resources

579a274

fix tests

0077ae3

add get required resources method to base column generator

55453da

move classification functions to engine; remove required resources

1b18b73

johnnygreco force-pushed the johnny/refactor/remove-required-resources-from-metadata branch from c4f717b to 1b18b73 Compare January 9, 2026 15:56

johnnygreco commented Jan 9, 2026

View reviewed changes

drop single from subclass names

a401abf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: update required resources treatment and use subclasses over mixins #184

refactor: update required resources treatment and use subclasses over mixins #184

johnnygreco commented Jan 7, 2026

Uh oh!

Uh oh!

johnnygreco Jan 7, 2026

Uh oh!

nabinchha Jan 8, 2026

Uh oh!

Uh oh!

nabinchha Jan 8, 2026

Uh oh!

johnnygreco Jan 9, 2026

Uh oh!

nabinchha Jan 8, 2026

Uh oh!

nabinchha Jan 8, 2026

Uh oh!

johnnygreco Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants



		class WithChatCompletionGeneration(WithModelGeneration):
		class ColumnGeneratorWithSingleModelChatCompletion(ColumnGeneratorWithSingleModel[TaskConfigT]):

refactor: update required resources treatment and use subclasses over mixins #184

Are you sure you want to change the base?

refactor: update required resources treatment and use subclasses over mixins #184

Conversation

johnnygreco commented Jan 7, 2026

The problem we are solving

Changes

Uh oh!

Uh oh!

johnnygreco Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

nabinchha Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nabinchha Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

johnnygreco Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

nabinchha Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

nabinchha Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

johnnygreco Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants