-
Notifications
You must be signed in to change notification settings - Fork 51
chore: Make column_type a pydantic field rather than a property #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| DataDesignerColumnType = create_str_enum_from_discriminated_type_union( | ||
| enum_name="DataDesignerColumnType", | ||
| type_union=ColumnConfigT, | ||
| discriminator_field_name="column_type", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's where we dynamically create this StrEnum.
| columns: | ||
| - name: code_id | ||
| sampler_type: uuid | ||
| column_type: sampler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yaml configs now have the column_type field.
src/data_designer/config/columns.py
Outdated
| def column_type_is_in_dag(column_type: Union[str, DataDesignerColumnType]) -> bool: | ||
| column_type = resolve_string_enum(column_type, DataDesignerColumnType) | ||
| return column_type in [ | ||
| DataDesignerColumnType.EXPRESSION, | ||
| DataDesignerColumnType.LLM_CODE, | ||
| DataDesignerColumnType.LLM_JUDGE, | ||
| DataDesignerColumnType.LLM_STRUCTURED, | ||
| DataDesignerColumnType.LLM_TEXT, | ||
| DataDesignerColumnType.VALIDATION, | ||
| ] | ||
|
|
||
|
|
||
| def column_type_is_llm_generated(column_type: Union[str, DataDesignerColumnType]) -> bool: | ||
| column_type = resolve_string_enum(column_type, DataDesignerColumnType) | ||
| return column_type in [ | ||
| DataDesignerColumnType.LLM_TEXT, | ||
| DataDesignerColumnType.LLM_CODE, | ||
| DataDesignerColumnType.LLM_STRUCTURED, | ||
| DataDesignerColumnType.LLM_JUDGE, | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are defined as functions instead of methods on DataDesignerColumnType. We'll need to make the lists they check dynamic once plugins are implemented.
| """ | ||
|
|
||
| columns: list[ColumnConfigT] = Field(min_length=1) | ||
| columns: list[Annotated[ColumnConfigT, Field(discriminator="column_type")]] = Field(min_length=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we can do this
a436172 to
17b984f
Compare
| DataDesignerColumnType.EXPRESSION, | ||
| DataDesignerColumnType.LLM_CODE, | ||
| DataDesignerColumnType.LLM_JUDGE, | ||
| DataDesignerColumnType.LLM_STRUCTURED, | ||
| DataDesignerColumnType.LLM_TEXT, | ||
| DataDesignerColumnType.VALIDATION, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does our type checker have an issue with this? Since DataDesignerColumnType is dynamically resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not running ty yet, but my IDE seems happy with all of it. If you can pull the branch to check your IDE settings, that might be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's change these to sets
| import networkx as nx | ||
|
|
||
| from data_designer.config.columns import ColumnConfigT | ||
| from data_designer.config.columns import ColumnConfigT, column_type_is_in_dag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: column_type_is_in_dag -> column_type_used_in_execution_dag or something like that?... to be specific about what this dag is.
nabinchha
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just some nits, thanks!
nabinchha
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
|
Thank you for your submission! We ask that you sign our Developer Certificate of Origin before we can accept your contribution. You can sign the DCO by adding a comment below using this text: I have read the DCO document and I hereby sign the DCO. You can retrigger this bot by commenting recheck in this Pull Request. Posted by the DCO Assistant Lite bot. |
This PR is in preparation for the plugin framework.
Main Change
column_typeis now a pydantic field of column configs (as opposed to a property).The reasons for this change are:
This makes it more straightforward for the plugin framework to dynamically update the column types. For instance, the
column_typefield is now given as aLiteral["column-type"], which can be defined before theDataDesignerColumnTypestring enum is created. And we can use the type union (i.e., ColumnConfigT) to createDataDesignerColumnTypeafter all the configs have been defined.We can now use this field as a type union discriminator, which is pydantic's most reliable way to identify concrete objects from serialized configs.
Other Notes
column_typefield. The user never needs to set this in the SDK, but it is required in yaml configs (this is actually a good thing, though, since it makes it easy to identify column types in the config).