|
1 | 1 | !!! warning "Experimental Feature" |
2 | | - The plugin system is currently **experimental** and under active development. The documentation, examples, and plugin interface are subject to significant changes in future releases. If you encounter any issues, have questions, or have ideas for improvement, please [open an issue on GitHub](https://github.com/NVIDIA-NeMo/DataDesigner/issues/new/choose). |
| 2 | + The plugin system is currently **experimental** and under active development. The documentation, examples, and plugin interface are subject to significant changes in future releases. If you encounter any issues, have questions, or have ideas for improvement, please consider starting [a discussion on GitHub](https://github.com/NVIDIA-NeMo/DataDesigner/discussions). |
3 | 3 |
|
4 | 4 |
|
5 | 5 | # Example Plugin: Index Multiplier |
@@ -109,17 +109,17 @@ class IndexMultiplierColumnGenerator(ColumnGenerator[IndexMultiplierColumnConfig |
109 | 109 | - Generic type `ColumnGenerator[IndexMultiplierColumnConfig]` connects the task to its config |
110 | 110 | - `metadata()` describes your generator and its requirements |
111 | 111 | - `generation_strategy` can be `FULL_COLUMN`, `CELL_BY_CELL` |
112 | | -- `required_resources` lists any required resources (models, artifact storage, etc.). This parameter will change in the future, so keeping it as `None` is safe for now. |
113 | | -- Access configuration parameters via `self.config` |
| 112 | +- You have access to the configuration parameters via `self.config` |
| 113 | +- `required_resources` lists any required resources (models, artifact storages, etc.). This parameter will evolve in the near future, so keeping it as `None` is safe for now. That said, if your task will use the model registry, adding `data_designer.engine.resources.ResourceType.MODEL_REGISTRY` will enable automatic model health checking for your column generation task. |
114 | 114 |
|
115 | 115 | !!! info "Understanding generation_strategy" |
116 | 116 | The `generation_strategy` specifies how the column generator will generate data. |
117 | 117 |
|
118 | | - - **`FULL_COLUMN`**: Generates the entire column at once |
119 | | - - `generate` must take a `pd.DataFrame` as input and return a `pd.DataFrame` |
| 118 | + - **`FULL_COLUMN`**: Generates the full column (at the batch level) in a single call to `generate` |
| 119 | + - `generate` must take as input a `pd.DataFrame` with all previous columns and return a `pd.DataFrame` with the generated column appended |
120 | 120 |
|
121 | 121 | - **`CELL_BY_CELL`**: Generates one cell at a time |
122 | | - - `generate` must take a `dict` as input and return a `dict` |
| 122 | + - `generate` must take as input a `dict` with key/value pairs for all previous columns and return a `dict` with an additional key/value for the generated cell |
123 | 123 | - Supports concurrent workers via a `max_parallel_requests` parameter on the configuration |
124 | 124 |
|
125 | 125 | ## Step 4: Create the plugin object |
|
0 commit comments