-
Notifications
You must be signed in to change notification settings - Fork 186
[hipDNN] [ALMIOPEN-834] - Initial version of EngineConfig Knob RFC #3667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
[hipDNN] [ALMIOPEN-834] - Initial version of EngineConfig Knob RFC #3667
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks really good!
My big piece of feedback, especially for the frontend, I don't think a vector is the correct data structure for holding the knobs and knob info, and it should probably be a map instead.
My suggestion:
In the frontend use the id as the hash in a map for the knob info, and turn KnobSetting into KnobValue and have the user pass in a map of knobId to KnobValue
| } | ||
|
|
||
| table EnumConstraints { | ||
| valid_values: [string]; // Required: list of valid enum strings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit weird to me, I think that the EnumValue and EnumConstraints should be the same type. Unless we're always guaranteed a string to enum conversion function it could be cumbersome to programmatically check if the value you're using is supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's chat about this. I think I agree, but I'm not sure what the best solve would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can chat, but I would expect this to be the list of enum values as uint or something similar.
| else | ||
| { | ||
| // Use default for other knobs | ||
| customKnobSettings.push_back(info.toDefaultKnob()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the obstacles for having the default being automatically set? Is there a function where we could loop over the knob infos and fill in the defaults if they're not overriden?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything not set explicitly should be the default, and it shouldn't be required to be set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change this to a map, I think we can just replace the defaults (generated by knobToDefaultKnobSetting) with custom values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use a map, but it shouldn't be providing any values from the frontend for defaults (unless user sets it).
I think it's better to make the API just a set of explicitly overridden settings.
Doing it this way makes it good for serialization (only save what the user explicitly wanted), and also reduce versioning issues if the plugin authors change their engine configuration settings.
| }; | ||
|
|
||
| // Knob information class - describes available knobs for an engine | ||
| class Knob { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but I think it'd make things conceptually easier to follow if this name lined up with what's in the flatbuffers schema, ie: KnobInfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've got a few options:
- We get
Knoband setKnobSetting - We get
KnobInfoand setKnob - We get
KnobInfoand setKnobSetting
I'll come back through and rename things when we decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on matching with schema, but can chat on the specific name choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Votes cast, first one wins.
| if(info.getKnobIdStr() == "miopen.conv.tile_size") | ||
| { | ||
| // Override tile size | ||
| customKnobSettings.emplace_back(info.getKnobId(), 32); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needing to iterate through these is a little clunky, it'd be nice if there were some sort of map-like interface, possibly mapping knob id to its constraints. That also has the advantage of making it so you can only ever have one knob info for a given knob id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of being able to write:
if(knobs.contains(knobIdToStr("miopen.conv.tile_size")){
customKnobSettings.emplace_back(info.getKnobId(), 32);
}Or a more complex type than vector as well that could give you syntax like:
if(knobs.supports("miopen.conv.tile_size"))
I think the knob settings should probably be a map too if that's possible. If anything, I think that's more important since it's managed by the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that. Doing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think we should be requiring these get documented by plugin authors, and having a header in the frontend for defining these for easy use.
This would allow us to still support custom plugin knobs, but make explicit choices available and easy to consume for users of hipDNN frontend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We likely should support both setting them via name, or Id, and manage the conversion ourselves using the hashing.
| 1. **Multiple Value Types**: int64, float64, bool, string, and enumerated types | ||
| 2. **Flexible Validation**: Per-knob validation rules appropriate to the value type | ||
| 3. **Plugin Autonomy**: Plugins can define custom knobs without modifying core hipDNN | ||
| 4. **Namespace Isolation**: Different plugins can have knobs with the same semantic meaning without conflicts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see where it's important to avoid conflicts between knobs within an engine, but where is it necessary across engines? Aren't you always setting or getting from a specific engine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that's a good thing for us to chat about. I was thinking for serialization purposes it might avoid headaches, but maybe not?
| display_name: string; // Human-readable name | ||
| description: string; // Help text | ||
| value_type: KnobValue; // Discriminator for union | ||
| default_value: KnobValue; // Default setting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we need this.
I think we want to know the default in user documentation, but I would expect this API to be just a set of overrides, and we don't need to provide any that are default back (unless the user decides to set it themselves).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's worth keeping just for logging purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it has utility because the knobs are plugin vendor supplied, and we don't control how they're documented, so this is a bit of a forcing function.
| void applyKnobSettings(const KnobSettings& settings) override { | ||
| // Extract and apply settings | ||
| if (auto tileSize = settings.getInt("miopen.conv.tile_size")) { | ||
| _tileSize = *tileSize; | ||
| } | ||
| if (auto algo = settings.getEnum("miopen.conv.algorithm")) { | ||
| _algorithm = parseAlgorithm(*algo); | ||
| } | ||
| if (auto reuse = settings.getBool("miopen.conv.reuse_workspace")) { | ||
| _reuseWorkspace = *reuse; | ||
| } | ||
| } | ||
|
|
||
| private: | ||
| int64_t _tileSize = 16; | ||
| Algorithm _algorithm = Algorithm::Winograd; | ||
| bool _reuseWorkspace = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these properties exist at the execution plan level.
They should be serialized with the execution plan, and the engine shouldn't be stateful with them I believe.
We likely need some way to register the properties that are available at the engine, but they get set and stored with the plan itself. The properties would then be used during the build stage, or execute stage of the plan.
It would be up to the plan to decide if it needs to keep the properties around.
Some are likely build time properties, and others are runtime properties.
| // Get default value (type-specific) | ||
| std::optional<int64_t> getDefaultInt() const; | ||
| std::optional<double> getDefaultFloat() const; | ||
| std::optional<bool> getDefaultBool() const; | ||
| std::optional<std::string> getDefaultString() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like that if we remove the default value then this goes away.
| // Option 1: Use default knob values | ||
| auto defaultKnobSettings = Graph::knobToDefaultKnobSettings(knobs); | ||
| graph.create_execution_plan_ext(MIOPEN_ENGINE, defaultKnobSettings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the above, I think no reason to have any default's.
The current API that does it without knobs should pass no knobs into the backend, and will use defaults for all properties.
I guess this also means for the documentation we should make it clear that all knob settings are optional, and a default must always be provided by the plugin authors.
|
This looks really good @mousdahl-amd! Some higher-level thoughts/questions:
Just some thoughts. The RFC looks good to me as-is too. And sorry for the last minute review. |
| min_value: int64; | ||
| max_value: int64; | ||
| stride: int64 = 1; // Step size (default 1) | ||
| valid_values: [int64]; // Optional: explicit list of valid values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this for the user to use to pick the value? If the user sets something invalid, is it the plugin that would reject the knob setting?
| To prevent conflicts between plugins, we recommend a hierarchical naming scheme: | ||
|
|
||
| ``` | ||
| <plugin_name>.<engine_name>.<knob_name> | ||
|
|
||
| Examples: | ||
| - miopen.conv.tile_size | ||
| - miopen.conv.algorithm | ||
| - rocblas.gemm.transpose_algorithm | ||
| - custom_plugin.matmul.block_size | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having a requirement that these are declared in the frontend also will allow us to check for overlaps.
For custom plugins we can't enforce this, but that's not a big issue.
I think it's also fine if they overlap, they don't actually have to be globally unique as far as I can tell.
|
|
||
| **Rationale**: | ||
| - **Plugin Autonomy**: Plugins can define custom knobs without modifying core hipDNN | ||
| - **Collision Avoidance**: Namespace prefixes prevent conflicts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should strip this concern out of the RFC.
I don't see a reason why an overlapping knob would be problematic, it just needs to be unique within the engine itself.
| ### 5.4 Validation at Finalization | ||
|
|
||
| **Decision**: Validate knob settings during `hipdnnBackendFinalize(engineCfg)`. | ||
|
|
||
| **Rationale**: | ||
| - **Early Error Detection**: Catch invalid settings before execution | ||
| - **Clear Error Messages**: Can provide detailed validation errors | ||
| - **Consistency**: Matches existing finalization pattern in hipDNN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to have sanity checks in the frontend from the provided definition if we can manage it.
Since it's provided from the backend, I think we could do it at that level, and also the plugin level itself as a last resort (this will catch issues with breaking engine config changes that were serialized).
| ### 5.5 Serialization Support | ||
|
|
||
| **Decision**: Engine configurations with knob settings are serializable via Flatbuffers. | ||
|
|
||
| **Rationale**: | ||
| - **Persistence**: Users can save optimal configurations | ||
| - **Reproducibility**: Configurations can be shared across runs/systems | ||
| - **Debugging**: Serialized configs can be inspected and modified | ||
| - **Integration**: Works naturally with existing Flatbuffer infrastructure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is implied, but we might want to call out that these will not be serialized with the graph, but instead serialized with the execution plans.
Serializing execution plans isn't existing functionality today, but will exist in the future.
Motivation
We want the ability to expose engine configuration parameters (knobs) to consumers of hipDNN. This RFC presents a path forward.
Submission Checklist