-
Notifications
You must be signed in to change notification settings - Fork 52
Description
In the existing WebNN spec, conv2d supports two input operand layouts defined by MLInputOperandLayout and four filter operand layouts defined by MLConv2dFilterOperandLayout.
enum MLInputOperandLayout {
"nchw",
"nhwc"
};
enum MLConv2dFilterOperandLayout {
"oihw",
"hwio",
"ohwi",
"ihwo"
};
This may make the implementation more complicated especially if a native ML framework or OS API doesn't support some of these layouts. If one layout is unsupported, the implementation may need to insert the transpose
operations into the graph around the conv2d
operation that transposes the unsupported layout to supported one. This would easily lead to an inefficient graph representation that may have redundant transpose
operations. Or the implementation may need to optimize the graph by techniques such as "transpose sink" which may require a more complex implementation. This issue was raised in Chromium CL review.
To simplify the implementation, the proposal is to reduce the supported operand layouts, for example, just keep the default one. Because WebNN supports transpose
operation, the layout adaption and graph level optimization can be handled by ML frameworks that usually already support such functionalities.
Thanks @wacky6 for this idea.