-
Notifications
You must be signed in to change notification settings - Fork 2
Unary operators implemented. #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… neg kernel yet though
ggml_tensor * dst, | ||
webgpu_pipeline & pipeline, | ||
bool in_place, | ||
const std::vector<uint32_t> & xielu_params = {}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of calling this xielu_params
, call it extra_params
, so it's more general for other future unary operators
"SHADER_NAME": "xielu_f16", | ||
"REPLS": { | ||
"TYPE": "f16", | ||
"FUNC": "dst[dst_i] = select(((exp(min(src[src_i], f16(params.eps))) - 1.0h) - src[src_i]) * f16(params.alpha_n) + f16(params.beta) * src[src_i], f16(params.alpha_p) * src[src_i] * src[src_i] + f16(params.beta) * src[src_i], src[src_i] > 0.0h);", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, after thinking about this more, my opinion is that we don't want to duplicate these relatively complicated statements across the different variants. I realize that this is what I do in bin_op.tmpl.wgsl
, but in that case the operation is a single character.
So, my opinion here is that there are two options:
- Add a
DECL
for each operator with a customupdate
function, and reuse that for all the variants of a given operator. - Modify
embed_wgsl.py
to add the ability to reuseREPLS
across different variants. For example, theREPLS
block could be specified like this:
"REPLS": {
"FUNC": "XIELU_FUNC"
}
And somewhere else, have a new block that's something like this:
#define REPLS
{
"XIELU_FUNC": ...
"EXP_FUNC": ...
....
}
Which you would then expand before calling replace_placeholders
in embed_wgsl.py
Let me know what you think of both of these options and if you think either one would be better.
Make sure to read the contributing guidelines before submitting a PR