Skip to content

Conversation

@Jaddyen
Copy link
Contributor

@Jaddyen Jaddyen commented Jul 25, 2025

This adds a pass that adds a getBufferForName function to EmitC classes that enables runtime lookup of field buffers by their string names.
This allows us to get the cpp emission:

#include <map>
#include <string>
class mainClass {
 public:
  float[1] fieldName0;
  float[1] fieldName1;
  float[1] fieldName2;
  
  const std::map<std::string, char*> v2 = {{"another_feature", reinterpret_cast<char*>(&fieldName0)}, { "some_feature", reinterpret_cast<char*>(&fieldName1)}, { "output_0", reinterpret_cast<char*>(&fieldName2)}};

  void execute() {
    size_t v1 = 0;
    float[1] v2 = fieldName0;
    float[1] v3 = fieldName1;
    float[1] v4 = fieldName2;
    float v5 = v3[v1];
    float v6 = v2[v1];
    float v7 = v5 + v6;
    v4[v1] = v7;
    return;
  }

};

@Jaddyen Jaddyen requested review from ilovepi, jpienaar and mtrofin July 25, 2025 06:04
@jpienaar
Copy link
Member

Shouldn't it be "char* getBufferForName" ? (it returns a string)

@mtrofin
Copy link
Member

mtrofin commented Jul 25, 2025

The map should be declared as a member of the class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This would require that the class has fields with attributes and a function named `execute`.
This requires that the class has fields with attributes and a function named `execute`.

Why is execute a requirement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I insert the new function before the execute function.
Ideally, we could have some other insertion point but execute func made the most sense since it is always added when we wrap-emitc-func-in-class.

Comment on lines 57 to 60
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this code is repeated, I'd suggest making it a helper function or a lambda. Then the code here can be

  if(!hasMap)
     addHeader(kMapLibraryHeader);
  if(!hasString)
     addHeader(kMapLibraryHeader);

Comment on lines 38 to 39
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool hasMap = false;
bool hasString = false;
bool hasMapHdr = false;
bool hasStringHdr = false;

nit: hasString is pretty common name that usually isn't about the header. Lets just make it completely unambiguous.

Comment on lines 52 to 53
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to early exit, you might as well put it in the loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a cast<>? If you know it will be of this type (e.g. can't fail) then cast<> is appropriate. Otherwise, I think you still need to check dyn_cast<> for success.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, thanks for the pointer!

Comment on lines 117 to 119
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::string indexPath = stringAttr.getValue().str();
fieldNames.emplace_back(indexPath, fieldOp.getName().str());
fieldNames.emplace_back(tringAttr.getValue().str(), fieldOp.getName().str());

You can avoid a copy here, otherwise you can probably use std::move() to similar effect.

Comment on lines 109 to 116
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of nesting so deeply w/ dyn_cast<>, can you just early exit if the cast fails?

Comment on lines 131 to 123
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend doing this w/ a string_stream. You can avoid a lot of copies and you can use things like formatv() to make the code easier to read.

Comment on lines 142 to 167
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For blocks like this, where you're basically spelling out the C++ code, you may want to write that down in a comment, so its easy to see what operations you're doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we now simply return the full map, I have reduced the amount of c++ code I'm spelling out.
I appreciate the pointer!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the \22 be in the output? I'd expect maybe \", but IDK if that's going to work correctly. It desn't seem right to me at least...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is some weird parsing going on.
I chose to not take it too seriously since it didn't change the output when it came to cpp.

@ajaden-codes ajaden-codes force-pushed the add-ref-map branch 2 times, most recently from f4512f1 to 94b0c34 Compare July 28, 2025 16:13
@Jaddyen
Copy link
Contributor Author

Jaddyen commented Jul 29, 2025

The map should be declared as a member of the class.

We could:

  1. Declare the map as a member of the class, pass it as an argument to the function then return it from the function or
  2. Initialize the map within the function and return it from the function.

@Jaddyen Jaddyen marked this pull request as ready for review July 29, 2025 04:15
@mtrofin
Copy link
Member

mtrofin commented Jul 29, 2025

The map should be declared as a member of the class.

We could:

  1. Declare the map as a member of the class, pass it as an argument to the function then return it from the function or
  2. Initialize the map within the function and return it from the function.

Let's do 1. 2 would mean re-creating the map at every call to that function, which would need to be renamed to indicate that (create not get), but more importantly, it's not necessary to re-create the map as the data inside of it doesn't change and shouldn't change.

Comment on lines 123 to 119
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we stop in the error case? I'd assume you'd want to return an error code here, instead of continuing.

Comment on lines 111 to 112
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this would be more readable as a formatv(). IMO, structures like this are harder to read when constructed via stream (e.g. its easier to misread them or miss a detail). WDYT?

Comment on lines 63 to 68
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM style is to omit braces for single statement bodies. I'm not sure if MLIR deviates (I'm fine either way), but we should be consistent, and you have a different convention a few lines above.

@github-actions
Copy link

github-actions bot commented Jul 29, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@Jaddyen Jaddyen requested a review from ilovepi July 29, 2025 21:11
@Jaddyen Jaddyen marked this pull request as draft July 29, 2025 21:27
@Jaddyen Jaddyen marked this pull request as ready for review August 1, 2025 21:54
@aniragil
Copy link
Contributor

aniragil commented Aug 3, 2025

[Apologies if I missed some earlier community discussion on this]
This patch seems to be one in a series aimed at supporting specific MLGO features. Would be good if we could separate generic contributions that benefit most/all EmitC users (e.g. adding an emit.class op) from downstream-specific ones. For instance, the pass added here seems to perform a rather specific transformation and rely on existing dialect components. Could you elaborate on why it belongs upstream in MLIR core?
If you believe these patterns (reflection map, func-to-class for AoT) to be beneficial for many EmitC users, would be great if you could post an RFC on the MLIR Discourse to facilitate a wider discussion in the community.

+@marbre

@marbre
Copy link
Member

marbre commented Aug 4, 2025

[Apologies if I missed some earlier community discussion on this] This patch seems to be one in a series aimed at supporting specific MLGO features. Would be good if we could separate generic contributions that benefit most/all EmitC users (e.g. adding an emit.class op) from downstream-specific ones. For instance, the pass added here seems to perform a rather specific transformation and rely on existing dialect components. Could you elaborate on why it belongs upstream in MLIR core? If you believe these patterns (reflection map, func-to-class for AoT) to be beneficial for many EmitC users, would be great if you could post an RFC on the MLIR Discourse to facilitate a wider discussion in the community.

+@marbre

Thanks @aniragil!

While there has been some discussion dating back to 2023 on what MLGO would need and resulting in the efforts by @simon-camp to add an upstream supported lowering to EmitC (PR #11754), it isn't clear to me what else is needed. Therefore, I would appreciate to discuss this based on an RFC as suggested by @aniragil.

@mtrofin
Copy link
Member

mtrofin commented Aug 4, 2025

[Apologies if I missed some earlier community discussion on this] This patch seems to be one in a series aimed at supporting specific MLGO features. Would be good if we could separate generic contributions that benefit most/all EmitC users (e.g. adding an emit.class op) from downstream-specific ones. For instance, the pass added here seems to perform a rather specific transformation and rely on existing dialect components. Could you elaborate on why it belongs upstream in MLIR core? If you believe these patterns (reflection map, func-to-class for AoT) to be beneficial for many EmitC users, would be great if you could post an RFC on the MLIR Discourse to facilitate a wider discussion in the community.
+@marbre

Thanks @aniragil!

While there has been some discussion dating back to 2023 on what MLGO would need and resulting in the efforts by @simon-camp to add an upstream supported lowering to EmitC (PR #11754), it isn't clear to me what else is needed. Therefore, I would appreciate to discuss this based on an RFC as suggested by @aniragil.

The RFC in question is this one. The MLGO usecase was used as one of the motivations, especially since MLGO is in-tree. The additional requirements (for MLGO) were listed high-level, this patch here is for the "ability to bind by name" part.

Perhaps we should make that relation to the RFC more clear in this patch description?

@marbre
Copy link
Member

marbre commented Aug 4, 2025

[Apologies if I missed some earlier community discussion on this] This patch seems to be one in a series aimed at supporting specific MLGO features. Would be good if we could separate generic contributions that benefit most/all EmitC users (e.g. adding an emit.class op) from downstream-specific ones. For instance, the pass added here seems to perform a rather specific transformation and rely on existing dialect components. Could you elaborate on why it belongs upstream in MLIR core? If you believe these patterns (reflection map, func-to-class for AoT) to be beneficial for many EmitC users, would be great if you could post an RFC on the MLIR Discourse to facilitate a wider discussion in the community.
+@marbre

Thanks @aniragil!
While there has been some discussion dating back to 2023 on what MLGO would need and resulting in the efforts by @simon-camp to add an upstream supported lowering to EmitC (PR #11754), it isn't clear to me what else is needed. Therefore, I would appreciate to discuss this based on an RFC as suggested by @aniragil.

The RFC in question is this one. The MLGO usecase was used as one of the motivations, especially since MLGO is in-tree. The additional requirements (for MLGO) were listed high-level, this patch here is for the "ability to bind by name" part.

Perhaps we should make that relation to the RFC more clear in this patch description?

That RFC was specifically about upstreaming the TOSA to EmitC conversions and the reference implementation, both implemented in https://github.com/iml130/mlir-emitc/. It is correct that MLGO use-case was highlighted as a motivation but the specific RFC never got a lot of attraction and was never accepted. I think what Gil is asking for is a separate, more detailed RFC with regards to what is needed and what operations or conversions need to be implemented. It can of course refer to the linked thread and re-use arguments.

@mtrofin
Copy link
Member

mtrofin commented Aug 4, 2025

[Apologies if I missed some earlier community discussion on this] This patch seems to be one in a series aimed at supporting specific MLGO features. Would be good if we could separate generic contributions that benefit most/all EmitC users (e.g. adding an emit.class op) from downstream-specific ones. For instance, the pass added here seems to perform a rather specific transformation and rely on existing dialect components. Could you elaborate on why it belongs upstream in MLIR core? If you believe these patterns (reflection map, func-to-class for AoT) to be beneficial for many EmitC users, would be great if you could post an RFC on the MLIR Discourse to facilitate a wider discussion in the community.
+@marbre

Thanks @aniragil!
While there has been some discussion dating back to 2023 on what MLGO would need and resulting in the efforts by @simon-camp to add an upstream supported lowering to EmitC (PR #11754), it isn't clear to me what else is needed. Therefore, I would appreciate to discuss this based on an RFC as suggested by @aniragil.

The RFC in question is this one. The MLGO usecase was used as one of the motivations, especially since MLGO is in-tree. The additional requirements (for MLGO) were listed high-level, this patch here is for the "ability to bind by name" part.
Perhaps we should make that relation to the RFC more clear in this patch description?

That RFC was specifically about upstreaming the TOSA to EmitC conversions and the reference implementation, both implemented in https://github.com/iml130/mlir-emitc/. It is correct that MLGO use-case was highlighted as a motivation but the specific RFC never got a lot of attraction and was never accepted.

Right, and IIRC there was no explicit RFC signoff process at the time anyway; on that - asking to learn (and make sure we follow the right steps) - is there an explicit signoff now in MLIR, or, like in LLVM, that's ony an escalation when there's disagreements?

I think what Gil is asking for is a separate, more detailed RFC with regards to what is needed and what operations or conversions need to be implemented. It can of course refer to the linked thread and re-use arguments.

We should have one up today. I am concerned with timing here, though, and would love it if we could find a way to make progress in the meantime. I'm assuming there's no objection to other work continuing, like lowering opcodes (that's quite generic), while folks look at the pieces that are more MLGO-specific and covered by the RFC, correct?

@simon-camp
Copy link
Contributor

As this pass is very use case specific, it could also be moved to the MLGO side of the repo together with a custom opt tool to run it.

@Jaddyen
Copy link
Contributor Author

Jaddyen commented Aug 5, 2025

[Apologies if I missed some earlier community discussion on this] This patch seems to be one in a series aimed at supporting specific MLGO features. Would be good if we could separate generic contributions that benefit most/all EmitC users (e.g. adding an emit.class op) from downstream-specific ones. For instance, the pass added here seems to perform a rather specific transformation and rely on existing dialect components. Could you elaborate on why it belongs upstream in MLIR core? If you believe these patterns (reflection map, func-to-class for AoT) to be beneficial for many EmitC users, would be great if you could post an RFC on the MLIR Discourse to facilitate a wider discussion in the community.
+@marbre

Thanks @aniragil!
While there has been some discussion dating back to 2023 on what MLGO would need and resulting in the efforts by @simon-camp to add an upstream supported lowering to EmitC (PR #11754), it isn't clear to me what else is needed. Therefore, I would appreciate to discuss this based on an RFC as suggested by @aniragil.

The RFC in question is this one. The MLGO usecase was used as one of the motivations, especially since MLGO is in-tree. The additional requirements (for MLGO) were listed high-level, this patch here is for the "ability to bind by name" part.
Perhaps we should make that relation to the RFC more clear in this patch description?

That RFC was specifically about upstreaming the TOSA to EmitC conversions and the reference implementation, both implemented in https://github.com/iml130/mlir-emitc/. It is correct that MLGO use-case was highlighted as a motivation but the specific RFC never got a lot of attraction and was never accepted. I think what Gil is asking for is a separate, more detailed RFC with regards to what is needed and what operations or conversions need to be implemented. It can of course refer to the linked thread and re-use arguments.

Here is an RFC.

@aniragil
Copy link
Contributor

aniragil commented Aug 6, 2025

I am concerned with timing here, though, and would love it if we could find a way to make progress in the meantime.

One way to try and speed things up is to request more EmitC folks in the community for review in addition to @marbre and myself, e.g. @simon-camp, @mgehre-amd, @jacquesguan.

I'm assuming there's no objection to other work continuing, like lowering opcodes (that's quite generic), while folks look at the pieces that are more MLGO-specific and covered by the RFC, correct?

It's eventually up to the reviewers, so best to upload and have concrete discussions per case. For instance, malloc and memcpy were such generic contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants