Conversation
benchmarks/CMakeCache.txt
Outdated
There was a problem hiding this comment.
CMakeFiles, CMake caches and generated makefiles should be added to gitignore
|
|
||
| Operation *op = getOperation(); | ||
| if (!op->hasTrait<OpTrait::SymbolTable>()) { | ||
| op->emitError() << " was scheduled to be run under the inliner, but does " |
There was a problem hiding this comment.
| op->emitError() << " was scheduled to be run under the inliner, but does " | |
| op->emitError() << " was scheduled to be run under the inliner, but does not " |
| @@ -0,0 +1,38 @@ | |||
| /* Top level routine for automatic kernel fusion. Creates "fusion sets", i.e. | |||
| * sets of subkernels that are to be fused together. Major steps: | |||
There was a problem hiding this comment.
| * sets of subkernels that are to be fused together. Major steps: | |
| * sets of subkernels that are to be fused together. |
Or finish the comment, I'm fine either way :)
| if (write == bufferStores.end()) | ||
| return WalkResult::advance(); | ||
|
|
||
| if (!write->second.size()) |
There was a problem hiding this comment.
What happens here if loadMem's rank is 0?
There was a problem hiding this comment.
Never mind, I see that an interrupt result means that there could be a read after write.
| ploopChains.clear(); | ||
| ploopChains.push_back({}); | ||
|
|
||
| bool noSideEffects = true; |
There was a problem hiding this comment.
It would be good to add some comments about this logic. It looks like it will always group all parallels in a block into chains of 2: (i, i+1), (i+2, i+3), ...
|
|
||
| // {{{ generic -> einsum | ||
|
|
||
| typedef struct EinsumArg { |
There was a problem hiding this comment.
Prefer C++ style struct declarations like
struct EinsumArg
{
};
| if ( | ||
| !isa<arith::AddFOp>(op) && | ||
| !isa<arith::MulFOp>(op) && | ||
| !isa<linalg::YieldOp>(op) | ||
| ) { |
There was a problem hiding this comment.
| if ( | |
| !isa<arith::AddFOp>(op) && | |
| !isa<arith::MulFOp>(op) && | |
| !isa<linalg::YieldOp>(op) | |
| ) { | |
| if (!isa<arith::AddFOp, arith::MulFOp, linalg::YieldOp>(op)) { |
| !isa<arith::MulFOp>(op) && | ||
| !isa<linalg::YieldOp>(op) | ||
| ) { | ||
| op.dump(); |
There was a problem hiding this comment.
| op.dump(); | |
| llvm::outs() << "Note: cannot convert following linalg.generic to einsum:\n"; | |
| op.dump(); |
OK to dump the op but should give a message for why it's being printed. Or just get rid of the dump since this is only one possible case for why a generic isn't a valid einsum.
| } | ||
| } | ||
|
|
||
| std::set<char> outputIndices; |
There was a problem hiding this comment.
This part is nice and elegant
| FusedEinsum fusedEinsum; | ||
|
|
||
| std::vector<EinsumSpecification> fused_einsums; | ||
| for (auto outer = einsums.rbegin(); outer != einsums.rend(); ++outer) { |
There was a problem hiding this comment.
This function should have some comments explaining the steps
No description provided.