@@ -23,11 +23,6 @@ the resulting `memref` IR has no memory leaks.
2323
2424## Deprecated Passes
2525
26- The old dialect conversion-based bufferization passes have been deprecated and
27- should not be used anymore. Most of those passes have already been removed from
28- MLIR. One-Shot Bufferize produces in better bufferization results with fewer
29- memory allocations and buffer copies.
30-
3126The buffer deallocation pass has been deprecated in favor of the ownership-based
3227buffer deallocation pipeline. The deprecated pass has some limitations that may
3328cause memory leaks in the resulting IR.
@@ -276,18 +271,13 @@ semantics (i.e., tensor result or tensor operand) that is not bufferizable
276271` to_memref ` /` to_tensor ` ops around the bufferization boundary.
277272
278273One-Shot Bufferize can be configured to bufferize only ops from a set of
279- dialects with ` dialect-filter ` . This can be useful for gradually migrating from
280- dialect conversion-based bufferization to One-Shot Bufferize. One-Shot Bufferize
281- must run first in such a case, because dialect conversion-based bufferization
282- generates ` to_tensor ` ops without the ` restrict ` unit attribute, which One-Shot
283- Bufferize cannot analyze.
274+ dialects with ` dialect-filter ` .
284275
285276One-Shot Bufferize can also be called programmatically with
286277[ ` bufferization::runOneShotBufferize ` ] ( https://github.com/llvm/llvm-project/blob/ae2764e835a26bad9774803eca0a6530df2a3e2d/mlir/include/mlir/Dialect/Bufferization/Transforms/OneShotAnalysis.h#L167 ) .
287278Alternatively,
288279[ ` bufferization::bufferizeOp ` ] ( https://github.com/llvm/llvm-project/blob/ae2764e835a26bad9774803eca0a6530df2a3e2d/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h#L78 )
289- skips the analysis and inserts a copy on every buffer write, just like the
290- dialect conversion-based bufferization.
280+ skips the analysis and inserts a copy on every buffer write.
291281
292282By default, function boundaries are not bufferized. This is because there are
293283currently limitations around function graph bufferization: recursive
@@ -484,259 +474,3 @@ conflict detection algorithm, interested users may want to refer to:
484474* [ Original design document] ( https://discourse.llvm.org/uploads/short-url/5kckJ3DftYwQokG252teFgw3sYa.pdf )
485475* [ ODM talk] ( https://youtu.be/TXEo59CYS9A ) , ([ slides] ( https://mlir.llvm.org/OpenMeetings/2022-01-13-One-Shot-Bufferization.pdf ) ).
486476* [ LLVM Dev Meeting 2023 tutorial slides] ( https://m-sp.org/downloads/llvm_dev_2023.pdf )
487-
488- ## Migrating from Dialect Conversion-based Bufferization
489-
490- Both dialect conversion-based bufferization and One-Shot Bufferize generate
491- ` to_tensor ` /` to_memref ` ops at the bufferization boundary (when run with
492- ` allow-unknown-ops ` ). They can be combined and run in sequence. However,
493- One-Shot Bufferize must run first because it cannot analyze those boundary ops.
494- To update existing code step-by-step, it may be useful to specify a dialect
495- filter for One-Shot Bufferize, so that dialects can be switched over one-by-one.
496-
497- ## Dialect Conversion-based Bufferization
498-
499- Disclaimer: Most dialect conversion-based bufferization has been migrated to
500- One-Shot Bufferize. New users should use One-Shot Bufferize (with or without
501- analysis). The following documentation is only for existing users of dialect
502- conversion-based bufferization.
503-
504- This system is a simple application of MLIR's dialect conversion infrastructure.
505- The bulk of the code related to bufferization is a set of ordinary
506- ` ConversionPattern ` 's that dialect authors write for converting ops that operate
507- on ` tensor ` 's to ops that operate on ` memref ` 's. A set of conventions and best
508- practices are followed that allow these patterns to be run across multiple
509- independent passes (rather than requiring a single huge atomic conversion pass),
510- which makes the compilation pipelines scalable, robust, and easy to debug.
511-
512- This document is targeted at people looking to utilize MLIR's bufferization
513- functionality, along with people who want to extend it to cover their own ops.
514-
515- <a name =" the-talk " >** NOTE:** </a > Before reading this document, please watch the
516- talk "Type Conversions the Not-So-Hard-Way: MLIR's New Bufferization
517- Infrastructure"
518- ([ slides] ( https://drive.google.com/file/d/1FVbzCXxZzS9LBLuvpPNLWJD-XDkt54ky/view?usp=sharing ) ,
519- [ recording] ( https://drive.google.com/file/d/1VfVajitgf8ZPnd-HRkJvaJiFLhBsluXN/view?usp=sharing ) ).
520- That talk gives a high-level overview of the bufferization infrastructure and
521- important conceptual details related to using the MLIR dialect conversion
522- infrastructure.
523-
524- ### Bufferization's place in a compilation pipeline
525-
526- Bufferization itself does not free any of the buffers that have been allocated,
527- nor does it do anything particularly intelligent with the placement of buffers
528- w.r.t. control flow. Thus, a realistic compilation pipeline will usually consist
529- of:
530-
531- 1 . Bufferization
532- 1 . Buffer optimizations such as ` buffer-hoisting ` , ` buffer-loop-hoisting ` , and
533- ` promote-buffers-to-stack ` , which do optimizations that are only exposed
534- after bufferization.
535- 1 . Finally, running the [ ownership-based buffer deallocation] ( OwnershipBasedBufferDeallocation.md )
536- pass.
537-
538- After buffer deallocation has been completed, the program will be quite
539- difficult to transform due to the presence of the deallocation ops. Thus, other
540- optimizations such as linalg fusion on memrefs should be done before that stage.
541-
542- ### General structure of the bufferization process
543-
544- Bufferization consists of running multiple * partial* bufferization passes,
545- followed by one * finalizing* bufferization pass.
546-
547- There is typically one partial bufferization pass per dialect (though other
548- subdivisions are possible). For example, for a dialect ` X ` there will typically
549- be a pass ` X-bufferize ` that knows how to bufferize all the ops in that dialect.
550- By running pass ` X-bufferize ` for each dialect ` X ` in the program, all the ops
551- in the program are incrementally bufferized.
552-
553- Partial bufferization passes create programs where only some ops have been
554- bufferized. These passes will create * materializations* (also sometimes called
555- "casts") that convert between the ` tensor ` and ` memref ` type, which allows
556- bridging between ops that have been bufferized and ops that have not yet been
557- bufferized.
558-
559- Finalizing bufferizations complete the bufferization process, and guarantee that
560- there are no tensors remaining in the program. This involves eliminating the
561- materializations. The pass ` finalizing-bufferize ` provides a minimal pass that
562- only eliminates materializations and issues an error if any unbufferized ops
563- exist in the program.
564-
565- However, it is possible for a finalizing bufferization to do more than just
566- eliminate materializations. By adding patterns (just as a partial bufferization
567- would), it is possible for a finalizing bufferization pass to simultaneously
568- bufferize ops and eliminate materializations. This has a number of disadvantages
569- discussed in the talk and should generally be avoided.
570-
571- ### Example
572-
573- As a concrete example, we will look at the bufferization pipeline from the
574- ` mlir-npcomp ` reference backend
575- ([ code] ( https://github.com/llvm/mlir-npcomp/blob/97d6d04d41216e73d40b89ffd79620973fc14ce3/lib/RefBackend/RefBackend.cpp#L232 ) ).
576- The code, slightly simplified and annotated, is reproduced here:
577-
578- ``` c++
579- // Partial bufferization passes.
580- pm.addPass(createTensorConstantBufferizePass());
581- pm.addNestedPass<func::FuncOp>(createTCPBufferizePass()); // Bufferizes the downstream `tcp` dialect.
582- pm.addNestedPass<func::FuncOp>(createLinalgBufferizePass());
583- pm.addNestedPass<func::FuncOp>(createTensorBufferizePass());
584- pm.addPass(createFuncBufferizePass());
585-
586- // Finalizing bufferization pass.
587- pm.addNestedPass<func::FuncOp>(createFinalizingBufferizePass());
588- ```
589-
590- Looking first at the partial bufferization passes, we see that there are a
591- sequence of ` FuncOp ` passes (which run in parallel on functions). These function
592- passes are bracketed by ` arith-bufferize ` and ` func-bufferize ` , which are module
593- passes (and thus serialize the parallel compilation process). These two passes
594- must be module passes because they make changes to the top-level module.
595-
596- The bulk of the bufferization work is done by the function passes. Most of these
597- passes are provided as part of the upstream MLIR distribution and bufferize
598- their respective dialects (e.g. ` abc-bufferize ` bufferizes the ` abc ` dialect).
599- The ` tcp-bufferize ` pass is an exception -- it is a partial bufferization pass
600- used to bufferize the downstream ` tcp ` dialect, and fits in perfectly with all
601- the other passes provided upstream.
602-
603- The last pass is the finalizing bufferization pass. The ` mlir-npcomp ` reference
604- backend has arranged that all ops are bufferized by partial bufferizations, so
605- that the upstream ` finalizing-bufferize ` pass can be used as the finalizing
606- bufferization pass. This gives excellent diagnostics when something goes wrong
607- with the bufferization process, such as due to an op that wasn't handled by any
608- pattern.
609-
610- ### How to write a partial bufferization pass
611-
612- The contract of a partial bufferization pass is that a subset of ops (or kinds
613- of ops, customizable by a ConversionTarget) get bufferized.
614-
615- A partial bufferization pass is just a pass that uses the
616- [ dialect conversion] ( DialectConversion.md ) framework to apply
617- ` ConversionPattern ` s with a ` tensor ` to ` memref ` type conversion.
618-
619- To describe how to write such a pass, we will walk through an example, the
620- ` tensor-bufferize ` pass
621- ([ code] ( https://github.com/llvm/llvm-project/blob/bc8acf2ce8ad6e8c9b1d97b2e02d3f4ad26e1d9d/mlir/lib/Dialect/Tensor/Transforms/Bufferize.cpp#L23 ) ,
622- [ test] ( https://github.com/llvm/llvm-project/blob/bc8acf2ce8ad6e8c9b1d97b2e02d3f4ad26e1d9d/mlir/test/Dialect/Tensor/bufferize.mlir#L1 ) )
623- that bufferizes the ` tensor ` dialect. Note that these passes have been replaced
624- with a ` BufferizableOpInterface ` -based implementation in the meantime, so we
625- have to take a looker at an older version of the code.
626-
627- The bulk of the code in the pass will be a set of conversion patterns, with a
628- simple example being
629- [ BufferizeCastOp] ( https://github.com/llvm/llvm-project/blob/2bf6e443e54604c7818c4d1a1837f3d091023270/mlir/lib/Dialect/Tensor/Transforms/Bufferize.cpp#L23 ) ).
630-
631- ```
632- class BufferizeCastOp : public OpConversionPattern<tensor::CastOp> {
633- public:
634- using OpConversionPattern::OpConversionPattern;
635- LogicalResult
636- matchAndRewrite(tensor::CastOp op, OpAdaptor adaptor,
637- ConversionPatternRewriter &rewriter) const override {
638- auto resultType = getTypeConverter()->convertType(op.getType());
639- rewriter.replaceOpWithNewOp<MemRefCastOp>(op, resultType, adaptor.source());
640- return success();
641- }
642- };
643- ```
644-
645- See [ the talk] ( #the-talk ) for more details on how to write these patterns.
646-
647- The
648- [ pass itself] ( https://github.com/llvm/llvm-project/blob/bc8acf2ce8ad6e8c9b1d97b2e02d3f4ad26e1d9d/mlir/lib/Dialect/Tensor/Transforms/Bufferize.cpp#L57 )
649- is very small, and follows the basic pattern of any dialect conversion pass.
650-
651- ```
652- void mlir::populateTensorBufferizePatterns(
653- const BufferizeTypeConverter &typeConverter, RewritePatternSet &patterns) {
654- patterns.add<BufferizeCastOp, BufferizeExtractOp>(typeConverter,
655- patterns.getContext());
656- }
657-
658- struct TensorBufferizePass : public TensorBufferizeBase<TensorBufferizePass> {
659- void runOnOperation() override {
660- auto *context = &getContext();
661- BufferizeTypeConverter typeConverter;
662- RewritePatternSet patterns(context);
663- ConversionTarget target(*context);
664-
665- populateTensorBufferizePatterns(typeConverter, patterns);
666- target.addIllegalOp<tensor::CastOp, tensor::ExtractOp>();
667- target.addLegalDialect<func::FuncDialect>();
668-
669- if (failed(
670- applyPartialConversion(getOperation(), target, std::move(patterns))))
671- signalPassFailure();
672- }
673- };
674- ```
675-
676- The pass has all the hallmarks of a dialect conversion pass that does type
677- conversions: a ` TypeConverter ` , a ` RewritePatternSet ` , and a ` ConversionTarget ` ,
678- and a call to ` applyPartialConversion ` . Note that a function
679- ` populateTensorBufferizePatterns ` is separated, so that power users can use the
680- patterns independently, if necessary (such as to combine multiple sets of
681- conversion patterns into a single conversion call, for performance).
682-
683- One convenient utility provided by the MLIR bufferization infrastructure is the
684- ` BufferizeTypeConverter ` , which comes pre-loaded with the necessary conversions
685- and materializations between ` tensor ` and ` memref ` .
686-
687- In this case, the ` BufferizationOpsDialect ` is marked as legal, so the
688- ` bufferization.to_tensor ` and ` bufferization.to_memref ` ops, which are inserted
689- automatically by the dialect conversion framework as materializations, are
690- legal. There is a helper ` populateBufferizeMaterializationLegality `
691- ([ code] ( https://github.com/llvm/llvm-project/blob/a0b65a7bcd6065688189b3d678c42ed6af9603db/mlir/include/mlir/Transforms/Bufferize.h#L53 ) )
692- which helps with this in general.
693-
694- ### Other partial bufferization examples
695-
696- - ` func-bufferize `
697- ([ code] ( https://github.com/llvm/llvm-project/blob/2f5715dc78328215d51d5664c72c632a6dac1046/mlir/lib/Dialect/Func/Transforms/FuncBufferize.cpp#L1 ) ,
698- [ test] ( https://github.com/llvm/llvm-project/blob/2f5715dc78328215d51d5664c72c632a6dac1046/mlir/test/Dialect/Func/func-bufferize.mlir#L1 ) )
699-
700- - Bufferizes ` func ` , ` call ` , and ` BranchOpInterface ` ops.
701- - This is an example of how to bufferize ops that have multi-block
702- regions.
703- - This is an example of a pass that is not split along dialect
704- subdivisions.
705-
706- ### How to write a finalizing bufferization pass
707-
708- The contract of a finalizing bufferization pass is that all tensors are gone
709- from the program.
710-
711- The easiest way to write a finalizing bufferize pass is to not write one at all!
712- MLIR provides a pass ` finalizing-bufferize ` which eliminates the
713- ` bufferization.to_tensor ` / ` bufferization.to_memref ` materialization ops
714- inserted by partial bufferization passes and emits an error if that is not
715- sufficient to remove all tensors from the program.
716-
717- This pass is sufficient when partial bufferization passes have bufferized all
718- the ops in the program, leaving behind only the materializations. When possible,
719- it is recommended to structure your pass pipeline this way, as this has the
720- significant advantage that if an op does not get bufferized (due to a missing
721- pattern, bug in the code, etc.), ` finalizing-bufferize ` will emit a nice clean
722- error, and the IR seen by ` finalizing-bufferize ` will only contain only one
723- unbufferized op.
724-
725- However, before the current bufferization infrastructure was put in place,
726- bufferization could only be done as a single finalizing bufferization mega-pass
727- that used the ` populate*BufferizePatterns ` functions from multiple dialects to
728- simultaneously bufferize everything at once. Thus, one might see code in
729- downstream projects structured this way. This structure is not recommended in
730- new code. A helper, ` populateEliminateBufferizeMaterializationsPatterns `
731- ([ code] ( https://github.com/llvm/llvm-project/blob/a0b65a7bcd6065688189b3d678c42ed6af9603db/mlir/include/mlir/Transforms/Bufferize.h#L58 ) )
732- is available for such passes to provide patterns that eliminate
733- ` bufferization.to_tensor ` and ` bufferization.to_memref ` .
734-
735- ### Changes since [ the talk] ( #the-talk )
736-
737- - ` func-bufferize ` was changed to be a partial conversion pass, and there is a
738- new ` finalizing-bufferize ` which serves as a general finalizing
739- bufferization pass.
740- - Most partial bufferization passes have been reimplemented in terms of
741- ` BufferizableOpInterface ` . New users should use One-Shot Bufferize instead
742- of dialect conversion-based bufferization.
0 commit comments