[Docs] The end-to-end optimization tutorial doesn't work on the latest code

Thanks for participating in the TVM community! We use https://discuss.tvm.ai for any general usage questions and discussions. The issue tracker is used for actionable items such as feature proposals discussion, roadmaps, and bug tracking.  You are always welcomed to post on the forum first :smile_cat:

Issues that are inactive for a period of time may get closed. We adopt this policy so that we won't lose track of actionable issues that may fall at the bottom of the pile. Feel free to reopen a new one if you feel there is an additional problem that needs attention when an old one gets closed.

### Documentation Title & Type

[End-to-End Optimize Model](https://tvm.apache.org/docs/how_to/tutorials/e2e_opt_model.html)

### Additions/Changes Requested

When runs the tutorial with the latest code, there are exceptions:

```log
# Metadata omitted. Use show_meta=True in script() method to show it.

Traceback (most recent call last):
  File "//workspace/test_tvm2.py", line 72, in <module>
    ex = tvm.compile(mod, target="cuda")
  File "/opt/mlc-llm/3rdparty/tvm/python/tvm/driver/build_module.py", line 104, in compile
    return tvm.relax.build(
  File "/opt/mlc-llm/3rdparty/tvm/python/tvm/relax/vm_build.py", line 263, in build
    return _vmlink(
  File "/opt/mlc-llm/3rdparty/tvm/python/tvm/relax/vm_build.py", line 158, in _vmlink
    lib = tvm.tir.build(tir_mod, target=target, pipeline=tir_pipeline)
  File "/opt/mlc-llm/3rdparty/tvm/python/tvm/tir/build.py", line 226, in build
    mod = pipeline(mod)
  File "/opt/mlc-llm/3rdparty/tvm/python/tvm/ir/transform.py", line 167, in __call__
    return _ffi_transform_api.RunPass(self, mod)
  File "python/tvm_ffi/cython/function.pxi", line 678, in core.Function.__call__
  File "<unknown>", line 0, in tvm::transform::Pass::operator()(tvm::IRModule) const
  File "<unknown>", line 0, in tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  File "<unknown>", line 0, in tvm::transform::ModulePassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  File "<unknown>", line 0, in std::_Function_handler<tvm::IRModule (tvm::IRModule, tvm::transform::PassContext), tvm::transform::__TVMFFIStaticInitFunc4()::{lambda(tvm::ffi::TypedFunction<tvm::IRModule (tvm::ffi::RValueRef<tvm::IRModule, void>, tvm::transform::PassContext)>, tvm::transform::PassInfo)#1}::operator()(tvm::ffi::TypedFunction<tvm::IRModule (tvm::ffi::RValueRef<tvm::IRModule, void>, tvm::transform::PassContext)>, tvm::transform::PassInfo) const::{lambda(tvm::IRModule, tvm::transform::PassContext)#1}>::_M_invoke(std::_Any_data const&, tvm::IRModule&&, tvm::transform::PassContext&&)
  File "<unknown>", line 0, in tvm::transform::__TVMFFIStaticInitFunc4()::{lambda(tvm::ffi::TypedFunction<tvm::IRModule (tvm::ffi::RValueRef<tvm::IRModule, void>, tvm::transform::PassContext)>, tvm::transform::PassInfo)#1}::operator()(tvm::ffi::TypedFunction<tvm::IRModule (tvm::ffi::RValueRef<tvm::IRModule, void>, tvm::transform::PassContext)>, tvm::transform::PassInfo) const::{lambda(tvm::IRModule, tvm::transform::PassContext)#1}::operator()(tvm::IRModule, tvm::transform::PassContext) const
  File "python/tvm_ffi/cython/function.pxi", line 732, in core.tvm_ffi_callback
  File "/opt/mlc-llm/3rdparty/tvm/python/tvm/tir/pipeline.py", line 123, in _pipeline
    mod = tvm.ir.transform.Sequential(passes)(mod)
  File "/opt/mlc-llm/3rdparty/tvm/python/tvm/ir/transform.py", line 167, in __call__
    return _ffi_transform_api.RunPass(self, mod)
  File "python/tvm_ffi/cython/function.pxi", line 678, in core.Function.__call__
  File "<unknown>", line 0, in tvm::transform::Pass::operator()(tvm::IRModule) const
  File "<unknown>", line 0, in tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  File "<unknown>", line 0, in tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  File "<unknown>", line 0, in tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  File "<unknown>", line 0, in tvm::transform::ModulePassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  File "<unknown>", line 0, in std::_Function_handler<tvm::IRModule (tvm::IRModule, tvm::transform::PassContext), tvm::tir::transform::VerifyMemory()::{lambda(tvm::IRModule, tvm::transform::PassContext)#1}>::_M_invoke(std::_Any_data const&, tvm::IRModule&&, tvm::transform::PassContext&&)
  File "<unknown>", line 0, in tvm::tir::transform::VerifyMemory()::{lambda(tvm::IRModule, tvm::transform::PassContext)#1}::operator()(tvm::IRModule, tvm::transform::PassContext) const [clone .constprop.0]
  File "<unknown>", line 0, in tvm::runtime::detail::LogFatal::~LogFatal() [clone .constprop.0]
  File "<unknown>", line 0, in tvm::runtime::detail::LogFatal::Entry::Finalize()
RuntimeError: Memory verification failed with the following errors:
    Variable `lv3` is directly accessed by host memory (it is not contained in a thread environment or in the function arguments.
    Variable `pool_max` is directly accessed by host memory (it is not contained in a thread environment or in the function arguments.
    Variable `pool_max` is directly accessed by host memory (it is not contained in a thread environment or in the function arguments.
    Variable `pool_max` is directly accessed by host memory (it is not contained in a thread environment or in the function arguments.
  Did you forget to bind?
# from tvm.script import tir as T

@T.prim_func
def max_pool2d(lv3: T.Buffer((T.int64(1), T.int64(64), T.int64(112), T.int64(112)), "float32"), pool_max: T.Buffer((T.int64(1), T.int64(64), T.int64(56), T.int64(56)), "float32")):
    T.func_attr({"op_pattern": 4, "target": T.target({"arch": "sm_87", "host": {"keys": ["arm_cpu", "cpu"], "kind": "llvm", "mtriple": "aarch64-unknown-linux-gnu", "tag": ""}, "keys": ["cuda", "gpu"], "kind": "cuda", "max_num_threads": 1024, "tag": "", "thread_warp_size": 32}), "tir.noalias": True})
    pad_temp = T.allocate([831744], "float32", "global")
    pad_temp_1 = T.Buffer((T.int64(831744),), data=pad_temp)
    for ax1, ax2, ax3 in T.grid(64, 114, 114):
        lv3_1 = T.Buffer((T.int64(802816),), data=lv3.data)
        pad_temp_1[ax1 * 12996 + ax2 * 114 + ax3] = T.if_then_else(1 <= ax2 and ax2 < 113 and 1 <= ax3 and ax3 < 113, lv3_1[ax1 * 12544 + ax2 * 112 + ax3 - 113], T.float32(-340282346638528859811704183484516925440.0))
    for ax1, ax2, ax3, rv0, rv1 in T.grid(64, 56, 56, 3, 3):
        cse_v1: T.int32 = ax1 * 3136 + ax2 * 56 + ax3
        pool_max_1 = T.Buffer((T.int64(200704),), data=pool_max.data)
        if rv0 == 0 and rv1 == 0:
            pool_max_1[cse_v1] = T.float32(-340282346638528859811704183484516925440.0)
        pool_max_1[cse_v1] = T.max(pool_max_1[cse_v1], pad_temp_1[ax1 * 12996 + ax2 * 228 + rv0 * 114 + ax3 * 2 + rv1])
```

Unfortunately I'm not familiar with tvm enough to propose a solution, please help :)

### Triage

Please refer to the list of label tags [here](https://github.com/apache/tvm/wiki/Issue-Triage-Labels) to find the relevant tags and add them below in a bullet format (example below).

* needs-triage


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Docs] The end-to-end optimization tutorial doesn't work on the latest code #18481

Documentation Title & Type

Additions/Changes Requested

Triage

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Docs] The end-to-end optimization tutorial doesn't work on the latest code #18481

Description

Documentation Title & Type

Additions/Changes Requested

Triage

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions