Update on "[llm] Add a generic text only LLM runner"

larryliu0820 · larryliu0820 · commit 13125f436d67 · 2025-06-05T23:53:18.000-07:00
Introducing `text_llm_runner`. This can be used to run all text only decoder only LLM models supported by ExecuTorch. * Metadata is being read out from the .pte file and being used to construct the runner object. * examples/models/llama/runner.h[.cpp] only contains a simple wrapper around `text_llm_runner.h[.cpp]`. In next PRs I will move examples/models/phi-3-mini/runner to use the generic runner. Will look into QNN and MediaTek runners as well. Differential Revision: [D75910889](https://our.internmc.facebook.com/intern/diff/D75910889/) [ghstack-poisoned]
diff --git a/extension/llm/runner/test/CMakeLists.txt b/extension/llm/runner/test/CMakeLists.txt
@@ -25,4 +25,5 @@ et_cxx_test(
   ${_test_srcs}
   EXTRA_LIBS
   executorch
+  extension_llm_runner
 )
diff --git a/test/run_oss_cpp_tests.sh b/test/run_oss_cpp_tests.sh
@@ -32,6 +32,7 @@ build_executorch() {
   if [ -x "$(command -v glslc)" ]; then
     BUILD_VULKAN="ON"
   fi
+  # -DEXECUTORCH_BUILD_EXTENSION_LLM_RUNNER=ON \  TODO(larryliu0820): Fix the name collision between Abseil and XNNPACK and turn this on.
   cmake . \
     -DCMAKE_INSTALL_PREFIX=cmake-out \
     -DEXECUTORCH_USE_CPP_CODE_COVERAGE=ON \
@@ -40,7 +41,6 @@ build_executorch() {
     -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
     -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
     -DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
-    # -DEXECUTORCH_BUILD_EXTENSION_LLM_RUNNER=ON \  TODO(larryliu0820): Fix the name collision between Abseil and XNNPACK and turn this on.
     -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
     -DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=ON \
     -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \

Original file line number	Diff line number	Diff line change
`@@ -25,4 +25,5 @@ et_cxx_test(`
`25`	`25`	`${_test_srcs}`
`26`	`26`	`EXTRA_LIBS`
`27`	`27`	`executorch`
	`28`	`+ extension_llm_runner`
`28`	`29`	`)`