diff --git a/backends/apple/mps/setup.md b/backends/apple/mps/setup.md index 697d93ea659..7cd4c240a43 100644 --- a/backends/apple/mps/setup.md +++ b/backends/apple/mps/setup.md @@ -116,7 +116,7 @@ python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp cd executorch python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --generate_etrecord -b ``` -2. Run your Program on the ExecuTorch runtime and generate an [ETDump](./sdk-etdump.md). +2. Run your Program on the ExecuTorch runtime and generate an [ETDump](./etdump.md). ``` ./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program --dump-outputs ``` diff --git a/docs/source/build-run-coreml.md b/docs/source/build-run-coreml.md index 6e0cc802df6..9751dc066f2 100644 --- a/docs/source/build-run-coreml.md +++ b/docs/source/build-run-coreml.md @@ -100,7 +100,7 @@ python3 -m examples.apple.coreml.scripts.export --model_name mv3 --generate_etre # Builds `coreml_executor_runner`. ./examples/apple/coreml/scripts/build_executor_runner.sh ``` -3. Run and generate an [ETDump](./sdk-etdump.md). +3. Run and generate an [ETDump](./etdump.md). ```bash cd executorch @@ -108,7 +108,7 @@ cd executorch ./coreml_executor_runner --model_path mv3_coreml_all.pte --profile_model --etdump_path etdump.etdp ``` -4. Create an instance of the [Inspector API](./sdk-inspector.rst) by passing in the [ETDump](./sdk-etdump.md) you have sourced from the runtime along with the optionally generated [ETRecord](./etrecord.rst) from step 1 or execute the following command in your terminal to display the profiling data table. +4. Create an instance of the [Inspector API](./model-inspector.rst) by passing in the [ETDump](./etdump.md) you have sourced from the runtime along with the optionally generated [ETRecord](./etrecord.rst) from step 1 or execute the following command in your terminal to display the profiling data table. ```bash python examples/apple/coreml/scripts/inspector_cli.py --etdump_path etdump.etdp --etrecord_path mv3_coreml.bin ``` diff --git a/docs/source/delegate-debugging.md b/docs/source/delegate-debugging.md index f8d572d4da1..e4e6b0ddcc9 100644 --- a/docs/source/delegate-debugging.md +++ b/docs/source/delegate-debugging.md @@ -127,7 +127,7 @@ A demo of the runtime code can be found [here](https://github.com/pytorch/execut ## Surfacing custom metadata from delegate events -As seen in the runtime logging API's above, users can log an array of bytes along with their delegate profiling event. We make this data available for users in post processing via the [Inspector API](./sdk-inspector.rst). +As seen in the runtime logging API's above, users can log an array of bytes along with their delegate profiling event. We make this data available for users in post processing via the [Inspector API](./model-inspector.rst). Users can pass a metadata parser when creating an instance of the Inspector. The parser is a callable that deserializes the data and returns a list of strings or a dictionary containing key-value pairs. The deserialized data is then added back to the corresponding event in the event block for user consumption. Here's an example of how to write this parser: diff --git a/docs/source/devtools-overview.md b/docs/source/devtools-overview.md index e18b7f16c64..259eaf562c3 100644 --- a/docs/source/devtools-overview.md +++ b/docs/source/devtools-overview.md @@ -36,10 +36,10 @@ ETDump (ExecuTorch Dump) is the binary blob that is generated by the runtime aft If you only care about looking at the raw performance data without linking back to source code and other extensive features, an ETDump alone will be enough to leverage the basic features of the Developer Tools. For the full experience, it is recommended that the users also generate an ETRecord. ``` -More details are available in the [ETDump documentation](sdk-etdump.md) on how to generate and store an ETDump from the runtime. +More details are available in the [ETDump documentation](etdump.md) on how to generate and store an ETDump from the runtime. ### Inspector APIs The Inspector Python APIs are the main user enrty point into the Developer Tools. They join the data sourced from ETDump and ETRecord to give users access to all the performance and debug data sourced from the runtime along with linkage back to eager model source code and module hierarchy in an easy to use API. -More details are available in the [Inspector API documentation](sdk-inspector.rst) on how to use the Inspector APIs. +More details are available in the [Inspector API documentation](model-inspector.rst) on how to use the Inspector APIs. diff --git a/docs/source/etdump.md b/docs/source/etdump.md new file mode 100644 index 00000000000..42391cf40e9 --- /dev/null +++ b/docs/source/etdump.md @@ -0,0 +1,44 @@ +# Prerequisite | ETDump - ExecuTorch Dump + +ETDump (ExecuTorch Dump) is one of the core components of the ExecuTorch Developer Tools. It is the mechanism through which all forms of profiling and debugging data is extracted from the runtime. Users can't parse ETDump directly; instead, they should pass it into the Inspector API, which deserializes the data, offering interfaces for flexible analysis and debugging. + + +## Generating an ETDump + +Generating an ETDump is a relatively straightforward process. Users can follow the steps detailed below to integrate it into their application that uses ExecuTorch. + +1. ***Include*** the ETDump header in your code. +```C++ +#include +``` + +2. ***Create*** an Instance of the ETDumpGen class and pass it into the `load_method` call that is invoked in the runtime. + +```C++ +torch::executor::ETDumpGen etdump_gen = torch::executor::ETDumpGen(); +Result method = + program->load_method(method_name, &memory_manager, &etdump_gen); +``` + +3. ***Dump Out the ETDump Buffer*** - after the inference iterations have been completed, users can dump out the ETDump buffer. If users are on a device which has a filesystem, they could just write it out to the filesystem. For more constrained embedded devices, users will have to extract the ETDump buffer from the device through a mechanism that best suits them (e.g. UART, JTAG etc.) + +```C++ +etdump_result result = etdump_gen.get_etdump_data(); +if (result.buf != nullptr && result.size > 0) { + // On a device with a file system users can just write it out + // to the file-system. + FILE* f = fopen(FLAGS_etdump_path.c_str(), "w+"); + fwrite((uint8_t*)result.buf, 1, result.size, f); + fclose(f); + free(result.buf); + } +``` + +4. ***Compile*** your binary using CMake with the `ET_EVENT_TRACER_ENABLED` pre-processor flag to enable events to be traced and logged into ETDump inside the ExecuTorch runtime. This flag needs to be added to the ExecuTorch library and any operator library that you are compiling into your binary. For reference, you can take a look at `examples/sdk/CMakeLists.txt`. The lines of interest are: +``` +target_compile_options(executorch INTERFACE -DET_EVENT_TRACER_ENABLED) +target_compile_options(portable_ops_lib INTERFACE -DET_EVENT_TRACER_ENABLED) +``` +## Using an ETDump + +Pass this ETDump into the [Inspector API](./model-inspector.rst) to access this data and do post-run analysis. diff --git a/docs/source/etrecord.rst b/docs/source/etrecord.rst index 63546f43ca6..1ab84a6ee10 100644 --- a/docs/source/etrecord.rst +++ b/docs/source/etrecord.rst @@ -18,7 +18,7 @@ them to debug and visualize their model. * Delegate debug handle maps The ``ETRecord`` object itself is intended to be opaque to users and they should not access any components inside it directly. -It should be provided to the `Inspector API `__ to link back performance and debug data sourced from the runtime back to the Python source code. +It should be provided to the `Inspector API `__ to link back performance and debug data sourced from the runtime back to the Python source code. Generating an ``ETRecord`` -------------------------- @@ -37,4 +37,4 @@ they are interested in working with via our tooling. Using an ``ETRecord`` --------------------- -Pass the ``ETRecord`` as an optional argument into the `Inspector API `__ to access this data and do post-run analysis. +Pass the ``ETRecord`` as an optional argument into the `Inspector API `__ to access this data and do post-run analysis. diff --git a/docs/source/extension-module.md b/docs/source/extension-module.md index 878356ba5d9..ee3ffd29a57 100644 --- a/docs/source/extension-module.md +++ b/docs/source/extension-module.md @@ -134,7 +134,7 @@ Most of the ExecuTorch APIs, including those described above, return either `Res ### Profile the Module -Use [ExecuTorch Dump](sdk-etdump.md) to trace model execution. Create an instance of the `ETDumpGen` class and pass it to the `Module` constructor. After executing a method, save the `ETDump` to a file for further analysis. You can capture multiple executions in a single trace if desired. +Use [ExecuTorch Dump](etdump.md) to trace model execution. Create an instance of the `ETDumpGen` class and pass it to the `Module` constructor. After executing a method, save the `ETDump` to a file for further analysis. You can capture multiple executions in a single trace if desired. ```cpp #include diff --git a/docs/source/index.rst b/docs/source/index.rst index 048fab70a26..1e1060f70b7 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -203,10 +203,10 @@ Topics in this section will help you get started with ExecuTorch. devtools-overview bundled-io etrecord - sdk-etdump - sdk-profiling - sdk-debugging - sdk-inspector + etdump + runtime-profiling + model-debugging + model-inspector memory-planning-inspection delegate-debugging devtools-tutorial diff --git a/docs/source/llm/getting-started.md b/docs/source/llm/getting-started.md index 1cfeab6e5e6..272098d4445 100644 --- a/docs/source/llm/getting-started.md +++ b/docs/source/llm/getting-started.md @@ -774,7 +774,7 @@ Run the export script and the ETRecord will be generated as `etrecord.bin`. ##### ETDump generation -An ETDump is an artifact generated at runtime containing a trace of the model execution. For more information, see [the ETDump docs](../sdk-etdump.md). +An ETDump is an artifact generated at runtime containing a trace of the model execution. For more information, see [the ETDump docs](../etdump.md). Include the ETDump header in your code. ```cpp @@ -843,7 +843,7 @@ This prints the performance data in a tabular format in “inspector_out.txt”, ![](../_static/img/llm_manual_print_data_tabular.png) View in full size -To learn more about the Inspector and the rich functionality it provides, see the [Inspector API Reference](../sdk-inspector.md). +To learn more about the Inspector and the rich functionality it provides, see the [Inspector API Reference](../model-inspector.md). ## Custom Kernels With the ExecuTorch custom operator APIs, custom operator and kernel authors can easily bring in their kernel into PyTorch/ExecuTorch. diff --git a/docs/source/model-debugging.md b/docs/source/model-debugging.md new file mode 100644 index 00000000000..5475a703bd7 --- /dev/null +++ b/docs/source/model-debugging.md @@ -0,0 +1,82 @@ +# Debugging Models in ExecuTorch + +With the ExecuTorch Developer Tools, users can debug their models for numerical inaccurcies and extract model outputs from their device to do quality analysis (such as Signal-to-Noise, Mean square error etc.). + +Currently, ExecuTorch supports the following debugging flows: +- Extraction of model level outputs via ETDump. +- Extraction of intermediate outputs (outside of delegates) via ETDump: + - Linking of these intermediate outputs back to the eager model python code. + + +## Steps to debug a model in ExecuTorch + +### Runtime +For a real example reflecting the steps below, please refer to [example_runner.cpp](https://github.com/pytorch/executorch/blob/main/examples/devtools/example_runner/example_runner.cpp). + +1. [Optional] Generate an [ETRecord](./etrecord.rst) while exporting your model. When provided, this enables users to link profiling information back to the eager model source code (with stack traces and module hierarchy). +2. Integrate [ETDump generation](./etdump.md) into the runtime and set the debugging level by configuring the `ETDumpGen` object. Then, provide an additional buffer to which intermediate outputs and program outputs will be written. Currently we support two levels of debugging: + - Program level outputs + ```C++ + Span buffer((uint8_t*)debug_buffer, debug_buffer_size); + etdump_gen.set_debug_buffer(buffer); + etdump_gen.set_event_tracer_debug_level( + EventTracerDebugLogLevel::kProgramOutputs); + ``` + + - Intermediate outputs of executed (non-delegated) operations (will include the program level outputs too) + ```C++ + Span buffer((uint8_t*)debug_buffer, debug_buffer_size); + etdump_gen.set_debug_buffer(buffer); + etdump_gen.set_event_tracer_debug_level( + EventTracerDebugLogLevel::kIntermediateOutputs); + ``` +3. Build the runtime with the pre-processor flag that enables tracking of debug events. Instructions are in the [ETDump documentation](./etdump.md). +4. Run your model and dump out the ETDump buffer as described [here](./etdump.md). (Do so similarly for the debug buffer if configured above) + + +### Accessing the debug outputs post run using the Inspector API's +Once a model has been run, using the generated ETDump and debug buffers, users can leverage the [Inspector API's](./model-inspector.rst) to inspect these debug outputs. + +```python +from executorch.devtools import Inspector + +# Create an Inspector instance with etdump and the debug buffer. +inspector = Inspector(etdump_path=etdump_path, + buffer_path = buffer_path, + # etrecord is optional, if provided it'll link back + # the runtime events to the eager model python source code. + etrecord = etrecord_path) + +# Accessing program outputs is as simple as this: +for event_block in inspector.event_blocks: + if event_block.name == "Execute": + print(event_blocks.run_output) + +# Accessing intermediate outputs from each event (an event here is essentially an instruction that executed in the runtime). +for event_block in inspector.event_blocks: + if event_block.name == "Execute": + for event in event_block.events: + print(event.debug_data) + # If an ETRecord was provided by the user during Inspector initialization, users + # can print the stacktraces and module hierarchy of these events. + print(event.stack_traces) + print(event.module_hierarchy) +``` + +We've also provided a simple set of utilities that let users perform quality analysis of their model outputs with respect to a set of reference outputs (possibly from the eager mode model). + + +```python +from executorch.devtools.inspector import compare_results + +# Run a simple quality analysis between the model outputs sourced from the +# runtime and a set of reference outputs. +# +# Setting plot to True will result in the quality metrics being graphed +# and displayed (when run from a notebook) and will be written out to the +# filesystem. A dictionary will always be returned which will contain the +# results. +for event_block in inspector.event_blocks: + if event_block.name == "Execute": + compare_results(event_blocks.run_output, ref_outputs, plot = True) +``` diff --git a/docs/source/model-inspector.rst b/docs/source/model-inspector.rst new file mode 100644 index 00000000000..d80a8960b1b --- /dev/null +++ b/docs/source/model-inspector.rst @@ -0,0 +1,159 @@ +Inspector APIs +============== + +Overview +-------- + +The Inspector APIs provide a convenient interface for analyzing the +contents of `ETRecord `__ and +`ETDump `__, helping developers get insights about model +architecture and performance statistics. It’s built on top of the `EventBlock Class <#eventblock-class>`__ data structure, +which organizes a group of `Event <#event-class>`__\ s for easy access to details of profiling events. + +There are multiple ways in which users can interact with the Inspector +APIs: + +* By using `public methods <#inspector-methods>`__ provided by the ``Inspector`` class. +* By accessing the `public attributes <#inspector-attributes>`__ of the ``Inspector``, ``EventBlock``, and ``Event`` classes. +* By using a `CLI <#cli>`__ tool for basic functionalities. + +Please refer to the `e2e use case doc `__ get an understanding of how to use these in a real world example. + + +Inspector Methods +----------------- + +Constructor +~~~~~~~~~~~ + +.. autofunction:: executorch.devtools.Inspector.__init__ + +**Example Usage:** + +.. code:: python + + from executorch.devtools import Inspector + + inspector = Inspector(etdump_path="/path/to/etdump.etdp", etrecord="/path/to/etrecord.bin") + +to_dataframe +~~~~~~~~~~~~~~~~ + +.. autofunction:: executorch.devtools.Inspector.to_dataframe + + +print_data_tabular +~~~~~~~~~~~~~~~~~~ + +.. autofunction:: executorch.devtools.Inspector.print_data_tabular + +.. _example-usage-1: + +**Example Usage:** + +.. code:: python + + inspector.print_data_tabular() + +.. image:: _static/img/print_data_tabular.png +Note that the unit of delegate profiling events is "cycles". We're working on providing a way to set different units in the future. + + +find_total_for_module +~~~~~~~~~~~~~~~~~~~~~ + +.. autofunction:: executorch.devtools.Inspector.find_total_for_module + +.. _example-usage-2: + +**Example Usage:** + +.. code:: python + + print(inspector.find_total_for_module("L__self___conv_layer")) + +:: + + 0.002 + + +get_exported_program +~~~~~~~~~~~~~~~~~~~~ + +.. autofunction:: executorch.devtools.Inspector.get_exported_program + +.. _example-usage-3: + +**Example Usage:** + +.. code:: python + + print(inspector.get_exported_program()) + +:: + + ExportedProgram: + class GraphModule(torch.nn.Module): + def forward(self, arg0_1: f32[4, 3, 64, 64]): + # No stacktrace found for following nodes + _param_constant0 = self._param_constant0 + _param_constant1 = self._param_constant1 + + ### ... Omit part of the program for documentation readability ... ### + + Graph signature: ExportGraphSignature(parameters=[], buffers=[], user_inputs=['arg0_1'], user_outputs=['aten_tan_default'], inputs_to_parameters={}, inputs_to_buffers={}, buffers_to_mutate={}, backward_signature=None, assertion_dep_token=None) + Range constraints: {} + Equality constraints: [] + + +Inspector Attributes +-------------------- + +``EventBlock`` Class +~~~~~~~~~~~~~~~~~~~~ + +Access ``EventBlock`` instances through the ``event_blocks`` attribute +of an ``Inspector`` instance, for example: + +.. code:: python + + inspector.event_blocks + +.. autoclass:: executorch.devtools.inspector.EventBlock + +``Event`` Class +~~~~~~~~~~~~~~~ + +Access ``Event`` instances through the ``events`` attribute of an +``EventBlock`` instance. + +.. autoclass:: executorch.devtools.inspector.Event + +**Example Usage:** + +.. code:: python + + for event_block in inspector.event_blocks: + for event in event_block.events: + if event.name == "Method::execute": + print(event.perf_data.raw) + +:: + + [175.748, 78.678, 70.429, 122.006, 97.495, 67.603, 70.2, 90.139, 66.344, 64.575, 134.135, 93.85, 74.593, 83.929, 75.859, 73.909, 66.461, 72.102, 84.142, 77.774, 70.038, 80.246, 59.134, 68.496, 67.496, 100.491, 81.162, 74.53, 70.709, 77.112, 59.775, 79.674, 67.54, 79.52, 66.753, 70.425, 71.703, 81.373, 72.306, 72.404, 94.497, 77.588, 79.835, 68.597, 71.237, 88.528, 71.884, 74.047, 81.513, 76.116] + + +CLI +--- + +Execute the following command in your terminal to display the data +table. This command produces the identical table output as calling the +`print_data_tabular <#print-data-tabular>`__ mentioned earlier: + +.. code:: bash + + python3 -m devtools.inspector.inspector_cli --etdump_path --etrecord_path + +Note that the `etrecord_path` argument is optional. + +We plan to extend the capabilities of the CLI in the future. diff --git a/docs/source/runtime-overview.md b/docs/source/runtime-overview.md index 6766e678e0e..1a421fdcc0a 100644 --- a/docs/source/runtime-overview.md +++ b/docs/source/runtime-overview.md @@ -33,7 +33,7 @@ The runtime is also responsible for: semantics of those operators. * Dispatching predetermined sections of the model to [backend delegates](compiler-delegate-and-partitioner.md) for acceleration. -* Optionally gathering [profiling data](sdk-profiling.md) during load and +* Optionally gathering [profiling data](runtime-profiling.md) during load and execution. ## Design Goals @@ -159,7 +159,7 @@ For more details about the ExecuTorch runtime, please see: * [Simplified Runtime APIs Tutorial](extension-module.md) * [Runtime Build and Cross Compilation](runtime-build-and-cross-compilation.md) * [Runtime Platform Abstraction Layer](runtime-platform-abstraction-layer.md) -* [Runtime Profiling](sdk-profiling.md) +* [Runtime Profiling](runtime-profiling.md) * [Backends and Delegates](compiler-delegate-and-partitioner.md) * [Backend Delegate Implementation](runtime-backend-delegate-implementation-and-linking.md) * [Kernel Library Overview](kernel-library-overview.md) diff --git a/docs/source/runtime-profiling.md b/docs/source/runtime-profiling.md new file mode 100644 index 00000000000..c228971d28c --- /dev/null +++ b/docs/source/runtime-profiling.md @@ -0,0 +1,23 @@ +# Profiling Models in ExecuTorch + +Profiling in ExecuTorch gives users access to these runtime metrics: +- Model Load Time. +- Operator Level Execution Time. +- Delegate Execution Time. + - If the delegate that the user is calling into has been integrated with the [Developer Tools](./delegate-debugging.md), then users will also be able to access delegated operator execution time. +- End-to-end Inference Execution Time. + +One uniqe aspect of ExecuTorch Profiling is the ability to link every runtime executed operator back to the exact line of python code from which this operator originated. This capability enables users to easily identify hotspots in their model, source them back to the exact line of Python code, and optimize if chosen to. + +We provide access to all the profiling data via the Python [Inspector API](./model-inspector.rst). The data mentioned above can be accessed through these interfaces, allowing users to perform any post-run analysis of their choice. + +## Steps to Profile a Model in ExecuTorch + +1. [Optional] Generate an [ETRecord](./etrecord.rst) while you're exporting your model. If provided this will enable users to link back profiling details to eager model source code (with stack traces and module hierarchy). +2. Build the runtime with the pre-processor flags that enable profiling. Detailed in the [ETDump documentation](./etdump.md). +3. Run your Program on the ExecuTorch runtime and generate an [ETDump](./etdump.md). +4. Create an instance of the [Inspector API](./model-inspector.rst) by passing in the ETDump you have sourced from the runtime along with the optionally generated ETRecord from step 1. + - Through the Inspector API, users can do a wide range of analysis varying from printing out performance details to doing more finer granular calculation on module level. + + +Please refer to the [Developer Tools tutorial](./tutorials/devtools-integration-tutorial.rst) for a step-by-step walkthrough of the above process on a sample model. diff --git a/docs/source/sdk-debugging.md b/docs/source/sdk-debugging.md index 80358fe99a1..3e975875f21 100644 --- a/docs/source/sdk-debugging.md +++ b/docs/source/sdk-debugging.md @@ -1,82 +1,3 @@ # Debugging Models in ExecuTorch -With the ExecuTorch Developer Tools, users can debug their models for numerical inaccurcies and extract model outputs from their device to do quality analysis (such as Signal-to-Noise, Mean square error etc.). - -Currently, ExecuTorch supports the following debugging flows: -- Extraction of model level outputs via ETDump. -- Extraction of intermediate outputs (outside of delegates) via ETDump: - - Linking of these intermediate outputs back to the eager model python code. - - -## Steps to debug a model in ExecuTorch - -### Runtime -For a real example reflecting the steps below, please refer to [example_runner.cpp](https://github.com/pytorch/executorch/blob/main/examples/devtools/example_runner/example_runner.cpp). - -1. [Optional] Generate an [ETRecord](./etrecord.rst) while exporting your model. When provided, this enables users to link profiling information back to the eager model source code (with stack traces and module hierarchy). -2. Integrate [ETDump generation](./sdk-etdump.md) into the runtime and set the debugging level by configuring the `ETDumpGen` object. Then, provide an additional buffer to which intermediate outputs and program outputs will be written. Currently we support two levels of debugging: - - Program level outputs - ```C++ - Span buffer((uint8_t*)debug_buffer, debug_buffer_size); - etdump_gen.set_debug_buffer(buffer); - etdump_gen.set_event_tracer_debug_level( - EventTracerDebugLogLevel::kProgramOutputs); - ``` - - - Intermediate outputs of executed (non-delegated) operations (will include the program level outputs too) - ```C++ - Span buffer((uint8_t*)debug_buffer, debug_buffer_size); - etdump_gen.set_debug_buffer(buffer); - etdump_gen.set_event_tracer_debug_level( - EventTracerDebugLogLevel::kIntermediateOutputs); - ``` -3. Build the runtime with the pre-processor flag that enables tracking of debug events. Instructions are in the [ETDump documentation](./sdk-etdump.md). -4. Run your model and dump out the ETDump buffer as described [here](./sdk-etdump.md). (Do so similarly for the debug buffer if configured above) - - -### Accessing the debug outputs post run using the Inspector API's -Once a model has been run, using the generated ETDump and debug buffers, users can leverage the [Inspector API's](./sdk-inspector.rst) to inspect these debug outputs. - -```python -from executorch.devtools import Inspector - -# Create an Inspector instance with etdump and the debug buffer. -inspector = Inspector(etdump_path=etdump_path, - buffer_path = buffer_path, - # etrecord is optional, if provided it'll link back - # the runtime events to the eager model python source code. - etrecord = etrecord_path) - -# Accessing program outputs is as simple as this: -for event_block in inspector.event_blocks: - if event_block.name == "Execute": - print(event_blocks.run_output) - -# Accessing intermediate outputs from each event (an event here is essentially an instruction that executed in the runtime). -for event_block in inspector.event_blocks: - if event_block.name == "Execute": - for event in event_block.events: - print(event.debug_data) - # If an ETRecord was provided by the user during Inspector initialization, users - # can print the stacktraces and module hierarchy of these events. - print(event.stack_traces) - print(event.module_hierarchy) -``` - -We've also provided a simple set of utilities that let users perform quality analysis of their model outputs with respect to a set of reference outputs (possibly from the eager mode model). - - -```python -from executorch.devtools.inspector import compare_results - -# Run a simple quality analysis between the model outputs sourced from the -# runtime and a set of reference outputs. -# -# Setting plot to True will result in the quality metrics being graphed -# and displayed (when run from a notebook) and will be written out to the -# filesystem. A dictionary will always be returned which will contain the -# results. -for event_block in inspector.event_blocks: - if event_block.name == "Execute": - compare_results(event_blocks.run_output, ref_outputs, plot = True) -``` +Please update your link to . This URL will be deleted after v0.4.0. diff --git a/docs/source/sdk-etdump.md b/docs/source/sdk-etdump.md index c58efb40de7..a765d4cf1b4 100644 --- a/docs/source/sdk-etdump.md +++ b/docs/source/sdk-etdump.md @@ -1,44 +1,3 @@ # Prerequisite | ETDump - ExecuTorch Dump -ETDump (ExecuTorch Dump) is one of the core components of the ExecuTorch Developer Tools. It is the mechanism through which all forms of profiling and debugging data is extracted from the runtime. Users can't parse ETDump directly; instead, they should pass it into the Inspector API, which deserializes the data, offering interfaces for flexible analysis and debugging. - - -## Generating an ETDump - -Generating an ETDump is a relatively straightforward process. Users can follow the steps detailed below to integrate it into their application that uses ExecuTorch. - -1. ***Include*** the ETDump header in your code. -```C++ -#include -``` - -2. ***Create*** an Instance of the ETDumpGen class and pass it into the `load_method` call that is invoked in the runtime. - -```C++ -torch::executor::ETDumpGen etdump_gen = torch::executor::ETDumpGen(); -Result method = - program->load_method(method_name, &memory_manager, &etdump_gen); -``` - -3. ***Dump Out the ETDump Buffer*** - after the inference iterations have been completed, users can dump out the ETDump buffer. If users are on a device which has a filesystem, they could just write it out to the filesystem. For more constrained embedded devices, users will have to extract the ETDump buffer from the device through a mechanism that best suits them (e.g. UART, JTAG etc.) - -```C++ -etdump_result result = etdump_gen.get_etdump_data(); -if (result.buf != nullptr && result.size > 0) { - // On a device with a file system users can just write it out - // to the file-system. - FILE* f = fopen(FLAGS_etdump_path.c_str(), "w+"); - fwrite((uint8_t*)result.buf, 1, result.size, f); - fclose(f); - free(result.buf); - } -``` - -4. ***Compile*** your binary using CMake with the `ET_EVENT_TRACER_ENABLED` pre-processor flag to enable events to be traced and logged into ETDump inside the ExecuTorch runtime. This flag needs to be added to the ExecuTorch library and any operator library that you are compiling into your binary. For reference, you can take a look at `examples/sdk/CMakeLists.txt`. The lines of interest are: -``` -target_compile_options(executorch INTERFACE -DET_EVENT_TRACER_ENABLED) -target_compile_options(portable_ops_lib INTERFACE -DET_EVENT_TRACER_ENABLED) -``` -## Using an ETDump - -Pass this ETDump into the [Inspector API](./sdk-inspector.rst) to access this data and do post-run analysis. +Please update your link to . This URL will be deleted after v0.4.0. diff --git a/docs/source/sdk-inspector.rst b/docs/source/sdk-inspector.rst index 4d46915a8af..0019528f419 100644 --- a/docs/source/sdk-inspector.rst +++ b/docs/source/sdk-inspector.rst @@ -1,159 +1,4 @@ Inspector APIs ============== -Overview --------- - -The Inspector APIs provide a convenient interface for analyzing the -contents of `ETRecord `__ and -`ETDump `__, helping developers get insights about model -architecture and performance statistics. It’s built on top of the `EventBlock Class <#eventblock-class>`__ data structure, -which organizes a group of `Event <#event-class>`__\ s for easy access to details of profiling events. - -There are multiple ways in which users can interact with the Inspector -APIs: - -* By using `public methods <#inspector-methods>`__ provided by the ``Inspector`` class. -* By accessing the `public attributes <#inspector-attributes>`__ of the ``Inspector``, ``EventBlock``, and ``Event`` classes. -* By using a `CLI <#cli>`__ tool for basic functionalities. - -Please refer to the `e2e use case doc `__ get an understanding of how to use these in a real world example. - - -Inspector Methods ------------------ - -Constructor -~~~~~~~~~~~ - -.. autofunction:: executorch.devtools.Inspector.__init__ - -**Example Usage:** - -.. code:: python - - from executorch.devtools import Inspector - - inspector = Inspector(etdump_path="/path/to/etdump.etdp", etrecord="/path/to/etrecord.bin") - -to_dataframe -~~~~~~~~~~~~~~~~ - -.. autofunction:: executorch.devtools.Inspector.to_dataframe - - -print_data_tabular -~~~~~~~~~~~~~~~~~~ - -.. autofunction:: executorch.devtools.Inspector.print_data_tabular - -.. _example-usage-1: - -**Example Usage:** - -.. code:: python - - inspector.print_data_tabular() - -.. image:: _static/img/print_data_tabular.png -Note that the unit of delegate profiling events is "cycles". We're working on providing a way to set different units in the future. - - -find_total_for_module -~~~~~~~~~~~~~~~~~~~~~ - -.. autofunction:: executorch.devtools.Inspector.find_total_for_module - -.. _example-usage-2: - -**Example Usage:** - -.. code:: python - - print(inspector.find_total_for_module("L__self___conv_layer")) - -:: - - 0.002 - - -get_exported_program -~~~~~~~~~~~~~~~~~~~~ - -.. autofunction:: executorch.devtools.Inspector.get_exported_program - -.. _example-usage-3: - -**Example Usage:** - -.. code:: python - - print(inspector.get_exported_program()) - -:: - - ExportedProgram: - class GraphModule(torch.nn.Module): - def forward(self, arg0_1: f32[4, 3, 64, 64]): - # No stacktrace found for following nodes - _param_constant0 = self._param_constant0 - _param_constant1 = self._param_constant1 - - ### ... Omit part of the program for documentation readability ... ### - - Graph signature: ExportGraphSignature(parameters=[], buffers=[], user_inputs=['arg0_1'], user_outputs=['aten_tan_default'], inputs_to_parameters={}, inputs_to_buffers={}, buffers_to_mutate={}, backward_signature=None, assertion_dep_token=None) - Range constraints: {} - Equality constraints: [] - - -Inspector Attributes --------------------- - -``EventBlock`` Class -~~~~~~~~~~~~~~~~~~~~ - -Access ``EventBlock`` instances through the ``event_blocks`` attribute -of an ``Inspector`` instance, for example: - -.. code:: python - - inspector.event_blocks - -.. autoclass:: executorch.devtools.inspector.EventBlock - -``Event`` Class -~~~~~~~~~~~~~~~ - -Access ``Event`` instances through the ``events`` attribute of an -``EventBlock`` instance. - -.. autoclass:: executorch.devtools.inspector.Event - -**Example Usage:** - -.. code:: python - - for event_block in inspector.event_blocks: - for event in event_block.events: - if event.name == "Method::execute": - print(event.perf_data.raw) - -:: - - [175.748, 78.678, 70.429, 122.006, 97.495, 67.603, 70.2, 90.139, 66.344, 64.575, 134.135, 93.85, 74.593, 83.929, 75.859, 73.909, 66.461, 72.102, 84.142, 77.774, 70.038, 80.246, 59.134, 68.496, 67.496, 100.491, 81.162, 74.53, 70.709, 77.112, 59.775, 79.674, 67.54, 79.52, 66.753, 70.425, 71.703, 81.373, 72.306, 72.404, 94.497, 77.588, 79.835, 68.597, 71.237, 88.528, 71.884, 74.047, 81.513, 76.116] - - -CLI ---- - -Execute the following command in your terminal to display the data -table. This command produces the identical table output as calling the -`print_data_tabular <#print-data-tabular>`__ mentioned earlier: - -.. code:: bash - - python3 -m devtools.inspector.inspector_cli --etdump_path --etrecord_path - -Note that the `etrecord_path` argument is optional. - -We plan to extend the capabilities of the CLI in the future. +Please update your link to . This URL will be deleted after v0.4.0. diff --git a/docs/source/sdk-profiling.md b/docs/source/sdk-profiling.md index b618dbf8f18..9c99a979757 100644 --- a/docs/source/sdk-profiling.md +++ b/docs/source/sdk-profiling.md @@ -1,23 +1,3 @@ # Profiling Models in ExecuTorch -Profiling in ExecuTorch gives users access to these runtime metrics: -- Model Load Time. -- Operator Level Execution Time. -- Delegate Execution Time. - - If the delegate that the user is calling into has been integrated with the [Developer Tools](./delegate-debugging.md), then users will also be able to access delegated operator execution time. -- End-to-end Inference Execution Time. - -One uniqe aspect of ExecuTorch Profiling is the ability to link every runtime executed operator back to the exact line of python code from which this operator originated. This capability enables users to easily identify hotspots in their model, source them back to the exact line of Python code, and optimize if chosen to. - -We provide access to all the profiling data via the Python [Inspector API](./sdk-inspector.rst). The data mentioned above can be accessed through these interfaces, allowing users to perform any post-run analysis of their choice. - -## Steps to Profile a Model in ExecuTorch - -1. [Optional] Generate an [ETRecord](./etrecord.rst) while you're exporting your model. If provided this will enable users to link back profiling details to eager model source code (with stack traces and module hierarchy). -2. Build the runtime with the pre-processor flags that enable profiling. Detailed in the [ETDump documentation](./sdk-etdump.md). -3. Run your Program on the ExecuTorch runtime and generate an [ETDump](./sdk-etdump.md). -4. Create an instance of the [Inspector API](./sdk-inspector.rst) by passing in the ETDump you have sourced from the runtime along with the optionally generated ETRecord from step 1. - - Through the Inspector API, users can do a wide range of analysis varying from printing out performance details to doing more finer granular calculation on module level. - - -Please refer to the [Developer Tools tutorial](./tutorials/devtools-integration-tutorial.rst) for a step-by-step walkthrough of the above process on a sample model. +Please update your link to . This URL will be deleted after v0.4.0. diff --git a/docs/source/tutorials_source/devtools-integration-tutorial.py b/docs/source/tutorials_source/devtools-integration-tutorial.py index 92d8e326004..dece18fa8ce 100644 --- a/docs/source/tutorials_source/devtools-integration-tutorial.py +++ b/docs/source/tutorials_source/devtools-integration-tutorial.py @@ -20,7 +20,7 @@ # This tutorial will show a full end-to-end flow of how to utilize the Developer Tools to profile a model. # Specifically, it will: # -# 1. Generate the artifacts consumed by the Developer Tools (`ETRecord <../etrecord.html>`__, `ETDump <../sdk-etdump.html>`__). +# 1. Generate the artifacts consumed by the Developer Tools (`ETRecord <../etrecord.html>`__, `ETDump <../etdump.html>`__). # 2. Create an Inspector class consuming these artifacts. # 3. Utilize the Inspector class to analyze the model profiling result. @@ -213,7 +213,7 @@ def forward(self, x): # Analyzing with an Inspector # --------------------------- # -# ``Inspector`` provides 2 ways of accessing ingested information: `EventBlocks <../sdk-inspector#eventblock-class>`__ +# ``Inspector`` provides 2 ways of accessing ingested information: `EventBlocks <../model-inspector#eventblock-class>`__ # and ``DataFrames``. These mediums give users the ability to perform custom # analysis about their model performance. # @@ -282,7 +282,7 @@ def forward(self, x): ###################################################################### # Note: ``find_total_for_module`` is a special first class method of -# `Inspector <../sdk-inspector.html>`__ +# `Inspector <../model-inspector.html>`__ ###################################################################### # Conclusion @@ -297,5 +297,5 @@ def forward(self, x): # # - `ExecuTorch Developer Tools Overview <../devtools-overview.html>`__ # - `ETRecord <../etrecord.html>`__ -# - `ETDump <../sdk-etdump.html>`__ -# - `Inspector <../sdk-inspector.html>`__ +# - `ETDump <../etdump.html>`__ +# - `Inspector <../model-inspector.html>`__ diff --git a/extension/pybindings/pybindings.pyi b/extension/pybindings/pybindings.pyi index 51c134de1ff..0b7be42ca7a 100644 --- a/extension/pybindings/pybindings.pyi +++ b/extension/pybindings/pybindings.pyi @@ -136,7 +136,7 @@ def _load_for_executorch( Args: path: File path to the ExecuTorch program as a string. enable_etdump: If true, enables an ETDump which can store profiling information. - See documentation at https://pytorch.org/executorch/stable/sdk-etdump.html + See documentation at https://pytorch.org/executorch/stable/etdump.html for how to use it. debug_buffer_size: If non-zero, enables a debug buffer which can store intermediate results of each instruction in the ExecuTorch program.