MTK update rule #5776

kirklandsign · 2024-09-30T23:51:20Z

No description provided.

Summary: Fix the build errors I introduced in #5318 Pull Request resolved: #5513 Test Plan: Built the model using the instructions at https://github.com/pytorch/executorch/blob/main/examples/models/phi-3-mini/README.md Reviewed By: helunwencser Differential Revision: D63133231 Pulled By: dbort fbshipit-source-id: 81d5402e76fd86c919a13a1b1b83687feca72ab7

Summary: Pull Request resolved: #5514 #5475 moved the images to static folder, #5493 updated some of it but missed out on the delegate specific docs. This diff, fixes that by replacing the image URLs that are broken Reviewed By: cmodi-meta, svekars, kirklandsign Differential Revision: D63135522 fbshipit-source-id: 10071b0428c203eef38500852e64f71e13ec1315

Summary: Took me a bit to figure out how to make cmake work; I had it installed in homebrew but it wasn't working. Add instructions for installing cmake globally in a way that Xcode can find it. Pull Request resolved: #5533 Reviewed By: shoumikhin Differential Revision: D63153558 Pulled By: dbort fbshipit-source-id: b7df7c0b9723c6e60cafea67a39dd326276a6148

Summary: Pull Request resolved: #5505 Reviewed By: SS-JIA Differential Revision: D63000056 fbshipit-source-id: 959127e874b30c7ebc069499d99e8c5881b3b272

Summary: Pull Request resolved: #5444 TIL about the `target` clang/GCC function attribute, which allows building a particular function under an `-march` flag instead of a whole file. ghstack-source-id: 243858419 Reviewed By: malfet Differential Revision: D62905047 fbshipit-source-id: a89c8169fea315aa653bbca819a672357c3dff77

…singlethreaded local workgroup (#5504) Summary: Pull Request resolved: #5504 Using new load_texel_lpos for simpler updating Reviewed By: SS-JIA Differential Revision: D62990822 fbshipit-source-id: 9163b807d9095ebdb089f08aa6ea20fbbb563d02

Summary: Pull Request resolved: #5534 Reviewed By: dltn Differential Revision: D63199241 fbshipit-source-id: 9b097c3cc6c492fc13d9183c8679de9d00c80d21

Summary: Pull Request resolved: #5535 . Reviewed By: dltn Differential Revision: D63201286 fbshipit-source-id: 1767a1c0cf876f7a3b6b4534a83c912c3de0eabf

Summary: Pull Request resolved: #5539 Reviewed By: dltn Differential Revision: D63222601 fbshipit-source-id: 9a0bd6e10f7d2d2c769b7617e0b0e605af616eb0

Summary: Removed [!TIP] format and replaced with a subheader Pull Request resolved: #5517 Reviewed By: kirklandsign Differential Revision: D63145938 Pulled By: cmodi-meta fbshipit-source-id: 5503901957b8aeffe23cdc756bd4c73a007dd35e

Summary: This logs the metrics from the size command when building with run.sh Pull Request resolved: #5342 Reviewed By: manuelcandales Differential Revision: D62874679 Pulled By: digantdesai fbshipit-source-id: f69bfa12c48101e540e684a590f78b546903cb42

Summary: Adding "px" unit for PyTorch site (i.e. https://pytorch.org/executorch/main/llm/llama-demo-android.html) will have same image widths as readme in github Pull Request resolved: #5540 Reviewed By: Riandy, kirklandsign Differential Revision: D63226892 Pulled By: cmodi-meta fbshipit-source-id: 5cfa30ee0ab156c1e004405cdc7dd99a0f61d2c2

Summary: The code under examples/... is a proxy for user code, and users should never declare code under the `torch::` or `executorch::` namespaces. Move this code under the `example::` namespace to make it more clear that users should use their own namespaces when writing code like this. Pull Request resolved: #5478 Test Plan: - Built using the instructions at https://github.com/pytorch/executorch/blob/main/examples/mediatek/README.md Reviewed By: JacobSzwejbka, cccclai Differential Revision: D62992974 Pulled By: dbort fbshipit-source-id: b01f1b33d2853a0555ae19d79769a5bb6d0ba853

Summary: examples/ code should use the new `executorch::` namespaces. Pull Request resolved: #5516 Test Plan: Built the app using the instructions at https://github.com/pytorch/executorch/blob/main/examples/demo-apps/apple_ios/LLaMA/README.md Reviewed By: larryliu0820 Differential Revision: D63138639 Pulled By: dbort fbshipit-source-id: fffb6d35d425dd733eead1b24ee8b9f2831e65c0

Summary: Pull Request resolved: #5546 The last prompt sent would be included in `getConversationHistory()` + adding it prior to sending it with the generate(). It looks like this got move during the rebasing. To fix this we now call `getConversationHistory()` prior to adding the rawPrompt to a Message. In regards to model response, I noticed that it did not really change the quality of the response. (tested with Llama 3.1) Reviewed By: Riandy Differential Revision: D62761977 fbshipit-source-id: 2f975983965fe837147f1ffb8b5dcfa8f2061895

Summary: Example code should use the new `executorch::` namespace wherever possible, and should not define code under the `torch::` namespace. Pull Request resolved: #5512 Test Plan: - Built llava changes with `bash .ci/scripts/test_llava.sh` Reviewed By: JacobSzwejbka, larryliu0820 Differential Revision: D63133181 Pulled By: dbort fbshipit-source-id: 5796b85eef053f3b3e4ba0e27a3a26ae48747b5a

Summary: Use the names in the new `executorch::` namespace. Pull Request resolved: #5495 Test Plan: ``` ./examples/devtools/build_example_runner.sh ``` Reviewed By: larryliu0820 Differential Revision: D63047148 Pulled By: dbort fbshipit-source-id: e0e3af1c130aaf409ecc142c28d75f0a44d88fa3

Summary: Pull Request resolved: #5548 Converting the input to and from float32 is faster than not using the op. h/t to torchchat, which does this already (though it had a bug, which I sent a patch for). Reviewed By: kimishpatel Differential Revision: D63158951 fbshipit-source-id: 58c90d141ee403536c03a3b731f8547790fc9440

Summary: This PR adds a CI job for phi-3-mini Pull Request resolved: #5532 Test Plan: The CI Job is green: https://github.com/pytorch/executorch/actions/runs/10967809307/job/30458161933?pr=5532 Reviewed By: iseeyuan Differential Revision: D63157703 Pulled By: helunwencser fbshipit-source-id: fc7f54e166062443f396e7a304712f7b60e5db90

Summary: Preview in GitHub for consistency: https://github.com/pytorch/executorch/blob/fix-images/examples/demo-apps/android/LlamaDemo/README.md Pull Request resolved: #5550 Test Plan: Doc preview: https://docs-preview.pytorch.org/pytorch/executorch/5550/llm/llama-demo-android.html Rendered GitHub preview: https://github.com/pytorch/executorch/blob/fix-images/examples/demo-apps/android/LlamaDemo/README.md?rgh-link-date=2024-09-23T20%3A12%3A32Z Reviewed By: cmodi-meta, kirklandsign Differential Revision: D63281004 Pulled By: svekars fbshipit-source-id: cc5710624cb9bdab9056558c94f127b3bc12b96c

…5515) Summary: Pull Request resolved: #5515 Storing QMat2 in a texture gives way to two main problems: - Indexing is a mess and additional computation is required to take into account the fact that we are reading ivec4's and only using half of the values - There is no texel fetching in int8. The texel is read in int32 and needs to be casted Keeping QMat2 in a buffer performs better because, although reading from buffers is slower, removing the extra computation compensates for this. {F1863459327} This diff also moves the scales_and_zeros tensor to Channels Packed in texture implementations because it just makes more sense, I had done some terrible indexing shennanigans before. ghstack-source-id: 244258611 exported-using-ghexport Reviewed By: yipjustin Differential Revision: D62504978 fbshipit-source-id: df2fdf87f75140be0a316576c8ffad67feefd6d7

Summary: Pull Request resolved: #5499 Seems to block bfloat16 stories110M as exported by torchchat (and we should have op coverage for bfloat16 anyway). ghstack-source-id: 243857968 Reviewed By: larryliu0820 Differential Revision: D63054001 fbshipit-source-id: 530b479872643f878912592c7b260d71e6e05804

Summary: Pull Request resolved: #5500 ghstack-source-id: 243857969 Reviewed By: digantdesai, larryliu0820 Differential Revision: D63057744 fbshipit-source-id: 9e1fb6f6479adb1575c5aed61b9da3c774586ba3

Summary: Pull Request resolved: #5519 Discovered these were missing while trying to use the following diff. ghstack-source-id: 243867517 exported-using-ghexport Reviewed By: digantdesai Differential Revision: D63147276 fbshipit-source-id: bf75fb0fe452e2e68a34271ca1250cdb90657e5a

#5520) Summary: Pull Request resolved: #5520 reserved ghstack-source-id: 243867516 exported-using-ghexport Reviewed By: larryliu0820 Differential Revision: D63147278 fbshipit-source-id: d5aefbf2509a1eca4c32bbbe7224e7b996fa1e57

Summary: Pull Request resolved: #5553 Adding support to load Llama Guard model and run prompt classification task Reviewed By: cmodi-meta, kirklandsign Differential Revision: D63148252 fbshipit-source-id: 482559e694da05bdec75b9a2dbd76163c686e47d

Summary: Pull Request resolved: #5560 ## Context Refactor operator test code generation scripts, such that components can be re-used to generate operator benchmarks. In broad strokes, the refactors implemented by this diff are as follows: * Improve granularity of Python modules * Replace `test` with `correctness_test`, to make it clear that we are generating correctness tests. **Note that I haven't changed the top level target name `compute_graph_op_tests_bin` since I believe it would be too verbose. ghstack-source-id: 244283559 exported-using-ghexport Reviewed By: nathanaelsee Differential Revision: D63286131 fbshipit-source-id: 1177ea381e6381045f1c97491dd7ec006690f574

Summary: Pull Request resolved: #5561 ## Context Use the automatic test generation infrastructure to generate operator benchmarks. The overall concept is the same as the test generation; we just structure the generated code in the style of the google benchmark library instead of GTEST. ghstack-source-id: 244287193 Reviewed By: derekxu, nathanaelsee Differential Revision: D63286132 fbshipit-source-id: 25c379accf6664dfca8232db81772b638b41c758

Summary: Add separate tests for Ethos-U85 to all backend operator tests. Updated ethos-u-vela version to support more operators. Signed-off-by: Per Åstrand <[[email protected]](mailto:[email protected])> Signed-off-by: Tom Allsop <[[email protected]](mailto:[email protected])> Pull Request resolved: #5346 Reviewed By: manuelcandales Differential Revision: D62875027 Pulled By: digantdesai fbshipit-source-id: 3bf238d81957258ee93ae235d575beff8a575191

Summary: Pull Request resolved: #5565 Swap to using method meta so we can be finer grained about this check Reviewed By: dbort Differential Revision: D62983475 fbshipit-source-id: c4599c5ecad0409cd8b2670464c4e9e8809b49ad

Summary: Update the output of executor runner on the tutorial `add.pte` Pull Request resolved: #5452 Reviewed By: mergennachin Differential Revision: D63934482 Pulled By: dvorjackz fbshipit-source-id: 4933508d21f6965e0fcddd0caf5bf0ec1acc69ee

Summary: Pull Request resolved: #5909 Reviewed By: mergennachin Differential Revision: D63930714 Pulled By: dvorjackz fbshipit-source-id: 46134ac5750315606dffbf246da55e712dd192e8

Summary: The script didn't handle invalid JSON well when the benchmark results is empty https://github.com/pytorch/executorch/actions/runs/11197448597/job/31128464106. It's better to upload whatever it finds instead of crashing in this case. Looking a bit close at the benchmark jobs, there are some cases where the `benchmark_results.json` file is empty. I'm not sure why yet, but it can be fixed in another PR: * https://github.com/pytorch/executorch/actions/runs/11197448597/job/31127998979#step:15:286 (stories110M, qnn) * https://github.com/pytorch/executorch/actions/runs/11197448597/job/31127999221#step:15:946 (vit, xnnpack) The test spec already waits for 3 minutes to see if the file is there before giving up Pull Request resolved: #5916 Reviewed By: guangy10 Differential Revision: D63950793 Pulled By: huydhn fbshipit-source-id: 160d1465395c025e028b0e2cb69b5837a8f0208a

Summary: Pull Request resolved: #5882 Lots of things are redundant and a few need to move to utils. Subsequent changes will split the export function and separate the run part. Main changes: - call `fuse_pt2` after `convert_pt2` instead of `quantize_pt2`, and avoid calling `convert_pt2` twice - move `print_ops_info` into `export_to_cadence` - remove the need to call `export_to_edge` in `export_model` - move the serialization utils to `utils.py` Reviewed By: zonglinpeng Differential Revision: D63795843 fbshipit-source-id: 7eb482b0daccf64d3f1ca73ffb5e5148584a6678

Summary: Debug coreml failure, no need to review Pull Request resolved: #5906 Reviewed By: cccclai Differential Revision: D63950443 Pulled By: huydhn fbshipit-source-id: 5c2c2ad0b140bf9b33d52c9631ff3cc4a576210f

Summary: - e2e script for https://github.com/apple/ml-fastvit (fastvit_s18) - add pass to handle mismatched tensor shape for broadcast ops when doing layout transform - add ParamObserver for params with lots of outliers - refactor & breakage fix Pull Request resolved: #5543 Reviewed By: kimishpatel Differential Revision: D63965451 Pulled By: cccclai fbshipit-source-id: 40cd85f60a8a539e6600cac1bfe16cdac4bb0465

Summary: Pull Request resolved: #5903 As title, they were tested in more recent version. 2.13 likely breaks Reviewed By: mergennachin Differential Revision: D63922556 fbshipit-source-id: 926fe6fd172d65f02d470f689016c687f1504647

Summary: Pull Request resolved: #5935 . Reviewed By: kirklandsign Differential Revision: D63988442 fbshipit-source-id: a0517166adc9ef3a1b2ee26d6003293b96ff7314

Summary: Pull Request resolved: #5868 the schema file copy is an actual AOT step. It's a workaround but it has been here a while, probably easier to have it as part of the build script instead of manually doing it everytime Reviewed By: kirklandsign Differential Revision: D63878765 fbshipit-source-id: 7fc4b848b72e2859364e21cbea3dd1b433d6cd8e

Summary: Pull Request resolved: #5861 We should use this option during exporting 1B/3B models as bf16 because KVCache is always fp32. Otherwise, we see regressed performance for 1B/3B in bf16 format. ghstack-source-id: 246391007 Reviewed By: mergennachin Differential Revision: D63871048 fbshipit-source-id: 6b3ff80dbc689a04c70e2fcc5c98698bb74f899b

Summary: Fix ExecuTorch capitalization. Pull Request resolved: #5945 Reviewed By: kirklandsign Differential Revision: D63997153 Pulled By: shoumikhin fbshipit-source-id: 5a3bd82d295c709ff9217b6325582e5cc2fab09c

Summary: Pull Request resolved: #5932 TSIA ghstack-source-id: 246607061 exported-using-ghexport bypass-github-export-checks Reviewed By: SS-JIA Differential Revision: D63918660 fbshipit-source-id: 3034b31899e8079ccb1d443bade4a0185997bb7a

Summary: Pull Request resolved: #5879 Port from LiteInterpreter's [`flip.glsl`](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/vulkan/glsl/flip.glsl). ``` - func: flip(Tensor self, int[] dims) -> Tensor ``` Will use this to verify AHB image is interpreted correctly. ghstack-source-id: 246657861 Reviewed By: SS-JIA Differential Revision: D63843843 fbshipit-source-id: e48ad206e7e563538142fa587b3acaf9c2919972

Summary: Pull Request resolved: #5799 ## Context As title, this diff adds an implementation for a fused SDPA + KV-Cache update operator which will be used in LLaMA models. Currently the SDPA portion of the operator is implemented via it's consituent operators, but a future optimization opportunity would be to implement a single flash attention shader. ## Reference Implementation For future reference, a reference implementation of the SDPA + KV cache update mechanism is shown below. This reference implementation was originally used to check intermediate outputs but in the end I decided to compare against the `sdpa_with_kv_cache` operator in `extension/llm` for simplicity. ``` at::Tensor convert_boolean_attn_mask( const at::Tensor& attn_mask, caffe2::TypeMeta dtype) { // Convert boolean mask to additive mask; need to invert mask to indicate what // to mask *out*. if (attn_mask.dtype() == at::kBool) { return at::where( attn_mask.logical_not(), -std::numeric_limits<double>::infinity(), at::scalar_tensor( 0.0, at::TensorOptions().dtype(dtype).device(attn_mask.device()))); } // Otherwise, attn_mask represents an additive attention tensor return attn_mask; } at::Tensor construct_attention_mask( const at::Tensor& q, const at::Tensor& k_cache, const int start_pos) { const int max_seq_len = k_cache.size(1); const int seq_len = q.size(1); at::Tensor attn_mask_base = at::ones({max_seq_len, start_pos + seq_len}, q.options().dtype(at::kBool)) .tril(); at::Tensor attn_mask_sliced = at::slice(attn_mask_base, 0, start_pos, start_pos + seq_len); attn_mask_sliced = convert_boolean_attn_mask(attn_mask_sliced, q.dtype()); return attn_mask_sliced; } std::vector<at::Tensor> sdpa_reference_impl( const at::Tensor& q_projected, const at::Tensor& k_projected, const at::Tensor& v_projected, at::Tensor& key_cache, at::Tensor& value_cache, const int64_t start_pos, const int64_t seq_len, const c10::optional<at::Tensor> __attn_mask_ignored, const double dropout_p, const bool is_causal, const c10::optional<double> scale) { at::Tensor attn_mask = construct_attention_mask(q_projected, key_cache, start_pos); at::Tensor key_cache_updated = at::slice_scatter( key_cache, k_projected, 1, start_pos, start_pos + k_projected.size(1)); at::Tensor value_cache_updated = at::slice_scatter( value_cache, v_projected, 1, start_pos, start_pos + v_projected.size(1)); at::Tensor key_cache_sliced = at::slice(key_cache_updated, 1, 0, start_pos + q_projected.size(1)); at::Tensor value_cache_sliced = at::slice(value_cache_updated, 1, 0, start_pos + q_projected.size(1)); at::Tensor q_transposed = q_projected.transpose(1, 2); at::Tensor k_transposed = key_cache_sliced.transpose(1, 2); at::Tensor v_transposed = value_cache_sliced.transpose(1, 2); // Skip doing repeat_interleave; assume that num_attention_heads == // num_kv_heads float scale_factor = 1.0 / sqrt(q_transposed.size(-1)); at::Tensor k_transposed_2 = k_transposed.transpose(-2, -1); at::Tensor attn_weight_prescale = at::matmul(q_transposed, k_transposed_2); at::Tensor attn_weight = attn_weight_prescale * scale_factor + attn_mask; at::Tensor attn_weight_softmax = at::softmax(attn_weight, -1); at::Tensor out = at::matmul(attn_weight_softmax, v_transposed); return { out.transpose(1, 2), key_cache_sliced, value_cache_sliced, q_transposed, k_transposed, v_transposed, k_transposed_2, attn_weight_prescale, attn_weight, attn_weight_softmax, out, }; } ``` ghstack-source-id: 246640547 Reviewed By: kimishpatel Differential Revision: D63724114 fbshipit-source-id: c85afc2f8eade8e0ac6e348eabbe608e5a0efce6

Summary: Changing `xnnpack.passes` to `xnnpack._passes` to indicate that these passes are not covered under the API stability guarantee. Pull Request resolved: #5917 Reviewed By: Olivia-liu, helunwencser Differential Revision: D63925008 fbshipit-source-id: 3d9f13c0a3bd61c66d07cebd62047a3e24f8af1d

Summary: Fix XNNPack tutorial to use `source_fn_stack` instead of `source_fn`. Pull Request resolved: #5948 Reviewed By: dvorjackz Differential Revision: D63950962 fbshipit-source-id: 5b4ced1c7edee4f5d60e9bffb8ab7a4a82788fcb

Summary: Add missing `export_for_training` import in bundledio tutorial. Pull Request resolved: #5949 Reviewed By: Olivia-liu, dvorjackz Differential Revision: D63951201 fbshipit-source-id: 42648d6d3e09edfd77714cf4a474d64dcd6c5e07

Summary: Bumps [numpy](https://github.com/numpy/numpy) from 1.21.3 to 1.22.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/numpy/numpy/releases">numpy's releases</a>.</em></p> <blockquote> <h2>v1.22.0</h2> <h1>NumPy 1.22.0 Release Notes</h1> <p>NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:</p> <ul> <li>Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.</li> <li>A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.</li> <li>NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.</li> <li>New methods for <code>quantile</code>, <code>percentile</code>, and related functions. The new methods provide a complete set of the methods commonly found in the literature.</li> <li>A new configurable allocator for use by downstream projects.</li> </ul> <p>These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.</p> <p>The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.</p> <h2>Expired deprecations</h2> <h3>Deprecated numeric style dtype strings have been removed</h3> <p>Using the strings <code>"Bytes0"</code>, <code>"Datetime64"</code>, <code>"Str0"</code>, <code>"Uint32"</code>, and <code>"Uint64"</code> as a dtype will now raise a <code>TypeError</code>.</p> <p>(<a href="https://redirect.github.com/numpy/numpy/pull/19539">gh-19539</a>)</p> <h3>Expired deprecations for <code>loads</code>, <code>ndfromtxt</code>, and <code>mafromtxt</code> in npyio</h3> <p><code>numpy.loads</code> was deprecated in v1.15, with the recommendation that users use <code>pickle.loads</code> instead. <code>ndfromtxt</code> and <code>mafromtxt</code> were both deprecated in v1.17 - users should use <code>numpy.genfromtxt</code> instead with the appropriate value for the <code>usemask</code> parameter.</p> <p>(<a href="https://redirect.github.com/numpy/numpy/pull/19615">gh-19615</a>)</p> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/numpy/numpy/commit/4adc87dff15a247e417d50f10cc4def8e1c17a03"><code>4adc87d</code></a> Merge pull request <a href="https://redirect.github.com/numpy/numpy/issues/20685">#20685</a> from charris/prepare-for-1.22.0-release</li> <li><a href="https://github.com/numpy/numpy/commit/fd66547557f57c430d41be2fc0764f74a62e8ccf"><code>fd66547</code></a> REL: Prepare for the NumPy 1.22.0 release.</li> <li><a href="https://github.com/numpy/numpy/commit/125304b035effcd82e366e601b102e7347eaa9ba"><code>125304b</code></a> wip</li> <li><a href="https://github.com/numpy/numpy/commit/c283859128b1a4b57014581570a23ed7950a24ea"><code>c283859</code></a> Merge pull request <a href="https://redirect.github.com/numpy/numpy/issues/20682">#20682</a> from charris/backport-20416</li> <li><a href="https://github.com/numpy/numpy/commit/5399c03d4a069fe81a1616be0184c9749d7271ee"><code>5399c03</code></a> Merge pull request <a href="https://redirect.github.com/numpy/numpy/issues/20681">#20681</a> from charris/backport-20954</li> <li><a href="https://github.com/numpy/numpy/commit/f9c45f8ebf31340b1a5a0371bfca25afcfc4794e"><code>f9c45f8</code></a> Merge pull request <a href="https://redirect.github.com/numpy/numpy/issues/20680">#20680</a> from charris/backport-20663</li> <li><a href="https://github.com/numpy/numpy/commit/794b36f7e1bf2a8c42774ab0db86a74bd32f674b"><code>794b36f</code></a> Update armccompiler.py</li> <li><a href="https://github.com/numpy/numpy/commit/d93b14e3d7abaa1d837825e51671f817788e120f"><code>d93b14e</code></a> Update test_public_api.py</li> <li><a href="https://github.com/numpy/numpy/commit/7662c0789cc6a70d5ad4d950ee2e95f3afef7df6"><code>7662c07</code></a> Update <strong>init</strong>.py</li> <li><a href="https://github.com/numpy/numpy/commit/311ab52488a7d096ac3bc4c2de0fdae17ecd13ef"><code>311ab52</code></a> Update armccompiler.py</li> <li>Additional commits viewable in <a href="https://github.com/numpy/numpy/compare/v1.21.3...v1.22.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=numpy&package-manager=pip&previous-version=1.21.3&new-version=1.22.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) You can trigger a rebase of this PR by commenting `dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `dependabot rebase` will rebase this PR - `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `dependabot merge` will merge this PR after your CI passes on it - `dependabot squash and merge` will squash and merge this PR after your CI passes on it - `dependabot cancel merge` will cancel a previously requested merge and block automerging - `dependabot reopen` will reopen this PR if it is closed - `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/pytorch/executorch/network/alerts). </details> > **Note** > Automatic rebases have been disabled on this pull request as it has been open for over 30 days. Pull Request resolved: #4514 Reviewed By: tarun292 Differential Revision: D64004841 Pulled By: larryliu0820 fbshipit-source-id: 30509aaa9b7b74ab7377e66430447cdd545c03e5

Summary: Pull Request resolved: #5941 Pull Request resolved: #5875 as title also bump pytorch pin clone of D63845664, this time does not export to a fork Reviewed By: mergennachin, tugsbayasgalan Differential Revision: D63994058 fbshipit-source-id: 9a616770607ea73c71465d2728406b35151f81f0

Summary: Pull Request resolved: #5952 Reviewed By: kirklandsign Differential Revision: D64005568 fbshipit-source-id: 7cd8ab9fe33d5745064aca7720d34ce1d9f4f06b

Summary: Pull Request resolved: #5953 Improve the logic used to generate a debug handle for delegate nodes. The existing logic is causing conflicts with other nodes debug handles. Reviewed By: Olivia-liu Differential Revision: D63779522 fbshipit-source-id: ff1583ee21bf05748c91a1c8b8e489996f80d7d9

Summary: Pull Request resolved: #5904 Move the Arm backend out of the `torch::` namespace, and update to avoid using the `torch::` or `exec_aten::` namespaces. Also, move the VelaBinStream code into a namespace to avoid declaring symbols in the global namespace. Reviewed By: mergennachin Differential Revision: D63923290 fbshipit-source-id: a98e4c1ede8072a9fd96fb7fc2e2fcc4e82bc28a

Summary: * Rename `_portable_lib.cpython-3.<distribution info>.so` to `_portable_lib.so` so it can be found by CMake `find_library()`. This can be achieved by setting `SETUPTOOLS_EXT_SUFFIX`. * Since `executorch-config.cmake` is also being used to find installed libraries such as `executorch.a`, `xnnpack_backend.a`, add a condition to tell if `executorch-config.cmake` is being used in cmake-out or site-packages. Pull Request resolved: #5961 Reviewed By: metascroy Differential Revision: D64014291 Pulled By: larryliu0820 fbshipit-source-id: 2757f2883d3f836e9efd45676f792c12f742e63d

dbort and others added 30 commits September 20, 2024 12:01

update copy_offset to new layout specifier gen & axis mapping (#5505)

c50f9fe

Summary: Pull Request resolved: #5505 Reviewed By: SS-JIA Differential Revision: D63000056 fbshipit-source-id: 959127e874b30c7ebc069499d99e8c5881b3b272

Fix optimized kernels build. (#5534)

3ec4161

Summary: Pull Request resolved: #5534 Reviewed By: dltn Differential Revision: D63199241 fbshipit-source-id: 9b097c3cc6c492fc13d9183c8679de9d00c80d21

Fix tensor cloning when data is null. (#5535)

45210bb

Summary: Pull Request resolved: #5535 . Reviewed By: dltn Differential Revision: D63201286 fbshipit-source-id: 1767a1c0cf876f7a3b6b4534a83c912c3de0eabf

Fix Xcode project. (#5539)

55d6b0d

Summary: Pull Request resolved: #5539 Reviewed By: dltn Differential Revision: D63222601 fbshipit-source-id: 9a0bd6e10f7d2d2c769b7617e0b0e605af616eb0

Support bfloat16 in op_index_put (#5500)

badd76e

Summary: Pull Request resolved: #5500 ghstack-source-id: 243857969 Reviewed By: digantdesai, larryliu0820 Differential Revision: D63057744 fbshipit-source-id: 9e1fb6f6479adb1575c5aed61b9da3c774586ba3

jackzhxng and others added 27 commits October 5, 2024 16:44

Release docs proofreading (#5909)

17c2f36

Summary: Pull Request resolved: #5909 Reviewed By: mergennachin Differential Revision: D63930714 Pulled By: dvorjackz fbshipit-source-id: 46134ac5750315606dffbf246da55e712dd192e8

Revert "Add quantize option to the coreml script (#5710)" (#5906)

c06a708

Summary: Debug coreml failure, no need to review Pull Request resolved: #5906 Reviewed By: cccclai Differential Revision: D63950443 Pulled By: huydhn fbshipit-source-id: 5c2c2ad0b140bf9b33d52c9631ff3cc4a576210f

update the tested qnn version (#5903)

e194feb

Summary: Pull Request resolved: #5903 As title, they were tested in more recent version. 2.13 likely breaks Reviewed By: mergennachin Differential Revision: D63922556 fbshipit-source-id: 926fe6fd172d65f02d470f689016c687f1504647

Add documentation for the apple benchmarking app. (#5935)

af6f3ed

Summary: Pull Request resolved: #5935 . Reviewed By: kirklandsign Differential Revision: D63988442 fbshipit-source-id: a0517166adc9ef3a1b2ee26d6003293b96ff7314

Update README.md (#5945)

0a11e99

Summary: Fix ExecuTorch capitalization. Pull Request resolved: #5945 Reviewed By: kirklandsign Differential Revision: D63997153 Pulled By: shoumikhin fbshipit-source-id: 5a3bd82d295c709ff9217b6325582e5cc2fab09c

Enable uint8 dtype in shaders (#5932)

a9cbb38

Summary: Pull Request resolved: #5932 TSIA ghstack-source-id: 246607061 exported-using-ghexport bypass-github-export-checks Reviewed By: SS-JIA Differential Revision: D63918660 fbshipit-source-id: 3034b31899e8079ccb1d443bade4a0185997bb7a

Use source_fn_stack in xnnpack tutorial (#5948)

c86d0d0

Summary: Fix XNNPack tutorial to use `source_fn_stack` instead of `source_fn`. Pull Request resolved: #5948 Reviewed By: dvorjackz Differential Revision: D63950962 fbshipit-source-id: 5b4ced1c7edee4f5d60e9bffb8ab7a4a82788fcb

Update docs on Module new APIs. (#5952)

0424eef

Summary: Pull Request resolved: #5952 Reviewed By: kirklandsign Differential Revision: D64005568 fbshipit-source-id: 7cd8ab9fe33d5745064aca7720d34ce1d9f4f06b

Merge remote-tracking branch 'origin/main' into mtk-5

fe2b138

Update build/build_android_llm_demo.sh

57f0201

Should still use executorch-llama.aar

954c78e

Clean up cmakelist

d4784a5

kirklandsign had a problem deploying to upload-benchmark-results October 8, 2024 01:07 — with GitHub Actions Failure

kirklandsign closed this Oct 29, 2024

kirklandsign deleted the mtk-5 branch October 29, 2024 22:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MTK update rule #5776

MTK update rule #5776

Uh oh!

kirklandsign commented Sep 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

45 participants