Skip to content

Conversation

@manuelcandales
Copy link
Contributor

Enables Float16 in the metal backend

Copilot AI review requested due to automatic review settings November 21, 2025 20:31
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 21, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15947

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 21, 2025
@manuelcandales manuelcandales requested review from Gasoonjia and removed request for cccclai, Copilot, jackzhxng and shoumikhin November 21, 2025 20:33
Copilot finished reviewing on behalf of manuelcandales November 21, 2025 20:35
@manuelcandales manuelcandales added the release notes: none Do not include this in the release notes label Nov 21, 2025
Copilot AI review requested due to automatic review settings November 21, 2025 20:50
Copilot finished reviewing on behalf of manuelcandales November 21, 2025 20:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables Float16 (Half precision) support in the Metal backend, complementing the existing BFloat16 support. The changes add Float16 as a supported data type across the Metal backend infrastructure and provide conversion utilities for transforming Float32 tensors to Float16.

Key Changes:

  • Added convert_to_float16 utility function mirroring the existing convert_to_bfloat16 pattern
  • Enabled Float16 dtype code (5) in Metal backend type system and validation
  • Extended Metal operations (matrix multiplication, convolution, attention) to handle Float16 data
  • Updated CI/CD workflows to test both float16 and bfloat16 dtypes

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
extension/llm/runner/util.h Added convert_to_float16 helper function for Float→Half tensor conversion
extension/asr/runner/runner.cpp Added Float16 conversion support in ASR audio feature preprocessing
backends/apple/metal/runtime/shims/utils.h Uncommented FLOAT16 enum value to enable Float16 dtype
backends/apple/metal/runtime/shims/utils.cpp Added Float16 to supported dtypes validation
backends/apple/metal/runtime/shims/et_metal_ops.mm Added Float16 handling in mm, convolution, and attention operations
backends/aoti/utils.h Added dtype code 5 mapping to ScalarType::Half
backends/aoti/common_shims.h Added aoti_torch_dtype_float16 function declaration
backends/aoti/common_shims.cpp Implemented aoti_torch_dtype_float16 returning dtype code 5
.github/workflows/metal.yml Added dtype matrix parameter (float16/bfloat16) for testing
.github/workflows/cuda.yml Added bfloat16 dtype parameter to script calls for consistency
.ci/scripts/export_model_artifact.sh Added dtype parameter and validation to export script

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
script: |
source .ci/scripts/test_model_e2e.sh cuda "${{ matrix.model.repo }}/${{ matrix.model.name }}" "${{ matrix.quant }}" "${RUNNER_ARTIFACT_DIR}"
source .ci/scripts/test_model_e2e.sh cuda bfloat16 "${{ matrix.model.repo }}/${{ matrix.model.name }}" "${{ matrix.quant }}" "${RUNNER_ARTIFACT_DIR}"
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script test_model_e2e.sh is being called with a new signature that includes dtype as the second parameter: test_model_e2e.sh cuda bfloat16 "${{ matrix.model.repo }}/${{ matrix.model.name }}" "${{ matrix.quant }}". However, based on the current implementation of test_model_e2e.sh (lines 60-62), it expects the old signature: <device> <hf_model> <quant_name> [model_dir]. This mismatch will cause the script to fail as it will interpret bfloat16 as the HF model name and the actual model name as the quant parameter. The test_model_e2e.sh script needs to be updated to accept and handle the dtype parameter in the same way export_model_artifact.sh was updated.

Copilot uses AI. Check for mistakes.
echo "::endgroup::"
${CONDA_RUN} bash .ci/scripts/test_model_e2e.sh metal "${{ matrix.model.repo }}/${{ matrix.model.name }}" "${{ matrix.quant }}" "${RUNNER_ARTIFACT_DIR}"
${CONDA_RUN} bash .ci/scripts/test_model_e2e.sh metal "${{ matrix.dtype }}" "${{ matrix.model.repo }}/${{ matrix.model.name }}" "${{ matrix.quant }}" "${RUNNER_ARTIFACT_DIR}"
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script test_model_e2e.sh is being called with a new signature that includes dtype as the second parameter: test_model_e2e.sh metal "${{ matrix.dtype }}" "${{ matrix.model.repo }}/${{ matrix.model.name }}" "${{ matrix.quant }}". However, the current implementation of test_model_e2e.sh (lines 60-62) expects the old signature: <device> <hf_model> <quant_name> [model_dir]. This mismatch will cause the script to fail as it will interpret dtype as the HF model name and the actual model name as the quant parameter. The test_model_e2e.sh script needs to be updated to accept and handle the dtype parameter consistently with how export_model_artifact.sh was updated.

Copilot uses AI. Check for mistakes.
@manuelcandales manuelcandales had a problem deploying to upload-benchmark-results November 21, 2025 21:39 — with GitHub Actions Failure
ET_LOG(Debug, "aoti_torch_mps_mm_out: self_tensor scalar_type=%d, SupportedDTypes::FLOAT32=%d, SupportedDTypes::FLOAT16=%d, SupportedDTypes::BFLOAT16=%d",
dtype, static_cast<int32_t>(SupportedDTypes::FLOAT32), static_cast<int32_t>(SupportedDTypes::FLOAT16), static_cast<int32_t>(SupportedDTypes::BFLOAT16));

if (dtype == static_cast<int32_t>(SupportedDTypes::FLOAT32)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very well familiar with ET coding practice, but considering you have no modify it in two places, why not have to_mps_dtype(SupportedDtypes) inline function and call it here and few hundrend lines down below?

Comment on lines 201 to 203
} // namespace llm
} // namespace extension
} // namespace executorch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: if ET is a C++17 compatible project, why not use nested namespaces?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants