fix(deps): update dependency torch to v1.13.1#183
Merged
github-actions[bot] merged 1 commit intomasterfrom Mar 8, 2026
Merged
Conversation
Contributor
Author
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
2.8.0→1.13.1Release Notes
pytorch/pytorch (torch)
v1.13.1: PyTorch 1.13.1 Release, small bug fix releaseCompare Source
This release is meant to fix the following issues (regressions / silent correctness):
The release tracker should contain all relevant pull requests related to this release as well as links to related issues
v1.13.0: PyTorch 1.13: beta versions of functorch and improved support for Apple’s new M1 chips are now availableCompare Source
Pytorch 1.13 Release Notes
Highlights
We are excited to announce the release of PyTorch 1.13! This includes stable versions of BetterTransformer. We deprecated CUDA 10.2 and 11.3 and completed migration of CUDA 11.6 and 11.7. Beta includes improved support for Apple M1 chips and functorch, a library that offers composable vmap (vectorization) and autodiff transforms, being included in-tree with the PyTorch release. This release is composed of over 3,749 commits and 467 contributors since 1.12.1. We want to sincerely thank our dedicated community for your contributions.
Summary:
The BetterTransformer feature set supports fastpath execution for common Transformer models during Inference out-of-the-box, without the need to modify the model. Additional improvements include accelerated add+matmul linear algebra kernels for sizes commonly used in Transformer models and Nested Tensors is now enabled by default.
Timely deprecating older CUDA versions allows us to proceed with introducing the latest CUDA version as they are introduced by Nvidia®, and hence allows support for C++17 in PyTorch and new NVIDIA Open GPU Kernel Modules.
Previously, functorch was released out-of-tree in a separate package. After installing PyTorch, a user will be able to
import functorchand use functorch without needing to install another package.PyTorch is offering native builds for Apple® silicon machines that use Apple's new M1 chip as a beta feature, providing improved support across PyTorch's APIs.
You can check the blogpost that shows the new features here.
Backwards Incompatible changes
Python API
uint8 and all integer dtype masks are no longer allowed in Transformer (#87106)
Prior to 1.13,
key_padding_maskcould be set to uint8 or other integer dtypes inTransformerEncoderandMultiheadAttention, which might generate unexpected results. In this release, these dtypes are not allowed for the mask anymore. Please convert them totorch.boolbefore using.1.12.1
1.13
Updated
torch.floor_divideto perform floor division (#78411)Prior to 1.13,
torch.floor_divideerroneously performed truncation division (i.e. truncated the quotients). In this release, it has been fixed to perform floor division. To replicate the old behavior, usetorch.divwithrounding_mode='trunc'.1.12.1
1.13
Fixed
torch.index_selecton CPU to error that index is out of bounds when thesourcetensor is empty (#77881)Prior to 1.13,
torch.index_selectwould return an appropriately sized tensor filled with random values on CPU if the source tensor was empty. In this release, we have fixed this bug so that it errors out. A consequence of this is thattorch.nn.Embeddingwhich utilizesindex_selectwill error out rather than returning an empty tensor whenembedding_dim=0andinputcontains indices which are out of bounds. The old behavior cannot be reproduced withtorch.nn.Embedding, however since an Embedding layer withembedding_dim=0is a corner case this behavior is unlikely to be relied upon.1.12.1
1.13
Disallow overflows when tensors are constructed from scalars (#82329)
Prior to this PR, overflows during tensor construction from scalars would not throw an error. In 1.13, such cases will error.
1.12.1
1.13
Error on indexing a cpu tensor with non-cpu indices (#69607)
Prior to 1.13,
cpu_tensor[cuda_indices]was a valid program that would return a cpu tensor. The original use case for mixed device indexing was fornon_cpu_tensor[cpu_indices], and allowing the opposite was unintentional (cpu_tensor[non_cpu_indices]). This behavior appears to be rarely used, and a refactor of our indexing kernels made it difficult to represent an op that takes in (cpu_tensor, non_cpu_tensor) and returns another cpu_tensor, so it is now an error.To replicate the old behavior for
base[indices], you can ensure that eitherindiceslives on the CPU device, orbaseandindicesboth live on the same device.1.12.1
1.13
Remove deprecated
torch.eig,torch.matrix_rank,torch.lstsq(#70982, #70981, #70980)The deprecation cycle for the above functions has been completed and they have been removed in the 1.13 release.
torch.nn
Enforce that the
biashas the same dtype asinputandweightfor convolutions on CPU (#83686)To align with the implementation on other devices, the CPU implementation for convolutions was updated to enforce that the
dtypeof thebiasmatches thedtypeof theinputandweight.1.12.1
1.13
Autograd
Disallow setting the
.dataof a tensor thatrequires_grad=Truewith an integer tensor (#78436)Setting the
.dataof a tensor thatrequires_gradwith an integer tensor now raises an error.1.12.1
1.13
Added variable_list support to ExtractVariables struct (#84583)
Prior to this change, C++ custom autograd Function considers tensors passed in TensorList to not be tensors for the purposes of recording the backward graph. After this change, custom Functions that receive TensorList must modify their backward functions to also compute gradients for these additional tensor inputs. Note that this behavior now differs from that of custom autograd Functions in Python.
1.12.1
1.13
Don't detach when making views; force kernel to detach (#84893)
View operations registered as CompositeExplicitAutograd kernels are no longer allowed to return input tensors as-is. You must explicitly create a new tensor (e.g., using
.alias()).1.12.1
1.13
ONNX
torch.onnx.register_custom_op_symbolicnow only registers the symbolic function at the specified opset version (#85636)This updates
register_custom_op_symbolic's behavior to only register the symbolic function at a single version. This is more aligned with the semantics of the API signature. Previously the API registers a symbolic function to all versions up to the specified version. As a result of this change, users will need to register a symbolic function to the exact version when they want to override an existing symbolic function. Users are not affected if (1) an implementation does not exist for the op, or (2) the symbolic function is already registering to the exact version for export.1.12.1
1.13
Default ONNX opset is updated to 14 (#83284)
The update is done in regularly to ensure we are in sync with the onnx updates. Users can specify
opset_versionintorch.onnx.exportto maintain opset version 13.torch.onnx.symbolic_registryis removed (#84382)We removed the
symbolic_registrymodule and hid it as an internal implementation detail. Users previously relying on theregister_opfunction to register custom symbolic functions should move to use thetorch.onnx.register_custom_op_symbolicAPI.ScalarTypeand global variables intorch.onnx.symbolic_helperare removed (#82995)The
ScalarTypeclass intorch.onnx.symbolic_helper, along with the global variablescast_pytorch_to_onnx,pytorch_name_to_type,scalar_name_to_pytorch,scalar_type_to_onnxandscalar_type_to_pytorch_typeare removed from the module. Users previously using these global variables for PyTorch JIT-ONNX type conversion in symbolic functions should move to use thetorch.onnx.JitScalarTypeclass.1.12.1
1.13
Distributed
In c10d collectives, input tensors dtype must now be the same (#84664)
We added a check to validate all dtype across all input tensors. Previously, users were allowed to pass in tensors with diferent dtypes for c10d collectives. Now, passing in tensors with different dtypes will throw a RuntimeError with the following message: “Invalid usage of tensors with different dtypes Found
torch.floatandtorch.half”. Users can usetensor.to(dtype={some_dtype})to fix this.1.12.1
1.13
Users doing wildcard imports of torch.distributed.distributed_c10d will no longer get non-public symbols (#84872)
We limit the usage of c10d APIs to public APIs, so if a user does a wildcard import and calls an internal API, it will fail. Please see the example below:
1.12.1
1.13
Process Group C++ extensions must use absolute path when importing ProcessGroup.hpp (#86257), ProcessGroup::Work object moved out of work to its own Work class (#83680):
Details of the changes and the updated tutorial can be found in the PyTorch tutorial PR #2099
1.12.1
1.13
Quantization
Add required
example_argsargument toprepare_fxandprepare_qat_fx(#249) (#77608)We added an additional required
example_inputsargument toprepare_fxandprepare_qat_fxAPIs, this can be used to do type inference to figure out the type information for each of the fx Node in the graph.1.12.1
1.13
Stop moving models to CPU in quantization convert (#80555)
Previously, we automatically moved the model to CPU in
torch.ao.quantization.fx.convertto work around the issue where certain functions called by convert expect CPU arguments. This commit pushes this responsibility to the caller since it is the user's decision of which device to use.1.12.1
1.13
Replace the
is_referenceflag of thetorch.ao.quantize_fx.convert_fxfunction with theconvert_to_referencefunction (#80091, #81326)This PR removes the is_reference flag from the existing
convert_fxAPI and replaces it with a newconvert_to_referencefunction. This separates (1) converting the prepared model to a reference model from (2) lowering the reference model to a quantized model, enabling users to call their custom lowering function forcustom backends.
1.12.1
1.13
Add default configs for fixed qparams ops (#80184)
This commit adds qconfigs with special observers for fixed qparams ops (operators whose corresponding quantized version has fixed quantized parameters for output) like sigmoid in
get_default_qconfig_mappingandget_default_qat_qconfig_mapping. For correctness, we also require users to use these special observers if we detect these fixed qparams ops in prepare.1.12.1 (fails after this PR):
1.13
Replace
qconfig_dictwith a typedQConfigMappingobject (#78452, #79618)Previously, FX graph mode quantization configurations were specified through a dictionary of qconfigs. However, this
API was not in line with other core APIs in PyTorch. This commit replaces this dictionary with a config object that users will
create and pass to prepare and convert. This leads to better type safety and better user experience in notebook settings
due to improved auto completion.
1.12.1 (deprecated)
1.13
Replace
*custom_config_dictwith typed config objects (#79066)This commit replaces the following config dicts with python objects:
This leads to better type safety and better user experience in
notebook settings due to improved auto completion.
1.12.1
1.13
Remove
remove_quant_dequant_pairsand fix tests (#84203)This PR removed some passes in
convert_fx, and also fixes the way we quantize layer_norm operator, so theqconfigfor layer_norm op needs to be updated as well.1.12.1
1.13
Align observer dtype with reference model spec (#85345)
Before this PR, the
dtypeattribute of observers was not clearly defined. It originally meantinterface_dtypein the eager mode workflow, which is how the codebase before this PR is using it. In the new reference model spec,dtypeattribute of an observer represents thedtypevalue which needs to be passed into aquantizefunction in the reference model spec. This PR aligns the codebase to this definition ofdtype.1.12.1
1.13
Composability
Changed the backend C++ kernel representation for some operators that take in lists of tensors (#73350)
If an operator in ATen takes in a list of tensors, and is marked as “structured” in native_functions.yaml (example), then previously, TensorList was represented as
at::TensorList, orc10::ArrayRef<at::Tensor>. Now, it is represented as a more efficient type:const ITensorListRef&.1.12.1
1.13
C++ API
Lowered randint default dtype to the C++ API (#81410)
Prior to 1.13, the default for the
dtypeargument oftorch.randint,torch.long, was set via manual python binding. However, in the C++ API,torch::randintwould default to the global default data type, which is usuallyfloat. In 1.13 we changed the default fordtypein the C++ API toint64in order to match the python API. To reproduce the old behavior, one can set thedtypeargument.1.12.1
1.13
Enabled
dim=Nonefortorch.{std, var, std_mean, var_mean}(#81845, #82765, #82912)Prior to 1.13, a C++ API call that has argument types
torch::{std, var, std_mean, var_mean}(Tensor, OptionalIntArrayRef, int64_t, bool)used to resolve to the{std, var, std_mean, var_mean}.correctionoverload. In this release, it resolves to the{std, var, std_mean, var_mean}.dimoverload. With the.correctionoverload, the third argument of typeint64_tcould be used to pass a correction δN other than 1. In order to call the{std, var, std_mean, var_mean}.correctionoverload in 1.13, the oldint64_targument can be wrapped in ac10::optional.1.12.1
1.13
Deprecations
Distributed
We are deprecating the following APIs of c10d:
*_coalescedAPIs (#85959),*_multigpuAPIs (#85961) andProcessGroupRoundRobin(#85158)We added warnings when users call c10d’s
*_coalesced,*_multigpuandProcessGroupRoundRobinAPIs. Previously, users can use these APIs without any warnings but now they will see warnings like “torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions”. There are still workarounds for*_coalescedAPIs but no workarounds will be provided for the other two.1.12.1
1.13
We are deprecating passing
optim_inputinto the FSDP optimizer state checkpointing APIs. The user can simply not pass theoptim_inputargument, and all behavior is preserved. No fix is needed from users side for now.1.12.1
1.13
LinAlg
Deprecate torch.lu in favor of linalg.lu_factor (#77636)
The new operation has a cleaner API and better docs. The update rule is as follows:
1.12.1
1.13
Deprecate torch.lu_solve in favor of linalg.lu_solve(#77637)
The new operation has a notation consistent with
linalg.solve, and has an extra parameteradjoint=False. The update rule is as follows:1.12.1
1.13
ONNX
Monkey patched convenience method on
torch._C.Graph,torch._C.Blockandtorch._C.Nodeare deprecated. (#83006)Deprecated methods include
Graph.op(),Graph.constant(),Graph.at(),Block.op(), andNode.__getitem__(). Previously, these methods are patched into the classes above when users calltorch.onnx.export()and are typically used in custom symbolic functions. Users can continue to expectg.op()andg.at()in symbolic functions to work. Thegparameter has been substituted by theGraphContextobject (#84728). The methods are now exposed by theGraphContextclass with APIs unchanged. Users should not rely on theGraph.op(),Graph.constant(),Graph.at(),Block.op(),Node.__getitem__()methods when they are directly interacting with the C classes. Users should use only theop()andat()methods of theGraphContextobject, as other fields in the class will change in future releases.New features
Python API
scatter_addon CUDA for all input sizes (#79466)torch.concatenatethat aliasestorch.cat(#85073)Tensor.is_cpu()that returns whether a tensor is on CPU (#78887)forcekwarg toTensor.numpy()that enables returning a numpyndarraythat does not share storage with the tensor (#78564)torch.special.{airy_ai, bessel_j0, bessel_j1, bessel_y0, bessel_y1, modified_bessel_i0, modified_bessel_i1, modified_bessel_k0, modified_bessel_k1, scaled_modified_bessel_k0, scaled_modified_bessel_k1, spherical_bessel_j0}(#78900), (#78901), (#78902), (#78912), (#78451)torch.special.{chebyshev_polynomial_t, chebyshev_polynomial_u, chebyshev_polynomial_v, chebyshev_polynomial_w, hermite_polynomial_h, hermite_polynomial_he, laguerre_polynomial_l, legendre_polynomial_p, shifted_chebyshev_polynomial_t, shifted_chebyshev_polynomial_u, shifted_chebyshev_polynomial_v, shifted_chebyshev_polynomial_w}(#78196), (#78293), (#78304), (#78366), (#78352), (#78357)weights_onlyoption totorch.loadthat restricts load to state_dict only, enabling safe loading. This can also be set using theTORCH_FORCE_WEIGHTS_ONLY_LOADenvironment variable (#86812)Build
-Werror=unused-but-set-variablebuild flag (#79305)-Werror=type-limitsin Bazel CPU build (#79139)-Werror=unused-variablein Bazel CPU build (#79156)-Wconstant-conversionto catch errors detected in #75400 (#80461)-Werror=non-virtual-dtorbuild flag (#81012)-Wunused-local-typedefbuild flag (#86154)Complex
torch.{index_select, index_add}(#79217), (#79897).torch.roll(#79970),torch.fft.{fftshift, ifftshift}(#79970),torch.{acos, acosh, asinh, atanh}, (#80030),torch.{cos, sinh, cosh, tanh}(#78718),torch.sqrt, rsqrt(#77490),torch.{triu, tril, diag, trace}(#78062).torch.where(#78665),torch.{where, pow, masked_fill, sgn, tan, angle}(#78665)torch.nn.ConvTranspose1d(#79694).torch.nn
popfunction tonn.Sequentialandnn.ModuleList(#81601)nn.Module(#80811)torch.optim
maximizekwarg foroptim.SparseAdam(#80336),optim.ASGD(#81875),optim.Rprop(#81864),optim.RMSprop(#80326)differentiablekwargoptim.SGD(#80938),optim.Adam(#82205),optim.RMSprop(#83578)optim.Adam(#80279),optim.AdamW(#80280),optim.Adamax(#80319),optim.RMSprop(#83860),optim.Rprop(#83858),optim.{RMSprop, ASGD}(#83860), (#84472)optim.lr_scheduler.PolynomialLR(#82769)BetterTransformer
ForEach
foreachmaximumandminimum(#82523)LinAlg
linalg.lu_solve,linalg.solve_ex,linalg.vecdot,linalg.vander(#77634, #80073, #70542, #76303)Sparse
torch.sparse.spdiagsfor easier creation of diagonal sparse matrices (#78439)torch.fx
JIT
torch.ops.nvprimsnamespace for nvFuser-specific prims (#82155)conv_transpose2d.input, convolution, convolution_backward(#77283, #83557, #80860)aten::_convolutionwhen it is 2D conv in NNC (#84038)ProcessGroup::Work.wait()API to TorchScript (#83303)ONNX
prim::PythonOpfor Autograd Function Export (#74765)AMD
CUDA
Intel
MPS
aten::index_add.outoperator for MPS backend (#79935)aten::prelu operatorfor MPS backend (#82401)aten::bitwise-notoperator native support for MPS backend (#83678)aten::tensor::index_putoperator for MPS backend (#85672)aten::upsample_nearest1doperator for MPS backend (#81303)aten::bitwise_{and|or|xor}operators for MPS backend (#82307)Configuration
📅 Schedule: Branch creation - At 12:00 AM through 04:59 AM and 10:00 PM through 11:59 PM, Monday through Friday ( * 0-4,22-23 * * 1-5 ), Only on Sunday and Saturday ( * * * * 0,6 ) (UTC), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled because a matching PR was automerged previously.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.