Skip to content

Commit b8c0b33

Browse files
cthifacebook-github-bot
authored andcommitted
Split the quantize_ops_gpu (#4842)
Summary: Pull Request resolved: #4842 X-link: facebookresearch/FBGEMM#1870 It is finally time to split this infamous target :) - OSS should be minimally impacted, there was one small change I made to merge `cublas_utils.h` into the one kernel that uses it. Reviewed By: q10 Differential Revision: D82031570 fbshipit-source-id: 4c9d0a94a56e117777aa998f497dc2822f96c7ae
1 parent 2c8ef86 commit b8c0b33

File tree

2 files changed

+14
-24
lines changed

2 files changed

+14
-24
lines changed

fbgemm_gpu/experimental/gen_ai/src/quantize/cublas_utils.h

Lines changed: 0 additions & 22 deletions
This file was deleted.

fbgemm_gpu/experimental/gen_ai/src/quantize/cutlass_extensions/f8f8bf16_cublas.cu

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,25 @@
1010
#include <ATen/cuda/CUDAContext.h>
1111
#include <c10/core/ScalarType.h>
1212
#include <c10/cuda/CUDAGuard.h>
13-
14-
#include "cublas_utils.h"
13+
#include <cublas_v2.h>
1514

1615
namespace fbgemm_gpu {
1716

1817
#if CUDART_VERSION >= 12000
1918

19+
#define CUBLAS_WORKSPACE_SIZE 4194304
20+
21+
namespace {
22+
23+
inline void checkCublasStatus(cublasStatus_t status) {
24+
if (status != CUBLAS_STATUS_SUCCESS) {
25+
printf("cuBLAS API failed with status %d\n", status);
26+
throw std::logic_error("cuBLAS API failed");
27+
}
28+
}
29+
30+
} // namespace
31+
2032
at::Tensor f8f8bf16_cublas(
2133
at::Tensor A, // FP8
2234
at::Tensor B, // FP8

0 commit comments

Comments
 (0)