-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[llvm] Add API to get decompressed size of a zstd compressed buffer. #107020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-llvm-support Author: Udit Agarwal (uditagarwal97) ChangesUnlike This PR adds a wrapper around this API in Full diff: https://github.com/llvm/llvm-project/pull/107020.diff 3 Files Affected:
diff --git a/llvm/include/llvm/Support/Compression.h b/llvm/include/llvm/Support/Compression.h
index 2a8da9e96d356f..da7b6f88d955b0 100644
--- a/llvm/include/llvm/Support/Compression.h
+++ b/llvm/include/llvm/Support/Compression.h
@@ -71,6 +71,8 @@ Error decompress(ArrayRef<uint8_t> Input, uint8_t *Output,
Error decompress(ArrayRef<uint8_t> Input, SmallVectorImpl<uint8_t> &Output,
size_t UncompressedSize);
+// Get the size of the decompressed data.
+Error getDecompressedSize(ArrayRef<uint8_t> Input, size_t &UncompressedSize);
} // End of namespace zstd
enum class Format {
diff --git a/llvm/lib/Support/Compression.cpp b/llvm/lib/Support/Compression.cpp
index badaf68ab59cd0..ab012dfb5612b1 100644
--- a/llvm/lib/Support/Compression.cpp
+++ b/llvm/lib/Support/Compression.cpp
@@ -224,6 +224,20 @@ Error zstd::decompress(ArrayRef<uint8_t> Input,
return E;
}
+Error zstd::getDecompressedSize(ArrayRef<uint8_t> Input,
+ size_t &UncompressedSize) {
+
+ unsigned long long Res = ZSTD_getFrameContentSize(Input.data(), Input.size());
+
+ // ZSTD_getFrameContentSize returns unsigned long long, but the size
+ // of uncompressed data should be bounded by size_t.
+ UncompressedSize = static_cast<size_t>(Res);
+
+ return ZSTD_isError(Res) ? make_error<StringError>(ZSTD_getErrorName(Res),
+ inconvertibleErrorCode())
+ : Error::success();
+}
+
#else
bool zstd::isAvailable() { return false; }
void zstd::compress(ArrayRef<uint8_t> Input,
@@ -240,4 +254,8 @@ Error zstd::decompress(ArrayRef<uint8_t> Input,
size_t UncompressedSize) {
llvm_unreachable("zstd::decompress is unavailable");
}
+Error zstd::getDecompressedSize(ArrayRef<uint8_t> Input,
+ size_t &UncompressedSize) {
+ llvm_unreachable("zstd::getDecompressedSize is unavailable");
+}
#endif
diff --git a/llvm/unittests/Support/CompressionTest.cpp b/llvm/unittests/Support/CompressionTest.cpp
index 5d326cafbe3a1c..058be206d28b1f 100644
--- a/llvm/unittests/Support/CompressionTest.cpp
+++ b/llvm/unittests/Support/CompressionTest.cpp
@@ -73,8 +73,15 @@ static void testZstdCompression(StringRef Input, int Level) {
SmallVector<uint8_t, 0> Uncompressed;
zstd::compress(arrayRefFromStringRef(Input), Compressed, Level);
+ // Check that getDecompressedSize returns the size of the original buffer.
+ size_t DecompressedSize;
+ Error E = zstd::getDecompressedSize(Compressed, DecompressedSize);
+ EXPECT_FALSE(std::move(E));
+
+ EXPECT_EQ(DecompressedSize, Input.size());
+
// Check that uncompressed buffer is the same as original.
- Error E = zstd::decompress(Compressed, Uncompressed, Input.size());
+ E = zstd::decompress(Compressed, Uncompressed, Input.size());
EXPECT_FALSE(std::move(E));
EXPECT_EQ(Input, toStringRef(Uncompressed));
|
|
Ping |
1 similar comment
|
Ping |
llvm/lib/Support/Compression.cpp
Outdated
| return E; | ||
| } | ||
|
|
||
| Error zstd::getDecompressedSize(ArrayRef<uint8_t> Input, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more conventional to return Expected<uint64_t> and avoid output parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied in c9046d7
|
Do you have an in-tree use case for this API? If not, I don't think we want to add a wrapper here. |
There are efforts going on to upstream SYCL RT to LLVM (https://discourse.llvm.org/t/rfc-sycl-runtime-upstreaming/74479/4). In SYCL offloading, we use Without any API to get decompressed size (proposed changes in this PR), the component using ZSTD compression has to always pass on the decompressed size (original blob size) to the component doing decompression - that's what we currently do in Currently, in SYCL RT, we had to link ZSTD library along with |
|
As an alternative, you might use |
Unlike
zlib(AFAIK),zstdstores the size of the original, uncompressed buffer in the header and provides the following API to retrieve it:This PR adds a wrapper around this API in
llvm::compression::zstdnamespace so that the users won't have to keep track of the original, uncompressed size when passing around zstd-compressed buffers.