feat(c_api): add Error_t instead bool as return result#1299
feat(c_api): add Error_t instead bool as return result#1299LHT129 merged 1 commit intoantgroup:mainfrom
Conversation
Reviewer's GuideThe PR refactors the C API to return a structured Error_t instead of bool, adding status codes and helper functions for error conversion, updating all API functions to return detailed errors or SUCCESS, and adjusting tests accordingly. Entity relationship diagram for Error_t and API error codeserDiagram
ERROR_T {
int code
char message
}
ERROR_CODE {
int VSAG_SUCCESS
int VSAG_UNKNOWN_ERROR
int VSAG_INTERNAL_ERROR
int VSAG_INVALID_ARGUMENT
int VSAG_WRONG_STATUS
int VSAG_BUILD_TWICE
int VSAG_INDEX_NOT_EMPTY
int VSAG_UNSUPPORTED_INDEX
int VSAG_UNSUPPORTED_INDEX_OPERATION
int VSAG_DIMENSION_NOT_EQUAL
int VSAG_INDEX_EMPTY
int VSAG_NO_ENOUGH_MEMORY
int VSAG_READ_ERROR
int VSAG_MISSING_FILE
int VSAG_INVALID_BINARY
}
ERROR_T ||--o| ERROR_CODE : uses code
Class diagram for updated Error_t and API function signaturesclassDiagram
class Error_t {
int code
char message[1024]
}
class VsagIndex {
+VsagIndex(std::shared_ptr<vsag::Index> index)
std::shared_ptr<vsag::Index> index_
}
vsag_index_t <|-- VsagIndex
class API_Functions {
+vsag_index_factory(const char* index_name, const char* index_param) : vsag_index_t
+vsag_index_destroy(vsag_index_t index) : Error_t
+vsag_index_build(vsag_index_t index, const float* data, const int64_t* ids, uint64_t dim, uint64_t count) : Error_t
+vsag_index_knn_search(vsag_index_t index, const float* query, uint64_t dim, uint64_t k, float* results, int64_t* ids) : Error_t
+vsag_serialize_file(vsag_index_t index, const char* file_path) : Error_t
+vsag_deserialize_file(vsag_index_t index, const char* file_path) : Error_t
}
VsagIndex <.. API_Functions : uses
Error_t <.. API_Functions : returns
Flow diagram for error handling in API functionsflowchart TD
A[API function called] --> B{Try operation}
B -- Success --> C["Return SUCCESS (Error_t with code VSAG_SUCCESS)"]
B -- Failure (exception) --> D["Return make_error(exception)"]
B -- Failure (operation error) --> E["Return make_error(error)"]
File-Level Changes
Assessment against linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary of ChangesHello @LHT129, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the error handling mechanism within the C API by replacing boolean return values with a more informative Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- The VSAG_* error codes are defined as non-const int globals in the header, which can lead to ODR violations—consider using an enum or constexpr/static inline constants instead.
- The memcpy calls for copying error messages assume the input fits within the 1024-byte buffer; add a bound check or use strncpy/strncat patterns to avoid potential overflow.
- SUCCESS is declared as a mutable global Error_t—consider making it a constexpr or static inline constant (or provide a function) to ensure it’s immutable and safe across translation units.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The VSAG_* error codes are defined as non-const int globals in the header, which can lead to ODR violations—consider using an enum or constexpr/static inline constants instead.
- The memcpy calls for copying error messages assume the input fits within the 1024-byte buffer; add a bound check or use strncpy/strncat patterns to avoid potential overflow.
- SUCCESS is declared as a mutable global Error_t—consider making it a constexpr or static inline constant (or provide a function) to ensure it’s immutable and safe across translation units.
## Individual Comments
### Comment 1
<location> `include/vsag/vsag_c_api.h:21-35` </location>
<code_context>
extern "C" {
#include <stdint.h>
+
+int VSAG_SUCCESS = 0;
+int VSAG_UNKNOWN_ERROR = -1;
+int VSAG_INTERNAL_ERROR = -2;
+int VSAG_INVALID_ARGUMENT = -3;
+int VSAG_WRONG_STATUS = -4;
+int VSAG_BUILD_TWICE = -5;
+int VSAG_INDEX_NOT_EMPTY = -6;
+int VSAG_UNSUPPORTED_INDEX = -7;
+int VSAG_UNSUPPORTED_INDEX_OPERATION = -8;
+int VSAG_DIMENSION_NOT_EQUAL = -9;
+int VSAG_INDEX_EMPTY = -10;
+int VSAG_NO_ENOUGH_MEMORY = -11;
+int VSAG_READ_ERROR = -12;
+int VSAG_MISSING_FILE = -13;
+int VSAG_INVALID_BINARY = -14;
+
+typedef struct Error {
</code_context>
<issue_to_address>
**suggestion:** Error code constants should be declared as macros or const, not as global variables.
Using global variables for error codes can cause unintended modifications and linkage problems. Prefer #define macros or const int to keep error codes immutable and properly scoped.
```suggestion
#define VSAG_SUCCESS 0
#define VSAG_UNKNOWN_ERROR -1
#define VSAG_INTERNAL_ERROR -2
#define VSAG_INVALID_ARGUMENT -3
#define VSAG_WRONG_STATUS -4
#define VSAG_BUILD_TWICE -5
#define VSAG_INDEX_NOT_EMPTY -6
#define VSAG_UNSUPPORTED_INDEX -7
#define VSAG_UNSUPPORTED_INDEX_OPERATION -8
#define VSAG_DIMENSION_NOT_EQUAL -9
#define VSAG_INDEX_EMPTY -10
#define VSAG_NO_ENOUGH_MEMORY -11
#define VSAG_READ_ERROR -12
#define VSAG_MISSING_FILE -13
#define VSAG_INVALID_BINARY -14
```
</issue_to_address>
### Comment 2
<location> `src/vsag_c_api.cpp:23` </location>
<code_context>
+#include <cstring>
#include <fstream>
+Error_t SUCCESS = {VSAG_SUCCESS, "success"};
+
class VsagIndex {
</code_context>
<issue_to_address>
**suggestion:** The global SUCCESS variable should be declared as const to prevent accidental modification.
Declaring SUCCESS as const will ensure its value cannot be changed and make its purpose as a constant clear.
```suggestion
const Error_t SUCCESS = {VSAG_SUCCESS, "success"};
```
</issue_to_address>
### Comment 3
<location> `src/vsag_c_api.cpp:38-39` </location>
<code_context>
+ Error_t err;
+ err.code = static_cast<int>(error.type);
+ const auto& msg = error.message;
+ memcpy(err.message, msg.c_str(), msg.size() + 1);
+ err.message[msg.size()] = '\0';
+ return err;
+}
</code_context>
<issue_to_address>
**issue (bug_risk):** Potential buffer overflow when copying error messages.
Copying msg.size() + 1 bytes into a 1024-byte buffer risks overflow if msg.size() > 1023. Use snprintf or std::strncpy to safely copy and null-terminate the message.
</issue_to_address>
### Comment 4
<location> `src/vsag_c_api.cpp:48-49` </location>
<code_context>
+ Error_t err;
+ err.code = static_cast<int>(vsag::ErrorType::INTERNAL_ERROR);
+ const auto* msg = e.what();
+ memcpy(err.message, msg, strlen(msg) + 1);
+ err.message[strlen(msg)] = '\0';
+ return err;
+}
</code_context>
<issue_to_address>
**issue (bug_risk):** Potential buffer overflow when copying exception messages.
If msg is longer than 1023 characters, err.message will overflow. Use snprintf or std::strncpy to safely copy and null-terminate the message within the buffer size.
</issue_to_address>
### Comment 5
<location> `src/vsag_c_api.cpp:143-91` </location>
<code_context>
vsag_serialize_file(vsag_index_t index, const char* file_path) {
try {
auto* vsag_index = static_cast<VsagIndex*>(index);
if (vsag_index != nullptr) {
std::ofstream file(file_path, std::ios::binary);
- vsag_index->index_->Serialize(file);
</code_context>
<issue_to_address>
**issue (bug_risk):** No error returned if vsag_index is nullptr.
Returning SUCCESS when vsag_index is nullptr may hide errors. Please return an error code like VSAG_INVALID_ARGUMENT in this case.
</issue_to_address>
### Comment 6
<location> `src/vsag_c_api_test.cpp:51-52` </location>
<code_context>
}
- bool ret = vsag_index_build(index, datas.data(), ids.data(), dim, num_vectors);
- REQUIRE(ret);
+ Error_t ret = vsag_index_build(index, datas.data(), ids.data(), dim, num_vectors);
+ REQUIRE(ret.code == VSAG_SUCCESS);
auto func = [&](vsag_index_t index1) {
const char* hgraph_search_parameters = R"(
</code_context>
<issue_to_address>
**suggestion (testing):** Missing negative/error case tests for vsag_index_build.
Please add tests for invalid inputs, such as null pointers, zero count, and mismatched dimensions, to ensure vsag_index_build returns the correct error code and message.
```suggestion
Error_t ret = vsag_index_build(index, datas.data(), ids.data(), dim, num_vectors);
REQUIRE(ret.code == VSAG_SUCCESS);
// Negative/Error case tests for vsag_index_build
// 1. Null datas pointer
Error_t ret_null_datas = vsag_index_build(index, nullptr, ids.data(), dim, num_vectors);
REQUIRE(ret_null_datas.code != VSAG_SUCCESS);
// 2. Null ids pointer
Error_t ret_null_ids = vsag_index_build(index, datas.data(), nullptr, dim, num_vectors);
REQUIRE(ret_null_ids.code != VSAG_SUCCESS);
// 3. Zero num_vectors
Error_t ret_zero_vectors = vsag_index_build(index, datas.data(), ids.data(), dim, 0);
REQUIRE(ret_zero_vectors.code != VSAG_SUCCESS);
// 4. Mismatched dimensions (e.g., dim = 0)
Error_t ret_zero_dim = vsag_index_build(index, datas.data(), ids.data(), 0, num_vectors);
REQUIRE(ret_zero_dim.code != VSAG_SUCCESS);
// 5. Mismatched data size (datas too small)
std::vector<float> small_datas(dim * (num_vectors - 1), 0.0f);
Error_t ret_small_datas = vsag_index_build(index, small_datas.data(), ids.data(), dim, num_vectors);
REQUIRE(ret_small_datas.code != VSAG_SUCCESS);
// 6. Mismatched data size (ids too small)
std::vector<int64_t> small_ids(num_vectors - 1, 0);
Error_t ret_small_ids = vsag_index_build(index, datas.data(), small_ids.data(), dim, num_vectors);
REQUIRE(ret_small_ids.code != VSAG_SUCCESS);
```
</issue_to_address>
### Comment 7
<location> `src/vsag_c_api_test.cpp:52` </location>
<code_context>
+ Error_t ret = vsag_index_knn_search(index1, datas.data() + i * dim, dim, topk, hgraph_search_parameters, scores.data(), results.data());
</code_context>
<issue_to_address>
**suggestion (testing):** No test coverage for error scenarios in vsag_index_knn_search.
Add tests for vsag_index_knn_search with invalid inputs to verify correct error codes and messages are returned.
</issue_to_address>
### Comment 8
<location> `src/vsag_c_api_test.cpp:86-52` </location>
<code_context>
const char* file_name = "/tmp/test_c_api.vsag";
ret = vsag_serialize_file(index, file_name);
- REQUIRE(ret);
+ REQUIRE(ret.code == VSAG_SUCCESS);
vsag_index_t index2 = vsag_index_factory(index_name, index_param);
ret = vsag_deserialize_file(index2, file_name);
</code_context>
<issue_to_address>
**suggestion (testing):** Missing tests for vsag_serialize_file error conditions.
Add tests for error scenarios, such as null index, invalid file path, and read-only file locations. Ensure the Error_t returned has the expected error code and message.
Suggested implementation:
```cpp
const char* file_name = "/tmp/test_c_api.vsag";
ret = vsag_serialize_file(index, file_name);
REQUIRE(ret.code == VSAG_SUCCESS);
vsag_index_t index2 = vsag_index_factory(index_name, index_param);
ret = vsag_deserialize_file(index2, file_name);
REQUIRE(ret.code == VSAG_SUCCESS);
REQUIRE(index2 != nullptr);
// Error scenario 1: null index
Error_t ret_null_index = vsag_serialize_file(nullptr, file_name);
REQUIRE(ret_null_index.code == VSAG_ERR_INVALID_ARGUMENT);
REQUIRE(std::string(ret_null_index.message).find("index is null") != std::string::npos);
// Error scenario 2: invalid file path
const char* invalid_file_name = "/invalid_path/test_c_api.vsag";
Error_t ret_invalid_path = vsag_serialize_file(index, invalid_file_name);
REQUIRE(ret_invalid_path.code == VSAG_ERR_IO);
REQUIRE(std::string(ret_invalid_path.message).find("failed to open file") != std::string::npos);
// Error scenario 3: read-only file location
const char* readonly_file_name = "/etc/test_c_api.vsag";
Error_t ret_readonly = vsag_serialize_file(index, readonly_file_name);
REQUIRE(ret_readonly.code == VSAG_ERR_IO);
REQUIRE(std::string(ret_readonly.message).find("permission denied") != std::string::npos);
func(index2);
```
- Ensure that the error codes `VSAG_ERR_INVALID_ARGUMENT` and `VSAG_ERR_IO` exist and are returned by `vsag_serialize_file` for these error conditions.
- The error messages checked in the tests ("index is null", "failed to open file", "permission denied") should match the actual messages returned by your implementation. Adjust the string checks if your error messages differ.
- If your test environment does not allow writing to `/etc/`, choose another read-only location for the read-only file test.
</issue_to_address>
### Comment 9
<location> `src/vsag_c_api_test.cpp:89-52` </location>
<code_context>
- REQUIRE(ret);
+ REQUIRE(ret.code == VSAG_SUCCESS);
vsag_index_t index2 = vsag_index_factory(index_name, index_param);
ret = vsag_deserialize_file(index2, file_name);
- REQUIRE(ret);
+ REQUIRE(ret.code == VSAG_SUCCESS);
REQUIRE(index2 != nullptr);
</code_context>
<issue_to_address>
**suggestion (testing):** No negative tests for vsag_deserialize_file.
Please add tests for vsag_deserialize_file that cover deserialization from a non-existent file, a corrupted file, and a null index, verifying that the returned Error_t contains the correct error code and message.
Suggested implementation:
```cpp
ret = vsag_deserialize_file(index2, file_name);
REQUIRE(ret.code == VSAG_SUCCESS);
REQUIRE(index2 != nullptr);
// Negative test: deserialization from a non-existent file
const char* non_existent_file = "/tmp/non_existent_file.vsag";
ret = vsag_deserialize_file(index2, non_existent_file);
REQUIRE(ret.code == VSAG_ERR_FILE_NOT_FOUND);
REQUIRE(std::string(ret.message).find("not found") != std::string::npos);
// Negative test: deserialization from a corrupted file
const char* corrupted_file = "/tmp/corrupted_file.vsag";
{
// Write some invalid/corrupted data to the file
FILE* f = fopen(corrupted_file, "wb");
REQUIRE(f != nullptr);
const char* bad_data = "corrupted data";
fwrite(bad_data, 1, strlen(bad_data), f);
fclose(f);
}
ret = vsag_deserialize_file(index2, corrupted_file);
REQUIRE(ret.code == VSAG_ERR_DESERIALIZE);
REQUIRE(std::string(ret.message).find("corrupt") != std::string::npos);
// Negative test: deserialization with null index
ret = vsag_deserialize_file(nullptr, file_name);
REQUIRE(ret.code == VSAG_ERR_INVALID_ARGUMENT);
REQUIRE(std::string(ret.message).find("null index") != std::string::npos);
```
- Ensure that the error codes `VSAG_ERR_FILE_NOT_FOUND`, `VSAG_ERR_DESERIALIZE`, and `VSAG_ERR_INVALID_ARGUMENT` exist and are used by `vsag_deserialize_file`.
- Make sure the error messages returned by `vsag_deserialize_file` contain the expected substrings ("not found", "corrupt", "null index").
- If the error codes or messages differ, adjust the checks accordingly.
- Clean up the corrupted file after the test if your test framework requires it.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
src/vsag_c_api.cpp
Outdated
| #include <cstring> | ||
| #include <fstream> | ||
|
|
||
| Error_t SUCCESS = {VSAG_SUCCESS, "success"}; |
There was a problem hiding this comment.
suggestion: The global SUCCESS variable should be declared as const to prevent accidental modification.
Declaring SUCCESS as const will ensure its value cannot be changed and make its purpose as a constant clear.
| Error_t SUCCESS = {VSAG_SUCCESS, "success"}; | |
| const Error_t SUCCESS = {VSAG_SUCCESS, "success"}; |
There was a problem hiding this comment.
Code Review
This pull request is a great improvement for error handling in the C API, replacing boolean return values with a more informative Error_t struct. My review focuses on ensuring the new error handling mechanism is robust and safe. I've identified a few critical issues related to potential linker errors and buffer overflows in the error message handling, which could compromise security and stability. I've also pointed out an inconsistency in error handling for deserialization and a minor improvement for a global constant. Overall, the changes are in the right direction, and with these fixes, the C API will be much more reliable.
include/vsag/vsag_c_api.h
Outdated
| int VSAG_SUCCESS = 0; | ||
| int VSAG_UNKNOWN_ERROR = -1; | ||
| int VSAG_INTERNAL_ERROR = -2; | ||
| int VSAG_INVALID_ARGUMENT = -3; | ||
| int VSAG_WRONG_STATUS = -4; | ||
| int VSAG_BUILD_TWICE = -5; | ||
| int VSAG_INDEX_NOT_EMPTY = -6; | ||
| int VSAG_UNSUPPORTED_INDEX = -7; | ||
| int VSAG_UNSUPPORTED_INDEX_OPERATION = -8; | ||
| int VSAG_DIMENSION_NOT_EQUAL = -9; | ||
| int VSAG_INDEX_EMPTY = -10; | ||
| int VSAG_NO_ENOUGH_MEMORY = -11; | ||
| int VSAG_READ_ERROR = -12; | ||
| int VSAG_MISSING_FILE = -13; | ||
| int VSAG_INVALID_BINARY = -14; |
There was a problem hiding this comment.
Defining non-const global variables in a header file will lead to multiple definition errors at link time if this header is included in more than one translation unit. These error codes should be defined in a way that prevents this, for example by using an enum.
Using an enum groups the constants logically and is a common practice for C APIs.
typedef enum VSAG_ERROR_CODE {
VSAG_SUCCESS = 0,
VSAG_UNKNOWN_ERROR = -1,
VSAG_INTERNAL_ERROR = -2,
VSAG_INVALID_ARGUMENT = -3,
VSAG_WRONG_STATUS = -4,
VSAG_BUILD_TWICE = -5,
VSAG_INDEX_NOT_EMPTY = -6,
VSAG_UNSUPPORTED_INDEX = -7,
VSAG_UNSUPPORTED_INDEX_OPERATION = -8,
VSAG_DIMENSION_NOT_EQUAL = -9,
VSAG_INDEX_EMPTY = -10,
VSAG_NO_ENOUGH_MEMORY = -11,
VSAG_READ_ERROR = -12,
VSAG_MISSING_FILE = -13,
VSAG_INVALID_BINARY = -14
} VSAG_ERROR_CODE;
src/vsag_c_api.cpp
Outdated
| err.code = static_cast<int>(error.type); | ||
| const auto& msg = error.message; | ||
| memcpy(err.message, msg.c_str(), msg.size() + 1); | ||
| err.message[msg.size()] = '\0'; |
There was a problem hiding this comment.
This block has two issues:
- Buffer Overflow Vulnerability:
memcpycan cause a buffer overflow if the error message is too long. This is a serious security risk. It's safer to usesnprintf. - Incorrect Error Code Mapping: The C API error codes are negative, but
vsag::ErrorTypevalues are positive. The error code should be negated to match the C API specification.
The line err.message[msg.size()] = '\0'; is also redundant and potentially unsafe.
| err.code = static_cast<int>(error.type); | |
| const auto& msg = error.message; | |
| memcpy(err.message, msg.c_str(), msg.size() + 1); | |
| err.message[msg.size()] = '\0'; | |
| err.code = -static_cast<int>(error.type); | |
| const auto& msg = error.message; | |
| // Use snprintf for safe string copying | |
| snprintf(err.message, sizeof(err.message), "%s", msg.c_str()); |
src/vsag_c_api.cpp
Outdated
| const auto* msg = e.what(); | ||
| memcpy(err.message, msg, strlen(msg) + 1); | ||
| err.message[strlen(msg)] = '\0'; |
There was a problem hiding this comment.
memcpy with strlen(msg) + 1 can cause a buffer overflow if the exception message from e.what() is too long. This is a serious security risk. It's safer to use snprintf to copy the string, which prevents overflows. The redundant null termination on line 49 can also cause an out-of-bounds write.
| const auto* msg = e.what(); | |
| memcpy(err.message, msg, strlen(msg) + 1); | |
| err.message[strlen(msg)] = '\0'; | |
| const auto* msg = e.what(); | |
| // Use snprintf for safe string copying | |
| snprintf(err.message, sizeof(err.message), "%s", msg); |
| @@ -130,9 +164,9 @@ vsag_deserialize_file(vsag_index_t index, const char* file_path) { | |||
| vsag_index->index_->Deserialize(file); | |||
There was a problem hiding this comment.
The return value of vsag_index->index_->Deserialize(file) is being ignored. This function returns a tl::expected<void, Error> which indicates success or failure. By ignoring it, a deserialization error would go unnoticed, and the function would incorrectly return SUCCESS. You should check the returned value and propagate the error if one occurred, similar to how vsag_serialize_file is implemented. For example:
auto deserialize_result = vsag_index->index_->Deserialize(file);
file.close();
if (!deserialize_result.has_value()) {
return make_error(deserialize_result.error());
}
src/vsag_c_api.cpp
Outdated
| #include <cstring> | ||
| #include <fstream> | ||
|
|
||
| Error_t SUCCESS = {VSAG_SUCCESS, "success"}; |
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (32.14%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. @@ Coverage Diff @@
## main #1299 +/- ##
==========================================
- Coverage 92.08% 91.98% -0.11%
==========================================
Files 320 320
Lines 17734 17751 +17
==========================================
- Hits 16331 16328 -3
- Misses 1403 1423 +20
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
36f08c1 to
fb00dcf
Compare
774a236 to
35f7ccd
Compare
Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
closed: #1296
Summary by Sourcery
Replace simple boolean returns in the C API with a structured Error_t type carrying error codes and messages, define a set of VSAG error codes, implement error conversion helpers, and update all API functions and tests to use the new error reporting mechanism
New Features:
Enhancements:
Tests: