feat(index): add write function for Serialize#1291
Conversation
Reviewer's GuideThis PR adds a new serialization API that accepts a user‐provided write function. It extends the Index interface, provides implementations in InnerIndexInterface and IndexImpl, updates algorithm feature flags to advertise support, and adds a test to verify write‐function–based serialization and round‐trip deserialization. Sequence diagram for serialization using a user-provided write functionsequenceDiagram
participant User
participant IndexImpl
participant InnerIndexInterface
participant WriteFuncStreamWriter
User->>IndexImpl: Serialize(write_func)
IndexImpl->>InnerIndexInterface: Serialize(write_func)
InnerIndexInterface->>WriteFuncStreamWriter: Create writer with write_func
InnerIndexInterface->>WriteFuncStreamWriter: Serialize(writer)
WriteFuncStreamWriter->>User: write_func(offset, size, data)
Note over User,WriteFuncStreamWriter: Data is written via user-provided function
ER diagram for updated IndexFeature enumerDiagram
INDEX_FEATURE {
SUPPORT_METRIC_TYPE_COSINE string
SUPPORT_SERIALIZE_FILE string
SUPPORT_SERIALIZE_BINARY_SET string
SUPPORT_SERIALIZE_WRITE_FUNC string
SUPPORT_DESERIALIZE_FILE string
SUPPORT_DESERIALIZE_BINARY_SET string
SUPPORT_DESERIALIZE_READER_SET string
}
ALGORITHM ||--o{ INDEX_FEATURE : supports
Class diagram for new and updated serialization methods in Index and related classesclassDiagram
class Index {
<<abstract>>
+Serialize() tl::expected<BinarySet, Error>
+Serialize(WriteFuncType write_func) tl::expected<void, Error>
}
class IndexImpl {
+Serialize() tl::expected<BinarySet, Error>
+Serialize(WriteFuncType write_func) tl::expected<void, Error>
+Serialize(std::ostream& out_stream) tl::expected<void, Error>
}
class InnerIndexInterface {
+Serialize() tl::expected<BinarySet, Error>
+Serialize(WriteFuncType write_func) void
+Serialize(std::ostream& out_stream) void
+Serialize(StreamWriter& writer) void
}
Index <|-- IndexImpl
IndexImpl o-- InnerIndexInterface
class WriteFuncType {
std::function<void(OffsetType, SizeType, const void*)>
}
class OffsetType {
uint64_t
}
class SizeType {
uint64_t
}
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary of ChangesHello @LHT129, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- Align the WriteFuncType Serialize overload’s error handling with the tl::expected return pattern instead of throwing runtime exceptions for unsupported indexes.
- Extract common serialization logic between the ostream and WriteFuncType overloads into a shared helper to avoid duplication in InnerIndexInterface.
- Verify that all index implementations supporting file-based serialization also advertise SUPPORT_SERIALIZE_WRITE_FUNC to keep feature flags consistent.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Align the WriteFuncType Serialize overload’s error handling with the tl::expected return pattern instead of throwing runtime exceptions for unsupported indexes.
- Extract common serialization logic between the ostream and WriteFuncType overloads into a shared helper to avoid duplication in InnerIndexInterface.
- Verify that all index implementations supporting file-based serialization also advertise SUPPORT_SERIALIZE_WRITE_FUNC to keep feature flags consistent.
## Individual Comments
### Comment 1
<location> `src/algorithm/inner_index_interface.cpp:147-153` </location>
<code_context>
}
+void
+InnerIndexInterface::Serialize(WriteFuncType write_func) const {
+ std::string time_record_name = this->GetName() + " Serialize";
+ SlowTaskTimer t(time_record_name);
+
+ uint64_t num_bytes = this->CalSerializeSize();
+ WriteFuncStreamWriter writer(write_func, 0);
+ this->Serialize(writer);
+}
+
</code_context>
<issue_to_address>
**issue:** Missing error handling for WriteFuncType serialization.
This method should return a tl::expected or similar error type to match the error handling approach used in other Serialize methods.
</issue_to_address>
### Comment 2
<location> `src/index/index_impl.h:380-381` </location>
<code_context>
}
+ [[nodiscard]] tl::expected<void, Error>
+ Serialize(WriteFuncType write_func) const override {
+ SAFE_CALL(this->inner_index_->Serialize(write_func));
+ }
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Return value from SAFE_CALL is not propagated.
Please return the result of inner_index_->Serialize to ensure error information is not lost.
</issue_to_address>
### Comment 3
<location> `src/algorithm/inner_index_interface.h:329-333` </location>
<code_context>
virtual void
Serialize(std::ostream& out_stream) const;
+ virtual void
+ Serialize(WriteFuncType write_func) const;
+
virtual void
</code_context>
<issue_to_address>
**issue:** Inconsistent return type for Serialize with WriteFuncType.
Standardize the return type of Serialize(WriteFuncType) to match other implementations and improve error handling.
</issue_to_address>
### Comment 4
<location> `include/vsag/index.h:649-652` </location>
<code_context>
+ *
+ * @param write_func is a function to write serialized index
+ */
+ [[nodiscard]] virtual tl::expected<void, Error>
+ Serialize(WriteFuncType write_func) const {
+ throw std::runtime_error("Index doesn't support Serialize with write function");
+ }
</code_context>
<issue_to_address>
**suggestion:** Default implementation throws runtime_error but returns tl::expected.
Instead of throwing, return tl::unexpected(Error) to align with tl::expected error handling.
```suggestion
[[nodiscard]] virtual tl::expected<void, Error>
Serialize(WriteFuncType write_func) const {
return tl::unexpected(Error("Index doesn't support Serialize with write function"));
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| InnerIndexInterface::Serialize(WriteFuncType write_func) const { | ||
| std::string time_record_name = this->GetName() + " Serialize"; | ||
| SlowTaskTimer t(time_record_name); | ||
|
|
||
| uint64_t num_bytes = this->CalSerializeSize(); | ||
| WriteFuncStreamWriter writer(write_func, 0); | ||
| this->Serialize(writer); |
There was a problem hiding this comment.
issue: Missing error handling for WriteFuncType serialization.
This method should return a tl::expected or similar error type to match the error handling approach used in other Serialize methods.
| Serialize(WriteFuncType write_func) const override { | ||
| SAFE_CALL(this->inner_index_->Serialize(write_func)); |
There was a problem hiding this comment.
issue (bug_risk): Return value from SAFE_CALL is not propagated.
Please return the result of inner_index_->Serialize to ensure error information is not lost.
| virtual void | ||
| Serialize(std::ostream& out_stream) const; | ||
|
|
||
| virtual void | ||
| Serialize(WriteFuncType write_func) const; |
There was a problem hiding this comment.
issue: Inconsistent return type for Serialize with WriteFuncType.
Standardize the return type of Serialize(WriteFuncType) to match other implementations and improve error handling.
There was a problem hiding this comment.
Code Review
This pull request introduces a new serialization method using a write function, which is a great addition for flexibility. The implementation is generally clean and includes a good test case. I have two suggestions for improvement: one is to remove an unused variable, and the other is to align the default implementation of the new virtual function with its return type to avoid throwing exceptions, which would improve the API's robustness.
| Serialize(WriteFuncType write_func) const { | ||
| throw std::runtime_error("Index doesn't support Serialize with write function"); | ||
| } |
There was a problem hiding this comment.
The default implementation of this function throws a std::runtime_error, but its signature indicates it should return a tl::expected<void, Error>. While this seems to be a pattern in this class, it's generally better to return an error value that matches the function signature, like tl::unexpected, to avoid surprising callers who might not expect an exception from a function returning tl::expected. This would make the interface safer to use directly.
| Serialize(WriteFuncType write_func) const { | |
| throw std::runtime_error("Index doesn't support Serialize with write function"); | |
| } | |
| Serialize(WriteFuncType write_func) const { | |
| return tl::unexpected(vsag::Error(vsag::ErrorType::UNSUPPORTED_INDEX_OPERATION, "Index doesn't support Serialize with write function")); | |
| } |
| std::string time_record_name = this->GetName() + " Serialize"; | ||
| SlowTaskTimer t(time_record_name); | ||
|
|
||
| uint64_t num_bytes = this->CalSerializeSize(); |
1e9fbdf to
39aac7b
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## main #1291 +/- ##
==========================================
+ Coverage 91.28% 91.99% +0.71%
==========================================
Files 320 320
Lines 17693 17702 +9
==========================================
+ Hits 16151 16285 +134
+ Misses 1542 1417 -125
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
3807d74 to
2d355a8
Compare
Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
closed: #1282
Summary by Sourcery
Add support for serializing indexes via a custom write function
New Features:
Tests: