Skip to content

Conversation

@dranikpg
Copy link
Contributor

@dranikpg dranikpg commented Nov 5, 2025

Ok, so this changed a lot. Add active path for serializing hashes:

  • Generic EstimateSerializedSize
  • Generic Serialize
  • Propagating ExternalRep in CompactObj, including cooling
  • Handle upload

Signed-off-by: Vladislav Oleshko <[email protected]>
Signed-off-by: Vladislav Oleshko <[email protected]>
Signed-off-by: Vladislav Oleshko <[email protected]>
@dranikpg dranikpg requested a review from romange November 18, 2025 14:46
@dranikpg dranikpg marked this pull request as ready for review November 18, 2025 14:46
romange
romange previously approved these changes Nov 19, 2025

// Prequisite: IsCool() is true.
// Keeps cool record only as external value and discard in-memory part.
void KeepExternal(size_t offset, size_t sz);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KeepExternal is not clear imho.

Freeze - a more creative option, and maybe DropCool is more direct explaining what it does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.. The names just become more and more cryptic 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed

Comment on lines 531 to 532
auto estimated = EstimateSerializedSize(*value);
if (OccupiesWholePages(estimated->first)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug, we have to compute it not by value->Size(), but by the estimated serialization size. This might become an issue with large maps

Signed-off-by: Vladislav Oleshko <[email protected]>
Comment on lines +92 to +94
if (pv.IsInline())
return {};
return std::make_pair(pv.GetRawString().view().size(), CompactObj::ExternalRep::STRING);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we can't call GetRawString unconditionally. I'll try to make it nicer in the future, for now it just becomes too much for a single PR

@dranikpg dranikpg requested a review from romange November 20, 2025 11:47
Copy link
Collaborator

@romange romange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a big PR, I wish it could be split into multiple parts.

{"third key", "third value"},
{"fourth key", "fourth value"},
{"fifth key", "fifth value"}};
const vector<std::pair<string, string>> kBase = {{"first key", "first value"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious, what's the reason for string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turing listpack into vector<p<string_view, string_view>> is even more code, so I just changed the type to string. In the future, the serialized map constructor needs to be updated eitehr way to be copy-less, but its too much for this PR. Its all temporary

if (pv.Encoding() == kEncodingListPack) {
auto* lp = static_cast<uint8_t*>(pv.RObjPtr());
size_t bytes = lpBytes(lp);
bytes += lpLength(lp) * 2 * 4;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment explaining the formula?

// TODO(vlad): Maybe split into different accessors?
// Do NOT enforce rules depending on dynamic runtime values as this is called
// when scheduling stash and just before succeeeding and is expected to return the same results
optional<pair<size_t /*size*/, CompactObj::ExternalRep>> EstimateSerializedSize(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does more than EstimateSize, and with having optional and precise ExternalRep the behavior is confusing. Maybe call it GetSerializationDescriptor which will return

struct SerializationDescriptor {
size estimated_size;
CompactObj::ExternalRep repr;

and add NONE to ExternalRep enum? or alternatively add is_valid() { return size> 0; } and return size=0 for unfit objects.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I am not a big fan of chaining optionals—or other types like std::expected—one within another, especially when we control the wrapped class and it can describe the "undef" state itself. For example, in our codebase, we have using Result = std::optional; where ResultType is another optional, or std::optional<facade::ErrorReply> where ErrorReply can hold an empty state. These levels of indirection decrease readability, imho.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree here, but in a properly structured code base type composition is always almost preferable

stats->tiered_used_bytes += segment.length;
stats_.total_stashes++;

CompactObj::ExternalRep rep = EstimateSerializedSize(*pv)->second;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ADD a DCHECK before dereferencing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the nullopt access with throw an assert either way

@dranikpg dranikpg requested a review from romange November 21, 2025 09:41
@dranikpg
Copy link
Contributor Author

Removed the optional and just keeped a pair

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants