Skip to content

Conversation

@twoeths
Copy link
Collaborator

@twoeths twoeths commented Dec 22, 2025

Motivation

  • we tightly couple all TreeView implementations with BaseTreeView which make it difficult to enhance it

    • we are not able to extend it, for example to store extra fields like length or cached readonly array in child TreeViews, see feat: impl getAll / push / sliceTo / sliceFrom API for list tree view #100 (comment)
    • when accessing a child we return a shallow copy of TreeView using {.base_view=...} which contains potential bugs in terms of double free or dangling pointer. Instead we could/should store reference of the child TreeView itself and just return it
    • it's not easy to implement new types, for example ContainerNodeStruct
  • we are not able to get a child TreeView, modify it and let parent commit() like in the current lodestar. We can resolve that by storing chid TreeViews as references instead

  • code isolation: implementation for a TreeView should be isolated to its own struct only, in the end I get it really close to our typescript implementation

This PR implements ContainerTreeView using tuple (instead of Map) which gives us more freedom for other TreeViews: given a comptime index we can get the type of child_data[index]

Description

isolate each TreeView implementation to its own struct

  • implement ContainerTreeView using tuple
    • store child TreeView as either reference or native basic type inside the tuple
  • refactor basic array:
    • track a Map of nodes
    • add length fields so that we only set to cache per commit()
  • refactor composite array
    • track a Map of child TreeView reference + Map of nodes
    • add length fields so that we only set to cache per commit()
  • only organize reusable code where it makes sense, for example getChildNode() setChildNode() getLength() setLength()
  • drop BaseTreeView

part of #78

@gemini-code-assist
Copy link

Summary of Changes

Hello @twoeths, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the ContainerTreeView implementation to align with the ssz-ts design principles. The primary goal is to enhance the flexibility and extensibility of TreeView implementations by removing the tight coupling with BaseTreeView. This is achieved by directly managing child TreeView instances or basic type nodes within the container, introducing a change tracking mechanism for efficient updates, and updating the get and set operations accordingly. The changes facilitate future enhancements and better support for complex SSZ types.

Highlights

  • Decoupling from BaseTreeView: The ContainerTreeView implementation has been refactored to remove its tight coupling with BaseTreeView, allowing for more specialized and flexible data structure management.
  • Direct Child Management: Child TreeView instances or basic type Node.Ids are now stored directly within the ContainerTreeView using a dynamically generated tuple (child_data), improving control and performance over child elements.
  • Optimized Commit Mechanism: A new changed hash map tracks modifications to child fields, enabling commit operations to efficiently update only the affected parts of the Merkle tree, rather than recomputing the entire structure.
  • Alignment with ssz-ts: This refactoring brings the Zig implementation of ContainerTreeView closer to the design patterns found in the ssz-ts library, addressing previous limitations and enhancing compatibility.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant refactoring of the ContainerTreeView implementation to align it with the ssz-ts version, decoupling it from BaseTreeView. The changes are extensive and introduce a more self-contained and feature-rich ContainerTreeView.

I've found a couple of critical issues related to memory safety and correctness in the new implementation that need to be addressed. One is a bug in the commit function that could lead to reading uninitialized memory, and the other is an unsafe shallow copy in the set function for composite types, which could cause memory corruption. I've also included a medium-severity suggestion regarding adherence to the project's style guide for struct initialization, which could improve performance and reduce stack usage.

@twoeths twoeths marked this pull request as ready for review December 24, 2025 08:22
@twoeths twoeths requested a review from a team as a code owner January 4, 2026 08:12
.pool = self.pool,
.child_data = .{null} ** ST.chunk_count,
.original_nodes = .{null} ** ST.chunk_count,
.root = self.root,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we copy self.root without ref, will we run into double unrefs if the original self.root was unrefed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, resolved it via calling init() instead 27d8022

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the TreeView implementations to better align with the TypeScript ssz-ts library, removing the tightly coupled BaseTreeView abstraction and giving each TreeView type more freedom in its implementation.

Key Changes:

  • Removed BaseTreeView and TreeViewData abstractions in favor of isolated TreeView implementations
  • Changed TreeView.init() to return pointers (*Self) instead of values (Self) for better memory management
  • Implemented ContainerTreeView using tuples to store child data/views instead of hash maps
  • Added utility modules for shared functionality (clone_opts.zig, child_nodes.zig, assert.zig)
  • Refactored list and array views to track length internally and only update tree on commit()

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/int/ssz/tree_view/list_composite.zig Updated to use pointer-based TreeView API, changed getRoot() calls, and fixed deref patterns
test/int/ssz/tree_view/list_basic.zig Updated getAll() to not require allocator parameter, changed to pointer-based API
test/int/ssz/tree_view/container.zig Updated Field() to return pointer types for composite fields, adjusted getRoot() usage
test/int/ssz/tree_view/bit_vector.zig Changed from base_view.data to direct data field access
test/int/ssz/tree_view/bit_list.zig Changed from base_view.data to direct data field access
test/int/ssz/tree_view/array_composite.zig Updated to pointer-based API and removed unnecessary deinit calls for borrowed references
test/int/ssz/tree_view/array_basic.zig Updated getAll() signature and pointer-based API usage
src/ssz/tree_view/utils/clone_opts.zig New utility file defining CloneOpts struct for clone operations
src/ssz/tree_view/utils/child_nodes.zig New utility file with shared child node management functions
src/ssz/tree_view/utils/assert.zig New utility file for compile-time TreeView type assertions
src/ssz/tree_view/root.zig Removed BaseTreeView and TreeViewData exports
src/ssz/tree_view/list_composite.zig Refactored to store chunks directly and track length internally
src/ssz/tree_view/list_basic.zig Refactored to store chunks directly and track length internally
src/ssz/tree_view/container.zig Complete rewrite using tuple-based child storage with test added
src/ssz/tree_view/chunks.zig Refactored BasicPackedChunks and CompositeChunks to be self-contained
src/ssz/tree_view/bit_vector.zig Refactored to store BitArray data directly
src/ssz/tree_view/bit_list.zig Refactored to store BitArray data directly
src/ssz/tree_view/bit_array.zig Refactored BitArray to be self-contained with own fields
src/ssz/tree_view/base.zig Deleted file - BaseTreeView and TreeViewData removed
src/ssz/tree_view/array_composite.zig Refactored to store chunks directly
src/ssz/tree_view/array_basic.zig Refactored to store chunks directly
src/ssz/root.zig Removed BaseTreeView and TreeViewData exports

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@spiral-ladder spiral-ladder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review with @wemeetagain on a recorded call

Comment on lines 195 to 197
fn getLength(self: *Self) !usize {
return self.chunks.getLength();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn getLength(self: *Self) !usize {
return self.chunks.getLength();
}

Consider removing this entirely since we only call this during init(). Prefer self.chunks.getLength() directly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in ee71eb2

Comment on lines 204 to 206
if (self._len > ST.limit) {
return error.LengthOverLimit;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (self._len > ST.limit) {
return error.LengthOverLimit;
}
if (self._len >= ST.limit) {
return error.LengthOverLimit;
}

Seems like a redundant check since every update to self._len should include this check by correctness (which we do already check when we update it above) + this should be inclusive of ST.limit. If we want to keep this, we can turn this into an assertion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in cd9e0ff

if (existing) |child_value| {
return child_value;
} else {
const node = try self.root.getNodeAtDepth(self.pool, ST.chunk_depth, field_index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const node = try self.root.getNodeAtDepth(self.pool, ST.chunk_depth, field_index);
const node = try self.root.getNodeAtDepth(self.pool, ST.chunk_depth, field_index);
errdefer self.pool.unref(node);

missing errdefer here + other similar areas

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we have to errdefer unref() here, the get() api simply returns a borrowed reference to the child the getNodeAtDepth() does not create a new ref at all

Comment on lines +156 to +159
try child_view.commit();
const child_changed = if (self.original_nodes[i]) |orig_node| blk: {
break :blk orig_node != child_view.getRoot();
} else true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try child_view.commit();
const child_changed = if (self.original_nodes[i]) |orig_node| blk: {
break :blk orig_node != child_view.getRoot();
} else true;
const child_changed = try child_view.commit();

Could we perhaps return a bool here instead of using original_nodes to track if the child changed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, it involves changes across all TreeViews and could avoid having to store child_nodes
will track as a separate issue

Comment on lines +122 to +127
self.child_data[i] = null;
}
}
inline for (0..ST.chunk_count) |i| {
// these nodes are unref by root
self.original_nodes[i] = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.child_data[i] = null;
}
}
inline for (0..ST.chunk_count) |i| {
// these nodes are unref by root
self.original_nodes[i] = null;
}
}
inline for (0..ST.chunk_count) |i| {
// these nodes are unref by root

Do we need to set these to null if we're only calling this at deinit? Plus, could we perhaps just inline this function since deinit() is the only place we use this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to set these to null if we're only calling this at deinit?

yes, deinit is for cleaning everything, plus we don't want to track a dangling pointers there where child TreeViews are also deinited

Plus, could we perhaps just inline this function since deinit() is the only place we use this?

that's part of the step when you look at different TreeView structs so prefer to leave it there
I make them private to make it easier to reason about through see 28c0ca6

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps rename this to vector_basic.zig? Same comment with array_composite.zig

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not part of this work, happy to do it in a separate PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants