Skip to content

Conversation

jblomer
Copy link
Contributor

@jblomer jblomer commented Nov 29, 2024

Demonstrates how to use the RNTuple API to build a special purpose container that reads large vectors piece-wise.

Original idea & code from @gemmeren

@jblomer jblomer self-assigned this Nov 29, 2024
@jblomer jblomer requested a review from couet as a code owner November 29, 2024 10:01
@jblomer jblomer marked this pull request as draft November 29, 2024 10:01
@jblomer jblomer force-pushed the ntuple-tutorial-streaming-vector branch from 77f0ca8 to 26e598e Compare November 29, 2024 10:10
Make RNTupleGlobalRange and RNTupleClusterRange copyable and
default-constructible.
Workaround. The underlying problem, moving the RField, is still broken.
To be addressed at a later point.
@jblomer jblomer force-pushed the ntuple-tutorial-streaming-vector branch from 26e598e to 570e576 Compare November 29, 2024 11:38
Copy link
Member

@vepadulano vepadulano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very nice addition thanks! A few considerations from my side


public:
iterator(RNTupleClusterRange::RIterator index, RNTupleView<T> &view) : fIndex(index), fView(view) {}
~iterator() = default;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have to define this destructor, we should follow rule of five

Comment on lines +116 to +120
RStreamingVector(RNTupleCollectionView &&vectorView)
: fVectorView(std::move(vectorView)), fItemView(fVectorView.GetView<T>("_0"))
{
}
~RStreamingVector() = default;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment about rule of five

/// \file
/// \ingroup tutorial_ntuple
///
/// Example of a streaming vector: a special purpose container that reads large vectors piece-wise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the tutorial is lacking some explanation of the interplay between the various components. Especially in places like RStreamingVector::iterator constructor and operator*. We should probably point the user to also check the functioning of RNTupleCollectionView to understand better the rest of the tutorial

Copy link

Test Results

    18 files      18 suites   3d 23h 25m 59s ⏱️
 2 688 tests  2 687 ✅ 0 💤 1 ❌
46 547 runs  46 544 ✅ 0 💤 3 ❌

For more details on these failures, see this check.

Results for commit 570e576.

public:
class iterator {
RNTupleClusterRange::RIterator fIndex;
RNTupleView<T> &fView;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should store a pointer to make the type copyable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be copyable though? To me the semantics would be a bit fuzzy (would the copy automatically get the same window of entries loaded in? Also, when you copy a std::vector you do a deep copy, but in this case to my understanding it wouldn't)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that hint that the class is more of a RStreamingVectorView and/or RVectorStreamingView?

bool operator!=(const iterator &rh) const { return fIndex != rh.fIndex; }
};

RStreamingVector(RNTupleCollectionView &&vectorView)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should take the view by value (cf previous discussion on taking move-only types by value vs by xvalue reference)


void ReadRNTupleStreamingVector()
{
auto reader = RNTupleReader::Open(kNTupleName, kFileName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should open the reader with disabled cluster cache, otherwise all pages will still send up in memory I think

@couet couet removed their request for review December 4, 2024 09:40
@jblomer
Copy link
Contributor Author

jblomer commented Aug 26, 2025

Replaced by #19748

@jblomer jblomer closed this Aug 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants