Skip to content

[skip-ci][DF] Add translations of TTree::Draw to RDataFrame#20284

Merged
martamaja10 merged 1 commit intoroot-project:masterfrom
martamaja10:rosetta
Nov 7, 2025
Merged

[skip-ci][DF] Add translations of TTree::Draw to RDataFrame#20284
martamaja10 merged 1 commit intoroot-project:masterfrom
martamaja10:rosetta

Conversation

@martamaja10
Copy link
Contributor

@martamaja10 martamaja10 commented Nov 4, 2025

First attempt of adding some "Rosetta stone" table. Open to discussion what else to include already now and what should be removed (note the last two rows of the table). The location of this table can also be modified.

Screenshots updated after second round of fixes (post Stephan's and Vincenzo's review)

Screenshot 2025-11-06 at 11 30 12

@martamaja10 martamaja10 changed the title [NFC][DF] Add translations of TTree::Draw to RDataFrame [skip-ci][DF] Add translations of TTree::Draw to RDataFrame Nov 4, 2025
@github-actions
Copy link

github-actions bot commented Nov 4, 2025

Test Results

0 tests   0 ✅  0s ⏱️
0 suites  0 💤
0 files    0 ❌

Results for commit 10c84e0.

Copy link
Member

@hageboeck hageboeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks!

@martamaja10 martamaja10 self-assigned this Nov 5, 2025
@silverweed
Copy link
Contributor

Would it make sense to have 3 columns on the table: one with TTree::Draw, one with "RDF one-liner" (interpreted, as few intermediate variables as possible) and one with "RDF fully compiled", with proper lambdas/types and not as brief as the second column?

@martamaja10
Copy link
Contributor Author

Would it make sense to have 3 columns on the table: one with TTree::Draw, one with "RDF one-liner" (interpreted, as few intermediate variables as possible) and one with "RDF fully compiled", with proper lambdas/types and not as brief as the second column?

Thanks for this comment Jack! You're right, maybe having the minimal version here could be better and then showing the fully compiled version separately would also make sense. What I fear with 3 columns here is that maybe it will make it hard to read/comprehend quickly. Maybe this PR should focus on the simplest translation and then we add on top of it later? What do you think @vepadulano @hageboeck ?

@hageboeck
Copy link
Member

hageboeck commented Nov 5, 2025

Maybe this PR should focus on the simplest translation and then we add on top of it later? What do you think vepadulano hageboeck ?

We could surely merge it now with these two columns, wait one night for it to show up in the docs, and subsequently test if the space allows us to have the third column.
Vincenzo had the same question for a Python column, so we would arrive already at 4 columns.

Maybe one could make a table with the most concise one-liners (actually, one statement), in C++ and Python. And then we could think about a second table with the "expert" fully compiled versions?
The reason I make a distinction between one-liner and one statement is that this:

auto histo = df.Filter(...)
               .Define(....)
               .Histo1D(...);

seems easier to present and read than:

auto histo = df.Filter(...).Define(....).Histo1D(...);

@hageboeck hageboeck added the skip ci Skip the full builds on the actions runners label Nov 5, 2025
@silverweed
Copy link
Contributor

silverweed commented Nov 5, 2025

Let's not forget that if the table is meant to be viewed on a webpage we can also have it dynamic (keep it with 2 columns but let the user decide which "flavor" of RDF is displayed with a dropdown menu or similar) - similarly to what e.g. Windows does for the WPF docs (notice the little dropdown menu that lets you select the programming language)

@ferdymercury
Copy link
Collaborator

Thanks for this!

It's not for TTree::Draw but it would be nice to also document a translation of:

tree->SetBranchAddress("time", &time);
auto prevTime = 0;
for(Long64_t i = 0; i<nEntries; ++i) {
    tree->GetEntry(i);
    hist->Fill(time - prevTime);
    prevTime = time;
}

(if a histogram of time differences of consecutive entries can be achieved at all with non-multithreaded RDF)

@hageboeck
Copy link
Member

Thanks for this!

It's not for TTree::Draw but it would be nice to also document a translation of:

tree->SetBranchAddress("time", &time);
auto prevTime = 0;
for(Long64_t i = 0; i<nEntries; ++i) {
    tree->GetEntry(i);
    hist->Fill(time - prevTime);
    prevTime = time;
}

(if a histogram of time differences of consecutive entries can be achieved at all with non-multithreaded RDF)

Yes, you can, but you need a mutable lambda:

  ROOT::RDataFrame rdf(10);
  unsigned long long old = 0;
  auto histo = rdf.Define("timediff", [old = 0](unsigned long long entry) mutable {
      const auto result = entry - old;
      old = entry;
      return result;
      }, {"rdfentry_"}).Histo1D({"name", "title", 10, 0, 10}, "timediff");
image

@vepadulano
Copy link
Member

Maybe this PR should focus on the simplest translation and then we add on top of it later? What do you think @vepadulano @hageboeck ?

I believe this PR should show the simplest/shortest way to represent in RDF an equivalent TTree::Draw query. Doing it in C++ or Python would be equivalent, it's fine to start with C++. Probably the string-expressions are the shortest way of representing the TTree::Draw queries in RDF (being those queries string expressions in the first place). If in any case having a C++ lambda makes the code shorter, then it's also fine to use that.

Copy link
Member

@vepadulano vepadulano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! This is great! I have left a few comments

Copy link
Member

@vepadulano vepadulano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing thanks!

@pcanal
Copy link
Member

pcanal commented Nov 6, 2025

This is a great start! Thanks. For later PRs, there are several other cases (see for example dt_DrawTest.C that we should also tackle (in particular array element selection and indexing).

@martamaja10 martamaja10 marked this pull request as ready for review November 7, 2025 07:48
@martamaja10 martamaja10 merged commit 5b1edbc into root-project:master Nov 7, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip ci Skip the full builds on the actions runners

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

Comments