-
Notifications
You must be signed in to change notification settings - Fork 62
Add a c++ implementation for podio-dump
#620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I wonder how an RDataFrame-based python version (with pre-compiled functions) would fare on a performance vs. comfort scale |
c7d24b2 to
6b34936
Compare
6b34936 to
04b0bd2
Compare
04b0bd2 to
b49dd73
Compare
|
I have rebased this onto the latest version of podio. The main difference to before is that There are some formatting differences wrt to the python implmentation, which are pretty much all related to the alignment of numerical outputs in the tables. Using ROOT 6.34.04 and python 3.11.11 I get the following times (simply using the test times here): Now and Before and So, depending on which ROOT backend is used the speedup is somewhere between 2x and 30x. For SIO it is another factor 10 on top of that (for the test files). Even if the speedup is "only" a factor two for the most prevalent file format at the moment, I still think this would be worth it, also considering the increased implementation side, since many things that came packaged for python are now handrolled here. I have checked |
tools/src/podio-dump-tool.cpp
Outdated
| } | ||
|
|
||
| template <typename T> | ||
| std::string getTypeString() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| std::string getTypeString() { | |
| constexpr std::string getTypeString() { |
I'm not sure this will change anything...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it a consteval std::string_view, so it's definitely evaluated at compile time.
| @@ -0,0 +1,213 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to podio-dump-legacy.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to podio-dump-legacy
tools/src/podio-dump-tool.cpp
Outdated
| const auto collNames = frame.getAvailableCollections(); | ||
| for (const auto& name : podio::utils::sortAlphabeticaly(collNames)) { | ||
| const auto coll = frame.get(name); | ||
| print_flush("{}\n", name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that it is much faster than before, is it needed to flush so much? Maybe let the receiver (terminal) flush whenever they want for better throughput (for example with large files)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to check. This might also just be a remnant from porting the python implementation, where flushing on the python side was necessary, because otherwise python and c++ streams (from coll.print()) would not be synchronized properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to work without.
tools/src/podio-dump-tool.cpp
Outdated
| } | ||
|
|
||
| template <typename... Args> | ||
| void print_flush(fmt::format_string<Args...> fmtstr, Args&&... args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| void print_flush(fmt::format_string<Args...> fmtstr, Args&&... args) { | |
| void print_flush(const fmt::format_string<Args...>& fmtstr, Args&&... args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole function has been removed, as it is no longer necessary.
podio-dumppodio-dump
a22da4d to
8f723df
Compare
tools/src/argparseUtils.h
Outdated
| inline std::vector<std::string> splitString(const std::string& str, const char delimiter) { | ||
| std::vector<std::string> tokens; | ||
| std::string token; | ||
| for (char ch : str) { | ||
| if (ch == delimiter) { | ||
| if (!token.empty()) { | ||
| tokens.push_back(token); | ||
| token.clear(); | ||
| } | ||
| } else { | ||
| token += ch; | ||
| } | ||
| } | ||
| if (!token.empty()) { | ||
| tokens.push_back(token); | ||
| } | ||
| return tokens; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| inline std::vector<std::string> splitString(const std::string& str, const char delimiter) { | |
| std::vector<std::string> tokens; | |
| std::string token; | |
| for (char ch : str) { | |
| if (ch == delimiter) { | |
| if (!token.empty()) { | |
| tokens.push_back(token); | |
| token.clear(); | |
| } | |
| } else { | |
| token += ch; | |
| } | |
| } | |
| if (!token.empty()) { | |
| tokens.push_back(token); | |
| } | |
| return tokens; | |
| } | |
| inline std::vector<std::string> splitString(const std::string& str, const char delimiter) { | |
| std::vector<std::string> result; | |
| for (const auto part : std::ranges::views::split(str, delimiter)) { | |
| result.emplace_back(part.begin(), part.end()); | |
| } | |
| return result; | |
| } |
There is std::ranges::views::split to do exactly what this function does. In principle it should be possible to replace all the occurrences of splitString with std::ranges::views::split but more code would have to be modified to use views so this is a partial solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made this return a vector<string_view> and adapted the parseSizeOrExit to take a string_view.
| } | ||
| return number; | ||
| } catch (const std::invalid_argument&) { | ||
| std::cerr << "'" << str << "' cannot be parsed into an integer number" << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed that now passing a bad range like -e 1: fails with '' cannot be parsed into an integer number while before this would fail with:
usage: podio-dump [-h] [-c CATEGORY] [-e ENTRIES] [-d] [--dump-edm DUMP_EDM] [--version]
inputfile
podio-dump: error: argument -e/--entries: '0:' cannot be parsed into a list of entries
Not a huge problem since it is stated in the help how to correctly pass a range, but the error message is a bit useless in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be back to (almost) what it was before now. I have put in an explicit check for an empty string after the colon. Ideally parseSizeOrExit would return an optional, but that would require quite some other work as well.
| #include <fmt/core.h> | ||
|
|
||
| #include <algorithm> | ||
| #include <iostream> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #include <iostream> |
Marked by clangd
|
I don't have any more comments and probably won't have another look in detail unless I find something. I have been using this and it feels soooo much better to use, certainly way faster even in debug builds it is very fast. It also crashes faster when reading an incompatible file 😄 |
IIRC, it was really the loop in python over all the collections that was slow (and the |
Drop-in replacement of podio-dump python implementation which gets moved to podio-dump.py because it supports dumping the pre-release legacy files
Co-authored-by: Juan Miguel Carceller <[email protected]>
Co-authored-by: Juan Miguel Carceller <[email protected]>
564140a to
f0217c0
Compare
|
This is now in the Key4hep nightlies on Alma 9 and Ubuntu 24. On Ubuntu 22, it doesn't build with GCC 11 anymore with multiple failures: I wasn't aware of this, but actually building with GCC 11 (with tests) hasn't been possible since #626. I think probably the way to go forward with this is to request GCC > 11 rather than trying to make the errors go away. |
|
I am not entirely sure how to best deal with this, but gcc11 is still the system compiler on Alma9, so if we can keep support for it alive, I would be all for it. |
|
On the other side, GCC 15 is going to be released in ~ 1 month, so by now GCC 11 is already 4 years old. In addition no one seems to be building podio with tests since #626 is from last year's summer. |
|
Well, it got merged in December and the v01-02 tag does not yet include it, so if people only build tagged versions they will not have encountered it yet. The main question is whether we consider gcc11 as c++20 compatible (enough) or not. Since we switched to c++20 after v01-02 that would also give us a clean cut and a reason to abandon gcc11. |
BEGINRELEASENOTES
podio-dumppython implementation by apodio-dumpwrapper script around the newpodio-dump-toolc++ implementation. This is an (almost) drop-in replacement, only legacy files (i.e. files written before the introduction of theFrame) cannot be read by this.podio-dump-legacy.podio-dump-tooluses that for formatting.std::formatyet to easily replace this.ENDRELEASENOTES
This has taken a bit to develop (see below for the original description). The way it is now is a combination of two tools that are wrapped in a thin bash script that acts as the entry point and is called
podio-dump. This is effectively only necessary because the datamodel definitions are stored as JSON inside the file and it's just way easier to transform that back to YAML in python than it is in c++. Hence, the bash script only checks for the corresponding flag and if that is present pipes the output of the c++podio-dump-toolto a python script (json-to-yaml). Otherwise it simply calls thepodio-dump-toolpassing on all the arguments.The dependency on fmtlib is necessary because it supports formatting of ranges and easy opt-in for implementations via
operator<<. Additionally,std::print(andprintln) are only scheduled for c++23. Once there is more / all necessary support in<format>, adjusting to that and dropping the fmtlib dependency should be possible.Original description
This is an attempt at making
podio-dumpquicker after several complaints (e.g. key4hep/EDM4hep#312). After some "profiling" it turns out that the slowest part in the python implementation is the loop over all the collections which can be significantly sped up by going to c++. In my local timings the current (python based)podio-dumpis almost ten times slower than this (c++ based)podio-dump-cpp) for dumping theexample_frame.rootfile from the tests (times viatime)podio-dumppodio-dump-cppThe main disadvantages of the c++ implementation are that we need quite a bit of boilerplate for things that are trivial in python, e.g.:
tabulatefunctionalityiostreamandiomanipis bordering on masochism, I have decided to pull infmtfor now. In principle c++20 has similar functionality instd::format(but nofmt::printthat only comes with c++26). However, that requires gcc >=13 and clang >= 16.podio/tools/podio-dump
Lines 99 to 100 in d275460
in c++ this would require at least one other library to be pulled in
Since dumping the datamodel would require quite a bit of work in c++, I would be in favor of keeping that in python in a separate tool, while the other functionality could be covered by the c++ implementation.
TODO: