DRAFT: csv export support #5

juliangaal · 2024-08-08T12:07:12Z

Hi there,

after #2 I took the liberty to start working on csv export support

Current changes:

csv(stream) that matches print(stream)
get_summary (formerly get_summary_table) extended to support ResultFormat: TABLE, CSV, DB, JSON
result_csv implemented to print summary
- file system operations later
- exporting the "detailed" overview requires further discussion
- currently includes info and summary table in the same csv string (default for printing currently), see comments below

Issues and Discussion

Please provide clang-format file. Formatting is tab-based and hard to adhere to, I did my best.
IMO placing units in the header makes exporting to different formats (CSV, JSON, DB) easier down the line. This way there is no need to write a function that does, e.g. table_percentage yet doesn't append "%". It also makes result file parsing much simpler.
```
filename,function,line,calls,ae[1-99]%,ae[0-100]%,time ae[0-100],time a[0-100]
ctrack.cpp,expensiveOperation,6,10,99.95%,99.95%,576.21 ms,576.21 ms
```
would become
```
filename,function,line,calls,ae[1-99]%,ae[0-100]%,time ae[0-100]ms,time a[0-100]ms
ctrack.cpp,expensiveOperation,6,10,99.95,99.95,576.21,576.21
```
Now, when parsing, there is no need to handle units explicitely
I think splitting into an 'info.csv' and 'results.csv' is necessary, because each table/csv has its own header.

How do you want to export the "details" table in the future? That part I haven't touched at all yet. By extending the headers to, e.g. min (fastest [0-1]%), min (center[1-99]%)

+---------+---------+----------+----------+----------+-----------+-----------+----------+---------+---------+
|   fastest[0-1]%   |                           center[1-99]%                           | slowest[99-100]%  |
+---------+---------+----------+----------+----------+-----------+-----------+----------+---------+---------+
|   min   |  mean   |   min    |   mean   |   med    |  time a   |  time ae  |   max    |  mean   |   max   |
+---------+---------+----------+----------+----------+-----------+-----------+----------+---------+---------+
| 0.00 ns | 0.00 ns | 51.12 ms | 57.14 ms | 58.00 ms | 571.36 ms | 571.36 ms | 59.01 ms | 0.00 ns | 0.00 ns |
+---------+---------+----------+----------+----------+-----------+-----------+----------+---------+---------+

Compaile · 2024-08-08T12:57:06Z

thanks, for the pr we will review it shortly.

one quick thing to add alredy.
Leaving the units only in the header wont be great for terminal ussage.

times can be ns can be us can be ms or seconds.
it will be hard to read if people have something like 10000000 ms in a slow code path.

but i aggree for csv it would be better so i'll think about it

Compaile · 2024-08-12T08:29:52Z

Hi @juliangaal ,
I like the general structure of the pr!. Here are some topics which you mentioned and my suggestions:
a) For the file-based export, the detailed overview can just be extra columns. At the end of the day, the summary overview is just a short version of the details table merged. So I would just add the extra columns for the missing entries (I can also take care of that if you want).
b) Regarding the units, I see your point. I still want to have the units in string (logfile) or std::cout export. So we could add that as a parameter overload. What unit should we use as default if we leave them out? ms? µs? Or is this a setting?
c) Can you explain why splitting into info and result.csv would be necessary?
d) I agree we should provide a clang-format file. Will add that to the todo.

juliangaal · 2024-08-13T10:32:03Z

a) sure, I would appreciate that
b) My latest commit implemented just that, for summary and detail. Regarding "leaving out the units", I would only leave them out of the data, but place the unit in the header? That way the user doesn't have to chose, and future self will not have to answer the question which unit was used during generation...
c) I guess this is partly personal preference, but I believe it makes parsing easier. Consider this (current csv output of get_summary):

Start,End,time total,time ctracked,time ctracked %  <-- info header
2024-08-13 11:53:16,2024-08-13 11:53:17,565.93,565.58,99.94  <-- info
filename,function,line,calls,ae[1-99]%,ae[0-100]%,time ae[0-100],time a[0-100] <-/ summary header
ctrack.cpp,expensiveOperation,6,10,99.94,99.94,565.58,565.58 <-- summary

Processing this with e.g. pandas means discarding the first two rows just to access the other (for benchmarking more useful) data. We could of course write the data "next to" each other in the csv file

Compaile · 2024-08-16T07:27:06Z

Here's the text with spelling corrections:
Perfect.
a) I will add it to your PR soon.
b)
Yes, we can put them in the header when we don't want them inside the rows. However, when you have a big project and tracked many functions, it will be hard to set a good unit value in the header.
As an example, let's say we have functions which need 50 seconds and some which need 100 μs.
If we choose μs as header, the 50 seconds entry would show 50000000000, which is hard to read. In CSV, I totally agree it makes sense. What default unit should we use for CSV (since it needs to be constant) for all entries? Milliseconds?
c) I prefer to just add some more rows and write it next to the other data. That's more common than having 2 CSV files.
So I would just add the info entries (Start, End, time total, time tracked, time tracked %) to the start or end of the CSV lines. Could you take care of this?

Then i will add a) and merge. Thanks for the nice idea and implementation

juliangaal · 2024-08-19T08:31:15Z

regarding b) couldn't we just provide an option where the dev can pass, e.g. std::chrono::milliseconds?

In general, I am just a little confused about your goals with appending rows to the right of the csv file. E.g.

Start,End,time total,time ctracked,time ctracked %  <-- info header
2024-08-13 11:53:16,2024-08-13 11:53:17,565.93,565.58,99.94  <-- info
filename,function,line,calls,ae[1-99]%,ae[0-100]%,time ae[0-100],time a[0-100] <-/ summary header
ctrack.cpp,expensiveOperation,6,10,99.94,99.94,565.58,565.58 <-- summary

would have to be changed to

Start,End,time total,time ctracked,time ctracked, filename,function,line,calls,ae[1-99]%,ae[0-100]%,time ae[0-100],time a[0-100] 
2024-08-13 11:53:16,2024-08-13 11:53:17,565.93,565.58,99.94,,,,,,,,
,,,,,,,,ctrack.cpp,expensiveOperation,6,10,99.94,99.94,565.58,565.58 
,,,,,,,,ctrack.cpp,expensiveOperation2,6,10,99.94,99.94,565.58,565.58 
....

is this the format you are intending to provide?

Compaile · 2024-11-12T10:47:29Z

Sorry for the incredible long wait. We now allocated more dedicated ressources to ctrack. If you dont mind i would finish up this pr and implement the last missing things?

Asperamanca · 2025-04-28T19:55:49Z

Please consider an option to export without decimal separator. Different locales use different separators. Exporting numbers in such a unit that whole numbers are sufficient makes the output much easier to import. My guess is that CSV is targeted more for import into tools for further processing; therefore readability is not the major concern (if you want that, you have the nice ASCII tables).
If you insist on the decimal separator, please make it configurable.

Compaile · 2025-04-28T20:02:31Z

good point.
We will adjust that and finish the csv and json export very soon. Would you agree that ";" is the best default separator ? (+ make it configurable)

Asperamanca · 2025-04-29T09:10:43Z

The ";" sounds like a reasonable default. Cursory research on decimal and thousands separators (please don't use those on a CSV export, either!) reveals that commonly used characters are ,.'

juliangaal mentioned this pull request Aug 8, 2024

csv export #2

Closed

feat: basic csv formatting for summary

2acda1b

juliangaal force-pushed the develop branch from 15156d2 to 2acda1b Compare August 8, 2024 12:29

feat: support displaying/ignoring of units in summary/detail

80bfb9b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DRAFT: csv export support #5

DRAFT: csv export support #5

Uh oh!

juliangaal commented Aug 8, 2024 •

edited

Loading

Uh oh!

Compaile commented Aug 8, 2024

Uh oh!

Compaile commented Aug 12, 2024

Uh oh!

juliangaal commented Aug 13, 2024 •

edited

Loading

Uh oh!

Compaile commented Aug 16, 2024

Uh oh!

juliangaal commented Aug 19, 2024

Uh oh!

Compaile commented Nov 12, 2024

Uh oh!

Asperamanca commented Apr 28, 2025

Uh oh!

Compaile commented Apr 28, 2025

Uh oh!

Asperamanca commented Apr 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DRAFT: csv export support #5

Are you sure you want to change the base?

DRAFT: csv export support #5

Uh oh!

Conversation

juliangaal commented Aug 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Current changes:

Issues and Discussion

Uh oh!

Compaile commented Aug 8, 2024

Uh oh!

Compaile commented Aug 12, 2024

Uh oh!

juliangaal commented Aug 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Compaile commented Aug 16, 2024

Uh oh!

juliangaal commented Aug 19, 2024

Uh oh!

Compaile commented Nov 12, 2024

Uh oh!

Asperamanca commented Apr 28, 2025

Uh oh!

Compaile commented Apr 28, 2025

Uh oh!

Asperamanca commented Apr 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

juliangaal commented Aug 8, 2024 •

edited

Loading

juliangaal commented Aug 13, 2024 •

edited

Loading