Add a tool to get graph statistics#556
Conversation
ba310c9 to
f619e39
Compare
|
I thought the goal was the unify our tools into a common framework. This seems to just create more tools that will get out of date. Also, can you add tests? |
|
This is a separate PR. The one to unify things is still in draft. This tool was needed to unblock some work for other people. |
I'd prefer we start with writing the library API and adding tests. If people want to run code that doesn't have tests, that is up to them and they can just take your branch. But we should try to uphold some quality for things in master. |
|
I can definitely write tests for this. |
bd94e80 to
f00ae8c
Compare
dca3a72 to
cc9837f
Compare
706c0ff to
d4b8999
Compare
witchel
left a comment
There was a problem hiding this comment.
Thanks very much!
I think a unified tool to collect and print information on graphs will be really useful.
I do agree with Donald that we need a unification rather than another tool.
I also agree with him that tests are necessary. There are somewhat easy ways of building graphs in memory, you can see TestCreate in ingest.cpp. That starts from an empty in-memory graph and adds nodes, edges and properties.
| real_used_space += sizeof(mainType); | ||
| } |
There was a problem hiding this comment.
Is this calculation correct for Boolean types? Is their sizeof == 8?
We shouldn't have any Booleans in V2 RDGs, but might as well be correct.
| basic_raw_stats.insert(std::pair("Node-Schema-Size", total_num_node_props)); | ||
| basic_raw_stats.insert(std::pair("Edge-Schema-Size", total_num_edge_props)); |
There was a problem hiding this comment.
See Time.h:BytesToStr (in libsupport) for some good formatting options for sizes and times.
| } else { | ||
| prop_field = g->GetEdgeProperty(prop_name).value()->chunk(0); | ||
| } | ||
| auto bit_width = arrow::bit_width(dtype->id()); |
There was a problem hiding this comment.
Oh, maybe this makes your space calculation for Boolean types work?
amberhassaan
left a comment
There was a problem hiding this comment.
A few important comments.
96030d9 to
148c34d
Compare
ad7631c to
9ceb2fe
Compare
|
@GioLomia : my changes have diverged quite a bit from current state of the code, so you shouldn't push to this PR for now. We can revisit and decide what to do with the tool once I'm finished with my changes (and analysis of grouping memory usage). |
This PR is meant to add an executable script that will allow us to inspect our memory performance and benchmark and changes we make the underlying memory structure.