Hi, thank you for sharing this great project!
I’m currently studying the codebase and noticed that the script relies on the following JSON files:
kv_store_text_chunks.json
kv_store_entities.json
kv_store_hyperedges.json
Could you kindly clarify:
How are these JSON files generated from the original dataset?
Is there any existing script or code reference for preprocessing the raw data into these formats?
Any guidance or pointers would be greatly appreciated. Thanks in advance!