[Question] How to produce dataset with millions of data in batches

Hi, we love using hollow, it is very nice.

I wanna know if there is a properly  way to produce data in batches? Like I have 10 million objects to produce, I wanna produce them divided into 10 parts and produce 1 million objects every time. I need to produce data in batches because my vm does not have enough memory to store 10 million objects.

I am using `Incremental` and `withNumStatesBetweenSnapshots` to make it publish snapshot only at begining and at last so that it run like "in batches". But I met a problem that sometimes the `Incremental` did not publish dataset because some batch do not change the dataset. 
I have fork [hollow-reference-implementation](https://github.com/Netflix/hollow-reference-implementation) and make 2 test cases to show what we are looking for. You can check my test cases: [ProducerTest](https://github.com/Q-Bug4/hollow-reference-implementation/blob/test-incremental/src/test/java/how/hollow/producer/ProducerTest.java)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] How to produce dataset with millions of data in batches #677

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] How to produce dataset with millions of data in batches #677

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions