Benchmarking Data Collector Strategies mesa-frames #168

Ben-geo · 2025-08-25T19:36:01Z

Ben-geo
Aug 25, 2025
Maintainer

We benchmarked different data collection + flushing strategies on the Boltzmann Wealth Model (100 steps) with up to 1M agents.
The goal was to evaluate trade-offs between execution time, memory usage, and CPU utilization for each approach.

🔬 Data Collector Strategies Compared

mesa-frames (pl native)
Default mesa-frames run with no data collection.
mesa-frames (pl native) with data collector Collects every step CSV
Collects data every step and immediately flushes to a CSV file.
mesa-frames (pl native) with data collector Collects every 10th step CSV
Collects data every 10th step and immediately flushes to a CSV file.
mesa-frames (pl native) with data collector Collects every step in Memory
Collects every step but does not flush, instead keeps everything in memory.
mesa-frames (pl native) with data collector Collects every step with deffered flush CSV
Collects every step and flushes every 100 steps (without concatenation, flushes files one by one).
mesa-frames (pl native) with data collector Collects every step with custom Deffered Flush
Collects every step and flushes every 100 steps (after concatenation).
mesa-frames (pl native) with data collector Collects every step in parquet
Collects every step and flushes as a Parquet file.
mesa-frames (pl native) with data collector Collects every step with async Flush
Collects every step and flushes asynchronously.

Plots

📊 Results Summary

Execution Time

Async Flush showed the best performance, scaling efficiently as the number of agents increased while also saving data.
Every step CSV (immediate flush) and Every step CSV (immediate flush) took the longest time - every step with custom Deferred Flush performed slightly better but still took a long time
Every Step In memory showed almost no difference than pl native; this shows that saving files is the bottleneck

Memory Usage

Async Flush and Custom Deferred Flush used up the most memory
There is a drop in memory usage somewhere around 700,000 agents, this occurred on every test we conducted

CPU Utilization

Async Flush kept the CPU consistently busy, minimizing idle time.
Deferred flush strategies, on the other hand, left the CPU underutilized for long periods.

✅ Conclusion

After evaluating runtime, memory, and CPU trade-offs, we decided to adopt:

➡️ mesa-frames (pl native) with data collector Collects every step with async Flush

Even though it consumes more memory, the significant reduction in execution time and high CPU utilization make Async Flush the best choice overall.
The performance gains outweigh the additional memory cost, especially for large-scale runs with mesa-frames (pl native).

quaquel · 2025-08-26T05:18:43Z

quaquel
Aug 26, 2025
Maintainer

Cool!

Can you elaborate a bit on the async flush and deferred flush. Specifically, can you give some technical details or code examples so I can wrap my head around it. It would be cool to test this with normal mesa at some point as well.

3 replies

Ben-geo Aug 26, 2025
Maintainer Author

deferred flush - Instead of flushing right after the collection we flush all "n" steps together

#normal flush - here we save use .write_csv("") 100 times
    def step(self):
        self.agents.do("step")
        self.dc.collect()
    def run_model(self, n):
        for _ in range(n):
            self.step()
            self.dc.flush()

#deferred flush - here we save use .write_csv("") 100 times
   def step(self):
        self.agents.do("step")
        self.dc.collect()
    def run_model(self, n):
        for _ in range(n):
            self.step()
        self.dc.flush()

#custom deferred flush - - here we save use .write_csv("") once since we concatenate 100 steps together and save as 1 df
    def step(self):
        self.agents.do("step")
        self.dc.collect()
    def run_model(self, n):
        for _ in range(n):
            self.step()
        self.dc.flush() # but flush is different 

    def flush(self,)
        agent_frame = pl.concat(self._agent_frames)
        agent_frame.collect().write_csv(f"{uri}/agent_reporter_{self._batch}.csv")

        model_frame = pl.concat(self._model_frames)
        model_frame.collect().write_csv(f"{uri}/model_reporter_{self._batch}.csv")

async flush - we are in the process of refining in #167
but initial idea was this

    def flush_async(self):
        threading.Thread(target=self.flush, daemon=True).start()

hope it makes more sense now!

quaquel Aug 26, 2025
Maintainer

Thanks for the great clarification. It makes perfect sense that the async flush works better in the case of individual runs. However, I am not sure what happens if you have, say, 4 cores and run 4 models in parallel. But then, I also don't know how polars is using cpu resources. For traditional MESA, however, it would be worth testing this additional scenario as well.

Ben-geo Aug 27, 2025
Maintainer Author

OH right! we did not test out multiple models in parallel, We will definitely test that out in the future as well!

adamamer20 · 2025-08-27T12:43:28Z

adamamer20
Aug 27, 2025
Maintainer

Great job on the comprehensive analysis, Ben!
This is exactly how communication around these benchmarking strategies should look 👍 For now, since our main focus is single-run efficiency, I think this is the right approach. We can always circle back and refine further if we hit other bottlenecks down the road.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmarking Data Collector Strategies mesa-frames #168

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Benchmarking Data Collector Strategies mesa-frames #168

Uh oh!

Ben-geo Aug 25, 2025 Maintainer

🔬 Data Collector Strategies Compared

Plots

📊 Results Summary

Execution Time

Memory Usage

CPU Utilization

✅ Conclusion

Replies: 2 comments · 3 replies

Uh oh!

quaquel Aug 26, 2025 Maintainer

Uh oh!

Ben-geo Aug 26, 2025 Maintainer Author

Uh oh!

quaquel Aug 26, 2025 Maintainer

Uh oh!

Ben-geo Aug 27, 2025 Maintainer Author

Uh oh!

adamamer20 Aug 27, 2025 Maintainer

Ben-geo
Aug 25, 2025
Maintainer

Replies: 2 comments 3 replies

quaquel
Aug 26, 2025
Maintainer

Ben-geo Aug 26, 2025
Maintainer Author

quaquel Aug 26, 2025
Maintainer

Ben-geo Aug 27, 2025
Maintainer Author

adamamer20
Aug 27, 2025
Maintainer