discussion for 'a unified theory of garbage collection' (6 november 2025) #602

mercure67 · 2025-10-30T23:55:59Z

mercure67
Oct 30, 2025

discussion thread for: A unified theory of garbage collection
discussion leads: helen ge, Jeffrey Qian

az275 · 2025-11-05T18:06:32Z

az275
Nov 5, 2025

Critique

I found the central thesis of this paper - that tracing and reference counting are duals, and that this enables a framework for analyzing the tradeoffs between GC design decisions - to be quite intuitive and accessible. The survey of GC algorithms lends concreteness to the observations. I thought there was sufficient attention given to the problem of cycle collection, and the fix-point formulation was pretty cool. My main criticism of this paper is that the cost analysis section was pretty meaningless to me; maybe I'm just not used to reading things that only present a framework for cost modeling instead of experimental evaluation, but I had a hard time translating from the formulas presented to an actual understanding of the tradeoffs between different strategies. There were also assumptions/oversimplifications made, such as ignoring memory fragmentation, which (while necessary for a contained theoretical discussion) are very real problems in practice.

Questions

How to empirically evaluate GC algorithms? Obviously performance is workload dependent and GCs are tailored to workloads, but what should we measure? What would need to go into designing a benchmark suite (and what are some examples of ones that are used today)?

1 reply

jeffreyqdd Nov 6, 2025

Maybe it's just me, but I thought the paper was quite poetic, claiming that tracing and reference counting are "duals" operating on both "matter" and "anti-matter". That aside, I also would have liked to see the paper take a more empirical perspective.

SyphonArch · 2025-11-05T22:24:14Z

SyphonArch
Nov 5, 2025

Critique:
I like how the paper brings together different garbage collection ideas under one framework. It makes the similarities between tracing and reference counting much clearer, and the cost model gives a structured way to think about trade-offs. Still, I'm not sure how much this neat theory helps with the messy details of real systems, like caching effects or when to trigger collection. I also wonder if the "everything is a hybrid" message holds up with modern collectors that use regions or mix moving and non-moving strategies. The main idea feels elegant, but I'm less sure how directly it applies in practice.

Discussion questions:

Does the tracing/reference-counting comparison still feel useful for today's garbage collectors, or has it become more of a historical framing?
If we think of modern systems as hybrids, what are the key practical trade-offs that developers actually tune - frequency of collection, pause times, or something else?

1 reply

jeffreyqdd Nov 6, 2025

I think the paper does a really good job of highlighting the strengths of each garbage collection approach and ultimately showing that borrowing ideas from the “dual” system can lead to more efficient designs. I think reference counting systems still have a place in today's compiler infrastructure, but we must think carefully about how to blend them into the GC/language design.

It's also worth noting that languages like Rust use reference counting in their rc and arc types, showing that even in an automatic, static language, runtime concepts are still useful.

Mond45 · 2025-11-05T22:45:59Z

Mond45
Nov 5, 2025

Critique:

I find the unification of tracing and reference-counting-based garbage collection to be quite intuitive, since they naturally complement each other, for instance, tracing can handle cycles that RC cannot, and it makes sense that real high-performance GCs combine both approaches. The theoretical formulation of the unification, however, feels somewhat forced in that the RC algorithm is defined with buffered decrement operations rather than immediate ones, which doesn't align with how we typically think of simple RC.

That said, I appreciate the fix-point formulation and the corresponding work-list algorithm (which, admittedly, is very common in compiler implementations). The result that tracing and RC compute the least and greatest fix-points respectively is elegant, though the practical implications of this duality remain unclear.

As for the cost analysis, the model seems to rely on strong assumptions, namely, the lack of fragmentation and a steady-state application behavior. I would be interested to see how well the cost model aligns with real program benchmarks, and how the analysis might change if those assumptions were removed. Still, the paper presents a comprehensive survey of GC techniques and serves as a valuable reference for understanding the GC design.

Discussion questions:

How would the cost analysis model change if the steady-state assumption were removed? How realistic is that assumption for real applications?
How could fragmentation be incorporated into the cost model?
How well does the cost model predict real benchmark behavior across different workloads and collector implementations?

1 reply

jeffreyqdd Nov 6, 2025

I have the same concerns as you for the cost model. Assuming no memory fragmentation seems too strong to me since it removes one of the main pros for using compacting or copy collectors.

jku20 · 2025-11-05T23:57:32Z

jku20
Nov 5, 2025

Critique: In sections 3 and 4, the authors make a strong argument for the similarity of reference counting and tracing algorithms, one working from an overapporximation of liveness and the other working from an underapproximation. These two techniques (along with many other technical tricks) the authors used to build various garbage collectors. However, by sections 5 and 6, I feel like the argument that garbage collectors can all be modeled by these various notions of approximations of liveness weakened. Especially section 6.3 about the train algorithm, the interpretation of each car (and train) as macro-node with a reference to other cars and trains reasonable at a high level, but at the same time, the algorithm's procedure itself seemed to diverge heavily from the examples provided in section 4. In part this might be because the algorithm simply is way more complex, but I didn't see a concrete way to break it down into parts which directly resembled those of the algorithms presented in sections 3 and 4.

Discussion Questions: Much of what the authors were doing felt like describing how various garbage collection algorithms could be built out of smaller uniform building blocks (e.g. tracing and rc collectors, macro nodes, remembered sets). Is there a concrete, small set of these algorithms and techniques which can be given a consistent interface and modularly put together to create the different garbage collectors described? In another phrasing, is it possible to make the arguments the authors make about how various garbage collectors can be interpreted using their theory more precise by constructing a set of primitives from the authors' theory to make a powerful "build your own garbage collector" language?

1 reply

mercure67 Nov 6, 2025
Author

the modularity aspect is pretty interesting... but i wonder if the similarities that the authors notice can be translated to implementation. i'm not really sure about that.

YoruCathy · 2025-11-06T01:00:57Z

YoruCathy
Nov 6, 2025

Critique:
The paper compellingly shows that tracing and reference counting are algorithmic duals, offering a unified perspective on garbage collection. However, it focuses more on theory than on empirical evaluation, leaving practical performance trade-offs underexplored.

Discussion Question:
How could the duality between tracing and reference counting inspire adaptive garbage collectors that switch strategies at runtime?

1 reply

jeffreyqdd Nov 6, 2025

This is an interesting question. Modern garbage collectors do employ a combination of both RC and tracing for example, Python. Python keeps track of how many references point to it through a field called ob_refcnt. Python's gc module uses a generational tracing-based collector to help reap objects with cycles. I'm interested in how tuning heuristics on which "mode" to use would affect program performance.

I think ultimately, the strategy differs for each program, and the system administrator would need to weigh tradeoffs for each workload distribution.

maheshejs · 2025-11-06T01:06:57Z

maheshejs
Nov 6, 2025

Critique
This was an interesting read, giving a unified model of garbage collection that systematically explains and explores the design space of collectors. The paper is comprehensive (I'm not sure I grasped all the details), as it categorizes high-performance garbage collectors and provides cost analyses that guide a more principled selection of designs. I found very insightful the qualitative comparison between the two main approaches, reference counting and tracing, and it struck me that the two are algorithmic duals of each other.

Question
Most modern languages rely on automatic, dynamic memory management. How does this unified model of garbage collection inform language design when choosing an appropriate collector? What key characteristics should collectors for general-purpose languages have?

1 reply

jeffreyqdd Nov 6, 2025

I think the unified model encourages language designers to consider the trade-offs of each collector "flavor". For example, if we had a language for latency-sensitive environments, we might prefer more predictable collector runs over tracing collectors that "amortize" their cost over many allocations.

adnan-armouti · 2025-11-06T01:49:18Z

adnan-armouti
Nov 6, 2025

Critique: 
This paper presents a unifying framework that treats tracing and reference counting as duals i.e. tracing grows from an under-approximation to the least fix-point, while RC shrinks from an over-approximation to the greatest fix-point. The gap in between is cycles. Based on this, collectors are “hybrids” that separate “who traces vs. who counts” responsibilities across storage divisions. While this framing is intuitive, I can’t help but agree with my classmates: it feels as though the cost model abstracts away fragmentation and barrier/cache effects, so you’re left wondering about the practical implications/applications of this work in real systems. 

Question(s):

Is there (or can there be) a practical decision rule for when a design should lean “more tracing” vs. “more counting”? Could this be adaptive at runtime? What evaluation measures should that decision be based on?
The paper provides two strategies for cycles in hybrids: a backup global trace vs. trial deletion. Given that this paper was published over two decades ago, I’m wondering what modern systems do? Do they mix both, or bias towards one under certain conditions?

1 reply

mercure67 Nov 6, 2025
Author

the paper is ~20 years old, so one does wonder whether advancements in GC have invalidated the authors' theory.

the bit about a practical decision rule is in line with what a lot of other people are saying, it merits discussing this together.

zc579 · 2025-11-06T01:57:26Z

zc579
Nov 6, 2025

Critique
This paper presents a mathematically rigorous formalization of garbage collection, it succeeds in describing diverse GC algorithms within a single algebraic model, but it offers little concrete guidance for actual system implementers. Key practical concerns—such as write-barrier optimizations, heap fragmentation, synchronization overhead in concurrent environments, and tuning for latency versus throughput—are abstracted away. Thus, I argue the problem would be that it is overly theoretical and detached from engineering practice.
Question
Can this unified theory effectively guide the design of modern garbage collectors used in systems such as the JVM, CLR, or Rust?

0 replies

tf-mac · 2025-11-06T03:18:26Z

tf-mac
Nov 6, 2025

Critique:
The paper offers an interesting perspective on tracing and reference counting, that is that they are truly linked algorithms, dealing with two types of garbage. They therefore conclude that any optimal garbage collection algorithm must be a mix of these two. My critique would be their cost models: they offer several formulas, but do not offer any significant use of these beyond abstract comparison of algorithms. Also, they conclude that any optimal garbage collection model must be a mix of tracing and reference counting; could there not be a third algorithm? How are they sure this covers the space of algorithms.

Question
How do they derive their cost models, how do they leave out relevant coefficients, and what are some immediate implications of these models?

0 replies

natetyoung · 2025-11-06T03:46:33Z

natetyoung
Nov 6, 2025

Critique
This paper presents a unified view of garbage collection, arguing that every approach is fundamentally about graph traversal algorithms to maintain "counts of references by live objects" or some proxy of them: tracing starts with an underestimate and traverses forward, incrementing, from roots (i.e. objects known to be live); traditional reference counting starts with an overestimate and traverses forward, decrementing, from objects known to be dead. The paper then goes on to describe how a huge number of garbage collection algorithms are some hybrid of these approaches, with variations in memory region or granularity, and characterizes their costs.

Question
Does their "fixpoint" formulation of reference counting imply some other way of doing garbage collection? Is there a slightly different formulation ("number of paths from the roots to each node" instead of "number of in-edges from live nodes", perhaps) that would induce a different novel algorithm? In particular, is there some view of a proxy of reference counts which can be found through sparse linear system solving (and would this be useful in any way)?

1 reply

mercure67 Nov 6, 2025
Author

while the framework presented is pretty general, i also found myself wondering if it is still too specific; and if, by describing GC algorithms as part of their framework, they foreclose thinking of alternatives which are outside of it!

pedropontesgarcia · 2025-11-06T04:23:07Z

pedropontesgarcia
Nov 6, 2025

Critique: Wow, this is a super cool and mathy paper. I really appreciate the formalization of garbage collection, and one more time (surprise!) it turns out that different looking "things" in CS are mathematically the same (like recursion and iteration, or the lambda calculus and the Turing machine). It is compelling that most garbage collectors are hybrids drawing from either side, and I found the cost analysis framework interesting and applicable.

Question: Did this unified theory ever help in any quantifiable way? As a mathy person, I find it very cool either way, but I'm curious if, especially coming from IBM, there were any interesting applications of this model to garbage collector design.

1 reply

CynyuS Nov 6, 2025

Its actually so cool that most concepts in CS boil down to the same core algorithm - and I too enjoyed the duality feature of this paper! My main critique of this paper is that they don't really evaluate this duality in some quantifiable way, and I think that it would be super cool to find for what set of workloads or benchmarks we can find this exhibited duality in.

magg1egao · 2025-11-06T05:55:31Z

magg1egao
Nov 6, 2025

Critique:
This paper discusses a unified theoretical framework for understanding tracing and reference counting garbage collectors, which are traditionally seen as fundamentally different. I found the paper quite insightful as the authors are essentially trying to reframe decades worth of garbage collection research into one conceptual umbrella. I think the paper did really show that all efficient garbage collectors are hybrid, mixing tracing and reference counting. For example, a generational collector uses a write barrier (which is a reference counting idea) to track old objects pointing to new ones, and then traces the new generation. The uniform cost model is also quite nice, being able to quantify the trade-offs between hybrid collectors. Although, the model seems to assume a steady-state behavior, which feels limiting in relation to modern, highly dynamic workloads. Although I know this paper’s primary goal was to just introduce this theory, I think it would have been nice if there was a little more discussion about the practicality and adaptability of this topic.

Question:
If, as the paper claims, tracing and reference counting are duals, does that imply an “optimal” collector could continuously transition between the two different modes depending on the runtime conditions? Or if one were to put this hybridization into practice, are there inherent structural limits that would prevent this?

0 replies

Jacqueline-Wen · 2025-11-06T07:48:14Z

Jacqueline-Wen
Nov 6, 2025

Critique

This paper provides a new perspective that tracing and reference counting are two inherently different garbage collection techniques. The authors were able to show that all high-performance garbage collectors are actually hybrids of both tracing and reference counting. I also appreciated how thorough the paper was. However, while duality is elegant in theory, there are practical differences in performance between the duals (such as differences in locality, cache behavior, and pause times). I’m not sure whether the researchers fully accounted for that in the paper.

Questions

How would the theory of duality in this paper mesh with the practicality of parallelism?
Were the insights of this paper used to build more performant garbage collectors?

0 replies

Smubge · 2025-11-06T12:40:12Z

Smubge
Nov 6, 2025

Critique

I was thought it was very cool that they took two different garbage collection techniques and shined a new light on them. The paper , overall very thorough, was also put in a way that one could read through it and understand. I wonder what the practical value of this paper is. I am very interested in the impact of this paper on the modern day garbage collector, and whether it significantly impacted the techniques utillized in modern-day.

#Questions

What is the impact of this paper on modern-day gabrage collectors?
Something I noticed was the authors specifically defined garbage collection as a runtime process of "storage reclamation" (p. 50). Languages such as rust use compile-time ownership/borrowing, which makes me question whether this paper could be generalized to benefit these languages.

0 replies

SerenaYZhang · 2025-11-10T19:59:48Z

SerenaYZhang
Nov 10, 2025

Critique

I thought it was really clever for the authors of the paper to connect two different garbage collection techniques and realize that they are duals of each other. I also found it cool that they showed that all high-performance collectors are actually hybrids combining both techniques, explaining why optimized implementations of both approaches converge toward similar performance characteristics.
However, one critique I have with the paper is with the cost model. While comprehensive, the cost model depends on parameters that can be difficult to measure accurately in real-world systems, potentially limiting its immediate practical applicability.

Questions

Did this paper's findings influence or inspire any aspect of future garbage collectors?
I wonder what other algorithms/techniques have been found to be duals of each other in computer science history.

0 replies

SolidLao · 2025-11-11T15:40:48Z

SolidLao
Nov 11, 2025

Critique

This is a very interesting paper. It presents a formulation of two algorithm (tracing and reference counting for garbage collection), that used to be viewed as being fundamentally different, are actually duals of each other. Moreover, different collectors are in fact hybrids of tracing and reference counting. I did not know this before and this makes this paper very interesting to me.

Questions

Seems like there would be different performance trade-offs for different collectors, e.g., more similar to tracing or reference counting, or somewhere between? There are some cost modelings and discussions in the paper. But I would expect some clear conclusions supported by experiments. Given these different performance trade-offs, it would also be interesting to propose some strategies, that given a scenario and the target workload, which collector is more suitable?

0 replies

discussion for 'a unified theory of garbage collection' (6 november 2025) #602

Uh oh!

Replies: 16 comments · 9 replies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mercure67 Nov 6, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mercure67 Nov 6, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mercure67 Nov 6, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Critique

Questions

Uh oh!

Critique

Uh oh!

Critique

Questions

Uh oh!

Critique

Questions

Replies: 16 comments 9 replies

mercure67 Nov 6, 2025
Author

mercure67 Nov 6, 2025
Author

mercure67 Nov 6, 2025
Author