Skip to content

object_tracer may not be the right parameter name in scan_object_and_trace_edges #1445

@wks

Description

@wks

TL;DR:

  1. This is not urgent. It may just need a name change.
  2. But if we admit both Scanning::scan_object and Scanning::scan_object_and_trace_edges visit edges, we may consider merging them.

Edge, or node?

If we think carefully, both Scanning::scan_object and Scanning::scan_object_and_trace_edges identify edges in the object graph. Their difference is merely how the edges are represented and how the edges are processed.

  • Scanning::scan_object calls back to SlotVisitor::visit_slot and passes a Slot instance.
  • Scanning::scan_object_and_trace_edges calls back to ObjectTracer::trace_object and passes an ObjectReference.

At first glance, it looks like the former visits an edge, and the latter visits a node. For our existing GC algorithms, ObjectTracer::trace_object actually calls ProcessEdgesWork::trace_object, which in turn forwards to the trace_object method (and its variants, such as trace_object_with_opportunistic_copy) of spaces. This further strengthens the impression that "trace_object visits objects".

But it actually does not. Both Scanning::scan_object and Scanning::scan_object_and_trace_edges visit edges, and the only difference is that the edges are represented differently. Inside Scanning::scan_object_and_trace_edges, the VM binding calls object_tracer.trace_object(target) to tell the MMTk core that there is an edge from the current object to the target object. Its return value is an ObjectReference, and it tells the VM binding how to update the field. That is still about edges.

For this reason, I think the object_tracer in the function Scanning::scan_object_and_trace_edges(tls, object, object_tracer) is a misnomer. It doesn't trace object, but it visits an edge, albeit represented as an ObjectReference.

It's all trace_object's fault, or is it?

Back in #628, I introduced the Scanning::scan_object_and_trace_edges method and the ObjectTracer trait to support the so-called "node-enqueuing tracing" which is necessary for CRuby. In that PR, I named the trait ObjectTracer simply because its instance provides the trace_object method. An ObjectTracer provides the trace_object method. How simple that is!

But if the trace_object method is really about edges, why was it called trace_object in the first place? The name trace_object exists in all spaces, and some spaces have multiple variants of it, such as ImmixSpace::trace_object_without_moving and ImmixSpace::trace_object_with_opportunistic_copy. Its origin may be traced all the way back to JikesRVM.

MarkSweepSpace.traceObject had that name since its first version written by @steveblackburn 22 years ago. Its documentation says:

  /**
   * Trace a reference to an object under a mark sweep collection
   * policy.  If the object header is not already marked, mark the
   * object in either the bitmap or by moving it off the treadmill,
   * and enqueue the object for subsequent processing. The object is
   * marked as (an atomic) side-effect of checking whether already
   * marked.
   *
   * @param object The object to be traced.
   * @return The object (there is no object forwarding in this
   * collector, so we always return the same object: this could be a
   * void method but for compliance to a more general interface).
   */
  public final VM_Address traceObject(VM_Address object, byte tag)
    throws VM_PragmaInline {
    if (MarkSweepHeader.testAndMark(object, markState)) {
      MarkSweepLocal.internalMarkObject(object, tag);
      VM_Interface.getPlan().enqueue(object);
    }
    return object;
  }

It says "trace a reference to an object ..." This acknowledges that it visits an edge (represented as a reference to an object) rather than the object itself.

But CopySpace.traceObject was named CopySpace.forwardObject

  /**
   * Forward an object.
   *
   * @param object The object to be forwarded.
   * @return The forwarded object.
   */
  public static ObjectReference forwardObject(ObjectReference object) 

... before it was renamed to traceObject in this commit

  /**
   * Trace an object under a copying collection policy.
   * If the object is already copied, the copy is returned.
   * Otherwise, a copy is created and returned.
   * In either case, the object will be marked on return.
   *
   * @param trace The trace being conducted.
   * @param object The object to be forwarded.
   * @return The forwarded object.
   */
  public ObjectReference traceObject(TraceLocal trace, ObjectReference object)

But this time it says "trace an object under ..." in a tone as if it works with the object itself, not an edge, which is in contradiction with MarkSweepSpace.traceObject.

Does it matter?

  • From a Space's point of view, it works with an object, marking it or forwarding it, and returns where the object is moved to (if moved at all). It doesn't need to care about edges at all. The transitive closure will take care of all edges.
  • But from Scanning::scan_object_and_trace_edges's point of view, it is visiting edges. But during transitive closure, it happens to have the side effect of marking or forwarding the object, and updating the field.

So I come to a conclusion that

  • Marking and/or forwarding object is about object.
  • Returning the new address of the object is also about object.
  • Updating a field is about edge.

and

  • Space::trace_object is about objects, but
  • Scanning::scan_object_and_trace_edges is about edges.

What should we do?

Maybe just rename the parameter object_tracer and the trait ObjectTracer because it actually visits edges.

Should we merge scan_object and scan_object_and_trace_edges?

Both scan_object and scan_object_and_trace_edges visit edges, so why does one mention _and_trace_edges while the other one doesn't? They just visit edges in different ways. Some VMs can only do it one way, while others can do it the other way, or both ways.

It may be helpful for transitive closure at least. In other use cases, such as write barriers, we may just let it enumerate edges in the form of ObjectReference as scan_object_and_trace_edges currently does.

TODO: We should have had an issue on GitHub or some discussions on Zulip. Add links here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions