-
Notifications
You must be signed in to change notification settings - Fork 13
Refchecks and array ownership #152
Description
This issue is to find a solution to the question of how we should handle resizing of numpy arrays in our LGDO objects. Right now, we are using refcheck=False, which means that when the Array capacity changes, the data is copied to a new array and the Array will be unlinked from any references to the underlying numpy array. In the past we used refcheck=True, which is much nicer when we want to use Arrays as wrappers or views for other data structures; however, this can prevent resizing in ways that can sometimes be confusing, and currently breaks the following line in our tests:
wft = WaveformTable(size=10, dt=np.zeros(5), t0=np.zeros(5), values=np.zeros((5, 50)))
It is also worth noting that refcheck=False still ocasionally runs into ownership issues (like when unpickling, which is used for multiprocessing), so using np.resize(ar, shape) is preferable to ar.resize(shape, refcheck=False). For the purposes of thinking about this, I will refer to resizing with refcheck=True as the Array not owning its data, and creating a copy when resizing as the Array owning its data.
Unfortunately, I'm not sure there's a way to handle this that won't create confusing behavior in at least some contexts. We could also add a boolean member for tracking ownership and have ways to control it; however we will have to answer questions about default behaviors. Note that the resizing right now only happens in Array, and all the other collection objects either inherit or contain Arrays; but we'd still have to figure out how to propagate ownership (e.g., does a Table have its own ownership, or is it delegated to each column). The way I see it, here are our choices:
- Do not own memory
a) Be strict about it (will prevent some code from working)
b) Provide a way to decouple our object from any references by making new copies - Own memory
a) Create copies if arrays are fed into initialization or setter (no wrappers/views allowed)
b) Create copies only when changing capacity, when growing array (current behavior; may cause confusion with wrappers/views) - Flag to make ownership optional (need to decide between 2a and 2b for when we own things)
a) Default to always owning (need flag to create wrapppers/views)
b) Default to never owning (need flag if you want to increase size; similar to 1b, only difference is what happens if a view_as exists)
c) Default to owning only if array is created by LGDO object (could get confusing)
Personally, I would lean towards either 1b or 3a, with copies of arrays created from the start unless you explicitly tell it to be non-owning. This would mean that if you want either owning/non-owning behavior you have to be explicit about it.