Removing the 'ak.singletons' and 'ak.firsts' functions: any complaints? #710
Replies: 5 comments 5 replies
-
The semantics of this—what it's supposed to do—could be better thought-out. It was intended for cases in which missing values are present because it was for handling the output of ak.argmin and ak.argmax. (Here are some early thoughts about the problem: #203.) The problem is basically this: >>> # We want to find the max of one_quantity and look at the corresponding quantity in "another."
>>> one_quantity = ak.Array([[1, 2, 3], [], [5, 4]])
>>> another = ak.Array([[1.1, 2.2, 3.3], [], [5.5, 4.4]])
>>> # As of Awkward 1, argmax gives one dimensional output with Nones, not singletons/empties.
>>> # (This is required for conformance with NumPy.)
>>> ak.argmax(one_quantity, axis=1)
<Array [2, None, 0] type='3 * ?int64'>
>>> # Applying that to "another" is not what we want, since it applies to the first dimension.
>>> # It rearranges the lists, rather than picking out the corresponding element from each list.
>>> another[ak.argmax(one_quantity, axis=1)]
<Array [[5.5, 4.4], None, [1.1, 2.2, 3.3]] type='3 * option[var * float64]'>
>>> # ak.singletons gets us to that other format.
>>> ak.singletons(ak.argmax(one_quantity, axis=1))
<Array [[2], [], [0]] type='3 * var * int64'>
>>> # And we get what we want: the corresponding value from each list.
>>> another[ak.singletons(ak.argmax(one_quantity, axis=1))]
<Array [[3.3], [], [5.5]] type='3 * var * float64'>
>>> # Which we then have to pair with ak.firsts to turn it back into flat-with-missing-values.
>>> ak.firsts(another[ak.singletons(ak.argmax(one_quantity, axis=1))])
<Array [3.3, None, 5.5] type='3 * ?float64'> However, that's non-obvious and verbose. I don't remember where I first saw it, but some users of Awkward found a much simpler solution: >>> # The 'keepdims' argument was included for NumPy compatibility...
>>> ak.argmax(one_quantity, axis=1, keepdims=True)
<Array [[2], [None], [0]] type='3 * var * ?int64'>
>>> # But it gives us nearly what ak.singletons does, which simplifies the max-by-another problem.
>>> another[ak.argmax(one_quantity, axis=1, keepdims=True)]
<Array [[3.3], [None], [5.5]] type='3 * var * ?float64'>
>>> # And then we can do the "ak.firsts" thing with a simple slice.
>>> another[ak.argmax(one_quantity, axis=1, keepdims=True)][:, 0]
<Array [3.3, None, 5.5] type='3 * ?float64'> So in truth, For instance, I wonder if you can get what you want by turning a regular axis created with >>> vals = ak.Array([[43, 15, 10.5], [11.5], [50, 5]])
>>> idx = ak.Array([2, 0, 1])
>>> # np.newaxis makes a new axis with length 1.
>>> idx[:, np.newaxis]
<Array [[2], [0], [1]] type='3 * 1 * int64'>
>>> # But we're about to do a jagged slice, so turn the length-1 dimension into a variable-length one.
>>> ak.from_regular(idx[:, np.newaxis])
<Array [[2], [0], [1]] type='3 * var * int64'>
>>> # Use this to slice "vals". It picks out the nth item from each list.
>>> vals[ak.from_regular(idx[:, np.newaxis])]
<Array [[10.5], [11.5], [5]] type='3 * var * float64'>
>>> # It works exactly the same way if you have any missing data.
>>> idx = ak.Array([2, None, 1])
>>> vals[ak.from_regular(idx[:, np.newaxis])]
<Array [[10.5], [None], [5]] type='3 * var * ?float64'>
>>> # And that shape is exactly what you want if you're going to be removing this dimension.
>>> vals[ak.from_regular(idx[:, np.newaxis])][:, 0]
<Array [10.5, None, 5] type='3 * ?float64'> So maybe |
Beta Was this translation helpful? Give feedback.
-
Thanks for the detailed answer! I need to remember to use Maybe |
Beta Was this translation helpful? Give feedback.
-
But you did raise an important point, that I'll convert this into a Discussion to let others weigh in on it. Now would be a good time to schedule them for removal in 1.2.0 or 1.3.0 (see Roadmap). |
Beta Was this translation helpful? Give feedback.
-
I currently use local_nonzero = ak.local_index(is_pid_i)[is_pid_i]
j = ak.firsts(local_nonzero) I wonder if there's a better way to do this that doesn't require local_index, but I suspect not because somewhere the local indices need to be computed. Clearly, without |
Beta Was this translation helpful? Give feedback.
-
Conclusion: See #1189 for an example of something that wasn't well thought-through.
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I'm using Python 3.7.3 and awkward 1.0.2 and I see that
ak.singletons()
only adds a dimension when there is a missing element. E.g.,which means I have two cases to deal with when using one array as an index for another.
The first case (which doesn't give me my desired output) where the first array has inner arrays of size at least 1 (and thus each value in the second index array is defined):
And another case (which does give me my desired output) where the first array has at least one empty inner array (and thus at least one missing element in the indexing array):
Is this intended? If so, is there a more awkward-like way of doing
vals[idx]
?Thanks!
Beta Was this translation helpful? Give feedback.
All reactions