mmore500
diff --git a/‎blog_professional/_posts/2024-01-02-a290255-oeis.md
Lines changed: 158 additions & 0 deletions b/‎blog_professional/_posts/2024-01-02-a290255-oeis.md
Lines changed: 158 additions & 0 deletions
diff --git a/‎resources/2024-01-02-a290255-oeis.png
670 KB b/‎resources/2024-01-02-a290255-oeis.png
670 KB
@@ -0,0 +1,158 @@
+---
+layout: post
+title:  "Closed-form Bit Magic for OEIS A290255"
+date:   2024-01-02
+no_toc: true
+---
+
+Ah, the [On-Line Encyclopedia of Integer Sequences](https://oeis.org/).
+
+![OEIS elmo](/resources/2024-01-02-a290255-oeis.png){:style="width: 100%;"}
+
+The next best thing to magic.
+
+Not only can you usually pick up a nice closed-form expression for whatever numbers you paste in, but more often than not you'll get a peek into your numbers' natural habitat in some strange (and often frightening) realm of mathematics.
+Careful!
+Don't think too hard about why your problem's secret isomorphism to [gray-coded](https://en.wikipedia.org/wiki/Gray_code) [Hadamard matrices](https://en.wikipedia.org/wiki/Hadamard_matrix) lest your mind become boggled.
+Just close that door back up, and you can go about coding in your new [top-tier arcana](https://en.wikipedia.org/wiki/Fast_inverse_square_root).
+
+For those unfamiliar, OEIS is a database of integer sequences.
+You plug in a few terms and get back a list of possible generating sequences with references to the literature and links to related sequences.
+(The New York Times did a nice feature on OEIS a few months back that's [well worth a read](https://www.nytimes.com/2023/05/21/science/math-puzzles-integer-sequences.html).)
+
+Used OEIS today to work some problems related to [hereditary stratigraphy](https://mmore500.com/0001/01/01/hstrat.html), and crossed paths with [A290255](https://oeis.org/A290255).
+
+```python3
+A290255 = [0,1,0,2,1,0,0,3,2,1,1,0,0,0,0,4,3,2,2,1,1,1,1,0,
+ 0,0,0,0,0,0,0,5,4,3,3,2,2,2,2,1,1,1,1,1,1,1,1,0,0,
+ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,5,4,4,3,3,3,3,2,2,2,
+ 2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,
+ 0,0,0,0]
+```
+
+This sequence ended up being the magic ingredient I needed, but OEIS listed no closed-form expression for the sequence.
+
+The description --- though --- gave enough *aha!* to light the way.
+Turns out, this sequence is just the "number of 0's following directly the first 1 in the binary representation of n."
+
+## Implementation
+
+I needed a closed-form expression for my use case, and here's what I came up with.
+
+```python3
+def bit_floor(n: int) -> int:
+    """Calculate the largest power of two not greater than n.
+
+    If zero, returns zero.
+    """
+    mask = 1 << (n >> 1).bit_length()
+    return n & mask
+
+
+def bit_drop_msb(n: int) -> int:
+    """Drop most significant bit from binary representation of integer n."""
+    return n & (~bit_floor(n))
+
+
+def bit_count_immediate_zeros(x: int) -> int:
+    """Count the number of zeros immediately following the first binary digit
+    of x."""
+    # https://oeis.org/A290255
+    return x.bit_length() - bit_drop_msb(x).bit_length() - bool(x)
+```
+
+We can smoosh it all together to get a sufficiently ghastly one-liner.
+
+```python3
+
+This all comes together as
+```python3
+def bit_count_immediate_zeros(n: int) -> int:
+    return n.bit_length() - (n & (~(1 << (n >> 1).bit_length()))).bit_length() - bool(n)
+
+```
+
+Note that if you're using *numpy* and might have *numpy* data types floating around, you'll want to explicitly cast inputs to `int` to get access to Python's built-in bit manipulation methods.
+
+## How Does it Work?
+
+How does this work?
+The trick is dropping the most significant bit.
+This will expose the immediate 0's (i.e., directly following the leading 1), if any, and they'll collapse away in Python's integer representation.
+Afterwards, the next following 1 will be left most significant bit if present.
+We can test how many bits were stripped away by subtracting the new bit length from the old bit length.
+
+Note that we need to account for the leading 1 in the bit length, which is taken care of by the last term.
+We'd like to subtract one from the bit length of the original number, but only if there was a bit there in the first place.
+The `bool(x)` tacked on the end does this for us --- it will evaluate as 1 if `x` is positive and 0 if `x` is zero.
+
+What about `bit_drop_msb` and `bit_floor`?
+
+In `bit_drop_msb`, the call to `bit_floor` isolates the leading one in the binary representation of `n`.
+This leading bit lets us easily generate an inverse mask --- i.e., with all other non-leading bits set to 1.
+Masking the original `n` and keeping only masked bits leaves everything else in place except for the most significant bit.
+
+In `bit_floor`, our goal is to find the largest power of two not greater than `n`.
+This amounts to isolating the most significant bit of `n`.
+A simple way to do this would be counting the number of bits in `n` and then shifting 1 up to that power of two.
+Something like `1 << (n.bit_length() - 1)`.
+Note that we have to subtract one because the 1 we are shifting up is already "shifted" into the first (i.e., 0th) position.
+
+The `n >> 1` in `bit_floor` is a trick for zero-handling.
+If `n` is zero, `n >> 1` will simply nop, and `bit_floor` will return zero.
+Otherwise, we can just shift 1 up to the power of two of `n`'s most significant bit.
+We no longer have to subtract one due to the `n >> 1` downshift.
+
+## Validation
+
+Let's check against the reference sequence entries.
+
+```python3
+>>> import itertools as it
+>>> juxtaposed = zip(
+...    map(bit_count_immediate_zeros, it.count(1)),  # oeis is 1-indexed
+...    A290255,
+... )
+>>> all(it.starmap(int.__eq__, juxtaposed))
+True
+```
+
+Okay, let's do a little more validation by comparing against a string-based implementation.
+
+```python3
+def bit_count_immediate_zeros_str(n: int) -> int:
+    """Count the number of zeros immediately following the first binary digit
+    of x."""
+    return f"{n:b}1"[1:].index("1")  # fstring "1" ensures a trailing 1
+```
+
+We can test our alternate implementation against the reference sequence entries.
+
+```python3
+>>> juxtaposed = zip(
+...    map(bit_count_immediate_zeros_str, it.count(1)),  # oeis is 1-indexed
+...    A290255,
+... )
+>>> all(it.starmap(int.__eq__, juxtaposed))
+True
+```
+
+And then test against one another.
+
+```python3
+>>> test_indices = [
+...     0, 1, 2,
+...     *map(int, np.linspace(0, 2**32, 10**5, dtype=int)),
+...     *map(int, np.geomspace(1, 2**32, 10**5, dtype=int)),
+...     *map(int, np.random.randint(0, 2**32, 10**5)),
+... ]
+>>> juxtaposed = zip(
+...    map(bit_count_immediate_zeros, test_indices),
+...    map(bit_count_immediate_zeros_str, test_indices),
+... )
+>>> all(it.starmap(int.__eq__, juxtaposed))
+True
+```
+
+Yup, everything checks out.
+Both implementations agree over integers sampled up to `2**32`!