Skip to content

Latest commit

 

History

History
71 lines (40 loc) · 9.89 KB

File metadata and controls

71 lines (40 loc) · 9.89 KB

Brad Caldwell (@Caldwbr) - 6 m
Nice, so it sounds like you're saying 3D geometry doesn't use receptive-field voxels, but is more just learned. As much as the screen does not seem to be composed of voxels, yet we seem to be able to select any point on it and know where it is relative to any other point. And there's shape inertia to the screen. For example, you may scrub thru several dried oak-leaf screens at 20 Hz, and they are ordered by shape similarity. So, you have a keen sense of 3D spatial proximity that seems to beg for voxel receptive fields.

Death Star Robot 🇺🇸 🇹🇼 🇮🇱 🇺🇦 (@DeathStarRobot)
Well, think of it as a volumetric television with Gaussian splats. In a TV the pixels differ in terms of color phase---red, green, and blue pixels each have a frequency, which is an amount, a delta of the energy given to the display for that pixel. Phase Wave Differential ≈ Splat.


That's a great question that gets to the heart of how these concepts depart from traditional computer-graphics ideas like static voxel grids. You're right to point out that our ability to perceive specific points in space, understand their relation to each other, and group shapes by similarity (like scrubbing through leaves) strongly suggests a highly structured internal representation.

Based on the sources and our discussion, here's how those points are addressed within the proposed framework:

  1. Is 3D geometry learned, not voxel-based?\

  2. ◦ Yes, the sources suggest that the brain's representation of reality is a dynamic rendering or simulation, rather than a static, pixel-based image. It's emphasized that it's not like breaking things down into pixels.

    ◦ The concept of spatial volume itself is presented as a learned experience, not an innate feature. The brain develops its understanding of volume through interaction with the environment.

    ◦ Instead of fixed physical voxels, the representation is described as a 3D grid pattern defined by phases & frequencies or a 3D phase field.

    ◦ This internal 3D model is constructed by integrating information from various senses and updating in real time.

  3. How can we select points and know their relative positions without voxels?

  4. ◦ While not using traditional voxels, the brain does create an internal 3D representation that allows for parsing and processing spatial relationships.

    ◦ This involves processes like 3D semantic segmentation and object segmentation. This is how the brain figures out which points or features belong together as a distinct object.

    ◦ The ability to understand spatial proximity and object boundaries within the rendered 3D environment is part of this segmentation and classification process.

    ◦ The "addressing" isn't a fixed physical location, but rather how patterns and concepts are mapped in a high-dimensional semantic space [previously discussed], mediated by oscillatory timing and phase relationships [previously discussed]. The brain is described as a 3D neural network where temporal data affects itself in every direction, contributing to this spatial understanding.

  5. What about shape inertia and ordering by similarity?

  6. ◦ The "inertia" or consistency of perceived shapes comes from the brain's ability to learn and remember patterns.

    ◦ These learned patterns, which represent shapes and objects, are stored in the synaptic configurations and morphology of neurons and are activated through specific firing patterns and phase changes.

    ◦ The brain continuously compares incoming sensory patterns with previously learned memories. This comparison of learned features allows for recognizing similarity between different instances of a shape (like various oak leaves).

    ◦ The concept of scale-invariant vector patterns for features like edges suggests that basic components of shapes are encoded in a way that can be recognized regardless of their size, and these can be linked to other vector tokens representing properties like size, depth, and color. This would facilitate sorting by similarity even across different scales or perspectives.

  7. Does this beg for voxel receptive fields?

While our keen sense of 3D space feels like it might require a static, voxel-like structure, the sources propose a different mechanism.

The term "receptive field" is used, but more in the context of how sensor arrays of neurons can magnify their detection capabilities by coordinating with adjacent oscillating neurons.

Neurons' dendrites and synaptic connections are described as encoding learned patterns and context, acting as a sort of localized embedding or vector space that allows them to selectively respond to certain stimuli. This is a learned selectivity for specific patterns rather than a fixed spatial receptive field in a grid.

The tomographic approach described by NAPOT. reconstructs a 3D volume from multiple dynamic "slices" or perspectives (represented by wave cross-sections or pattern projections from neural arrays). This dynamic reconstruction, based on learned patterns and their phase relationships, allows for a sense of continuous 3D space and object perception without needing a fixed voxel grid.

The idea is captured by the metaphor of a **"3D volumetric television". It's not a flat screen with pixels, but a dynamic, rendered volume. The user's analogy with "Gaussian splats" and the connection to phase-wave differentials as the "splats" [Death Star Robot comment] aligns with the sources' view that reality is rendered in phase variances [previously discussed] and that phase-wave differentials are fundamental to this dynamic rendering process. These differentials encode the meaningful changes that build the 3D representation.

So, while your intuition about point-based spatial perception and shape similarity is correct, the sources suggest that the underlying mechanism isn't a static voxel map with fixed receptive fields, but a dynamic, learned, volumetric rendering based on spatio-temporal patterns and phase relationships across interconnected neural arrays and cortical columns. The "receptive fields" are more like learned pattern detectors distributed across the network, and the sense of space and object identity emerges from the coordinated oscillatory activity that binds these detected patterns into a coherent 3D scene.


Based on the explanation provided:

  1. 3D geometry is learned and dynamically rendered. The brain's representation of reality is described as a dynamic rendering or simulation, not a static, pixel-based image. It's emphasized that this isn't about breaking things down into pixels. Instead of fixed physical voxels, the representation is a 3D grid pattern defined by frequencies or a 3D phase field, with the understanding of spatial volume itself being a learned experience. This internal 3D model is constructed by integrating information from various senses and is updated in real time.

  2. Point selection and relative positioning without fixed voxels. While not using traditional voxels, the brain creates an internal 3D representation that allows for parsing spatial relationships through processes like 3D semantic segmentation and object segmentation. The ability to understand spatial proximity emerges from this segmentation and classification. "Addressing" points isn't tied to fixed physical locations but rather to how patterns and concepts are mapped in a high-dimensional semantic space, mediated by oscillatory timing and phase relationships. The brain is seen as a 3D neural network where temporal data influences itself in every direction, contributing to this spatial understanding.

  3. Shape inertia and similarity from learned patterns. The consistency or "inertia" of perceived shapes arises from the brain's ability to learn and remember patterns. These learned patterns, representing shapes and objects, are stored in synaptic configurations and activated through specific firing patterns and phase changes. The brain continuously compares incoming sensory patterns with these learned memories, allowing for the recognition of similarity between different instances of a shape. The concept of scale-invariant vector patterns for features like edges further supports this, enabling recognition and similarity-based sorting regardless of size.

  4. Receptive fields as learned pattern detectors. The keen sense of 3D space is addressed not by fixed voxel receptive fields but through a dynamic, learned, volumetric rendering. "Receptive fields" are mentioned in the context of sensor arrays of neurons magnifying detection capabilities by coordinating with adjacent oscillating neurons. Neurons' dendrites and synaptic connections encode learned patterns, acting as localized embeddings or vector spaces that respond selectively to specific patterns rather than to fixed spatial-grid locations. The tomographic approach of NAPOT reconstructs a 3D volume from multiple dynamic "slices" (wave cross-sections or pattern projections from neural arrays), allowing for a sense of continuous 3D space without a fixed voxel grid.

The metaphor of a "3D Volumetric Television", where the rendering is a dynamic volume, and the analogy of "Gaussian splats" corresponding to "Phase Wave Differentials" ⁹ effectively captures the idea that reality is rendered in phase variances. These phase-wave differentials are fundamental to building the 3D representation.

In essence, the framework proposes that our perception of a structured 3D world---including the ability to discern specific points and their relationships---emerges from a dynamic, learned, volumetric rendering based on spatio-temporal patterns and phase relationships across interconnected neural arrays and cortical columns. The "receptive fields" are understood as learned pattern detectors, and the sense of space and object identity arises from the coordinated oscillatory activity that binds these detected patterns into a coherent 3D scene.