Skip to content
This repository was archived by the owner on Nov 22, 2023. It is now read-only.
This repository was archived by the owner on Nov 22, 2023. It is now read-only.

Swap to continous density field strategy for signal management #511

@alice-i-cecile

Description

@alice-i-cecile

After talking over the design and benchmarking results with @plof27 today, we've decided on a strategy to pursue for signal propagation (#1).

Key findings:

  1. Signal diffusion takes wildly more time than any other operation. See the trace in Add benchmark for signal production-diffusion-decay cycle #506.
  2. Signals are accessed very rarely compared to diffusion costs. Consider a worst-case for look-up. Suppose we have 100 different signal types (very low), a unit on each tile (very high), and they must act each frame, following a signal gradient (very high). This gives us 7 lookups per tile. However, for diffusion, we have 7 * 100 lookups for each tile, and 100 mutations for each tile. This is backed up by the trace above using the MVP implementation.
  3. Signals are generally quite sparse. In factory builders, you want a large variety of items and structures. Most of the time, that information is going to be irrelevant to anything that isn't fairly nearby: they'll always have something better to do. As a result, most signals should be treated as 0 in most places.
  4. The actual signal propagation logic can be quite simple and still works well!

For some more realistic numbers from the initial MVP:

  • 10% of tiles have units in them
  • actions take 0.5 seconds to complete on average
  • half of them involve signal gradient lookups
  • a new goal must be chosen (involving querying every signal on the tile) once every 5 seconds
  • each signal must be propagated to 5 neighbors (some cells are filled)
  • half of all tiles will have a signal emitter
  • 1000 different signal types
  • signals are 0 in 99% of all tiles

For a discretized strategy, that gives us:

  • 0.5 signal emissions per tile per signals pass
  • 1000 signals * 0.01 density * 5 neighbors = 50 signal diffusions per tile per signals pass
  • 1000 signals * 0.01 density = 10 decay and sparsification check per tile per signals pass
  • 0.1 units per tile * (5 +1) neighbors / 0.5 seconds per action = 0.12 local gradient lookups for actions per tile per second
  • 0.1 units per tile * 1000 signals * 0.01 density / 5 seconds = 0.2 goal lookups per tile per second

We can reduce how frequently our signals pass runs, but it probably can't be slower than once every half second or there will be perceptibly wrong results:

  • 0.25 signal emissions per tile per second
  • 25 signal diffusions per tile per second
  • 2 decay and sparsification check per tile per signals pass
  • 0.12 local gradient lookups for actions per tile per second
  • 0.2 goal lookups per tile per second

Obviously signal diffusion is dominating, but decay matters too!

Ideally our map is:

  • 10^3 tile map radius (big factory builder!)
  • 10^6 tiles in size as a result (hey, that's still feasible with individual entities for tiles and structures!)

Every second, we must process 10^6 * 25 = 2.5 * 10^7 diffusion steps per second. Which gives us 4e-8 seconds for each, aka 40 nanoseconds. If we want a GPU with matrix acceleration, we can't easily take advantage of sparsity. Thankfully though, unlike traditional pathfininding, all operations are linear with respect to map size, and scale very slowly with unit count (because there's so many fewer operations).

So, instead of doing a discretized storage approach, we propose the use of a density field strategy. The fundamentals are fairly straightforward, and come from applied math.

  1. Track relevant objects (signal emitters, barriers, special signal modifiers).
  2. For each object type, determine a differential equation of how they affect the field strength near them.
  3. Combine these equations to determine (for each signal type) an approximation that we can use to get the field strength at a position and time.
  4. Use this approximation to lazily look-up the values when we need them.

We only actually care about the value of these signals where these fields are evaluated! And because they're evaluated so rarely (see above), we have a massive budget to work with.

Let's pessimistically round up to 0.5 lookups per tile per second. At 10^6 tiles, that means we have 2000 nanoseconds to do each computation! As a bonus, in many cases we only want very approximate results (one significant figure is basically fine for following a gradient upstream or choosing a goal at random), and so we can stop the computation very early.

Of course, the big challenge is actually writing the math to do this, especially as we add more game mechanics. But diffusion equations are a very well-studied field, so I'm feeling quite optimistic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    architecturalChanges to the organization and patterns of the codecontroversialNeeds careful analysis and testing before proceedinggameplayGame mechanics and systemsperformanceCode go brrr

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions