Skip to content

Comments

perf: Overwhelming Audit w/ Multi-Threading#8203

Open
AzmodiusX wants to merge 26 commits intocataclysmbn:mainfrom
AzmodiusX:Map_Threading
Open

perf: Overwhelming Audit w/ Multi-Threading#8203
AzmodiusX wants to merge 26 commits intocataclysmbn:mainfrom
AzmodiusX:Map_Threading

Conversation

@AzmodiusX
Copy link
Collaborator

@AzmodiusX AzmodiusX commented Feb 19, 2026

Purpose of change (The Why)

Always looking for performance gains.
Notably, monmove seems to be an incredible source of lag.
Many things that could use distance culling don't.

Describe the solution (The How)

Implemented robust thread pool.
Map cache, scent map, and light map all threaded.
vehmove, monmove, and process_items all gained some form of performance upgrade.
Not every change was multi-threading related, but much was.
Also added several performance options, including splitting the sleeping monster movement blocking setting into 2, for npc specific. That version specifically forces NPCs to sleep, too.

Describe alternatives you've considered

Focusing on non-threaded improvements
Threading increases code complexity and maintenance burden, so it has to be worth it.

Testing

I used Tracy for profiling. Auto saves are off.
Load into world, in the middle of a city with several mid sized fires. There are many enemies, vehicles, and items nearby. The player is in complete debug mode, meaning nothing interacts with them. This removes many confounding variables such as combat or aggro checks. Though there are some downsides to this setup, it is incredibly reliable and standardized. I make the player wait 2 in game hours (7200 turns) while profiling, then quickly cease profiling.
At most steps, individual commits were tested for timing, noting that sequentially at least, each change resulted in at worst no change in timing (structural), with most resulting in an improvement.
The final comparison resulted in 66.36% time for do_turn. That's about 50% faster.
Necropolis got even more extreme gains, having a 56.5% comparison time.

I've also tested the following:
Vehicle collisions work correctly
Light sources activate and deactivate as expected; no "ghost lights"
Fire processing and monster movement still occur correctly near the edge of vision

I did another 2 hour test in necropolis, and the performance gains were more limited. The comparison is 88.5% (13% increase)
Honestly, what is going on in necropolis?

Those were old numbers so I can flex the massive improvement for worst case. Wowee.

Additional context

I'm sure I missed something, there's a lot here.
Of note: The graphics performance costs are actually marginal from my estimations. SDL2 isn't really the bottleneck for most setups.
Though, even if it were, SDL2 is not thread safe. Only some minor gains could be had from doing some setup steps via multi-threading beforehand. I think the largest benefit for that would be latency, not total game speed (which seems to be a more common issue).

Fair warning to people profiling this PR:
I added a lot of profiling. It might be a bit overwhelming, but it sure is informative. For a PR of this complexity, it seems valuable to keep.

Map cache, scent map, and light map all threaded.
Made thread pool
@github-actions github-actions bot added the src changes related to source code. label Feb 19, 2026
@autofix-ci
Copy link
Contributor

autofix-ci bot commented Feb 19, 2026

Autofix has formatted code style violation in this PR.

I edit commits locally (e.g: git, github desktop) and want to keep autofix
  1. Run git pull. this will merge the automated commit into your local copy of the PR branch.
  2. Continue working.
I do not want the automated commit
  1. Format your code locally, then commit it.
  2. Run git push --force to force push your branch. This will overwrite the automated commit on remote with your local one.
  3. Continue working.

If you don't do this, your following commits will be based on the old commit, and cause MERGE CONFLICT.

@AzmodiusX AzmodiusX changed the title perf: Map Multi-Threading perf: Overwhelming Audit w/ Multi-Threading Feb 22, 2026
AzmodiusX and others added 4 commits February 22, 2026 04:06
memset in map.cpp
rebuild_pq() called after every successful act_on_map(), not just on destruction
Extracted map::update_weather_transparency_lookup()
ITEM_PROCESS_RADIUS_SM now mapped to MAPSIZE, kept for testing or easy changes
parallel_for → parallel_for_chunked(..., 8) for monster planning
rate_target gains optional precalc_dist parameter
Pre-warm pass: calls mon->sees(u) for every plannable monster before parallel_for_chunked
active_items.empty() early-out
deferred + plan_lookup merged into a single plan_index map
Fix vehicle collosions
Fix heap corruption during vehmove
Removed stale comments
Documented non-determinism rng for threading in some cases
Fixed performance losses in vehmove, scent_map::decay, and monMove
Removed process-items distance check entirely
Added sight cache
Added has_cargo_recharge check
@AzmodiusX AzmodiusX marked this pull request as ready for review February 22, 2026 23:06
@AzmodiusX
Copy link
Collaborator Author

I see the failing tests, looking into it.
It's no surprise that vehmove is failing something.
Really glad for these tests right now, lots of edge cases.

@AzmodiusX AzmodiusX marked this pull request as draft February 23, 2026 22:27
Use monster budget to tier monsters based on distance.
Defers and simplifies monmove for monsters when counts grow
@AzmodiusX
Copy link
Collaborator Author

I am cooking.

Also did work for future threading work.
Deferred for fear of collision with sound rework and chunk loading.
@github-actions github-actions bot added the lua PRs and issues related to Lua scripting label Feb 24, 2026
autofix-ci bot and others added 3 commits February 24, 2026 05:09
Fixed macro step drifting monsters toward player_pos
--friendly moved from after the idle-path early returns to before them
shove_vehicle(dest, dest) fixed, now shove_vehicle(goal, dest)
Added effective_friendly = friendly > 0 ? friendly - 1 : friendly alongside the
   existing effective_wandf simulation
decide_action and execute_action: Changed lod_tier == 0 to lod_tier <= 1 in both the repath
  signal (decide) and the A* call site (execute)
Added monster::prewarm_sight(const Creature &)
plan_index is constructed fresh each call; plan_index.size() was always 0 at that point
Removed the outer turn_cached_sees(*this, g->u) guard before the player target
  block
monster_action_kind::special commented out as a Phase 3 placeholder (future implementation)
must_serialize field removed from monster_action_t. The only setter
  (action.must_serialize = true in the push case) removed, note left in comment for future.
Consolidated overly verbose comments (useful for development at the time)
Tier-1 monsters (20-60 tiles) can now run A* when genuinely stuck
AzmodiusX and others added 2 commits February 24, 2026 00:04
Split sleep performance options for NPCs, with force NPC sleep functionality added
@AzmodiusX
Copy link
Collaborator Author

I need to do my own testing and profiling some more, but based on what I've done so far, this should be working.
No guarantees it won't fail a test or 12, but it should be testable for logical breaks and such.

Cleared / clarified comments
NPC prewarm distance pre-cull
Faction-hostile monster pair prewarm
Removed a redundant hash-map lookup per monster per turn
Early return before cargo_parts build
falling_vehicles vector
@AzmodiusX AzmodiusX marked this pull request as ready for review February 24, 2026 11:15
@AzmodiusX
Copy link
Collaborator Author

I marked it ready so it would run the tests while I slept, plz forgib
Also it is genuinely better tested now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lua PRs and issues related to Lua scripting src changes related to source code.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant