Skip to content

Conversation

@ptheywood
Copy link
Member

@ptheywood ptheywood commented Nov 27, 2025

Fixes memory leaks in the visualiser, originally identified by @DavidFletcherShef in #148.

  • Delete/malloc mismatch in Text::setString
  • Missing free in Text::textureString::resize
  • Ensure FcConfigDestroy is called
    • This is a more eager than it could be
  • Call FcFini in ~Visualiser
  • Add a dtor for HUD::Item for a missing free
  • Fix an incorrect include path in TextureBuffer.h (which never gets triggered anyway)
  • Manually free CUDATextureBuffers in Visualiser::close
  • Initialise Entity::needsExport to false (matching debug build behaviour)
  • Initialise ImGuiPanel::first_render to 0
  • Opened issues for future enhancements identified during this PR (TextureBuffer.cu.cpp CUDA-only code no longer correct #150, Modernise / reduce manual memory management #151)

@ptheywood
Copy link
Member Author

WIP suppressions list to reduce noise in the valgrind output (this may suppress genuine errors in our code, but I'm fairly confident it doesn't)

{
   cudart
   Memcheck:Leak
   ...
   obj:*libcudart*
}
{
   cuda
   Memcheck:Leak
   ...
   obj:*libcuda*
}
{
   libnvidia-glcore
   Memcheck:Leak
   ...
   obj:*libnvidia-glcore*
}
{
   libGLX_nvidia_leak
   Memcheck:Leak
   ...
   obj:*libGLX_nvidia*
}
{
   libGLX_nvidia_ReallocZero
   Memcheck:ReallocZero
   ...
   obj:*libGLX_nvidia*
}
{
   libGLX_nvidia_BadSize
   Memcheck:BadSize
   ...
   obj:*libGLX_nvidia*
}
{
   libX11
   Memcheck:Leak
   ...
   obj:*libX11*
}

Which results in:

==307287== LEAK SUMMARY:
==307287==    definitely lost: 19,088 bytes in 86 blocks
==307287==    indirectly lost: 1,141,896 bytes in 35,596 blocks
==307287==      possibly lost: 0 bytes in 0 blocks
==307287==    still reachable: 372,610 bytes in 163 blocks
==307287==         suppressed: 116,311,329 bytes in 19,480 blocks
==307287== 
==307287== ERROR SUMMARY: 18 errors from 17 contexts (suppressed: 141 from 141)

...

--307287-- 
--307287-- used_suppression:   1278 cuda ../../cuda-valgrind.supp:8 suppressed: 116,105,196 bytes in 18,526 blocks
--307287-- used_suppression:     18 libnvidia-glcore ../../cuda-valgrind.supp:14 suppressed: 290,571 bytes in 35 blocks
--307287-- used_suppression:    108 libX11 ../../cuda-valgrind.supp:32 suppressed: 54,300 bytes in 917 blocks
--307287-- used_suppression:      2 libGLX_nvidia_leak ../../cuda-valgrind.supp:20 suppressed: 352 bytes in 2 blocks
--307287-- used_suppression:      1 libGLX_nvidia_realloczero ../../cuda-valgrind.supp:26
==307287== 
==307287== ERROR SUMMARY: 18 errors from 17 contexts (suppressed: 141 from 141)

Remaining erros look to be mostly imgui/fontconfig related with flamegpu functions in the stack trace (at a glance, I've not checked all 1765 loss records / 359 entries in the remainig valgrind log). A large number (of still reachables) may dissapear if we explciitly cleanup font config (just need to find the right place to call the appropriate method)

Some still-reachables in SDL2 that may be unavoidable/need suppressing.

@DavidFletcherShef
Copy link

I've copied the updated Text.cpp over to my model. It has fixed the issue I was seeing, the host memory during the simulation and afterwards while the window is open is now stable rather than increasing. If I can help by trying anything else out as you look at the remaining leaks let me know. Many thanks, David

This causes (some?) reiniitalisation of font config on each font search, but fixes (some) font config leaks
… merge.

This should probably be in a visualiser cleanup method, but one does not exist.
@ptheywood
Copy link
Member Author

I've copied the updated Text.cpp over to my model. It has fixed the issue I was seeing, the host memory during the simulation and afterwards while the window is open is now stable rather than increasing. If I can help by trying anything else out as you look at the remaining leaks let me know. Many thanks, David

Thanks for confirming the main leak is fixed, I'll let you know if there's anything else for you to check.


There are some reported dbus leaks via SDL2, but googling suggests these are won't fix upstream, so we can probably just suppress those, and other SDL_Init/SDL_GL_MakeCurrent allocations which do not seem to be cleaned up even though we do call SDL_Quit (although these suppressions may be a little strong / generic)

{
   libdbus_leak
   Memcheck:Leak
   ...
   obj:*libdbus*
}
{
   libSDL2_leak
   Memcheck:Leak
   ...
   obj:*libSDL2*
}

There's some leakage in the usage of fontconfig in Font::fontSearch, which can be fixed with calls to FcConfigDestroy and FcFini.
However, FcFini feels a bit strong to be calling on each invocation (And we don't have a nice place to call that for the visualiser currently).
I've committed these changes for now but we might want to drop the FcFini commit before merging.

There are also some leaks due to certain imgui and SDL cleanup methods not being called depedning on the current state of the UI window (due to multithreaded rendering). I.e. Visualiser::deallocateGLObjects does not always get called.

Manually closing the visualiser window when .join is used leads to those calls being made, further reducing the leaks identified by Valgrind (but requiring manual window closing).

With the extra suppressions, fixes and visualistion thread joining, the summary for my case is now:

==662819== LEAK SUMMARY:
==662819==    definitely lost: 288 bytes in 6 blocks
==662819==    indirectly lost: 0 bytes in 0 blocks
==662819==      possibly lost: 0 bytes in 0 blocks
==662819==    still reachable: 86,833 bytes in 38 blocks
==662819==         suppressed: 116,328,820 bytes in 19,527 blocks

The remaining definite loss is:

==662819==    at 0x60EAFD3: operator new(unsigned long) (vg_replace_malloc.c:487)
==662819==    by 0x429AFB5: flamegpu::visualiser::CUDATextureBuffer<float>* flamegpu::visualiser::mallocGLInteropTextureBuffer<float>(unsigned int, unsigned int) (cuda.cu:150)
==662819==    by 0x4260B55: flamegpu::visualiser::Visualiser::renderAgentStates() (Visualiser.cpp:551)
==662819==    by 0x4268FFB: flamegpu::visualiser::Visualiser::render() (Visualiser.cpp:451)
==662819==    by 0x4269997: flamegpu::visualiser::Visualiser::run() (Visualiser.cpp:310)
==662819==    by 0x133A0583: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.34)
==662819==    by 0x135DED63: start_thread (pthread_create.c:448)
==662819==    by 0x136721C3: clone (clone.S:100)
==662819== 

mallocGLInteropTextureBuffer is called 6 times for this example, but freeGLInteropTextureBuffer never gets called.

freeGLInteropTextureBuffer is called in the ~TextureBuffer, but guarded by ifdef __CUDACC__ however this is in TextureBuffer.cu.cpp which is compiled by gcc not nvcc on linux (everything goes through nvcc on windows).

That implementation probably needs to move to the header?


Of the remaining still reachables, a number can be suppressed (nvrtc, dl_open?), but I'll have to follow up with this another day. It may also be worth having a better look at the SDL2 suppressed leaks, and also running valgrind over our non-visualisation test suite.

Presumably this could be removed instead, seeing as TextureBuffer.h must never be seen by nvcc
@ptheywood
Copy link
Member Author

ptheywood commented Dec 1, 2025

With the following probably slightly too broad suppressions list, the boids_spatial3D example now reports:

==664154== LEAK SUMMARY:
==664154==    definitely lost: 0 bytes in 0 blocks
==664154==    indirectly lost: 0 bytes in 0 blocks
==664154==      possibly lost: 0 bytes in 0 blocks
==664154==    still reachable: 0 bytes in 0 blocks
==664154==         suppressed: 116,285,899 bytes in 19,526 blocks

and 2 errors in libraries

Invalid read of size 8
==664154==    at 0x62E442B: ??? (in /usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0.3200.4)
==664154==    by 0x42698F0: flamegpu::visualiser::Visualiser::run() (Visualiser.cpp:282)

...

==664154== 2 errors in context 2 of 2:
==664154== Conditional jump or move depends on uninitialised value(s)
==664154==    at 0x4295279: flamegpu::visualiser::ImGuiPanel::drawPanel() (ImGuiPanel.cpp:37)
==664154==    by 0x42957A7: flamegpu::visualiser::ImGuiPanel::render(glm::mat<4, 4, float, (glm::qualifier)0> const*, glm::mat<4, 4, float, (glm::qualifier)0> const*, unsigned int) (ImGuiPanel.cpp:128)
==664154==    by 0x42A841A: flamegpu::visualiser::HUD::render() (HUD.cpp:85)
{
   cudart
   Memcheck:Leak
   ...
   obj:*libcudart*
}
{
   cuda
   Memcheck:Leak
   ...
   obj:*libcuda*
}
{
   libnvidia-glcore_leak
   Memcheck:Leak
   ...
   obj:*libnvidia-glcore*
}
{
   libnvidia-glcore_BadSize
   Memcheck:BadSize
   ...
   obj:*libnvidia-glcore*
}
{
   libGLX_nvidia_leak
   Memcheck:Leak
   ...
   obj:*libGLX_nvidia*
}
{
   libGLX_nvidia_ReallocZero
   Memcheck:ReallocZero
   ...
   obj:*libGLX_nvidia*
}
{
   libGLX_nvidia_BadSize
   Memcheck:BadSize
   ...
   obj:*libGLX_nvidia*
}
{
   libX11
   Memcheck:Leak
   ...
   obj:*libX11*
}
{
   libdbus_leak
   Memcheck:Leak
   ...
   obj:*libdbus*
}
{
   libSDL2_leak
   Memcheck:Leak
   ...
   obj:*libSDL2*
}
{
   libnvrtc_leak
   Memcheck:Leak
   ...
   obj:*libnvrtc*
}
{
   _dl_new_object
   Memcheck:Leak
   match-leak-kinds: reachable
   ...
   fun:_dl_new_object
   ...
}
{
   _dl_open
   Memcheck:Leak
   ...
   fun:_dl_open
   ...
}

Suppression usage is mostly in cuda/nvidia bits, but it may be worht cehcking the suppressed sdl2 leask more thoroughly, though perhaps this should be defferred to a future issue to get the in-render loop fixes merged in.

--664154-- used_suppression:   1271 cuda ../../cuda-valgrind.supp:8 suppressed: 116,103,388 bytes in 18,504 blocks
--664154-- used_suppression:      2 libnvrtc_leak ../../cuda-valgrind.supp:62 suppressed: 73,216 bytes in 2 blocks
--664154-- used_suppression:     11 libGLX_nvidia_leak ../../cuda-valgrind.supp:26 suppressed: 51,844 bytes in 19 blocks
--664154-- used_suppression:      3 libX11 ../../cuda-valgrind.supp:44 suppressed: 24,544 bytes in 3 blocks
--664154-- used_suppression:      7 _dl_open ../../cuda-valgrind.supp:76 suppressed: 13,617 bytes in 36 blocks
--664154-- used_suppression:    120 libSDL2_leak ../../cuda-valgrind.supp:56 suppressed: 31,751 bytes in 941 blocks
--664154-- used_suppression:     14 libdbus_leak ../../cuda-valgrind.supp:50 suppressed: 7,559 bytes in 20 blocks
--664154-- used_suppression:      1 dtv-addr-init /usr/libexec/valgrind/default.supp:1581 suppressed: 56 bytes in 1 blocks
--664154-- used_suppression:      1 libGLX_nvidia_BadSize ../../cuda-valgrind.supp:38
--664154-- used_suppression:      1 libGLX_nvidia_ReallocZero ../../cuda-valgrind.supp:32
==664154== 
==664154== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 142 from 142)

@ptheywood ptheywood marked this pull request as ready for review December 2, 2025 11:02
@ptheywood ptheywood requested review from Robadob and mondus December 2, 2025 11:02
Copy link
Member

@Robadob Robadob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks solid in principle, added a couple of comments on the things that might be worth reviewing closer.

Sorry about the delay, found it as an unread email in my inbox which had fell off my radar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants