fix: Complete memory management with dynamic optimization supportFix/memory problems #10

Shattered217 · 2026-01-10T11:25:51Z

Problem

Program crashes when processing images with different resolutions:

terminate called after throwing an instance of 'std::runtime_error'
what(): Error in async memcpy of opencv image: invalid argument
Aborted (core dumped)

Solution

Dynamic memory management: allocate_pinned_mat() and setup_pinned_matrices()
Auto-adjust odd dimensions: ensure_even_dimensions()
Dynamic buffer reallocation: supports arbitrary image size sequences
Added error handling and dimension validation

Testing

✅ Jetson Orin NX
✅ NVIDIA A100
✅ Mixed resolution image

incorrect image :

The processing of the first image is normal. If the resolution of subsequent images is different, errors will occur. The last image is the erroneous result after processing the Samoyed.

…tion - Add allocate_pinned_mat() and setup_pinned_matrices() for flexible memory management - Support dynamic image size handling with ensure_even_dimensions() - Add dynamic buffer reallocation in infer_on_dGPU() - Add even dimension validation with helpful error messages - Improve error handling with try-catch in main loop - Maintain compatibility with both zero-copy and dGPU modes

Copilot

Pull request overview

This PR addresses critical memory leaks and adds dynamic memory optimization features to the SAM3 segmentation system. The changes include a comprehensive 9-step destructor cleanup, dynamic buffer reallocation for variable image sizes, and automatic even dimension handling.

Changes:

Implemented complete destructor with proper CUDA resource cleanup (streams, device/host memory, registered pointers)
Added dynamic memory management with allocate_pinned_mat() and setup_pinned_matrices() for flexible buffer allocation
Introduced automatic even dimension validation and resizing to prevent inference errors

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File	Description
cpp/include/sam3.cuh	Added member variables to track registered memory pointers, new public methods for pinned memory allocation, and PinnedMemoryDeleter struct
cpp/src/sam3/sam3_trt/sam3.cu	Implemented constructor initialization, new memory allocation methods, dynamic buffer reallocation in inference paths, dimension validation, and comprehensive 9-step destructor cleanup
cpp/src/sam3/sam3_apps/sam3_pcs_app.cpp	Added even dimension helpers, replaced raw memory management with RAII-based approach using shared_ptr, improved error handling, and fixed memory leak from removed raw_bytes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-10T11:34:49Z

cpp/include/sam3.cuh

+struct PinnedMemoryDeleter {
+    void operator()(void* ptr) const {
+        if (ptr) {
+            cudaFreeHost(ptr);
+        }
+    }
+};


The PinnedMemoryDeleter struct is defined but never used in the codebase. The allocate_pinned_mat function uses an inline lambda deleter instead. Consider either using this struct in allocate_pinned_mat or removing it to reduce code clutter.

Copilot · 2026-01-10T11:34:49Z

cpp/src/sam3/sam3_trt/sam3.cu

+        if (opencv_input != nullptr)
+        {
+            cudaFree(opencv_input);
+            cudaFree((void*)gpu_result);
+        }
+        opencv_inbytes = current_inbytes;
+        cuda_check(cudaMalloc(&opencv_input, opencv_inbytes), " reallocating opencv input memory");
+        cuda_check(cudaMalloc((void**)&gpu_result, opencv_inbytes), " reallocating result memory");


There's a potential memory leak and state inconsistency issue in the reallocation logic. If cudaMalloc for opencv_input succeeds (line 219) but cudaMalloc for gpu_result fails (line 220), the pointers will be in an inconsistent state. The old memory is already freed, opencv_inbytes is updated, but gpu_result is nullptr while opencv_input points to newly allocated memory. Consider setting pointers to nullptr after freeing, or using a temporary variable pattern to ensure atomicity of the reallocation.

Suggested change

if (opencv_input != nullptr)

{

cudaFree(opencv_input);

cudaFree((void*)gpu_result);

}

opencv_inbytes = current_inbytes;

cuda_check(cudaMalloc(&opencv_input, opencv_inbytes), " reallocating opencv input memory");

cuda_check(cudaMalloc((void**)&gpu_result, opencv_inbytes), " reallocating result memory");

// Allocate new buffers first, using temporaries to keep state consistent

void* new_opencv_input = nullptr;

void* new_gpu_result = nullptr;

cuda_check(cudaMalloc(&new_opencv_input, current_inbytes), " reallocating opencv input memory");

cuda_check(cudaMalloc(&new_gpu_result, current_inbytes), " reallocating result memory");

// Free old buffers after successful allocations

if (opencv_input != nullptr)

{

cudaFree(opencv_input);

opencv_input = nullptr;

}

if (gpu_result != nullptr)

{

cudaFree((void*)gpu_result);

gpu_result = nullptr;

}

// Commit new state atomically

opencv_input = new_opencv_input;

gpu_result = static_cast<decltype(gpu_result)>(new_gpu_result);

opencv_inbytes = current_inbytes;

Copilot · 2026-01-10T11:34:50Z

cpp/src/sam3/sam3_trt/sam3.cu

+        if (opencv_input != nullptr)
+        {
+            cudaFree(opencv_input);
+            cudaFree((void*)gpu_result);
+        }
+        cuda_check(cudaMalloc(&opencv_input, opencv_inbytes), " allocating opencv input memory on a dGPU system");
+        cuda_check(cudaMalloc((void**)&gpu_result, opencv_inbytes), " allocating result memory on a dGPU system");


There's a potential memory leak and state inconsistency issue in the reallocation logic. If cudaMalloc for opencv_input succeeds (line 66) but cudaMalloc for gpu_result fails (line 67), the pointers will be in an inconsistent state. The old memory is already freed, opencv_inbytes is updated, but gpu_result is nullptr while opencv_input points to newly allocated memory. Consider setting pointers to nullptr after freeing, or using a temporary variable pattern to ensure atomicity of the reallocation.

Suggested change

if (opencv_input != nullptr)

{

cudaFree(opencv_input);

cudaFree((void*)gpu_result);

}

cuda_check(cudaMalloc(&opencv_input, opencv_inbytes), " allocating opencv input memory on a dGPU system");

cuda_check(cudaMalloc((void**)&gpu_result, opencv_inbytes), " allocating result memory on a dGPU system");

// Use temporary pointers so reallocation is atomic and leaves the object in a consistent state

void* new_opencv_input = nullptr;

void* new_gpu_result = nullptr;

// Allocate new buffers first; cuda_check will handle any allocation errors

cuda_check(cudaMalloc(&new_opencv_input, opencv_inbytes), " allocating opencv input memory on a dGPU system");

cuda_check(cudaMalloc(&new_gpu_result, opencv_inbytes), " allocating result memory on a dGPU system");

// Now that both allocations have succeeded, free any old buffers

if (opencv_input != nullptr)

{

cudaFree(opencv_input);

cudaFree((void*)gpu_result);

}

// Update member pointers to point to the newly allocated buffers

opencv_input = new_opencv_input;

gpu_result = new_gpu_result;

// Initialize the buffers

Copilot · 2026-01-10T11:34:50Z

cpp/src/sam3/sam3_apps/sam3_pcs_app.cpp

    const float vis_alpha = 0.3;
    const float probability_threshold = 0.5;
-    const SAM3_VISUALIZATION visualize = SAM3_VISUALIZATION::VIS_INSTANCE_SEGMENTATION;
+    const SAM3_VISUALIZATION visualize = SAM3_VISUALIZATION::VIS_SEMANTIC_SEGMENTATION;


The visualization type was changed from VIS_INSTANCE_SEGMENTATION to VIS_SEMANTIC_SEGMENTATION. If this change is intentional for the purposes of this PR, consider documenting it in the PR description. Otherwise, this might be a leftover from testing that should be reverted.

Suggested change

const SAM3_VISUALIZATION visualize = SAM3_VISUALIZATION::VIS_SEMANTIC_SEGMENTATION;

const SAM3_VISUALIZATION visualize = SAM3_VISUALIZATION::VIS_INSTANCE_SEGMENTATION;

Copilot · 2026-01-10T11:34:50Z

cpp/src/sam3/sam3_trt/sam3.cu

+            cudaFree(opencv_input);
+            cudaFree((void*)gpu_result);


The cudaFree calls should be wrapped with error checking. While cudaFree typically doesn't fail for valid pointers, it's a best practice to check for errors consistently, especially in cleanup code where errors could indicate corrupted state.

Suggested change

cudaFree(opencv_input);

cudaFree((void*)gpu_result);

cuda_check(cudaFree(opencv_input), " freeing opencv input memory on a dGPU system");

cuda_check(cudaFree((void*)gpu_result), " freeing result memory on a dGPU system");

Copilot · 2026-01-10T11:34:51Z

cpp/src/sam3/sam3_trt/sam3.cu

+        cudaMemset(opencv_input, 0, opencv_inbytes);
+        cudaMemset((void *)gpu_result, 0, opencv_inbytes);


The cudaMemset calls should be wrapped with cuda_check for consistent error handling. If these operations fail, it could lead to undefined behavior when the uninitialized memory is used.

Suggested change

cudaMemset(opencv_input, 0, opencv_inbytes);

cudaMemset((void *)gpu_result, 0, opencv_inbytes);

cuda_check(cudaMemset(opencv_input, 0, opencv_inbytes), " zeroing opencv input memory on a dGPU system");

cuda_check(cudaMemset((void *)gpu_result, 0, opencv_inbytes), " zeroing result memory on a dGPU system");

Copilot · 2026-01-10T11:34:51Z

cpp/src/sam3/sam3_trt/sam3.cu

+            cudaFree(opencv_input);
+            cudaFree((void*)gpu_result);


The cudaFree calls should be wrapped with cuda_check for consistent error handling. While cudaFree typically doesn't fail for valid pointers, it's a best practice to check for errors consistently.

Suggested change

cudaFree(opencv_input);

cudaFree((void*)gpu_result);

cuda_check(cudaFree(opencv_input), " freeing opencv input memory");

cuda_check(cudaFree((void*)gpu_result), " freeing result memory");

dataplayer12 · 2026-01-13T16:12:50Z

@Shattered217 can you share system traces or evidence of memory leak? The reason destructors are almost not used in this repo is because the driver takes care of cleaning up all resources. There is a huge amount of CUDA backend that is dedicated to ensuring that these issues don't happen. So, even without your PR, we don't actually encounter memory leaks when the application terminates.

Shattered217 · 2026-01-13T16:40:59Z

@dataplayer12 The core goal was to fix memory-related anomalies when inferring images of varying resolutions. Sorry—the AI assistant inadvertently "fixed" what the potentially non-existent potential memory leaks. I’ll carefully review the code tomorrow and remove all changes related to "memory leak fixes," focusing solely on resolution-adaptive optimization

Shattered217 added 2 commits January 10, 2026 18:43

Fix memory leaks in library destructor and app image loading

79ce1a5

Copilot AI review requested due to automatic review settings January 10, 2026 11:25

Copilot started reviewing on behalf of Shattered217 January 10, 2026 11:26 View session

Copilot AI reviewed Jan 10, 2026

View reviewed changes

fix: Remove unnecessary CUDA cleanup

4f66a0d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Complete memory management with dynamic optimization supportFix/memory problems #10

fix: Complete memory management with dynamic optimization supportFix/memory problems #10

Uh oh!

Shattered217 commented Jan 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 10, 2026

Uh oh!

Copilot AI Jan 10, 2026

Uh oh!

Copilot AI Jan 10, 2026

Uh oh!

Copilot AI Jan 10, 2026

Uh oh!

Copilot AI Jan 10, 2026

Uh oh!

Copilot AI Jan 10, 2026

Uh oh!

Copilot AI Jan 10, 2026

Uh oh!

dataplayer12 commented Jan 13, 2026

Uh oh!

Shattered217 commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        if (opencv_input != nullptr)
-        {
-            cudaFree(opencv_input);
-            cudaFree((void*)gpu_result);
-        }
-        opencv_inbytes = current_inbytes;
-        cuda_check(cudaMalloc(&opencv_input, opencv_inbytes), " reallocating opencv input memory");
-        cuda_check(cudaMalloc((void**)&gpu_result, opencv_inbytes), " reallocating result memory");
+        // Allocate new buffers first, using temporaries to keep state consistent
+        void* new_opencv_input = nullptr;
+        void* new_gpu_result   = nullptr;
+        cuda_check(cudaMalloc(&new_opencv_input, current_inbytes), " reallocating opencv input memory");
+        cuda_check(cudaMalloc(&new_gpu_result,   current_inbytes), " reallocating result memory");
+        // Free old buffers after successful allocations
+        if (opencv_input != nullptr)
+        {
+            cudaFree(opencv_input);
+            opencv_input = nullptr;
+        }
+        if (gpu_result != nullptr)
+        {
+            cudaFree((void*)gpu_result);
+            gpu_result = nullptr;
+        }
+        // Commit new state atomically
+        opencv_input  = new_opencv_input;
+        gpu_result    = static_cast<decltype(gpu_result)>(new_gpu_result);
+        opencv_inbytes = current_inbytes;

	const SAM3_VISUALIZATION visualize = SAM3_VISUALIZATION::VIS_SEMANTIC_SEGMENTATION;
	const SAM3_VISUALIZATION visualize = SAM3_VISUALIZATION::VIS_INSTANCE_SEGMENTATION;

		cudaMemset(opencv_input, 0, opencv_inbytes);
		cudaMemset((void *)gpu_result, 0, opencv_inbytes);

fix: Complete memory management with dynamic optimization supportFix/memory problems #10

Are you sure you want to change the base?

fix: Complete memory management with dynamic optimization supportFix/memory problems #10

Uh oh!

Conversation

Shattered217 commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

dataplayer12 commented Jan 13, 2026

Uh oh!

Shattered217 commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Shattered217 commented Jan 10, 2026 •

edited

Loading