Skip to content

Conversation

@JohanMabille
Copy link
Collaborator

No description provided.

@JohanMabille JohanMabille changed the title Stream Many small fixes on ArrowStream Nov 27, 2025
Copilot finished reviewing on behalf of JohanMabille November 27, 2025 18:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the arrow_array_stream_proxy class with improved move semantics support and better resource management. The changes add helper functions for moving ArrowArrayStream C structs, refactor the internal API to use references instead of pointers, and reorganize tests into a more structured format with comprehensive coverage of move semantics.

Key changes:

  • Added move_array_stream() helper functions to properly transfer ownership of C struct resources
  • Changed get_private_data() to return a reference instead of a pointer for cleaner API usage
  • Added owns_stream() method to check stream ownership and improved null checking in pop()

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/arrow_interface/arrow_array_stream.cpp Implements move_array_stream() helper functions for transferring ownership of ArrowArrayStream resources
src/arrow_interface/arrow_array_stream_proxy.cpp Updates constructor to use move_array_stream(), changes get_private_data() return type to reference, and adds owns_stream() method
include/sparrow/arrow_interface/arrow_array_stream_proxy.hpp Updates API signatures including reference return for get_private_data() and adds owns_stream() declaration
include/sparrow/arrow_interface/arrow_array_stream.hpp Declares new move_array_stream() helper functions and removes extraneous blank line
test/test_arrow_array_stream_proxy.cpp Reorganizes tests into subcases and adds comprehensive test coverage for move semantics, ownership checking, and edge cases

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +147 to +164
auto test_array = make_test_primitive_array<int32_t>(10);
ArrowArrayStream stream;
fill_arrow_array_stream(stream);
arrow_array_stream_proxy src(std::move(stream));
src.push(std::move(test_array));

ArrowArrayStream stream2;
fill_arrow_array_stream(stream2);
arrow_array_stream_proxy dst(std::move(stream2));
dst = std::move(src);

auto* src_str = src.export_stream();
REQUIRE_EQ(src_str->release, nullptr);
auto dst_arr = dst.pop();
REQUIRE(dst_arr.has_value());
delete src_str;
}

Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the previous test case, this has potential for memory leaks if assertions fail. The src_str pointer allocated on line 159 may not be deleted if REQUIRE_EQ on line 160 fails. Consider using RAII (e.g., std::unique_ptr) to ensure proper cleanup:

std::unique_ptr<ArrowArrayStream> src_str(src.export_stream());
REQUIRE_EQ(src_str->release, nullptr);

Copilot uses AI. Check for mistakes.
Comment on lines +166 to +182
auto test_array = make_test_primitive_array<int32_t>(10);
ArrowArrayStream stream;
fill_arrow_array_stream(stream);
arrow_array_stream_proxy src(&stream);
src.push(std::move(test_array));

ArrowArrayStream stream2;
fill_arrow_array_stream(stream2);
arrow_array_stream_proxy dst(&stream2);
dst = std::move(src);

auto* src_str = src.export_stream();
REQUIRE_EQ(src_str, nullptr);
auto dst_arr = dst.pop();
REQUIRE(dst_arr.has_value());
}
}
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case has memory leaks. Both stream and stream2 are stack-allocated and filled with fill_arrow_array_stream() which allocates private_data. When the test ends, these streams go out of scope without calling their release functions to free the private_data.

You should add cleanup code before the closing brace:

if (stream.release != nullptr)
{
    stream.release(&stream);
}
if (stream2.release != nullptr)
{
    stream2.release(&stream2);
}

Copilot uses AI. Check for mistakes.
ArrowArrayStream stream{};
fill_arrow_array_stream(stream);
arrow_array_stream_proxy proxy(&stream);
REQUIRE(!proxy.owns_stream());
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case has a memory leak. The stack-allocated stream is filled with fill_arrow_array_stream(stream) which allocates private_data. The stream is passed by pointer to the proxy (not owned), and when the test ends, the stack-allocated stream goes out of scope without calling its release function to free the private_data.

You should add cleanup code before the closing brace:

if (stream.release != nullptr)
{
    stream.release(&stream);
}
Suggested change
REQUIRE(!proxy.owns_stream());
REQUIRE(!proxy.owns_stream());
if (stream.release != nullptr)
{
stream.release(&stream);
}

Copilot uses AI. Check for mistakes.
fill_arrow_array_stream(stream);
arrow_array_stream_proxy src(&stream);
arrow_array_stream_proxy dst(std::move(src));
CHECK_THROWS_AS(src.pop(), std::runtime_error);
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case has a memory leak. The stack-allocated stream is filled with fill_arrow_array_stream(stream) which allocates private_data. After being passed to src by pointer and then moved to dst, the original stream goes out of scope without calling its release function to free the private_data.

You should add cleanup code before the closing brace:

if (stream.release != nullptr)
{
    stream.release(&stream);
}
Suggested change
CHECK_THROWS_AS(src.pop(), std::runtime_error);
CHECK_THROWS_AS(src.pop(), std::runtime_error);
if (stream.release != nullptr)
{
stream.release(&stream);
}

Copilot uses AI. Check for mistakes.
/**
* @brief Gets the private data (mutable version).
*
* @return Mutable pointer to the stream's private data.
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states "Mutable pointer to the stream's private data" but the return type was changed from pointer to reference. Update the documentation to say "Mutable reference to the stream's private data".

Suggested change
* @return Mutable pointer to the stream's private data.
* @return Mutable reference to the stream's private data.

Copilot uses AI. Check for mistakes.
Comment on lines +131 to +133
SPARROW_API ArrowArrayStream move_array_stream(ArrowArrayStream&& source);
SPARROW_API ArrowArrayStream move_array_stream(ArrowArrayStream& source);
}
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The move_array_stream functions lack documentation. These functions appear to implement move semantics for ArrowArrayStream (a C struct without native move constructor). Consider adding documentation that explains:

  • What the function does (transfers ownership of the stream's resources)
  • The state of the source after the move (all pointers zeroed)
  • When this should be used instead of regular assignment
  • Any preconditions or postconditions

Example:

/**
 * @brief Move an ArrowArrayStream by transferring ownership of its resources.
 *
 * Performs a shallow copy of the stream structure and zeros out the source.
 * After this operation, the source stream is left in a released-like state
 * with all function pointers and data pointers set to null.
 *
 * @param source The source stream to move from.
 * @return A new ArrowArrayStream with ownership of the source's resources.
 *
 * @post source will have all pointers set to nullptr/null
 */
Suggested change
SPARROW_API ArrowArrayStream move_array_stream(ArrowArrayStream&& source);
SPARROW_API ArrowArrayStream move_array_stream(ArrowArrayStream& source);
}
/**
* @brief Move an ArrowArrayStream by transferring ownership of its resources.
*
* Performs a shallow copy of the stream structure and zeros out the source.
* After this operation, the source stream is left in a released-like state
* with all function pointers and data pointers set to nullptr or zero.
*
* This should be used when you need to transfer ownership of the stream's resources,
* rather than copying. Regular assignment does not transfer ownership and may lead to
* double-free or resource leaks.
*
* @param source The source stream to move from (rvalue reference).
* Must not be released or nullptr.
* @return A new ArrowArrayStream with ownership of the source's resources.
*
* @post source will have all pointers set to nullptr/null/zero.
*
* @see https://arrow.apache.org/docs/format/CStreamInterface.html
*/
SPARROW_API ArrowArrayStream move_array_stream(ArrowArrayStream&& source);
/**
* @brief Move an ArrowArrayStream by transferring ownership of its resources.
*
* Performs a shallow copy of the stream structure and zeros out the source.
* After this operation, the source stream is left in a released-like state
* with all function pointers and data pointers set to nullptr or zero.
*
* This should be used when you need to transfer ownership of the stream's resources,
* rather than copying. Regular assignment does not transfer ownership and may lead to
* double-free or resource leaks.
*
* @param source The source stream to move from (lvalue reference).
* Must not be released or nullptr.
* @return A new ArrowArrayStream with ownership of the source's resources.
*
* @post source will have all pointers set to nullptr/null/zero.
*
* @see https://arrow.apache.org/docs/format/CStreamInterface.html
*/
SPARROW_API ArrowArrayStream move_array_stream(ArrowArrayStream& source);

Copilot uses AI. Check for mistakes.
Comment on lines +131 to +141
{
ArrowArrayStream stream{};
fill_arrow_array_stream(stream);
arrow_array_stream_proxy src(&stream);
arrow_array_stream_proxy dst(std::move(src));

auto* src_str = src.export_stream();
auto* dst_str = dst.export_stream();
REQUIRE_EQ(src_str, nullptr);
REQUIRE_NE(dst_str, nullptr);
}
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case has a memory leak. The stack-allocated stream is filled with fill_arrow_array_stream(stream) which allocates private_data using new. The stream is passed by pointer to the proxy (not owned), and when the test ends, the stack-allocated stream goes out of scope without calling its release function to free the private_data.

You should add cleanup code before the closing brace:

if (stream.release != nullptr)
{
    stream.release(&stream);
}

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +129
{
ArrowArrayStream stream{};
fill_arrow_array_stream(stream);
arrow_array_stream_proxy src(std::move(stream));
arrow_array_stream_proxy dst(std::move(src));

auto* src_str = src.export_stream();
auto* dst_str = dst.export_stream();
REQUIRE_EQ(src_str->release, nullptr);
REQUIRE_NE(dst_str->release, nullptr);
delete src_str;
delete dst_str;
}
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test has potential for memory leaks if assertions fail. The src_str and dst_str pointers are allocated by export_stream() but may not be deleted if a REQUIRE_EQ or REQUIRE_NE assertion fails before reaching the delete statements. Consider using RAII (e.g., std::unique_ptr) to ensure proper cleanup even when tests fail.

Example:

std::unique_ptr<ArrowArrayStream> src_str(src.export_stream());
std::unique_ptr<ArrowArrayStream> dst_str(dst.export_stream());
REQUIRE_EQ(src_str->release, nullptr);
REQUIRE_NE(dst_str->release, nullptr);

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Nov 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.18%. Comparing base (a6fabde) to head (ff345bb).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #616      +/-   ##
==========================================
+ Coverage   88.82%   89.18%   +0.36%     
==========================================
  Files         107      107              
  Lines        8279     8288       +9     
==========================================
+ Hits         7354     7392      +38     
+ Misses        925      896      -29     
Flag Coverage Δ
unittests 89.18% <100.00%> (+0.36%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@Alex-PLACET Alex-PLACET left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok once the copilot comments fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants