Skip to content

Conversation

@METONLIULEI
Copy link
Contributor

No description provided.

@wgtmac
Copy link
Member

wgtmac commented May 23, 2025

@lidavidm @raulcd Could you help review this?

@wgtmac
Copy link
Member

wgtmac commented May 23, 2025

[  0%] Building C object _deps/nanoarrow-build/CMakeFiles/nanoarrow.dir/src/nanoarrow/common/array.c.o
cd /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/nanoarrow-build && /usr/bin/cc -DNANOARROW_DEBUG -I/home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/nanoarrow-src/src -I/home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/nanoarrow-build/src -g -std=gnu99 -fPIC -fsanitize=address -fno-omit-frame-pointer -fsanitize=undefined -Wall -Werror -Wextra -Wpedantic -Wno-type-limits -Wmaybe-uninitialized -Wunused-result -Wconversion -Wno-sign-conversion -Wno-misleading-indentation -MD -MT _deps/nanoarrow-build/CMakeFiles/nanoarrow.dir/src/nanoarrow/common/array.c.o -MF CMakeFiles/nanoarrow.dir/src/nanoarrow/common/array.c.o.d -o CMakeFiles/nanoarrow.dir/src/nanoarrow/common/array.c.o -c /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/nanoarrow-src/src/nanoarrow/common/array.c

It seems that the flags for sanitizers are leaked into thirdparty builds. We need to use target-based compile properties like below:

add_library(iceberg_sanitizer_flags INTERFACE)
set(SANITIZER_FLAGS "-fsanitize=address,undefined")
target_compile_options(iceberg_sanitizer_flags INTERFACE ${SANITIZER_FLAGS})
target_link_options(iceberg_sanitizer_flags INTERFACE ${SANITIZER_FLAGS})

target_link_libraries(xxx PRIVATE iceberg_sanitizer_flags)

Comment on lines 43 to 45
env:
CC: ${{ matrix.cc }}
CXX: ${{ matrix.cxx }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this defined in the matrix?

Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Some minor comments, in general LGTM

concurrency:
group: ${{ github.repository }}-${{ github.head_ref || github.sha }}-${{ github.workflow }}
cancel-in-progress: true

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add contents read permissions:

permissions:
  contents: read

CXX: ${{ matrix.cxx }}
run: |
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DICEBERG_ENABLE_ASAN=ON -DICEBERG_ENABLE_UBSAN=ON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMAKE_EXPORT_COMPILE_COMMANDS=ON seems to be the default, is it necessary to explicitly be set here?

set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

- name: Run Tests
working-directory: build
env:
ASAN_OPTIONS: log_path=out.log:detect_leaks=1:symbolize=1:strict_string_checks=1:halt_on_error=0:detect_container_overflow=0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we upload the output logs on the job? or should we remove the log_path=out.log if not used?

      - name: Save the test output
        if: always()
        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
        with:
          name: test-output
          path: out.log

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all comments has done. ths

# specific language governing permissions and limitations
# under the License.

# Sanitize check for address and undefined.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Sanitize check for address and undefined.

I think this is redundant

contents: read

jobs:
asan-test:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
asan-test:
sanitizer-test:

Comment on lines 20 to 23
if(ICEBERG_ENABLE_ASAN OR ICEBERG_ENABLE_UBSAN)
add_library(iceberg_sanitizer_flags INTERFACE)
set(SANITIZER_FLAGS iceberg_sanitizer_flags)
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if(ICEBERG_ENABLE_ASAN OR ICEBERG_ENABLE_UBSAN)
add_library(iceberg_sanitizer_flags INTERFACE)
set(SANITIZER_FLAGS iceberg_sanitizer_flags)
endif()
add_library(iceberg_sanitizer_flags INTERFACE)

It is much simpler to make it available at all times even without any flag.

Comment on lines 150 to 153
if(TARGET ${SANITIZER_FLAGS})
target_link_libraries(${LIB_NAME}_shared
PRIVATE "$<BUILD_INTERFACE:${SANITIZER_FLAGS}>")
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if(TARGET ${SANITIZER_FLAGS})
target_link_libraries(${LIB_NAME}_shared
PRIVATE "$<BUILD_INTERFACE:${SANITIZER_FLAGS}>")
endif()
target_link_libraries(${LIB_NAME}_shared
PUBLIC "$<BUILD_INTERFACE:iceberg_sanitizer_flags>")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be visible to downstream targets (e.g. unit tests) as well.

Comment on lines 210 to 213
if(TARGET ${SANITIZER_FLAGS})
target_link_libraries(${LIB_NAME}_static
PRIVATE "$<BUILD_INTERFACE:${SANITIZER_FLAGS}>")
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if(TARGET ${SANITIZER_FLAGS})
target_link_libraries(${LIB_NAME}_static
PRIVATE "$<BUILD_INTERFACE:${SANITIZER_FLAGS}>")
endif()
target_link_libraries(${LIB_NAME}_static
PUBLIC "$<BUILD_INTERFACE:iceberg_sanitizer_flags>")

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Thanks!

@raulcd
Copy link
Member

raulcd commented May 28, 2025

I am slightly confused, the log output from the job, see the artifact uploaded here:
https://github.com/apache/iceberg-cpp/actions/runs/15293677546?pr=107
suggest there are several leaks, example:

Indirect leak of 31 byte(s) in 1 object(s) allocated from:
    #0 0x7f1fcbcfe548 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:95
    #1 0x557a45d3e428 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/13/bits/new_allocator.h:151
    #2 0x557a45cf56e5 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/13/bits/allocator.h:198
    #3 0x557a45cf56e5 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/13/bits/alloc_traits.h:482
    #4 0x557a45cf56e5 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/13/bits/basic_string.h:126
    #5 0x557a45cf3657 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/13/bits/basic_string.tcc:159
    #6 0x557a45cf5196 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/include/c++/13/bits/basic_string.tcc:332
    #7 0x557a45cddb0c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace_aux(unsigned long, unsigned long, unsigned long, char) /usr/include/c++/13/bits/basic_string.tcc:468
    #8 0x557a45e0eefd in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(unsigned long, char) /usr/include/c++/13/bits/basic_string.h:1488
    #9 0x557a45e1ac3a in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::resize(unsigned long, char) /usr/include/c++/13/bits/basic_string.tcc:405
    #10 0x557a45e1737f in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::resize(unsigned long) /usr/include/c++/13/bits/basic_string.h:1114
    #11 0x557a465e9c49 in EncodeMetadata /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/vendoredarrow-src/cpp/src/arrow/c/bridge.cc:154
    #12 0x557a465eb370 in ExportMetadata /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/vendoredarrow-src/cpp/src/arrow/c/bridge.cc:296
    #13 0x557a465ea162 in ExportField /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/vendoredarrow-src/cpp/src/arrow/c/bridge.cc:189
    #14 0x557a465eb1b2 in ExportChildren /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/vendoredarrow-src/cpp/src/arrow/c/bridge.cc:283
    #15 0x557a465ea5cb in ExportSchema /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/vendoredarrow-src/cpp/src/arrow/c/bridge.cc:209
    #16 0x557a465ed9db in arrow::ExportSchema(arrow::Schema const&, ArrowSchema*) /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/vendoredarrow-src/cpp/src/arrow/c/bridge.cc:523
    #17 0x557a45c92944 in iceberg::FromArrowSchemaTest_StructType_Test::TestBody() /home/runner/work/iceberg-cpp/iceberg-cpp/test/arrow_test.cc:357
    #18 0x557a46fb1d22 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/src/gtest.cc:2638
    #19 0x557a46faac32 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/src/gtest.cc:2674
    #20 0x557a46f85b37 in testing::Test::Run() /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/src/gtest.cc:2713
    #21 0x557a46f865f5 in testing::TestInfo::Run() /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/src/gtest.cc:2859
    #22 0x557a46f86fbd in testing::TestSuite::Run() /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/src/gtest.cc:3037
    #23 0x557a46f96fe3 in testing::internal::UnitTestImpl::RunAllTests() /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/src/gtest.cc:5967
    #24 0x557a46fb31a7 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/src/gtest.cc:2638
    #25 0x557a46fabeda in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/src/gtest.cc:2674
    #26 0x557a46f954cf in testing::UnitTest::Run() /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/src/gtest.cc:5546
    #27 0x557a45e51ea1 in RUN_ALL_TESTS() /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/include/gtest/gtest.h:2334
    #28 0x557a45e51e89 in main /home/runner/work/iceberg-cpp/iceberg-cpp/build/_deps/googletest-src/googletest/src/gtest_main.cc:64
    #29 0x7f1fcac2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 42c84c92e6f98126b3e2230ebfdead22c235b667)
    #30 0x7f1fcac2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 42c84c92e6f98126b3e2230ebfdead22c235b667)
    #31 0x557a45c6a584 in _start (/home/runner/work/iceberg-cpp/iceberg-cpp/build/test/arrow_test+0x12ac584) (BuildId: a1cb9affe9505874d1d4ac3b05b928803d7bf509)

Shouldn't the CI job fail?

I might be misinterpreting but I would expect the job to fail if a leak is found (as it seems is found from the logs)

- name: Run Tests
working-directory: build
env:
ASAN_OPTIONS: log_path=out.log:detect_leaks=1:symbolize=1:strict_string_checks=1:halt_on_error=0:detect_container_overflow=0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ASAN_OPTIONS: log_path=out.log:detect_leaks=1:symbolize=1:strict_string_checks=1:halt_on_error=0:detect_container_overflow=0
ASAN_OPTIONS: log_path=out.log:detect_leaks=1:symbolize=1:strict_string_checks=1:halt_on_error=1:detect_container_overflow=0

@raulcd Perhaps we need to enable halt_on_error to avoid the errors being swallowed by the log files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raulcd currently we set halt_on_error=0 to not blocking the test, and will fix all leaks later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let @wgtmac and yourself decide but I would rather merge a failing CI job on this case, knowing that it has to be fixed on a subsequent PR, than a false positive.

env:
ASAN_OPTIONS: log_path=out.log:detect_leaks=1:symbolize=1:strict_string_checks=1:halt_on_error=0:detect_container_overflow=0
LSAN_OPTIONS: suppressions=${{ github.workspace }}/.github/lsan-suppressions.txt
UBSAN_OPTIONS: log_path=out.log:print_stacktrace=1:suppressions=${{ github.workspace }}/.github/ubsan-suppressions.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
UBSAN_OPTIONS: log_path=out.log:print_stacktrace=1:suppressions=${{ github.workspace }}/.github/ubsan-suppressions.txt
UBSAN_OPTIONS: log_path=out.log:halt_on_error=1:print_stacktrace=1:suppressions=${{ github.workspace }}/.github/ubsan-suppressions.txt

@wgtmac
Copy link
Member

wgtmac commented May 29, 2025

I would suggest to do the following:

  1. Temporarily use halt_on_error=1 to confirm that the sanitizer CI breaks to prove this PR works.
  2. Check in the PR as is (the sanitizer CI failure will bother every PR before fixed) or disable sanitizer CI to make all CIs green.
  3. Use a separate PR to fix all sanitizer issues.

@raulcd
Copy link
Member

raulcd commented May 29, 2025

1. Temporarily use `halt_on_error=1` to confirm that the sanitizer CI breaks to prove this PR works.

That makes sense to me. To validate that the PR actually will fail the CI job and then open an issue to fix the sanitizer issues and turn on the halt_on_error once errors are fixed.

Thanks @wgtmac

@wgtmac
Copy link
Member

wgtmac commented May 30, 2025

https://github.com/apache/iceberg-cpp/actions/runs/15329887114/job/43133746854?pr=107 confirmed that this PR works as expected.

@wgtmac
Copy link
Member

wgtmac commented May 30, 2025

Now we can go with either approach:

  • Fix all issues together in this PR.
  • Temporarily set halt_on_error=0 to merge this PR and fix all issues in the next PR.

Please feel free to choose one @METONLIULEI

@METONLIULEI
Copy link
Contributor Author

@wgtmac already reset halt_on_error=0 now. and fix all issues in the next PR.

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Please fix the PR title to ci: add asan and ubsan support

@METONLIULEI METONLIULEI changed the title feat: add asan and ubsan support to cmake ci: add asan and ubsan support May 30, 2025
Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for validating!

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too deep into C++ for me, but I trust on @wgtmac, @raulcd and @lidavidm expertise here, thanks for adding this @METONLIULEI

@Fokko Fokko merged commit b4f7d5b into apache:main Jun 3, 2025
7 checks passed
@METONLIULEI METONLIULEI deleted the asan branch June 4, 2025 14:52
gty404 pushed a commit to gty404/iceberg-cpp that referenced this pull request Jun 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants