Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 118 additions & 0 deletions BUILD_CHINA.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->

# Building in China

This guide helps developers in China who may experience network issues when downloading dependencies from GitHub or international mirrors.

## Using Custom Mirror URLs

If you experience download timeouts, you can override the default dependency URLs using environment variables:

```bash
export ICEBERG_ARROW_URL="<your-mirror-url>/apache-arrow-22.0.0.tar.gz"
export ICEBERG_NANOARROW_URL="<your-mirror-url>/apache-arrow-nanoarrow-0.7.0.tar.gz"
export ICEBERG_CROARING_URL="<your-mirror-url>/CRoaring-v4.3.11.tar.gz"
export ICEBERG_NLOHMANN_JSON_URL="<your-mirror-url>/json-v3.11.3.tar.xz"
export ICEBERG_SPDLOG_URL="<your-mirror-url>/spdlog-v1.15.3.tar.gz"
export ICEBERG_CPR_URL="<your-mirror-url>/cpr-1.12.0.tar.gz"

# For Avro (git repository):
export ICEBERG_AVRO_GIT_URL="<your-git-mirror>/avro.git"
# Or if you have a tarball:
export ICEBERG_AVRO_URL="<your-mirror-url>/avro.tar.gz"
```

Then build as usual:

```bash
cmake -S . -B build
cmake --build build
```

## Alternative Solutions

1. **Use system packages**: Install dependencies via your system package manager
2. **Use a proxy**: Set `https_proxy` environment variable
3. **Pre-download**: Manually download tarballs to `~/.cmake/Downloads/`

## Getting Help

If you continue experiencing build issues, please open an issue at https://github.com/apache/iceberg-cpp/issues with details about which dependency failed.

# Building in China

This guide helps developers in China build iceberg-cpp when network access to GitHub and other international sites is limited.

## Mirror Support

The build system automatically tries alternative download mirrors when the primary URL fails. All third-party dependencies have been configured with China-based mirrors.

### Available Mirrors

Dependencies are automatically downloaded from these mirror sites:

**Apache Projects (Arrow, Nanoarrow):**
- Tsinghua University: https://mirrors.tuna.tsinghua.edu.cn/apache/
- USTC: https://mirrors.ustc.edu.cn/apache/

**GitHub Projects (CRoaring, nlohmann-json, spdlog, cpr):**
- Gitee: https://gitee.com/mirrors/
- FastGit: https://hub.fastgit.xyz/

**Note**: Avro requires a git repository (unreleased version). Automatic mirror fallback is not available for git repositories, but you can specify a custom git mirror using the `ICEBERG_AVRO_GIT_URL` environment variable.

### Custom Mirror URLs

To override the default mirrors, set environment variables before running CMake:

```bash
export ICEBERG_ARROW_URL="https://mirrors.tuna.tsinghua.edu.cn/apache/arrow/arrow-22.0.0/apache-arrow-22.0.0.tar.gz"
export ICEBERG_NANOARROW_URL="https://mirrors.tuna.tsinghua.edu.cn/apache/arrow/apache-arrow-nanoarrow-0.7.0/apache-arrow-nanoarrow-0.7.0.tar.gz"
export ICEBERG_CROARING_URL="https://gitee.com/mirrors/CRoaring/repository/archive/v4.3.11.tar.gz"
export ICEBERG_NLOHMANN_JSON_URL="https://gitee.com/mirrors/JSON-for-Modern-CPP/releases/download/v3.11.3/json.tar.xz"
export ICEBERG_SPDLOG_URL="https://gitee.com/mirrors/spdlog/repository/archive/v1.15.3.tar.gz"
export ICEBERG_CPR_URL="https://gitee.com/mirrors/cpr/repository/archive/1.12.0.tar.gz"

# For Avro, you can use either a tarball URL or a git repository URL:
export ICEBERG_AVRO_URL="https://example.com/avro.tar.gz" # if you have a tarball
# OR
export ICEBERG_AVRO_GIT_URL="https://gitee.com/mirrors/avro.git" # for git mirror
```

Then build as usual:

```bash
cmake -S . -B build
cmake --build build
```

## Troubleshooting

**Download failures:**
- Try setting a specific mirror using environment variables
- Use a VPN or proxy: `export https_proxy=http://proxy:port`
- Pre-download tarballs to `~/.cmake/Downloads/`

**Slow downloads:**
- The build will automatically retry with different mirrors
- Consider using Meson build system as an alternative

**Still having issues?**
Open an issue at https://github.com/apache/iceberg-cpp/issues with details about which dependency failed and the error message.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ C++ implementation of [Apache Iceberg™](https://iceberg.apache.org/).
- CMake 3.25 or higher
- C++23 compliant compiler

> **Note**: For developers in China experiencing network issues when downloading dependencies, see [BUILD_CHINA.md](BUILD_CHINA.md) for mirror configuration.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 from me. sorry about that :(

One quick question: do other Apache projects handle this the same way for China?

iceberg-cpp should build against system libraries dependencies, and I think that's probably the right direction.

Copy link
Contributor

@HuaHuaY HuaHuaY Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with you. We have CMake options such as ICEBERG_ARROW_URL which allow users to customize urls. If other projects use specific urls for China, I would also support following this approach.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!
To clarify: this PR doesn’t introduce region specific behavior or change defaults. It only provides optional mirror URLs just like ICEBERG_ARROW_URL to make the build workable for developers in China who frequently hit GitHub timeouts.

If preferred, I can simplify the PR so it only adds optional CMake variables + documentation, matching how other Apache projects handle this.
Please let me know and I’ll update it accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my -1 too. I don't think it's good idea to add other mirrors in ARROW_SOURCE_URL too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After researching Apache best practices I removed all hardcoded mirror URLs and kept only optional environment variables for custom mirrors just like ICEBERG_ARROW_URL. I added documentation showing how users can set these variables if needed. No defaults changed—everything still defaults to the original URLs. This keeps the build flexible respects Apache guidelines and helps developers facing network issues. Please review and let me know if you need any changes.


## Build

### Build, Run Test and Install Core Libraries
Expand Down
91 changes: 75 additions & 16 deletions cmake_modules/IcebergThirdpartyToolchain.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -164,17 +164,42 @@ function(resolve_avro_dependency)
OFF
CACHE BOOL "" FORCE)

fetchcontent_declare(avro-cpp
${FC_DECLARE_COMMON_OPTIONS}
# TODO: switch to Apache Avro 1.13.0 once released.
GIT_REPOSITORY https://github.com/apache/avro.git
GIT_TAG e6c308780e876b4c11a470b9900995947f7b0fb5
SOURCE_SUBDIR
lang/c++
FIND_PACKAGE_ARGS
NAMES
avro-cpp
CONFIG)
if(DEFINED ENV{ICEBERG_AVRO_URL})
# Support custom tarball URL
fetchcontent_declare(avro-cpp
${FC_DECLARE_COMMON_OPTIONS}
URL $ENV{ICEBERG_AVRO_URL}
SOURCE_SUBDIR
lang/c++
FIND_PACKAGE_ARGS
NAMES
avro-cpp
CONFIG)
elseif(DEFINED ENV{ICEBERG_AVRO_GIT_URL})
# Support custom git URL for mirrors
fetchcontent_declare(avro-cpp
${FC_DECLARE_COMMON_OPTIONS}
GIT_REPOSITORY $ENV{ICEBERG_AVRO_GIT_URL}
GIT_TAG e6c308780e876b4c11a470b9900995947f7b0fb5
SOURCE_SUBDIR
lang/c++
FIND_PACKAGE_ARGS
NAMES
avro-cpp
CONFIG)
else()
# Default to GitHub - uses unreleased version
fetchcontent_declare(avro-cpp
${FC_DECLARE_COMMON_OPTIONS}
GIT_REPOSITORY https://github.com/apache/avro.git
GIT_TAG e6c308780e876b4c11a470b9900995947f7b0fb5
SOURCE_SUBDIR
lang/c++
FIND_PACKAGE_ARGS
NAMES
avro-cpp
CONFIG)
endif()

fetchcontent_makeavailable(avro-cpp)

Expand Down Expand Up @@ -221,9 +246,17 @@ endfunction()
function(resolve_nanoarrow_dependency)
prepare_fetchcontent()

if(DEFINED ENV{ICEBERG_NANOARROW_URL})
set(NANOARROW_URL "$ENV{ICEBERG_NANOARROW_URL}")
else()
set(NANOARROW_URL
"https://dlcdn.apache.org/arrow/apache-arrow-nanoarrow-0.7.0/apache-arrow-nanoarrow-0.7.0.tar.gz"
)
endif()

fetchcontent_declare(nanoarrow
${FC_DECLARE_COMMON_OPTIONS}
URL "https://dlcdn.apache.org/arrow/apache-arrow-nanoarrow-0.7.0/apache-arrow-nanoarrow-0.7.0.tar.gz"
URL ${NANOARROW_URL}
FIND_PACKAGE_ARGS
NAMES
nanoarrow
Expand Down Expand Up @@ -270,9 +303,16 @@ function(resolve_croaring_dependency)
set(ENABLE_ROARING_TESTS OFF)
set(ENABLE_ROARING_MICROBENCHMARKS OFF)

if(DEFINED ENV{ICEBERG_CROARING_URL})
set(CROARING_URL "$ENV{ICEBERG_CROARING_URL}")
else()
set(CROARING_URL
"https://github.com/RoaringBitmap/CRoaring/archive/refs/tags/v4.3.11.tar.gz")
endif()

fetchcontent_declare(croaring
${FC_DECLARE_COMMON_OPTIONS}
URL "https://github.com/RoaringBitmap/CRoaring/archive/refs/tags/v4.3.11.tar.gz"
URL ${CROARING_URL}
FIND_PACKAGE_ARGS
NAMES
roaring
Expand Down Expand Up @@ -318,9 +358,16 @@ function(resolve_nlohmann_json_dependency)
OFF
CACHE BOOL "" FORCE)

if(DEFINED ENV{ICEBERG_NLOHMANN_JSON_URL})
set(NLOHMANN_JSON_URL "$ENV{ICEBERG_NLOHMANN_JSON_URL}")
else()
set(NLOHMANN_JSON_URL
"https://github.com/nlohmann/json/releases/download/v3.11.3/json.tar.xz")
endif()

fetchcontent_declare(nlohmann_json
${FC_DECLARE_COMMON_OPTIONS}
URL "https://github.com/nlohmann/json/releases/download/v3.11.3/json.tar.xz"
URL ${NLOHMANN_JSON_URL}
FIND_PACKAGE_ARGS
NAMES
nlohmann_json
Expand Down Expand Up @@ -378,9 +425,15 @@ function(resolve_spdlog_dependency)
ON
CACHE BOOL "" FORCE)

if(DEFINED ENV{ICEBERG_SPDLOG_URL})
set(SPDLOG_URL "$ENV{ICEBERG_SPDLOG_URL}")
else()
set(SPDLOG_URL "https://github.com/gabime/spdlog/archive/refs/tags/v1.15.3.tar.gz")
endif()

fetchcontent_declare(spdlog
${FC_DECLARE_COMMON_OPTIONS}
URL "https://github.com/gabime/spdlog/archive/refs/tags/v1.15.3.tar.gz"
URL ${SPDLOG_URL}
FIND_PACKAGE_ARGS
NAMES
spdlog
Expand Down Expand Up @@ -440,9 +493,15 @@ function(resolve_cpr_dependency)
set(CPR_ENABLE_SSL ON)
set(CPR_USE_SYSTEM_CURL ON)

if(DEFINED ENV{ICEBERG_CPR_URL})
set(CPR_URL "$ENV{ICEBERG_CPR_URL}")
else()
set(CPR_URL "https://github.com/libcpr/cpr/archive/refs/tags/1.12.0.tar.gz")
endif()

fetchcontent_declare(cpr
${FC_DECLARE_COMMON_OPTIONS}
URL https://github.com/libcpr/cpr/archive/refs/tags/1.12.0.tar.gz
URL ${CPR_URL}
FIND_PACKAGE_ARGS
NAMES
cpr
Expand Down
Loading