Skip to content

Commit 7802eb5

Browse files
benvanikclaude
andauthored
iree/async/: Proactor-based async I/O and causal frontier scheduling (#23527)
For the humans, from the human: this is a few weeks of deep feature branch work with a team of @claude's. It's gone through dozens of review cycles with mixed model teams and had quite a bit of stress testing. The design is convergent with subsequent work on the iree/net/ layer (which is built on top of this) as well as the remote HAL (which uses both). The future AMDGPU backend (and eventually all HAL drivers) will be natively built on iree/async/ for all their internal scheduling, enabling us to do distributed async execution of heterogeneous CPU/NVMe/NIC/NPU/GPU workloads. The existing iree/task/ system will be rebased on this soonish to replace its current polling infrastructure, and iree_loop_t will be upgraded to integrate better for wait operations. For now, this is a complete foundation across our platforms of interest and enough to unblock the AMDGPU and remote HAL efforts. --- This PR introduces `iree/async/`, a completion-based (proactor pattern) async I/O layer that serves as the foundation for IREE's networking, storage, and distributed scheduling. It depends only on `iree/base/` and provides the substrate that HAL drivers, networking, task executors, and the VM runtime will build on. ### Why ML inference at scale moves large tensors between GPUs, across networks, and through storage with latency budgets measured in microseconds. The data that matters — model weights, activations, KV caches — lives in GPU VRAM. Moving it between machines for distributed inference or to NVMe for checkpointing should not require the CPU to touch every byte. Modern hardware can already do this: NICs read directly from GPU VRAM (GPUDirect RDMA), NVMe controllers write to GPU memory (GPU Direct Storage), and GPUs access each other's memory across PCIe or NVLink. The software layer's job is to orchestrate these transfers, not participate in them. The existing approach of layering vendor-specific libraries (NCCL, RCCL) with traditional reactor I/O (select/poll/epoll + read/write) cannot express the pipelines we need. A reactor tells you "this fd is ready" and then you make a separate syscall to do the I/O — every transition costs two syscalls and a copy through kernel buffers. You cannot express "wait for GPU completion, then send the result over TCP, then write to NVMe" as a single atomic submission. Layering multiple runtime systems means multiple threading models, multiple synchronization primitives, and multiple memory management systems — each adding latency at every boundary. A completion-based proactor that handles all I/O through one submission/completion interface eliminates these boundaries. On io_uring, an entire pipeline — GPU fence wait, network send from registered GPU memory, disk write — can execute as linked SQEs in kernel space with zero userspace transitions between steps. The proactor is not in the data path; hardware and kernel handle the transfers directly. ### Causal frontier scheduling Beyond the I/O layer, this PR introduces a causal dependency tracking system based on vector clock frontiers. Timeline semaphores (the bridge between GPU queues and async I/O) carry frontier metadata: sparse vectors of `(axis, epoch)` pairs where each axis identifies a causal source (a GPU queue, a collective operation, a host thread) and each epoch marks a position on that timeline. When a GPU queue signals a semaphore, the signal carries the queue's current frontier — a compact summary of everything that happened before the signal. When another queue waits on that semaphore, it inherits the frontier through a merge (component-wise maximum). Causal knowledge propagates transitively: if queue C waits on a semaphore signaled by queue B, which previously waited on a semaphore signaled by queue A, queue C's frontier reflects A's work without any direct interaction. This enables three capabilities that binary events and standalone timeline semaphores cannot provide: **Wait elision**: When a queue's local frontier already dominates an operation's dependency frontier, the device wait is skipped entirely. Sequential single-queue workloads pay zero synchronization cost — every wait is elided because the queue's own epoch already implies all prerequisites. **O(1) buffer reuse**: When a buffer is freed, the deallocating queue's current frontier becomes the buffer's death frontier. Another queue can safely reuse the buffer by checking frontier dominance — one comparison instead of tracking every operation that touched the buffer. A weight tensor read by hundreds of operations has one death frontier, not hundreds of per-operation reference counts. **Remote pipeline scheduling**: A remote machine receiving a frontier can locally determine whether prerequisites are satisfied across all contributing queues — including queues on other machines it has never communicated with directly — without round-trips to the originating devices. Entire multi-stage, multi-device pipelines can be submitted atomically before any work begins, and hardware FIFO ordering ensures correct execution. Collective operations (all-reduce, all-gather) compress N device axes into a single collective channel axis, so tensor parallelism across 8 GPUs costs one frontier entry regardless of device count. The [async scheduling design docs](docs/website/docs/developers/design-docs/async-scheduling/) include an interactive visualizer that renders DAGs, frontier propagation, and semaphore state across configurable scenarios — from laptop (3 concurrent models) to datacenter (multi-node MI300X cluster with RDMA) — with step-through execution showing exactly how frontiers flow through pipelines. <img width="1192" height="1986" alt="image" src="https://github.com/user-attachments/assets/88366b2e-aca1-4c2a-8f90-7e4afea1f4c9" /> ### What's here **Core API** (`proactor.h`, `operation.h`, `semaphore.h`, `frontier.h`): The proactor manages async operation submission and completion dispatch through a vtable-dispatched interface. Operations are caller-owned, intrusive structs — no proactor allocation on submit. Semaphores provide cross-layer timeline synchronization with frontier-carrying signals. All operations carry status with rich annotations and stack traces; there are no silent failures. **Operation types**: Sockets (TCP/UDP/Unix, with accept, connect, recv, send, sendto, recvfrom, close), files (positioned pread/pwrite with open, read, write, close), events (cross-thread signaling), notifications (level-triggered epoch-based wakeup), timers, semaphore wait/signal, futex wait/wake, sequences (linked operation chains), and cross-proactor messages. Operations support multishot delivery (persistent accept/recv) and linked chaining (kernel-side sequences on io_uring, callback-emulated elsewhere). **Sockets** (`socket.h`): Immutable configuration at creation (REUSE_ADDR, REUSE_PORT, NO_DELAY, KEEPALIVE, ZERO_COPY), then bind/listen synchronously, then all I/O is async. Imported sockets from existing file descriptors. Sticky failure state — once a socket encounters an error, subsequent operations complete immediately with the recorded failure. **Memory registration** (`region.h`, `span.h`, `slab.h`): Registered memory regions for zero-copy I/O. Buffer registration pins memory and pre-computes backend handles so I/O operations reference memory by handle rather than re-mapping on every operation. Scatter-gather spans are non-owning value types; the proactor retains regions for in-flight operations automatically. Slab registration for fixed-size slot allocation with io_uring provided buffer ring integration. **Relays** (`relay.h`): Declarative source-to-sink event dataflow. Connect a readable fd or notification epoch advance to an eventfd write or notification signal. On io_uring, certain source/sink combinations execute entirely in kernel space via linked SQEs. **Device fence bridging**: Import sync_file fds from GPU drivers to advance async semaphores when GPU work completes. Export semaphore values as sync_file fds for GPU command buffers to wait on. The proactor bridges between kernel device synchronization and the async scheduling system, enabling ahead-of-time pipeline construction across GPU and I/O boundaries. **Signal handling**: Process-wide signal subscription through the proactor — signalfd on Linux, self-pipe on other POSIX platforms. SIGINT, SIGTERM, SIGHUP, SIGQUIT, SIGUSR1, SIGUSR2 dispatched as callbacks from within poll(). ### Platform backends **io_uring** (Linux 5.1+): The primary production backend. Direct syscalls, no liburing dependency. Exploits fixed files and registered buffers (avoid per-op fd lookup and page pinning), provided buffer rings for kernel-selected multishot recv buffers, linked SQEs for zero-round-trip operation sequences, zero-copy send (SEND_ZC), MSG_RING for cross-proactor messaging, futex ops (6.7+) for kernel-side semaphore waits in link chains, and sync_file fd polling for device fence import. Submit fills SQEs under a spinlock from any thread; io_uring_enter is called only from the poll thread (SINGLE_ISSUER). **POSIX** (Linux epoll, macOS/BSD kqueue, fallback poll()): Broad-coverage backend with pluggable event notification. Emulates linked operations, multishot, and other io_uring features with per-step poll round-trips — functionally equivalent API, same behavioral contract, higher per-step latency. The proactive scheduling API costs nothing extra on POSIX while enabling zero-round-trip execution on io_uring. Platform-default selection: epoll on Linux, kqueue on macOS/BSD, poll() elsewhere. **IOCP** (Windows): I/O Completion Ports backend. Closer in behavior to io_uring than the POSIX backend — completion-based rather than readiness-based. Socket operations, timer queue, and the full operation type set. All backends report capabilities at runtime (`query_capabilities()`). Callers discover what's available — multishot, fixed files, registered buffers, linked operations, zero-copy send, dmabuf, device fences, absolute timeouts, futex operations, cross-proactor messaging — and adapt their code paths. "Emulated" in the capability matrix means the API works but uses a software fallback rather than a kernel-optimized path. ### Testing A conformance test suite (CTS) validates all backends against shared test suites. Tests are written once and run against every registered backend configuration — 5 io_uring configurations with different capability masks, plus per-platform and per-feature POSIX configurations, plus IOCP. Tag-based filtering ensures tests only run against backends that support the features they exercise. Test suites cover core operations, socket I/O (TCP, UDP, Unix, multishot, zero-copy), file I/O, events, notifications, semaphores (async/sync/linked), relays, fences, cancellation, error propagation, and resource exhaustion. Benchmarks measure dispatch scalability, sequence overhead, relay fan-out, socket throughput, and event pool performance. ### Thread safety model The proactor's event loop is caller-driven: `poll()` has single-thread ownership, callbacks fire on the poll thread. `submit()`, `cancel()`, `wake()`, and `send_message()` are thread-safe from any thread. Semaphore signal/query and event set are thread-safe. Notification signal is both thread-safe and async-signal-safe. A utility wrapper (`proactor_thread.h`) provides optional dedicated-thread operation for applications that want it. ### Design docs - [`runtime/src/iree/async/README.md`](runtime/src/iree/async/README.md) — full API documentation with architecture diagrams, ownership rules, code examples, and the capability matrix - [`docs/.../async-scheduling/`](docs/website/docs/developers/design-docs/async-scheduling/) — causal frontier design document with interactive visualizer, multi-device scheduling scenarios (laptop through datacenter), and comparison with binary events and standalone timeline semaphores --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 243fe33 commit 7802eb5

File tree

244 files changed

+78474
-11
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

244 files changed

+78474
-11
lines changed

build_tools/bazel/build_test_all.sh

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -126,11 +126,23 @@ fi
126126
#
127127
# Note that somewhat contrary to its name `bazel test` will also build
128128
# any non-test targets specified.
129-
# We use `bazel query //...` piped to `bazel test` rather than the simpler
129+
#
130+
# We use `bazel cquery //...` piped to `bazel test` rather than the simpler
130131
# `bazel test //...` because the latter excludes targets tagged "manual". The
131132
# "manual" tag allows targets to be excluded from human wildcard builds, but we
132133
# want them built by CI unless they are excluded with tags.
133134
#
135+
# The cquery uses Starlark output to solve two problems:
136+
# - Clean labels: --output=label includes config hashes (e.g. "(a1b2c3)")
137+
# that xargs splits into spurious target patterns causing parse errors.
138+
# Starlark's str(target.label) produces clean canonical labels.
139+
# - Platform filtering: targets with target_compatible_with for a different
140+
# platform (e.g. Windows-only IOCP targets on a Linux host) have
141+
# IncompatiblePlatformProvider and are excluded by the Starlark expression.
142+
# Without this, the cquery-to-xargs pipeline turns them into explicitly-
143+
# listed targets, which Bazel treats as errors (unlike wildcard builds
144+
# where incompatible targets are silently skipped).
145+
#
134146
# Explicitly list bazelrc so that builds are reproducible and get cache hits
135147
# when this script is invoked locally.
136148
#
@@ -172,6 +184,9 @@ BAZEL_TEST_CMD+=(
172184
--config=generic_clang_ci
173185
)
174186

175-
"${BAZEL_STARTUP_CMD[@]}" query //... | \
187+
CQUERY_STARLARK='str(target.label) if "IncompatiblePlatformProvider" not in providers(target) else ""'
188+
"${BAZEL_STARTUP_CMD[@]}" cquery //... \
189+
--output=starlark --starlark:expr="${CQUERY_STARLARK}" 2>/dev/null | \
190+
grep -v '^$' | \
176191
xargs --max-args 1000000 --max-chars 1000000 --exit \
177192
"${BAZEL_TEST_CMD[@]}"

build_tools/bazel/iree.bazelrc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,7 @@ build:generic_clang --copt=-Wno-extern-c-compat # Matches upstream. Cannot impac
133133
build:generic_clang --copt=-Wno-invalid-offsetof # Technically UB but needed for intrusive ptrs
134134
build:generic_clang --copt=-Wno-unused-const-variable
135135
build:generic_clang --copt=-Wno-unused-function
136+
build:generic_clang --copt=-Wno-unused-lambda-capture
136137
build:generic_clang --copt=-Wno-unused-private-field
137138
build:generic_clang --copt=-Wno-pointer-sign
138139
build:generic_clang --copt=-Wno-char-subscripts

build_tools/bazel_to_cmake/bazel_to_cmake.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,7 @@ def convert_directory(directory_path, write_files, allow_partial_conversion, ver
307307
repo_cfg=repo_cfg,
308308
build_dir=directory_path,
309309
allow_partial_conversion=allow_partial_conversion,
310+
repo_root=repo_root,
310311
)
311312
except (NameError, NotImplementedError) as e:
312313
log(

build_tools/bazel_to_cmake/bazel_to_cmake_converter.py

Lines changed: 57 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -100,10 +100,12 @@ def __init__(
100100
converter: "Converter",
101101
targets: bazel_to_cmake_targets.TargetConverter,
102102
build_dir: str,
103+
repo_root: str = "",
103104
):
104105
self._converter = converter
105106
self._targets = targets
106107
self._build_dir = build_dir
108+
self._repo_root = repo_root
107109
self._custom_initialize()
108110

109111
def _custom_initialize(self):
@@ -152,6 +154,29 @@ def _emit_platform_guard_begin(self, target_compatible_with):
152154
"""Emits if(CMAKE_SYSTEM_NAME ...) for target_compatible_with."""
153155
if not target_compatible_with:
154156
return
157+
158+
# Handle PlatformSelect from select() in target_compatible_with.
159+
# Example: select({"@platforms//os:linux": [], "@platforms//os:macos": [],
160+
# "//conditions:default": ["@platforms//:incompatible"]})
161+
# Platforms with empty list are compatible; default with incompatible means
162+
# "only build on the explicitly listed platforms".
163+
if isinstance(target_compatible_with, PlatformSelect):
164+
compatible_platforms = []
165+
for label, value in target_compatible_with.conditions.items():
166+
if label == "//conditions:default":
167+
continue
168+
# Empty list means compatible on this platform.
169+
if value == []:
170+
cmake_name = _PLATFORM_CMAKE_SYSTEM_NAME.get(label)
171+
if cmake_name:
172+
compatible_platforms.append(
173+
f'CMAKE_SYSTEM_NAME STREQUAL "{cmake_name}"'
174+
)
175+
if compatible_platforms:
176+
combined = " OR ".join(compatible_platforms)
177+
self._converter.body += f"if({combined})\n"
178+
return
179+
155180
# target_compatible_with is a list of constraints (typically one).
156181
conditions = []
157182
for label in target_compatible_with:
@@ -169,6 +194,20 @@ def _emit_platform_guard_end(self, target_compatible_with):
169194
"""Emits endif() to close a target_compatible_with guard."""
170195
if not target_compatible_with:
171196
return
197+
198+
# Handle PlatformSelect: check if any compatible platforms were found.
199+
if isinstance(target_compatible_with, PlatformSelect):
200+
has_compatible = any(
201+
label != "//conditions:default"
202+
and value == []
203+
and label in _PLATFORM_CMAKE_SYSTEM_NAME
204+
for label, value in target_compatible_with.conditions.items()
205+
)
206+
if has_compatible:
207+
self._converter.body = self._converter.body.rstrip("\n") + "\n"
208+
self._converter.body += f"endif()\n\n"
209+
return
210+
172211
# Only emit if all labels are recognized (same check as begin).
173212
if all(
174213
label in _PLATFORM_CMAKE_SYSTEM_NAME for label in target_compatible_with
@@ -307,7 +346,10 @@ def _normalize_label(self, src):
307346
src = src.lstrip("/").lstrip(":").replace(":", "/")
308347
if not pkg_root_relative_label:
309348
return src
310-
elif os.path.exists(os.path.join(self._build_dir, src)):
349+
# Repo-root-relative labels (//pkg:file) resolve from the repo root,
350+
# not from the current package directory.
351+
check_dir = self._repo_root if self._repo_root else self._build_dir
352+
if os.path.exists(os.path.join(check_dir, src)):
311353
return f"${{PROJECT_SOURCE_DIR}}/{src}"
312354
else:
313355
return f"${{PROJECT_BINARY_DIR}}/{src}"
@@ -581,6 +623,7 @@ def cc_library(
581623
linkopts=None,
582624
includes=None,
583625
system_includes=None,
626+
alwayslink=None,
584627
target_compatible_with=None,
585628
**kwargs,
586629
):
@@ -599,6 +642,7 @@ def cc_library(
599642
data_block = self._convert_target_list_block("DATA", data)
600643
deps_block, platform_deps_block = self._convert_platform_select_deps(name, deps)
601644
testonly_block = self._convert_option_block("TESTONLY", testonly)
645+
alwayslink_block = self._convert_option_block("ALWAYSLINK", alwayslink)
602646
includes_block = self._convert_includes_block(includes)
603647
system_includes_block = self._convert_string_list_block(
604648
"SYSTEM_INCLUDES", system_includes
@@ -618,6 +662,7 @@ def cc_library(
618662
f"{deps_block}"
619663
f"{defines_block}"
620664
f"{testonly_block}"
665+
f"{alwayslink_block}"
621666
f"{includes_block}"
622667
f"{system_includes_block}"
623668
f" PUBLIC\n)\n\n"
@@ -1342,7 +1387,11 @@ def GetDict(obj):
13421387

13431388

13441389
def convert_build_file(
1345-
build_file_code, repo_cfg, build_dir, allow_partial_conversion=False
1390+
build_file_code,
1391+
repo_cfg,
1392+
build_dir,
1393+
allow_partial_conversion=False,
1394+
repo_root="",
13461395
):
13471396
converter = Converter()
13481397
# Allow overrides of TargetConverter and BuildFileFunctions from repo cfg.
@@ -1352,7 +1401,12 @@ def convert_build_file(
13521401
)(repo_map=repo_map)
13531402
build_file_functions = getattr(
13541403
repo_cfg, "CustomBuildFileFunctions", BuildFileFunctions
1355-
)(converter=converter, targets=target_converter, build_dir=build_dir)
1404+
)(
1405+
converter=converter,
1406+
targets=target_converter,
1407+
build_dir=build_dir,
1408+
repo_root=repo_root,
1409+
)
13561410

13571411
exec(build_file_code, GetDict(build_file_functions))
13581412
converted_text = converter.convert()

build_tools/cmake/iree_cc_library.cmake

Lines changed: 104 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ endfunction()
9595
function(iree_cc_library)
9696
cmake_parse_arguments(
9797
_RULE
98-
"PUBLIC;TESTONLY;SHARED;DISABLE_LLVM_LINK_LLVM_DYLIB"
98+
"PUBLIC;TESTONLY;SHARED;DISABLE_LLVM_LINK_LLVM_DYLIB;ALWAYSLINK"
9999
"PACKAGE;NAME;WINDOWS_DEF_FILE"
100100
"HDRS;TEXTUAL_HDRS;SRCS;COPTS;DEFINES;LINKOPTS;DATA;DEPS;INCLUDES;SYSTEM_INCLUDES"
101101
${ARGN}
@@ -154,7 +154,8 @@ function(iree_cc_library)
154154
list(APPEND _RULE_DEPS ${IREE_IMPLICIT_DEFS_CC_DEPS})
155155
endif()
156156

157-
if(NOT _RULE_IS_INTERFACE)
157+
if(NOT _RULE_IS_INTERFACE AND NOT _RULE_ALWAYSLINK)
158+
# Normal library: OBJECT for compilation, STATIC (or SHARED) for linking.
158159
add_library(${_OBJECTS_NAME} OBJECT)
159160
if(_RULE_SHARED OR BUILD_SHARED_LIBS)
160161
add_library(${_NAME} SHARED "$<TARGET_OBJECTS:${_OBJECTS_NAME}>")
@@ -273,10 +274,110 @@ function(iree_cc_library)
273274

274275
# INTERFACE libraries can't have the CXX_STANDARD property set so only
275276
# set here.
277+
set_property(TARGET ${_OBJECTS_NAME} PROPERTY CXX_STANDARD ${IREE_CXX_STANDARD})
278+
set_property(TARGET ${_OBJECTS_NAME} PROPERTY CXX_STANDARD_REQUIRED ON)
279+
elseif(NOT _RULE_IS_INTERFACE AND _RULE_ALWAYSLINK)
280+
# ALWAYSLINK library: OBJECT for compilation, INTERFACE for propagation.
281+
# The INTERFACE library propagates $<TARGET_OBJECTS:...> directly, ensuring
282+
# all objects are always included in the final binary. STATIC archives let
283+
# the linker strip unreferenced objects (dropping static init registrations);
284+
# propagating objects directly bypasses the archive and eliminates that
285+
# problem. This is the CMake equivalent of Bazel's alwayslink = True.
286+
add_library(${_OBJECTS_NAME} OBJECT)
287+
add_library(${_NAME} INTERFACE)
288+
289+
target_sources(${_OBJECTS_NAME}
290+
PRIVATE
291+
${_RULE_SRCS}
292+
${_RULE_TEXTUAL_HDRS}
293+
${_RULE_HDRS}
294+
)
295+
296+
set_property(TARGET ${_NAME} PROPERTY
297+
INTERFACE_IREE_TRANSITIVE_OBJECTS "$<TARGET_OBJECTS:${_OBJECTS_NAME}>")
298+
_iree_cc_library_add_object_deps(${_NAME} ${_RULE_DEPS})
299+
300+
# INTERFACE library propagates objects and link dependencies to consumers.
301+
target_link_libraries(${_NAME}
302+
INTERFACE
303+
$<TARGET_OBJECTS:${_OBJECTS_NAME}>
304+
${_RULE_DEPS}
305+
${IREE_THREADS_DEPS}
306+
)
307+
target_include_directories(${_NAME}
308+
INTERFACE
309+
"$<BUILD_INTERFACE:${IREE_SOURCE_DIR}>"
310+
"$<BUILD_INTERFACE:${IREE_BINARY_DIR}>"
311+
${_RULE_INCLUDES}
312+
)
313+
target_include_directories(${_NAME}
314+
SYSTEM INTERFACE
315+
${_RULE_SYSTEM_INCLUDES}
316+
)
317+
target_link_options(${_NAME}
318+
INTERFACE
319+
${IREE_DEFAULT_LINKOPTS}
320+
${_RULE_LINKOPTS}
321+
)
322+
target_compile_definitions(${_NAME}
323+
INTERFACE
324+
${_RULE_DEFINES}
325+
)
326+
327+
# OBJECT library needs compile-related properties for building the sources.
328+
# Compile options go directly on the OBJECT library (INTERFACE libraries
329+
# cannot have PRIVATE properties).
330+
target_compile_options(${_OBJECTS_NAME}
331+
PRIVATE
332+
${IREE_DEFAULT_COPTS}
333+
${_RULE_COPTS}
334+
)
335+
336+
# Forward transitive compile properties from the INTERFACE library's
337+
# dependency chain to the OBJECT library so sources see all transitive
338+
# include directories and definitions.
339+
target_include_directories(${_OBJECTS_NAME}
340+
PUBLIC
341+
$<TARGET_PROPERTY:${_NAME},INTERFACE_INCLUDE_DIRECTORIES>
342+
)
343+
target_include_directories(${_OBJECTS_NAME} SYSTEM
344+
PUBLIC
345+
$<TARGET_PROPERTY:${_NAME},INTERFACE_SYSTEM_INCLUDE_DIRECTORIES>
346+
)
347+
target_compile_definitions(${_OBJECTS_NAME}
348+
PUBLIC
349+
$<TARGET_PROPERTY:${_NAME},INTERFACE_COMPILE_DEFINITIONS>
350+
)
351+
# Forward deps to the OBJECT library for transitive compile definitions.
352+
# We forward deps directly rather than $<TARGET_PROPERTY:INTERFACE_LINK_LIBRARIES>
353+
# because the latter contains $<TARGET_OBJECTS:${_OBJECTS_NAME}> which would
354+
# create a circular reference.
355+
target_link_libraries(${_OBJECTS_NAME}
356+
PUBLIC
357+
${_RULE_DEPS}
358+
)
359+
360+
iree_add_data_dependencies(NAME ${_NAME} DATA ${_RULE_DATA})
361+
362+
if(BUILD_SHARED_LIBS AND IREE_SUPPORTS_VISIBILITY_DEFAULT)
363+
target_compile_options(${_OBJECTS_NAME} PRIVATE
364+
"-fvisibility=default"
365+
)
366+
endif()
367+
368+
if(_RULE_PUBLIC)
369+
set_property(TARGET ${_OBJECTS_NAME} PROPERTY FOLDER ${IREE_IDE_FOLDER})
370+
elseif(_RULE_TESTONLY)
371+
set_property(TARGET ${_OBJECTS_NAME} PROPERTY FOLDER ${IREE_IDE_FOLDER}/test)
372+
else()
373+
set_property(TARGET ${_OBJECTS_NAME} PROPERTY FOLDER ${IREE_IDE_FOLDER}/internal)
374+
endif()
375+
276376
set_property(TARGET ${_OBJECTS_NAME} PROPERTY CXX_STANDARD ${IREE_CXX_STANDARD})
277377
set_property(TARGET ${_OBJECTS_NAME} PROPERTY CXX_STANDARD_REQUIRED ON)
278378
else()
279-
# Generating header-only library.
379+
# Generating header-only library (no sources, or ALWAYSLINK on header-only
380+
# which is meaningless since there are no objects).
280381
add_library(${_NAME} INTERFACE)
281382
target_include_directories(${_NAME}
282383
INTERFACE

build_tools/cmake/iree_copts.cmake

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,7 @@ iree_select_compiler_opts(IREE_DEFAULT_COPTS
173173
"-Wno-invalid-offsetof" # Technically UB but needed for intrusive ptrs
174174
"-Wno-unused-const-variable"
175175
"-Wno-unused-function"
176+
"-Wno-unused-lambda-capture"
176177
"-Wno-unused-private-field"
177178
"-Wno-pointer-sign"
178179
"-Wno-char-subscripts"

0 commit comments

Comments
 (0)