-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[libclc] Fix commands in compile_to_bc are executed sequentially #130755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In libclc, we observe that compiling OpenCL source files to bitcode is executed sequentially on Windows, which increases debug build time by about an hour. add_custom_command may introduce additional implicit dependencies, see https://gitlab.kitware.com/cmake/cmake/-/issues/17097 This PR adds a target for each command and enables parallel builds of OpenCL source files.
|
@frasercrmck could you please review? thanks |
|
Am I right in thinking that CMake 3.27's |
done. |
frasercrmck
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't realise I hadn't published this review from last week!
| ${ARG_DEPENDENCIES} | ||
| DEPFILE ${ARG_OUTPUT}.d | ||
| ) | ||
| add_custom_target( ${ARG_TARGET} DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the (possibly) "temp" output (in the case of .ll files) should be given the honour of the ARG_TARGET target. The final output of this function should unconditionally have ARG_TARGET. That saves doing -as workarounds in other parts of the code. The TARGET name the user passes should be definitive, in other words.
Do we strictly need a target for the intermediate step too? If we do, we could have ${ARG_TARGET}${TMP_SUFFIX} here. Then non-IR files get ARG_TARGET target, and then in the IR block below we create ARG_TARGET for that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have simplified the code by removing ${ARG_TARGET}-as
The issue we observing is that .cl files are compiled sequentially. We don't see problem with .ll file, so we can leave it as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes that's much simpler.
I'm half wondering if this isn't only for enabling parallel execution but is technically required for correct behaviour. I know CMake can behave strangely when only depending upon files and not targets, which is that we're doing before this PR. Though even with this PR, we're still implicitly depending on files coming out of the llvm-as step. That's what makes me a bit nervous. This is still better than what we've got so maybe we can keep an eye on it and see if we have any parallel build issues stemming from .ll files, and we can adjust as needed.
I think I'll also post a follow-up which handles empty arguments. I was going to ask about handling an empty TARGET but noticed there's nothing for empty INPUT or OUTPUT so maybe that's best left handled later in one go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still better than what we've got so maybe we can keep an eye on it and see if we have any parallel build issues stemming from .ll files, and we can adjust as needed.
yeah, I agree
libclc/cmake/modules/AddLibclc.cmake
Outdated
| get_filename_component( file_ext ${file} EXT ) | ||
| if( ${file_ext} STREQUAL ".ll" ) | ||
| list( APPEND bytecode_ir_files ${output_file} ) | ||
| list( APPEND compile_tgts ${tgt}-as ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above - we wouldn't need this case if we could rely on a consistent target name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, removed
libclc/cmake/modules/AddLibclc.cmake
Outdated
| set( builtins_comp_lib_tgt builtins.comp.${ARG_ARCH_SUFFIX} ) | ||
| add_custom_target( ${builtins_comp_lib_tgt} | ||
| DEPENDS ${bytecode_files} | ||
| DEPENDS ${compile_tgts} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not need to depend on the targets and the files? CMake dependencies are strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I'll add ${bytecode_files} depends back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, added back ${bytecode_files}
|
@frasercrmck should we check CMAKE_VERSION and use DEPENDS_EXPLICIT_ONLY if the version is 3.27 or higher? I suppose DEPENDS_EXPLICIT_ONLY would be more light-weighted. And we need to keep both DEPENDS_EXPLICIT_ONLY and adding new target path, until llvm uplifts cmake version to 3.27. |
frasercrmck
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my build this increases the number of targets ninja -t targets from 19K to 28K. That's building all targets. I'm not sure what else we could do about that and whether it's a problem.
| ${ARG_DEPENDENCIES} | ||
| DEPFILE ${ARG_OUTPUT}.d | ||
| ) | ||
| add_custom_target( ${ARG_TARGET} DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes that's much simpler.
I'm half wondering if this isn't only for enabling parallel execution but is technically required for correct behaviour. I know CMake can behave strangely when only depending upon files and not targets, which is that we're doing before this PR. Though even with this PR, we're still implicitly depending on files coming out of the llvm-as step. That's what makes me a bit nervous. This is still better than what we've got so maybe we can keep an eye on it and see if we have any parallel build issues stemming from .ll files, and we can adjust as needed.
I think I'll also post a follow-up which handles empty arguments. I was going to ask about handling an empty TARGET but noticed there's nothing for empty INPUT or OUTPUT so maybe that's best left handled later in one go.
yes, the number of targets increase significantly with this approach. But as far as I know, we don't have compilation correctness issue regarding the increase of the number of targets. |
In libclc, we observe that compiling OpenCL source files to bitcode is
executed sequentially on Windows, which increases debug build time by
about an hour.
add_custom_command may introduce additional implicit dependencies, see
https://gitlab.kitware.com/cmake/cmake/-/issues/17097
This PR adds a target for each command, enabling parallel builds of
OpenCL source files.
CMake 3.27 has fixed above issue with DEPENDS_EXPLICIT_ONLY. When LLVM
upgrades cmake vertion to 3.7, we can switch to DEPENDS_EXPLICIT_ONLY.