|
1 | | -Comgr v3.0 Release Notes |
| 1 | +Comgr v4.0 Release Notes |
2 | 2 | ======================== |
3 | 3 |
|
4 | 4 | This document contains the release notes for the Code Object Manager (Comgr), |
5 | | -part of the ROCm Software Stack, release v3.0. Here we describe the status of |
| 5 | +part of the ROCm Software Stack, release v4.0. Here we describe the status of |
6 | 6 | Comgr, including major improvements from the previous release and new feature |
7 | 7 |
|
8 | | -These are in-progress notes for the upcoming Comgr v3.0 release. |
| 8 | +These are in-progress notes for the upcoming Comgr v4.0 release. |
9 | 9 | Release notes for previous releases can be found in |
10 | 10 | [docs/historical](docs/historical). |
11 | 11 |
|
12 | 12 | Potentially Breaking Changes |
13 | 13 | ---------------------------- |
14 | 14 | These changes are ones which we think may surprise users when upgrading to |
15 | | -Comgr v3.0 because of the opportunity they pose for disruption to existing |
| 15 | +Comgr v4.0 because of the opportunity they pose for disruption to existing |
16 | 16 | code bases. |
17 | 17 |
|
18 | | -- Removed -h option from comgr-objdump: The -h option (short for -headers) is a |
19 | | -legal comgr-objdump option. However registering this as an LLVM option by Comgr |
20 | | -prevents other LLVM tools or instances from registering a -h option in the same |
21 | | -process, which is an issue because -h is a common short form for -help. |
22 | | -- Updated default code object version used when linking code object specific |
23 | | -device library from v4 to v5 |
24 | | -- Updated shared library name on Windows 64-bit to include Comgr major version |
25 | | -(libamd\_comgr.dll -> libamd\_comgr\_X.dll, where X is the major version) |
26 | | -- oclc\_daz\_opt\_on.bc and oclc\_daz\_opt\_off.bc, and the corresponding |
27 | | - variable \_\_oclc\_daz\_opt are no longer necessary. |
28 | | -- Updated default device library linking behavior for several actions. |
29 | | - Previously, linking was done for some actions and not others, and not |
30 | | - controllable by the user. Now, linking is not done by default, but can |
31 | | - optionally be enabled via the |
32 | | - amd\_comgr\_action\_info\_set\_device\_lib\_linking() API. Users relying |
33 | | - on enabled-by-default behavior should update to use the new API to avoid |
34 | | - changes in behavior. |
35 | | - |
36 | | - Note: This does not apply to the \*COMPILE\_SOURCE\_WITH\_DEVICE\_LIBS\_TO\_BC |
37 | | - action. This action is not affected by the |
38 | | - amd\_comgr\_action\_info\_set\_device\_lib\_linking() API. The new API will |
39 | | - allow us to deprecate and remove this action in favor of the |
40 | | - \*COMPILE\_SOURCE\_TO\_BC action. |
41 | 18 |
|
42 | 19 | New Features |
43 | 20 | ------------ |
44 | | -- Added support for linking code\_object\_v4/5 device library files. |
45 | | -- Enabled llvm dylib builds. When llvm dylibs are enabled, a new package |
46 | | -rocm-llvm-core will contain the required dylibs for Comgr. |
47 | | -- Moved build to C++17, allowing us to use more modern features in the |
48 | | -implementation and tests. |
49 | | -- Enabled thread-safe execution of Comgr by enclosing primary Comgr actions in |
50 | | -an std::scoped\_lock() |
51 | | -- Added support for bitcode and archive unbundling during linking via the new |
52 | | -llvm OffloadBundler API. |
53 | | -- Added support for code object v6 and generic targets. |
54 | | -- Added mechanism to bypass device library file system writes if Comgr is able |
55 | | -to locate a local device library directory via the clang-resource-dir |
56 | 21 |
|
57 | 22 | Bug Fixes |
58 | 23 | --------- |
59 | | -- Fixed symbolizer assertion for non-null terminated file-slice content, |
60 | | -by bypassing null-termination check in llvm::MemoryBuffer |
61 | | -- Fixed bug and add error checking for internal unbundling. Previously internal |
62 | | -unbundler would fail if files weren't already present in filesystem. |
63 | | -- Fixed issue where lookUpCodeObject() would fail if code object ISA strings |
64 | | -weren't listed in order. |
65 | | -- Added support for subdirectories in amd\_comgr\_set\_data\_name(). Previously |
66 | | -names with a "/" would generate a file-not-found error. |
67 | | -- Added amdgpu-internalize-symbols option to bitcode codegen action, which has |
68 | | -significant performance implications |
69 | | -- Fixed an issue where -nogpulib was always included in HIP compilations, which |
70 | | -prevented correct execution of |
71 | | -COMPILE\_SOURCE\_WITH\_DEVICE\_LIBS\_TO\_BC action. |
72 | | -- Fixed a multi-threading bug where programs would hang when calling Comgr APIs |
73 | | -like amd\_comgr\_iterate\_symbols() from multiple threads |
74 | | -- Fixed an issue where providing DataObjects with an empty name to the bitcode |
75 | | -linking action caused errors when AMD\_COMGR\_SAVE\_TEMPS was enabled, or when |
76 | | -linking bitcode bundles. |
77 | | -- Updated to use lld::lldMain() introduced in D110949 instead of the older |
78 | | -lld::elf::link in Comgr's linkWithLLD() |
79 | | -- Added -x assembler option to assembly compilation. Before, if an assembly file |
80 | | -did not end with a .s file extension, it was not handled properly by the Comgr |
81 | | -ASSEMBLE\_SOURCE\_TO\_RELOCATABLE action. |
82 | | -- Switched getline() from C++ to C-style to avoid issues with stdlibc++ and |
83 | | -pytorch |
84 | | -- Added new -relink-builtin-bitcode-postop LLVM option to device library. This |
85 | | -fixes an issue with the \*COMPILE\_SOURCE\_WITH\_DEVICE\_LIBRARIES\_TO\_BC where |
86 | | -OpenCL applications that leveraged AMDGPUSimplifyLibCalls optimizations would |
87 | | -need to re-link bitcodes separately to avoid errors at runtime. |
88 | | -- Correctly set directory to object file path when forwarding -save-temps for |
89 | | -HIP compilations with AMD\_COMGR\_SAVE\_TEMPS set |
90 | | -- Added new ['--skip-line-zero'](https://github.com/llvm/llvm-project/pull/82240) |
91 | | -LLVM option by default in comgr-symbolizer to support symbolization of instructions |
92 | | -having no source correspondence in the debug information. |
93 | 24 |
|
94 | 25 | New APIs |
95 | 26 | -------- |
96 | | -- amd\_comgr\_populate\_mangled\_names() (v2.5) |
97 | | -- amd\_comgr\_get\_mangled\_name() (v2.5) |
98 | | - - Support bitcode and executable name lowering. The first call populates a |
99 | | - list of mangled names for a given data object, while the second fetches a |
100 | | - name from a given object and index. |
101 | | -- amd\_comgr\_populate\_name\_expression\_map() (v2.6) |
102 | | -- amd\_comgr\_map\_name\_expression\_to\_symbol\_name() (v2.6) |
103 | | - - Support bitcode and code object name expression mapping. The first call |
104 | | - populates a map of name expressions for a given comgr data object, using |
105 | | - LLVM APIs to traverse the bitcode or code object. The second call returns |
106 | | - a value (mangled symbol name) from the map for a given key (unmangled |
107 | | - name expression). These calls assume that names of interest have been |
108 | | - enclosed the HIP runtime using a stub attribute containg the following |
109 | | - string in the name: "__amdgcn_name_expr". |
110 | | -- amd\_comgr\_map\_elf\_virtual\_address\_to\_code\_object\_offset() (v2.7) |
111 | | - - For a given executable and ELF virtual address, return a code object |
112 | | - offset. This API will benifet the ROCm debugger and profilier |
113 | | -- amd\_comgr\_action\_info\_set\_bundle\_entry\_ids() (v2.8) |
114 | | -- amd\_comgr\_action\_info\_get\_bundle\_entry\_id\_count() (v2.8) |
115 | | -- amd\_comgr\_action\_info\_get\_bundle\_entry\_id() (v2.8) |
116 | | - - A user can provide a set of bundle entry IDs, which are processed when |
117 | | - calling the AMD\_COMGR\_UNBUNDLE action |
118 | | -- amd\_comgr\_action\_info\_set\_device\_lib\_linking() (v2.9) |
119 | | - - By setting this ActionInfo property, a user can explicitly dictate if |
120 | | - device libraries should be linked for a given action. (Previouly, the |
121 | | - action type implicitly determined device library linking). |
122 | | - |
123 | 27 |
|
124 | 28 | Deprecated APIs |
125 | 29 | --------------- |
126 | 30 |
|
127 | 31 | Removed APIs |
128 | 32 | ------------ |
129 | | -- amd\_comgr\_action\_info\_set\_options() (v3.0) |
130 | | -- amd\_comgr\_action\_info\_get\_options() (v3.0) |
131 | | - - Use amd\_comgr\_action\_info\_set\_option\_list(), |
132 | | - amd\_comgr\_action\_info\_get\_option\_list\_count(), and |
133 | | - amd\_comgr\_action\_info\_get\_option\_list\_item() instead |
134 | 33 |
|
135 | 34 | New Comgr Actions and Data Types |
136 | 35 | -------------------------------- |
137 | | -- (Action) AMD\_COMGR\_ACTION\_COMPILE\_SOURCE\_TO\_RELOCATABLE |
138 | | - - This action performs compile-to-bitcode, linking device libraries, and |
139 | | -codegen-to-relocatable in a single step. By doing so, clients are able to defer more |
140 | | -of the flag handling to toolchain. Currently only supports HIP. |
141 | | -- (Data Type) AMD\_COMGR\_DATA\_KIND\_BC\_BUNDLE |
142 | | -- (Data Type) AMD\_COMGR\_DATA\_KIND\_AR\_BUNDLE |
143 | | - - These data kinds can now be passed to an AMD\_COMGR\_ACTION\_LINK\_BC\_TO\_BC |
144 | | -action, and Comgr will internally unbundle and link via the OffloadBundler and linkInModule APIs. |
145 | | -- (Language Type) AMD\_COMGR\_LANGUAGE\_LLVM\_IR |
146 | | - - This language can now be passed to AMD\_COMGR\_ACTION\_COMPILE\_\* actions |
147 | | - to enable compilation of LLVM IR (.ll or .bc) files. This is useful for MLIR |
148 | | - contexts. |
149 | | -- (Action) AMD\_COMGR\_ACTION\_COMPILE\_SOURCE\_TO\_EXECUTABLE |
150 | | - - This action allows compilation from source directly to executable, including |
151 | | - linking device libraries. |
152 | | -- (Action) AMD\_COMGR\_ACTION\_UNBUNDLE |
153 | | - - This accepts a set of bitcode bundles, object file bundles, and archive |
154 | | - bundles,and returns set of unbundled bitcode, object files, and archives, |
155 | | - selecting bundles based on the bundle entry IDs provided. |
156 | | -- (Data Type) AMD\_COMGR\_DATA\_KIND\_OBJ\_BUNDLE |
157 | | - - This data kind represents a clang-offload-bundle of object files, and can be |
158 | | - passed when calling the AMD\_COMGR\_ACTION\_UNBUNDLE action |
159 | | -- (Data Type) AMD\_COMGR\_DATA\_KIND\_SPIRV |
160 | | - - This data kind represents a SPIR-V binary file (.spv) |
161 | | -- (Action) AMD\_COMGR\_ACTION\_TRANSLATE\_SPIRV\_TO\_BC |
162 | | - - This accepts a set of SPIR-V (.spv) inputs, and returns a set of translated |
163 | | - bitcode (.bc) outputs |
164 | 36 |
|
165 | 37 | Deprecated Comgr Actions and Data Types |
166 | 38 | --------------------------------------- |
167 | 39 |
|
168 | 40 | Removed Comgr Actions and Data Types |
169 | 41 | ------------------------------------ |
170 | | -- (Action) AMD\_COMGR\_ACTION\_COMPILE\_SOURCE\_TO\_FATBIN |
171 | | - - This workaround has been removed in favor of |
172 | | - \*\_COMPILE\_SOURCE\_(WITH\_DEVICE\_LIBS\_)TO\_BC |
173 | | -- (Action) AMD\_COMGR\_ACTION\_OPTIMIZE\_BC\_TO\_BC |
174 | | - - This is a legacy action that was never implemented |
175 | | -- (Language) AMD\_COMGR\_LANGUAGE\_HC |
176 | | - - This is a legacy language that was never used |
177 | | -- (Action) AMD\_COMGR\_ACTION\_ADD\_DEVICE\_LIBRARIES |
178 | | - - This has been replaced with |
179 | | - AMD\_COMGR\_ACTION\_COMPILE\_SOURCE\_WITH\_DEVICE\_LIBS\_TO\_BC |
180 | 42 |
|
181 | 43 | Comgr Testing, Debugging, and Logging Updates |
182 | 44 | --------------------------------------------- |
183 | | -- Added support for C++ tests. Although Comgr APIs are C-compatible, we can now |
184 | | -use C++ features in testing (C++ threading APIs, etc.) |
185 | | -- Clean up test directory by moving sources to subdirectory |
186 | | -- Several tests updated to pass while verbose logs are redirected to stdout |
187 | | -- Log information reported when AMD\_COMGR\_EMIT\_VERBOSE\_LOGS updated to: |
188 | | - - Show both user-facing clang options used (Compilation Args) and internal |
189 | | - driver options (Driver Job Args) |
190 | | - - Show files linked by linkBitcodeToBitcode() |
191 | | -- Remove support for code object v2 compilation in tests and test CMAKE due to |
192 | | -deprecation of code object v2 in LLVM. However, we still test loading and |
193 | | -metadata querys for code object v2 objects. |
194 | | -- Remove support for code object v3 compilation in tests and test CMAKE due to |
195 | | -deprecation of code object v3 in LLVM. However, we still test loading and |
196 | | -metadata querys for code object v3 objects. |
197 | | -- Revamp symbolizer test to fail on errors, among other improvments |
198 | | -- Improve linking and unbundling log to correctly store temporary files in /tmp, |
199 | | -and to output clang-offload-bundler command to allow users to re-create Comgr |
200 | | -unbundling. |
201 | | -- Add git branch and commit hash for Comgr, and commit hash for LLVM to log |
202 | | -output for Comgr actions. This can help us debug issues more quickly in cases |
203 | | -where reporters provide Comgr logs. |
204 | | -- Fix multiple bugs with mangled names test |
205 | | -- Update default arch for test binaries from gfx830 to gfx900 |
206 | | -- Refactor nested kernel behavior into new test, as this behavior is less common |
207 | | -and shouldn't be featured in the baseline tests |
208 | | -- Add metadata parsing tests for code objects with multiple AMDGPU metadata note entries. |
209 | | -- Updated Comgr HIP test to not rely on HIP\_COMPILER being set, or a valid HIP |
210 | | -installation. We can test the functionality of Comgr HIP compilation without |
211 | | -directly relying on HIP |
212 | | -- Added framework for Comgr lit tests. These tests will allow us to easily |
213 | | -validate generated artifacts with command-line tools like llvm-dis, |
214 | | -llvm-objdump, etc. Moving forward, most new Comgr tests should be written as |
215 | | -lit tests, and tests in comgr/test should be transitioned to comgr/test-lit. |
216 | 45 | - Removed HIP\_PATH and ROCM\_PATH environment variables. These were used for |
217 | 46 | now-removed Comgr actions, such as \*COMPILE\_SOURCE\_TO\_FATBIN. |
218 | 47 |
|
219 | 48 | New Targets |
220 | 49 | ----------- |
221 | | - - gfx942 |
222 | | - - gfx950 |
223 | | - - gfx1036 |
224 | | - - gfx1150 |
225 | | - - gfx1151 |
226 | | - - gfx1152 |
227 | | - - gfx9-generic |
228 | | - - gfx9-4-generic |
229 | | - - gfx10-1-generic |
230 | | - - gfx10-3-generic |
231 | | - - gfx11-generic |
232 | | - - gfx12-generic |
233 | 50 |
|
234 | 51 | Removed Targets |
235 | 52 | --------------- |
|
0 commit comments