|
1 |
| -Comgr v3.0 Release Notes |
| 1 | +Comgr v4.0 Release Notes |
2 | 2 | ========================
|
3 | 3 |
|
4 | 4 | This document contains the release notes for the Code Object Manager (Comgr),
|
5 |
| -part of the ROCm Software Stack, release v3.0. Here we describe the status of |
| 5 | +part of the ROCm Software Stack, release v4.0. Here we describe the status of |
6 | 6 | Comgr, including major improvements from the previous release and new feature
|
7 | 7 |
|
8 |
| -These are in-progress notes for the upcoming Comgr v3.0 release. |
| 8 | +These are in-progress notes for the upcoming Comgr v4.0 release. |
9 | 9 | Release notes for previous releases can be found in
|
10 | 10 | [docs/historical](docs/historical).
|
11 | 11 |
|
12 | 12 | Potentially Breaking Changes
|
13 | 13 | ----------------------------
|
14 | 14 | These changes are ones which we think may surprise users when upgrading to
|
15 |
| -Comgr v3.0 because of the opportunity they pose for disruption to existing |
| 15 | +Comgr v4.0 because of the opportunity they pose for disruption to existing |
16 | 16 | code bases.
|
17 | 17 |
|
18 |
| -- Removed -h option from comgr-objdump: The -h option (short for -headers) is a |
19 |
| -legal comgr-objdump option. However registering this as an LLVM option by Comgr |
20 |
| -prevents other LLVM tools or instances from registering a -h option in the same |
21 |
| -process, which is an issue because -h is a common short form for -help. |
22 |
| -- Updated default code object version used when linking code object specific |
23 |
| -device library from v4 to v5 |
24 |
| -- Updated shared library name on Windows 64-bit to include Comgr major version |
25 |
| -(libamd\_comgr.dll -> libamd\_comgr\_X.dll, where X is the major version) |
26 |
| -- oclc\_daz\_opt\_on.bc and oclc\_daz\_opt\_off.bc, and the corresponding |
27 |
| - variable \_\_oclc\_daz\_opt are no longer necessary. |
28 |
| -- Updated default device library linking behavior for several actions. |
29 |
| - Previously, linking was done for some actions and not others, and not |
30 |
| - controllable by the user. Now, linking is not done by default, but can |
31 |
| - optionally be enabled via the |
32 |
| - amd\_comgr\_action\_info\_set\_device\_lib\_linking() API. Users relying |
33 |
| - on enabled-by-default behavior should update to use the new API to avoid |
34 |
| - changes in behavior. |
35 |
| - |
36 |
| - Note: This does not apply to the \*COMPILE\_SOURCE\_WITH\_DEVICE\_LIBS\_TO\_BC |
37 |
| - action. This action is not affected by the |
38 |
| - amd\_comgr\_action\_info\_set\_device\_lib\_linking() API. The new API will |
39 |
| - allow us to deprecate and remove this action in favor of the |
40 |
| - \*COMPILE\_SOURCE\_TO\_BC action. |
41 | 18 |
|
42 | 19 | New Features
|
43 | 20 | ------------
|
44 |
| -- Added support for linking code\_object\_v4/5 device library files. |
45 |
| -- Enabled llvm dylib builds. When llvm dylibs are enabled, a new package |
46 |
| -rocm-llvm-core will contain the required dylibs for Comgr. |
47 |
| -- Moved build to C++17, allowing us to use more modern features in the |
48 |
| -implementation and tests. |
49 |
| -- Enabled thread-safe execution of Comgr by enclosing primary Comgr actions in |
50 |
| -an std::scoped\_lock() |
51 |
| -- Added support for bitcode and archive unbundling during linking via the new |
52 |
| -llvm OffloadBundler API. |
53 |
| -- Added support for code object v6 and generic targets. |
54 |
| -- Added mechanism to bypass device library file system writes if Comgr is able |
55 |
| -to locate a local device library directory via the clang-resource-dir |
56 | 21 |
|
57 | 22 | Bug Fixes
|
58 | 23 | ---------
|
59 |
| -- Fixed symbolizer assertion for non-null terminated file-slice content, |
60 |
| -by bypassing null-termination check in llvm::MemoryBuffer |
61 |
| -- Fixed bug and add error checking for internal unbundling. Previously internal |
62 |
| -unbundler would fail if files weren't already present in filesystem. |
63 |
| -- Fixed issue where lookUpCodeObject() would fail if code object ISA strings |
64 |
| -weren't listed in order. |
65 |
| -- Added support for subdirectories in amd\_comgr\_set\_data\_name(). Previously |
66 |
| -names with a "/" would generate a file-not-found error. |
67 |
| -- Added amdgpu-internalize-symbols option to bitcode codegen action, which has |
68 |
| -significant performance implications |
69 |
| -- Fixed an issue where -nogpulib was always included in HIP compilations, which |
70 |
| -prevented correct execution of |
71 |
| -COMPILE\_SOURCE\_WITH\_DEVICE\_LIBS\_TO\_BC action. |
72 |
| -- Fixed a multi-threading bug where programs would hang when calling Comgr APIs |
73 |
| -like amd\_comgr\_iterate\_symbols() from multiple threads |
74 |
| -- Fixed an issue where providing DataObjects with an empty name to the bitcode |
75 |
| -linking action caused errors when AMD\_COMGR\_SAVE\_TEMPS was enabled, or when |
76 |
| -linking bitcode bundles. |
77 |
| -- Updated to use lld::lldMain() introduced in D110949 instead of the older |
78 |
| -lld::elf::link in Comgr's linkWithLLD() |
79 |
| -- Added -x assembler option to assembly compilation. Before, if an assembly file |
80 |
| -did not end with a .s file extension, it was not handled properly by the Comgr |
81 |
| -ASSEMBLE\_SOURCE\_TO\_RELOCATABLE action. |
82 |
| -- Switched getline() from C++ to C-style to avoid issues with stdlibc++ and |
83 |
| -pytorch |
84 |
| -- Added new -relink-builtin-bitcode-postop LLVM option to device library. This |
85 |
| -fixes an issue with the \*COMPILE\_SOURCE\_WITH\_DEVICE\_LIBRARIES\_TO\_BC where |
86 |
| -OpenCL applications that leveraged AMDGPUSimplifyLibCalls optimizations would |
87 |
| -need to re-link bitcodes separately to avoid errors at runtime. |
88 |
| -- Correctly set directory to object file path when forwarding -save-temps for |
89 |
| -HIP compilations with AMD\_COMGR\_SAVE\_TEMPS set |
90 |
| -- Added new ['--skip-line-zero'](https://github.com/llvm/llvm-project/pull/82240) |
91 |
| -LLVM option by default in comgr-symbolizer to support symbolization of instructions |
92 |
| -having no source correspondence in the debug information. |
93 | 24 |
|
94 | 25 | New APIs
|
95 | 26 | --------
|
96 |
| -- amd\_comgr\_populate\_mangled\_names() (v2.5) |
97 |
| -- amd\_comgr\_get\_mangled\_name() (v2.5) |
98 |
| - - Support bitcode and executable name lowering. The first call populates a |
99 |
| - list of mangled names for a given data object, while the second fetches a |
100 |
| - name from a given object and index. |
101 |
| -- amd\_comgr\_populate\_name\_expression\_map() (v2.6) |
102 |
| -- amd\_comgr\_map\_name\_expression\_to\_symbol\_name() (v2.6) |
103 |
| - - Support bitcode and code object name expression mapping. The first call |
104 |
| - populates a map of name expressions for a given comgr data object, using |
105 |
| - LLVM APIs to traverse the bitcode or code object. The second call returns |
106 |
| - a value (mangled symbol name) from the map for a given key (unmangled |
107 |
| - name expression). These calls assume that names of interest have been |
108 |
| - enclosed the HIP runtime using a stub attribute containg the following |
109 |
| - string in the name: "__amdgcn_name_expr". |
110 |
| -- amd\_comgr\_map\_elf\_virtual\_address\_to\_code\_object\_offset() (v2.7) |
111 |
| - - For a given executable and ELF virtual address, return a code object |
112 |
| - offset. This API will benifet the ROCm debugger and profilier |
113 |
| -- amd\_comgr\_action\_info\_set\_bundle\_entry\_ids() (v2.8) |
114 |
| -- amd\_comgr\_action\_info\_get\_bundle\_entry\_id\_count() (v2.8) |
115 |
| -- amd\_comgr\_action\_info\_get\_bundle\_entry\_id() (v2.8) |
116 |
| - - A user can provide a set of bundle entry IDs, which are processed when |
117 |
| - calling the AMD\_COMGR\_UNBUNDLE action |
118 |
| -- amd\_comgr\_action\_info\_set\_device\_lib\_linking() (v2.9) |
119 |
| - - By setting this ActionInfo property, a user can explicitly dictate if |
120 |
| - device libraries should be linked for a given action. (Previouly, the |
121 |
| - action type implicitly determined device library linking). |
122 |
| - |
123 | 27 |
|
124 | 28 | Deprecated APIs
|
125 | 29 | ---------------
|
126 | 30 |
|
127 | 31 | Removed APIs
|
128 | 32 | ------------
|
129 |
| -- amd\_comgr\_action\_info\_set\_options() (v3.0) |
130 |
| -- amd\_comgr\_action\_info\_get\_options() (v3.0) |
131 |
| - - Use amd\_comgr\_action\_info\_set\_option\_list(), |
132 |
| - amd\_comgr\_action\_info\_get\_option\_list\_count(), and |
133 |
| - amd\_comgr\_action\_info\_get\_option\_list\_item() instead |
134 | 33 |
|
135 | 34 | New Comgr Actions and Data Types
|
136 | 35 | --------------------------------
|
137 |
| -- (Action) AMD\_COMGR\_ACTION\_COMPILE\_SOURCE\_TO\_RELOCATABLE |
138 |
| - - This action performs compile-to-bitcode, linking device libraries, and |
139 |
| -codegen-to-relocatable in a single step. By doing so, clients are able to defer more |
140 |
| -of the flag handling to toolchain. Currently only supports HIP. |
141 |
| -- (Data Type) AMD\_COMGR\_DATA\_KIND\_BC\_BUNDLE |
142 |
| -- (Data Type) AMD\_COMGR\_DATA\_KIND\_AR\_BUNDLE |
143 |
| - - These data kinds can now be passed to an AMD\_COMGR\_ACTION\_LINK\_BC\_TO\_BC |
144 |
| -action, and Comgr will internally unbundle and link via the OffloadBundler and linkInModule APIs. |
145 |
| -- (Language Type) AMD\_COMGR\_LANGUAGE\_LLVM\_IR |
146 |
| - - This language can now be passed to AMD\_COMGR\_ACTION\_COMPILE\_\* actions |
147 |
| - to enable compilation of LLVM IR (.ll or .bc) files. This is useful for MLIR |
148 |
| - contexts. |
149 |
| -- (Action) AMD\_COMGR\_ACTION\_COMPILE\_SOURCE\_TO\_EXECUTABLE |
150 |
| - - This action allows compilation from source directly to executable, including |
151 |
| - linking device libraries. |
152 |
| -- (Action) AMD\_COMGR\_ACTION\_UNBUNDLE |
153 |
| - - This accepts a set of bitcode bundles, object file bundles, and archive |
154 |
| - bundles,and returns set of unbundled bitcode, object files, and archives, |
155 |
| - selecting bundles based on the bundle entry IDs provided. |
156 |
| -- (Data Type) AMD\_COMGR\_DATA\_KIND\_OBJ\_BUNDLE |
157 |
| - - This data kind represents a clang-offload-bundle of object files, and can be |
158 |
| - passed when calling the AMD\_COMGR\_ACTION\_UNBUNDLE action |
159 |
| -- (Data Type) AMD\_COMGR\_DATA\_KIND\_SPIRV |
160 |
| - - This data kind represents a SPIR-V binary file (.spv) |
161 |
| -- (Action) AMD\_COMGR\_ACTION\_TRANSLATE\_SPIRV\_TO\_BC |
162 |
| - - This accepts a set of SPIR-V (.spv) inputs, and returns a set of translated |
163 |
| - bitcode (.bc) outputs |
164 | 36 |
|
165 | 37 | Deprecated Comgr Actions and Data Types
|
166 | 38 | ---------------------------------------
|
167 | 39 |
|
168 | 40 | Removed Comgr Actions and Data Types
|
169 | 41 | ------------------------------------
|
170 |
| -- (Action) AMD\_COMGR\_ACTION\_COMPILE\_SOURCE\_TO\_FATBIN |
171 |
| - - This workaround has been removed in favor of |
172 |
| - \*\_COMPILE\_SOURCE\_(WITH\_DEVICE\_LIBS\_)TO\_BC |
173 |
| -- (Action) AMD\_COMGR\_ACTION\_OPTIMIZE\_BC\_TO\_BC |
174 |
| - - This is a legacy action that was never implemented |
175 |
| -- (Language) AMD\_COMGR\_LANGUAGE\_HC |
176 |
| - - This is a legacy language that was never used |
177 |
| -- (Action) AMD\_COMGR\_ACTION\_ADD\_DEVICE\_LIBRARIES |
178 |
| - - This has been replaced with |
179 |
| - AMD\_COMGR\_ACTION\_COMPILE\_SOURCE\_WITH\_DEVICE\_LIBS\_TO\_BC |
180 | 42 |
|
181 | 43 | Comgr Testing, Debugging, and Logging Updates
|
182 | 44 | ---------------------------------------------
|
183 |
| -- Added support for C++ tests. Although Comgr APIs are C-compatible, we can now |
184 |
| -use C++ features in testing (C++ threading APIs, etc.) |
185 |
| -- Clean up test directory by moving sources to subdirectory |
186 |
| -- Several tests updated to pass while verbose logs are redirected to stdout |
187 |
| -- Log information reported when AMD\_COMGR\_EMIT\_VERBOSE\_LOGS updated to: |
188 |
| - - Show both user-facing clang options used (Compilation Args) and internal |
189 |
| - driver options (Driver Job Args) |
190 |
| - - Show files linked by linkBitcodeToBitcode() |
191 |
| -- Remove support for code object v2 compilation in tests and test CMAKE due to |
192 |
| -deprecation of code object v2 in LLVM. However, we still test loading and |
193 |
| -metadata querys for code object v2 objects. |
194 |
| -- Remove support for code object v3 compilation in tests and test CMAKE due to |
195 |
| -deprecation of code object v3 in LLVM. However, we still test loading and |
196 |
| -metadata querys for code object v3 objects. |
197 |
| -- Revamp symbolizer test to fail on errors, among other improvments |
198 |
| -- Improve linking and unbundling log to correctly store temporary files in /tmp, |
199 |
| -and to output clang-offload-bundler command to allow users to re-create Comgr |
200 |
| -unbundling. |
201 |
| -- Add git branch and commit hash for Comgr, and commit hash for LLVM to log |
202 |
| -output for Comgr actions. This can help us debug issues more quickly in cases |
203 |
| -where reporters provide Comgr logs. |
204 |
| -- Fix multiple bugs with mangled names test |
205 |
| -- Update default arch for test binaries from gfx830 to gfx900 |
206 |
| -- Refactor nested kernel behavior into new test, as this behavior is less common |
207 |
| -and shouldn't be featured in the baseline tests |
208 |
| -- Add metadata parsing tests for code objects with multiple AMDGPU metadata note entries. |
209 |
| -- Updated Comgr HIP test to not rely on HIP\_COMPILER being set, or a valid HIP |
210 |
| -installation. We can test the functionality of Comgr HIP compilation without |
211 |
| -directly relying on HIP |
212 |
| -- Added framework for Comgr lit tests. These tests will allow us to easily |
213 |
| -validate generated artifacts with command-line tools like llvm-dis, |
214 |
| -llvm-objdump, etc. Moving forward, most new Comgr tests should be written as |
215 |
| -lit tests, and tests in comgr/test should be transitioned to comgr/test-lit. |
216 | 45 | - Removed HIP\_PATH and ROCM\_PATH environment variables. These were used for
|
217 | 46 | now-removed Comgr actions, such as \*COMPILE\_SOURCE\_TO\_FATBIN.
|
218 | 47 |
|
219 | 48 | New Targets
|
220 | 49 | -----------
|
221 |
| - - gfx942 |
222 |
| - - gfx950 |
223 |
| - - gfx1036 |
224 |
| - - gfx1150 |
225 |
| - - gfx1151 |
226 |
| - - gfx1152 |
227 |
| - - gfx9-generic |
228 |
| - - gfx9-4-generic |
229 |
| - - gfx10-1-generic |
230 |
| - - gfx10-3-generic |
231 |
| - - gfx11-generic |
232 |
| - - gfx12-generic |
233 | 50 |
|
234 | 51 | Removed Targets
|
235 | 52 | ---------------
|
|
0 commit comments