-
Notifications
You must be signed in to change notification settings - Fork 18.1k
cmd/compile: consider using DWARF 5 #26379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Strongly in favor of doing this the moment we think we can get away with it. I'm happy to do the work for location lists. |
Un-proposaling this per discussion with @ianlancetaylor. |
For reference, the debug info for the linux_amd64/go1.11.1 version of cmd/compile is 15MB of which 8MB are in debug_loc and debug_ranges, specifically:
Rewriting debug_loc and debug_ranges in the new format specified by DWARF 5 will reduce the size of debug_ranges to 774460 B (31% of DWARF 4 size) and of debug_loc to 2032960 B (34% of DWARF 4 size). Edit: debug_ranges and debug_loc were accidentally swapped in the last sentence. |
Is compression still doing a better job at reducing binary size than DWARF 5, or can D5 also be compressed? I did just mostly finish a DWARF splitter for OSX that recreates the expected file in the expected place, and adds the expected UUID for matching the two files. I suspect I ought to be using a hash of the binary contents excluding debugging information, so that identical files will remain identical after UUIDs are added. |
The compression isn't actually part of the DWARF spec, so it's orthogonal to the DWARF version and can be used with DWARF 5. @aarzilli, thanks for doing that experiment. Can you get the numbers for DWARF 5 if it's also zlib compressed? |
After compression the size is basically the same, 98% (of compressed DWARF4) for debug_ranges and 77% (of compressed DWARF4) for debug_loc. The compression time however is reduced by 40%. |
Is that compression time+space comparing current low-effort (higher speed) versus future low-effort? Sorry to be so picky, it's just that we've made mistakes here before. |
I'm comparing compression using zlib.BestSpeed for both, like the linker. Since measuring how much time it actually took inside the linker is hard I measured how much time it takes to recompress the compressed section. Because I have send the whole section to the compressor and the linker doesn't, I get slightly better compression than the linker. But the difference is small (around 1%) and since I'm doing it for both DWARF 4 and DWARF 5 the result should be valid. |
Parsed a DWARF5 linux kernel module, the below statement of index 0 is not nil. It returns the real file path.
|
FWIW, GCC 11.1 was released on April 27, 2021. From https://gcc.gnu.org/gcc-11/changes.html:
Also, Clang 14 was released on March 25, 2022. From https://releases.llvm.org/14.0.0/tools/clang/docs/ReleaseNotes.html#dwarf-support-in-clang:
|
Hi all, I'm working on a set of patches to switch the Go compiler/linker over to DWARF 5, just FYI. |
Change https://go.dev/cl/633878 mentions this issue: |
Change https://go.dev/cl/633880 mentions this issue: |
Change https://go.dev/cl/635345 mentions this issue: |
Change https://go.dev/cl/635337 mentions this issue: |
Change https://go.dev/cl/633879 mentions this issue: |
Change https://go.dev/cl/633877 mentions this issue: |
Change https://go.dev/cl/634415 mentions this issue: |
Change https://go.dev/cl/635836 mentions this issue: |
Update: As reported by Alessandro, the binutils "objdump" command is complaining that the
Here's what's happening. As currently implemented, the Go toolchain generates a single monolithic The code binutils After it has finished printing the It is unfortunate that it works this way, since there is nothing in the DWARF standard that requires the "one header per compilation unit" model, and the hijacking or overloading of the Also worth noting that the Go toolchain uses a similar monolithic section approach for I'll need to think about what to do about this. |
The fix for the GNU binutils seems fairly easy. Should we just send them a patch? |
I agree it's worth doing this. I don't think it should block moving to DWARF 5 for Go, however. |
This patch enables the DWARF version 5 experiment by default for most platforms that support DWARF. Note that MacOS is kept at version 4, due to problems with CGO builds; the "dsymutil" tool from older versions of Xcode (prior to V16) can't handle DWARF5. Similar we keep DWARF 4 for GOOS=aix, where XCOFF doesn't appear to support the new section subtypes in DWARF 5. Updates #26379. Change-Id: I5edd600c611f03ce8e11be3ca18c1e6686ac74ef Reviewed-on: https://go-review.googlesource.com/c/go/+/637895 Reviewed-by: Cherry Mui <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: David Chase <[email protected]>
Change https://go.dev/cl/656895 mentions this issue: |
Change https://go.dev/cl/656836 mentions this issue: |
Fix a typo in the code that decides which GOOS values will support use of DWARF 5 ("darwin" was not spelled correctly). Updates #26379. Change-Id: I3a7906d708550fcedc3a8e89d0444bf12b9143f1 Reviewed-on: https://go-review.googlesource.com/c/go/+/656895 Auto-Submit: Ian Lance Taylor <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: David Chase <[email protected]>
Bump the required version of GDB up to 10 from 7.7 in the runtime GDB tests, so as to ensure that we have something that can handle DWARF 5 when running tests. In theory there is some DWARF 5 support on the version 9 release branch, but we get "Dwarf Error: DW_FORM_addrx" errors for some archs on builders where GDB 9.2 is installed. Updates #26379. Change-Id: I1b7b45f8e4dd1fafccf22f2dda0124458ecf7cba Reviewed-on: https://go-review.googlesource.com/c/go/+/656836 Auto-Submit: Ian Lance Taylor <[email protected]> Reviewed-by: David Chase <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
Change https://go.dev/cl/657175 mentions this issue: |
This patch changes the strategy we use in the compiler for handling range information for inlined subroutine bodies, fixing a bug in how this was handled for DWARF 5. The high and lo PC values being emitted for DW_TAG_inlined_subroutine DIEs were incorrect, pointing to the start of functions instead of the proper location. The fix in this patch is to move to unconditionally using DW_AT_ranges for inlined subroutines, even those with only a single range. Background: prior to this point, if a given inlined function body had a single contiguous range, we'd pick an abbrev entry for it with explicit DW_AT_low_pc and DW_AT_high_pc attributes. If the extent of the code for the inlined body was not contiguous (which can happen), we'd select an abbrev that used a DW_AT_ranges attribute instead. This strategy (preferring explicit hi/lo PC attrs for a single-range func) made sense for DWARF 4, since in DWARF 4 the representation used in the .debug_ranges section was especially heavyweight (lots of space, lots of relocations), so having explicit hi/lo PC attrs was less expensive. With DWARF 5 range info is written to the .debug_rnglists section, and the representation here is much more compact. Specifically, a single hi/lo range can be represented using a base address in addrx format (max of 4 bytes, but more likely 2 or 3) followed by start and endpoints of the range in ULEB128 format. This combination is more compact spacewise than the explicit hi/lo values, and has fewer relocations (0 as opposed to 2). Note: we should at some point consider applying this same strategy to lexical scopes, since we can probably reap some of the same benefits there as well. Updates #26379. Fixes #72821. Change-Id: Ifb65ecc6221601bad2ca3939f9b69964c1fafc7c Reviewed-on: https://go-review.googlesource.com/c/go/+/657175 Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Cherry Mui <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Alessandro Arzilli <[email protected]>
Change https://go.dev/cl/657177 mentions this issue: |
Change https://go.dev/cl/657176 mentions this issue: |
Add a small fragment describing the move to DWARF 5 for this release, along with the name of the GOEXPERIMENT. Updates #26379. Change-Id: I3a30a71436133e2e0a5edf1ba0db84b9cc17cc5c Reviewed-on: https://go-review.googlesource.com/c/go/+/657176 LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: David Chase <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
This patch extends the change in CL 657175 to apply the same abbrev selection strategy to single-range lexical scopes that we're now using for inlined routine bodies, when DWARF 5 is in effect. Ranges are more compact and use fewer relocation than explicit hi/lo PC values, so we might as well always use them. Updates #26379. Change-Id: Ieeaddf50e82acc4866010e29af32bcd1fb3b4f02 Reviewed-on: https://go-review.googlesource.com/c/go/+/657177 LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Cherry Mui <[email protected]>
Update: DWARF 5 is now enabled at tip (for non-darwin and non-aix platforms) 🎉🎉🎉. I have fixed a few bugs, but so far things look good. |
Hi @thanm , thank you for the great work! Is there anything we still need to do for this issue? Or we can close this? Thanks. |
I think we can close the issue, thanks. I am still looking at issue #72810 which I think is some sort of book-keeping problem in the compiler, I hope to have a fix soon (I don't expect this to be a deal-breaker). Hopefully we can enable DWARF5 by default for darwin once the oldest supported xcode version includes the necessary support (e.g. V17). |
Change https://go.dev/cl/663235 mentions this issue: |
Closing out this issue since I now have a fix in flight for #72810. |
When the compiler builds a Go package with DWARF 5 generation enabled, it emits relocations into various generated DWARF symbols (ex: SDWARFFCN) that use the R_DWTXTADDR_* flavor of relocations. The specific size of this relocation is selected based on the total number of functions in the package -- if the package is tiny (just a couple funcs) we can use R_DWTXTADDR_U1 relocs (which target just a byte); if the package is larger we might need to use the 2-byte or 3-byte flavor of this reloc. Prior to this patch, the strategy used to pick the right relocation size was flawed in that it didn't take into account packages with assembly code. For example, if you have a package P with 200 funcs written in Go source and 200 funcs written in assembly, you can't use the R_DWTXTADDR_U1 reloc flavor for indirect text references since the real function count for the package (asm + go) exceeds 255. The new strategy (with this patch) is to have the compiler look at the "symabis" file to determine the count of assembly functions. For the assembler, rather than create additional plumbing to pass in the Go source func count we just use an dummy (artificially high) function count so as to select a relocation that will be large enough. Fixes #72810. Updates #26379. Change-Id: I98d04f3c6aacca1dafe1f1610c99c77db290d1d8 Reviewed-on: https://go-review.googlesource.com/c/go/+/663235 Reviewed-by: Dmitri Shuralyov <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: David Chase <[email protected]>
This package abstracts object files for use in dyninst. It importantly retains access to sections for use outside of just the *dwarf.Data construction so that the same sections can be reused for location list parsing and other operations not directly supported by debug/dwarf. It also wraps the abstractions provided by delve's loclist package to make location lists easier to parse. This layer can also in the future (if we deem it relevant) model the differences between elf and mach-o (though that's possibly over-abstracting a bit). There's some TODOs left for supporting dwarf 5 location lists that are coming in the next Go release (see [0]). [0]: golang/go#26379
This package abstracts object files for use in dyninst. It importantly retains access to sections for use outside of just the *dwarf.Data construction so that the same sections can be reused for location list parsing and other operations not directly supported by debug/dwarf. It also wraps the abstractions provided by delve's loclist package to make location lists easier to parse. This layer can also in the future (if we deem it relevant) model the differences between elf and mach-o (though that's possibly over-abstracting a bit). There's some TODOs left for supporting dwarf 5 location lists that are coming in the next Go release (see [0]). [0]: golang/go#26379
This package abstracts object files for use in dyninst. It importantly retains access to sections for use outside of just the *dwarf.Data construction so that the same sections can be reused for location list parsing and other operations not directly supported by debug/dwarf. It also wraps the abstractions provided by delve's loclist package to make location lists easier to parse. This layer can also in the future (if we deem it relevant) model the differences between elf and mach-o (though that's possibly over-abstracting a bit). There's some TODOs left for supporting dwarf 5 location lists that are coming in the next Go release (see [0]). [0]: golang/go#26379
This package abstracts object files for use in dyninst. It importantly retains access to sections for use outside of just the *dwarf.Data construction so that the same sections can be reused for location list parsing and other operations not directly supported by debug/dwarf. It also wraps the abstractions provided by delve's loclist package to make location lists easier to parse. This layer can also in the future (if we deem it relevant) model the differences between elf and mach-o (though that's possibly over-abstracting a bit). There's some TODOs left for supporting dwarf 5 location lists that are coming in the next Go release (see [0]). [0]: golang/go#26379
This package abstracts object files for use in dyninst. It importantly retains access to sections for use outside of just the *dwarf.Data construction so that the same sections can be reused for location list parsing and other operations not directly supported by debug/dwarf. It also wraps the abstractions provided by delve's loclist package to make location lists easier to parse. This layer can also in the future (if we deem it relevant) model the differences between elf and mach-o (though that's possibly over-abstracting a bit). There's some TODOs left for supporting dwarf 5 location lists that are coming in the next Go release (see [0]). [0]: golang/go#26379
This package abstracts object files for use in dyninst. It importantly retains access to sections for use outside of just the *dwarf.Data construction so that the same sections can be reused for location list parsing and other operations not directly supported by debug/dwarf. It also wraps the abstractions provided by delve's loclist package to make location lists easier to parse. This layer can also in the future (if we deem it relevant) model the differences between elf and mach-o (though that's possibly over-abstracting a bit). There's some TODOs left for supporting dwarf 5 location lists that are coming in the next Go release (see [0]). [0]: golang/go#26379
This package abstracts object files for use in dyninst. It importantly retains access to sections for use outside of just the *dwarf.Data construction so that the same sections can be reused for location list parsing and other operations not directly supported by debug/dwarf. It also wraps the abstractions provided by delve's loclist package to make location lists easier to parse. This layer can also in the future (if we deem it relevant) model the differences between elf and mach-o (though that's possibly over-abstracting a bit). There's some TODOs left for supporting dwarf 5 location lists that are coming in the next Go release (see [0]). [0]: golang/go#26379
This package abstracts object files for use in dyninst. It importantly retains access to sections for use outside of just the *dwarf.Data construction so that the same sections can be reused for location list parsing and other operations not directly supported by debug/dwarf. It also wraps the abstractions provided by delve's loclist package to make location lists easier to parse. This layer can also in the future (if we deem it relevant) model the differences between elf and mach-o (though that's possibly over-abstracting a bit). There's some TODOs left for supporting dwarf 5 location lists that are coming in the next Go release (see [0]). [0]: golang/go#26379 This commit introduces the package with a ridiculous mispelling to work around gitignore rules.
This package abstracts object files for use in dyninst. It importantly retains access to sections for use outside of just the *dwarf.Data construction so that the same sections can be reused for location list parsing and other operations not directly supported by debug/dwarf. It also wraps the abstractions provided by delve's loclist package to make location lists easier to parse. This layer can also in the future (if we deem it relevant) model the differences between elf and mach-o (though that's possibly over-abstracting a bit). There's some TODOs left for supporting dwarf 5 location lists that are coming in the next Go release (see [0]). [0]: golang/go#26379 This commit introduces the package with a ridiculous mispelling to work around gitignore rules.
DWARF 5 has several advantages over previous versions of DWARF. Notably,
It supports position-independent representations, which significantly reduces the number of relocations in object files and hence the size of object files and the load on the linker. In the
go
binary, 49% of the 503,361 total relocations are in the DWARF.It supports much more compact location and range list formats. The location and range list sections are 6% of the 12MiB of the
go
binary, even when zlib compressed.It has an official language code for Go. :)
DWARF 5 is quite new, and I don't think the rest of the ecosystem is ready yet, but I wanted to get the idea floating. It is supported by the GNU and LLVM toolchains and some debuggers. Support was added in GCC 7.1 (May 2017) and GDB 8.0 (June 2017). It appears to be in the latest LLVM, which covers most of the Xcode tools, though I can't find when it was added.
It is currently not supported by LLDB or the macOS linker. We could potentially get around the macOS linker by leaving out the Go DWARF from the objects we pass to the system linker and then merging it in to the final binary (we already do a merge step). This is more feasible with DWARF5 because it's mostly position-independent, so we wouldn't need dsymutil to relocate it for us.
/cc @cherrymui @heschik @dr2chase @randall77 @ianlancetaylor
The text was updated successfully, but these errors were encountered: