Are YOU experiencing shady output from your Go debugger? Try these things!

A couple of days ago I noticed some unreliable output when stepping through the SnailLife server. I use GoLand and Delve debugger, but in the course of looking into this I picked up some tips from trial and error plus from some very helpful people which I think might be useful regardless of the debugger you use.

First, the repro.

I started a client test in debug mode stopped at a breakpoint. The test spins up my server binary. I attached a debugger to the server and started stepping through the client breakpoints. Eventually my server breakpoint is hit and I step through the server from there.

Instead of seeing accurate variable information from the debugger, I was getting dodgy output. Variables look fine on one line and then turn nil the next. Structs which were not empty turn empty. Strings which were not empty turn empty as you step through. Check out the msg variable here - I get it using fmt.Sprintf("%v", jarReq). The msg variable gets an accurate looking value even though jarReq is displayed as nil.

msg variable

Then, stepping one more line forward, msg turns into an empty string:

msg variable is an empty string

Also note that if I step through to and past a line that runs jar := jarReq.Jar I do not get a nil pointer dereference panic and jar looks like it should, so jarReq is obviously not nil:

accurate jar value

One step forward and jar is now a zero value struct:

jar is nil

This had the signs of debugging an optimized binary all over it, but I thought I built with compiler optimizations and inlining disabled (-N -l in go build)

I originally posted about this on a Go forum hoping a fresh set of eyes might spot something I’m doing wrong, but no one had any obvious insights there and the suggestion was to file a Delve bug. I was kind of reluctant to file a Delve bug because I was almost confident that this was not a Delve issue. After digging around some more I resigned myself to filing a Delve bug.

As suspected, it was not a Delve bug. I never did get to the root of exactly what caused this as it required some random crap like a machine reboot and a Go update to 1.11 release to fix along with the stuff that was obviously wrong (which I go into below). So there was definitely some local weirdness going on (and no, it is extremely unlikely that this was some unreported Go issue related to rc1 which required an upgrade). Nevertheless, I learned some things that I was doing wrong and I also learned some tips to better troubleshoot this kind of issue going forward.

-gcflags=‘all=-N -l’, NOT -gcflags ‘-N -l’ (if you are on Go >= 1.10)

This was my biggest mistake. I thought I was running go build (and alternatively go install) with compiler optimizations disabled by running go build -gcflags '-N -l'. This is wrong. This command only disabled optimizations and inlining for the main package. To disable these for the entire program, you need to run go build -gcflags='all=-N -l'.

Update: A reader emailed and pointed out to me that this behaviour was implemented in Go 1.10 and the above advice is only relevant if you are on 1.10 or newer. If you are on older versions of Go, this does not apply. You can read more about the change here. I dug up the original discussion about this also, which can be found here

Remember to pass the compiler flags to go install

Luckily I did not make this error this time, but I did have a scare for a minute thinking I might have! When I started learning Go I had a misconception about go build and go install. I would always run go build && go install, thinking build will build and output the binary to my current location and install will basically just move it into my $GOPATH. This is not what happens. go install also builds, it just happens to output the binary into $GOPATH. So if one were to run go build with all the necessary flags to disable optimizations and then just run go install right after that, they’d be left with another - optimized - version of the binary. If you use go install, you need to also give it the same compiler flags that you would if you were using go build.

objdump –dwarf your-exe | grep DW_AT_producer

I was using binary size to roughly try and see if my binary was built with optimizations disabled or not. After I went through the magical machine-reboot-and-Go-update steps this did indeed work - the binary was now a couple of MB larger. But that’s not a very reliable test, really. Alessandro Arzilli of Delve pointed out this command, which I could use to confirm if my binary was built with the aforementioned -N -l flags or not. Running objdump --dwarf your-exe | grep DW_AT_producer should produce output in which every line is followed by -N -l, if the compilation had the expected result.

Of course just parroting back a command given to you isn’t where we want to be. Instead of using the command totally blindly it would be nice to know the general gist of what it actually does:

objdump displays information about specified object files. --dwarf displays contents of the file’s debug sections. DWARF (Heh, heh, get it? To go with ELF? :D) is a debugging data format which stands for “Debugging With Attributed Record Format” (although the name was apparently made up after the acronym!) It explained in more detail here and here, I recommend checking these out.

Go adds compilation flags to the DW_AT_producer attribute in the DWARF debug information. After a few minutes of digging around, I found this Go issue about recording the optimization level in the binary, filed and resolved in 2017. There were a few changes associated with this but here’s where they seem to add the actual information to the attribute: https://go-review.googlesource.com/c/go/+/71430/5/src/cmd/link/internal/ld/dwarf.go (check out the rest of the commit as well)

Better output from GoLand

Although the issue has nothing to do with GoLand itself, Florin Pățan of GoLand joined in to discuss the insufficient output one gets from GoLand when debugging an optimized binary with Delve. Delve apparently provides a warning when you debug an optimized binary, but GoLand does not consume this or report it to the user. Instead, you get the kinds of output I saw - nil and otherwise zero-value variables. This is pretty misleading. Florin provided and followed up on an issue already in existence to improve this on the JetBrains side: https://youtrack.jetbrains.com/issue/GO-5311

Conclusion

Overall, even though I didn’t end up with all the answers in the end, I learned some interesting things when looking into this problem. I also appreciated how members of both the Delve and JetBrains communities jumped on this issue so quickly to provide useful insights and follow up on potential IDE improvements here, even though this was obviously not directly a Delve or GoLand problem.

I guess I can remove all those fmt.Println()s that I fell back on to track down that original bug I was debugging now…

comments powered by Disqus