-
Notifications
You must be signed in to change notification settings - Fork 18k
go/types, x/tools/go/types, x/tools/analysis: encoding used by objectpath is inconsistent for use by compiler & tools/analysis #44195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@amscanne do the types constructed from gcexportdata always contain all of the methods that are present when analyzing from source (such as unexported methods)? If not, it seems that simply sorting the methods wouldn't be sufficient. Instead it seems like objectpath might need the actual method name rather than just an index? |
Great question. I have no idea. I'll do a bit more digging. However, sorting does solve the issue in this case, so I suspect the answer might yes here (although that does not mean it is yes everywhere). |
cc @timothy-king @guodongli-google as well, since y'all have been looking at analysis recently and may have thoughts here. |
I think I have a better understanding of this issue. The object file itself seems to contain rich type information (along with names). Based on my reading of the gcexportdata code, it seems like it is sorted in lexicographical order (presumably by the toolchain when emitting the object). Because of this, gcexportdata will always decode in the same lexicographical order. This is good, because it establishes a consistent ordering. The problem is fundamentally with objectpath, which relies on this ordering. The objectpath package encodes methods in a way that relies on method ordering, as opposed to using names (which almost everything else does). Therefore, the objectpath encoding is susceptible to the method ordering on *types.Func objects. Note that *types.Interface objects are explicitly ordered during construction, so this indexing scheme works fine for interfaces -- just not methods. The facts framework relies on objectpath to generate consistent keys for types. If two *types.Package objects are the same with the exception of method ordering on relevant *types.Func, then objectpath will generate two different paths for many objects (functions themselves, parameters, etc.). For analysis, there are two different sources for the *types.Package: the current package comes from the source code (parsing and type checking), while the *types.Package for dependencies comes from gcexportdata. So the facts are saved using the source method ordering, while they are loaded using the gcexportdata method ordering (lexicographical). It was suggested in [1] that type checking will sort methods, but I don't believe that is true. I think that ast parsing and type checking is sensitive to the ordering of files being parsed. This must be the case because the gcexportdata import does have a consistent lexicographical ordering imposed, as noted at the top. As some evidence of this, my current workaround is to sort all input files [2] which seems to permute the types generated sufficiently to match in my case (though this is extremely fragile and works only for now). I've added sanity checking for the binary vs AST-derived types, and this files simply by removing the sort in this case. So there are two reasonable solutions,
Since (1) is unlikely to break anything (both uses of objectpath will get the exact same encoding, since gcexportdata will already be sorted and objectpath will index into that list), I think this makes more sense. [1] https://go-review.googlesource.com/c/go/+/290750 |
I've sent a change to use a "canonical" ordering for objectpath: |
As expected, the workaround is too brittle to work correctly. While everything works in one configuration, another configuration (tsan/race) yields type conflicts. I think the proper objectpath fix is the way forward. |
Package-scope declarations are ordered lexicographically, but methods aren't declared at package scope. They're attached to their receiver type.
How robust is objectpath expected to be about different build configurations? In general, changing build tags can arbitrarily change a type's definition, which I would think would necessarily invalidate objectpath strings. |
Here is a description of the reproducers for this issue that I am aware of:
We can make method ids in objectpath agnostic to the GoFile order of these two paths by sorting them. If "foo_generated.go" and "foo.go" contained |
Hi, just catching up. As I commented on https://golang.org/cl/331789, I'm a little concerned about changing objectpath serialization. I think we should do it, but want to go over the compatibility implications. Here's my analysis:
Does anyone else have observations or concerns with respect to backwards compatibility of objectpath encoding? I have very little context on this package, but based on my analysis I think it's a gray area. Since this is fixing a bug, I think we should proceed. We should probably also make it explicit that objectpath encoding may change. |
FWIW, the current encoding depends on implementation details of the export data format that I at least consider to be unstable. E.g., I've already landed CLs on the dev.typeparams branch that sort methods before exporting them, so that's going to change objectpath's method numbering anyway. (And the x/tools exporter already sorts them too, but using a different sort order.) So to the extent that objectpath needs to be stable, it needs to be decoupled from the export data ordering of method's anyway. If it's important to maintain historical sort order, it might be able to sort methods based on Pos. But I think sorting on Id is simpler / more robust. |
I had fairly similar concerns and went through roughly the same checklist. Most of my observations were left as comments on https://golang.org/cl/331789. I think we are okay.
At the moment objectpath is determined by the order files are parsed. This means if two tools disagree about the file order the method ids are unstable anyways. We some evidence that this is happening. (See my previous comment for details.) My understanding of the objectpath documentation is that this was not the intention. So plausibly this is a bug in the implementation.
This would probably need to be similar to other interfaces like gcexportdata. This needs to be consistent while stored in the cache by the same tool while analyzing a different project. Personally I think we may just want to stop using numbers for identifying the methods and switch to method names in the encoding. Quicker (asymptotically at least) writing and lookup times (after https://golang.org/cl/331789 goes in), removes the file ordering concerns and seems conceptually stable. It may also just be robust enough that clearly documenting the conditions that objectpath is stable w.r.t. is not worth it? The main cost that I can think of is additional memory/storage space.
I have not looked into the details of how token.FileSet are created, but Pos order might have the same set of problems: a different order of files passed to the tool creates a different order of Files, which creates different Pos orders. |
Change https://golang.org/cl/331789 mentions this issue: |
Change https://golang.org/cl/339689 mentions this issue: |
On Go tip (pre-1.18), http://golang.org/issue/44195 is making SA1019 mistake uses of reflect.Value.Len for reflect.Value.InterfaceData, which is deprecated. It is thus mistakenly raising deprecation errors on uses of reflect.Value.Len. Suppress these errors by disabling SA1019 entirely. This is a bit overkill, but it is unclear to me if we want hard errors on deprecation anyways. That can be reevaluated when http://golang.org/issue/44195 is fixed. The other staticcheck analyzers are moved to alphabetical order. Updates golang/go#44195 PiperOrigin-RevId: 390655918
What's the status of this? Have we decided on either of the two proposed fixes? With Go tip it appears almost guaranteed to run into this bug, see the various issues referring to this issue. |
The status is that we delayed https://go-review.googlesource.com/c/tools/+/331789/ until post release. Now that the tree is open for Go 1.18 we are no longer blocked. We now need to make a decision between this and the other option https://golang.org/cl/339689. If folks have feedback on the candidate solutions, it would help to get this soon. (And since I cannot leave well enough alone #47725 is related, but not necessary to make a decision on this.) |
Change https://golang.org/cl/343390 mentions this issue: |
…dquirks These sorts are only important for 'toolstash -cmp' testing of unified IR against -G=0 mode, but they were added before I added -d=unifiedquirks to allow altering small "don't care" output details like this. This CL should help mitigate issues with #44195 until package objectpath is updated and deployed. Change-Id: Ia3dcf359481ff7abad5ddfca8e673fd2bb30ae01 Reviewed-on: https://go-review.googlesource.com/c/go/+/343390 Trust: Matthew Dempsky <[email protected]> Run-TryBot: Matthew Dempsky <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Robert Griesemer <[email protected]>
Change https://golang.org/cl/354433 mentions this issue: |
This issue is derived from dominikh/go-tools#924.
What version of Go are you using (
go version
)?1.16rc1
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?Linux, ARM64.
What did you do?
When using analyzers implemented using the analysis package, I hit upon a case where facts being exported and imported were being matched up with the wrong methods. (This issue was a method-specific one, and the proposed fix is also method specific.)
For example, instrumenting the facts package in Encode and Decode for serialization and deserialization of facts:
It turns out that the objectpath package is producing the name above, and this is what is used for keying the type in the exported fact data. During analysis, the type information for imported packages is sourced from the compiled artifact via gcexportdata, but the current package types are synthesized directly from the source files, and facts are derived from those types. However, this means that there is the possibility of facts being constructed using a different method ordering that what might appear in the compiler artifact, and therefore fact serialization may not key types correctly (and whoever is importing this fact is in for a bad time).
Since this logic is effectively built in to the compiler (which is often a binary package), this has the potential to cause issues for any analysis packages that may link against a different go/types package or perturb the ordering for NamedType.methods.
Possible Fix
While I realize that this API is not stable, the issue is effectively mitigated by making the Method ordering stable for NamedTypes by performing a sort on the first call to Method (and ensuring that no calls to AddMethod happen after that point). This should eliminate the sensitivity issue (but won't fix breakages in the case of genuine binary incompatibility, which is fine).
A draft of this fix is posted here: https://go-review.googlesource.com/c/go/+/290750
(But of course, I could be way off in my diagnose, or there could be a better solution.)
The text was updated successfully, but these errors were encountered: