-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Add an AccessPath abstraction and formalize memory access #33121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Would it be helpful to have a high-level description of access paths in |
@varungandhi-apple yes, to the extent that I want to add constraints to SIL, that should be documented in SIL.rst. Independently, we should have documentation as a starting point for the central abstractions we use in the SIL optimizer. I haven't been doing that, but it looks like SIL programmers manual is the right place https://github.com/apple/swift/blob/master/docs/SILProgrammersManual.md. I'll add something there. All of this is secondary to good file-level comments though, which is where I think anything of substantial detail should go. We're sorely lacking adequate comments IMO |
@atrick Somehow I'm missing the essentials here. Introducing new abstractions always comes with a cost. It's very important for an abstraction like this to have a small and clean API. Is the API here the ==/!= operators? So that one can use it to check if two accesses are aliased or not? How does AccessPath relate to ProjectionPath? Should it be a replacement for ProjectionPath? If yes, then I'm strongly in favor of replacing ProjectionPath with AccessPath before this PR is merged. Because otherwise it will never be done and we already have too many utilities which are piling up in the code base. If no, it would be worth documenting what's the difference and when to use which utility. |
d4d85f9
to
b6d7e36
Compare
b6d7e36
to
4bd8dfd
Compare
@eeckstein @varungandhi-apple @airspeedswift Yes, the basic API for the data type is == and !=; it simply identifies a memory location within an access. Importantly, it's a 3-word value that functions as a hash key. Most of the important API comes from the AccessedStorage that it contains. However, as you can read in documentation, or from the class definition, other important APIs are: We currently have nothing like AccessPath is also a complete abstraction, unlike anything we have today. It handles all SIL patterns that I consider to be well-formed. This includes dynamic index offsets, negative index offsets, and arbitrary index chains. It also models all types of pointer casting and boxing/unboxing patterns that should be supported in SIL. The goal of this PR is not to add another utility. It's about creating a verified correct and complete model for SIL memory access. This needs to exist so we can make progress on several things that have come up in the past few months (at least if we don't want to make the current mess worse):
I don't think RLE/DSE should ever have been checked in with the current design for efficiency reasons, and I made that clear at the time. It's true that simply substituting AccessPath would reduce the space and time complexity, but others have been interested in working on it, and my current priority is not fixing compile-time and memory usage issues (the arbitrary thresholds that are in place limit that overhead). ProjectionPath is different in that it tries to fully abstract over the SIL instructions instead of simply augmenting SIL. I don't think that goal fits well with any of the current uses. So, I agree it should be straightforward to remove ProjectionPath over time. I believe the additional functionality of working with projections should be built alongside and on-top of AccessPath, rather than building it into AccessPath. For example, I implemented a simple projection cloner cleanly on top of AccessPath as a separate utility. The less urgent but more interesting need for AccessPath is not to replace ProjectionPath, but to replace all of the ad-hoc anlayses scattered throughout the code that rely on forming a complete transitive closure of non-escaping uses of memory locations. For example, if StackPromotion used this, then we wouldn't need all the complexity of use-points in EscapeAnalysis. Passes that badly need a standard, verified model of known-memory-uses are CopyForwarding, COWArrayOpt, ArrayPropertyOpt. However, my current interest is less about optimization and more about writing light-weight diagnostic passes. I don't think that the AccessPath should directly provide an alias analysis interface. I actually think it should remain minimal. The key is to define a simple model that can be fully verified, then build efficient, purposeful utilities on top of that. The problem with most of our existing utilites is that they aren't proper abstractions at all. They don't model any well-defined concepts and aren't complete or verified. They try to do everything for everyone without really knowing what they're doing. All passes have slightly different needs, so the utilities become misleading for most uses by hiding all of their assumptions behind an interface. It's very straightforward to provide an alias analysis interface on top of |
4bd8dfd
to
f9a407f
Compare
@swift-ci test |
Build failed |
Build failed |
f9a407f
to
f635c3d
Compare
Something important I forgot to mention: one of the original motivations for implementing AccessPathVerification is to get closer to enabling access marker verification by default. Currently exclusivity verification is under a special flag, only run by a special bot that no one looks at. |
f635c3d
to
26ce46b
Compare
...and avoid reallocation. This is immediately necessary for LICM, in addition to its current uses. I suspect this could be used by many passes that work with addresses. RLE/DSE should absolutely migrate to it.
Change ProjectionIndex for ref_tail_addr to std::numeric_limits<int>::max(); This is necessary to disambiguate the tail elements from ref_element_addr field zero.
To clarify and unify logic, improve precision, and behave consistently with other code that does the same thing.
Things that have come up recently but are somewhat blocked on this: - Moving AccessMarkerElimination down in the pipeline - SemanticARCOpts correctness and improvements - AliasAnalysis improvements - LICM performance regressions - RLE/DSE improvements Begin to formalize the model for valid memory access in SIL. Ignoring ownership, every access is a def-use chain in three parts: object root -> formal access base -> memory operation address AccessPath abstracts over this path and standardizes the identity of a memory access throughout the optimizer. This abstraction is the basis for a new AccessPathVerification. With that verification, we now have all the properties we need for the type of analysis requires for exclusivity enforcement, but now generalized for any memory analysis. This is suitable for an extremely lightweight analysis with no side data structures. We currently have a massive amount of ad-hoc memory analysis throughout SIL, which is incredibly unmaintainable, bug-prone, and not performance-robust. We can begin taking advantage of this verifably complete model to solve that problem. The properties this gives us are: Access analysis must be complete over memory operations: every memory operation needs a recognizable valid access. An access can be unidentified only to the extent that it is rooted in some non-address type and we can prove that it is at least *not* part of an access to a nominal class or global property. Pointer provenance is also required for future IRGen-level bitfield optimizations. Access analysis must be complete over address users: for an identified object root all memory accesses including subobjects must be discoverable. Access analysis must be symmetric: use-def and def-use analysis must be consistent. AccessPath is merely a wrapper around the existing accessed-storage utilities and IndexTrieNode. Existing passes already very succesfully use this approach, but in an ad-hoc way. With a general utility we can: - update passes to use this approach to identify memory access, reducing the space and time complexity of those algorithms. - implement an inexpensive on-the-fly, debug mode address lifetime analysis - implement a lightweight debug mode alias analysis - ultimately improve the power, efficiency, and maintainability of full alias analysis - make our type-based alias analysis sensistive to the access path
The main goal of this checker is to find the access scope. To do that, it needs to use `getAccessBegin`, which in turn needs to handle address_to_pointer. Use isAccessedStorageCast to handle more cases, do a small refactoring, and fix minor issues.
Add a flag: enable-accessed-storage-dump-uses
3c6148a
to
d559080
Compare
@swift-ci test |
@swift-ci benchmark |
@swift-ci test source compatibility |
These docs have gone through one round of review. Now I want to get this AccessPathVerification checked in ASAP. I have a set of new kinds of verification to land, but don't want to do more than one per week (starting with this one) to give them each plenty of exposure. |
Some of the functionality discussed in the docs will be landed in a follow-up PR to avoid introducing too many deep functional changes at once. |
Performance: -O
Code size: -OPerformance: -Osize
Code size: -OsizePerformance: -Onone
Code size: -swiftlibsHow to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
Build failed |
Build failed |
d559080
to
0fd6be8
Compare
Begin to formalize the model for valid memory access in SIL. Ignoring ownership, every access is a def-use chain in three parts:
object root -> formal access base -> memory operation address
AccessPath abstracts over this path and standardizes the identity of a
memory access throughout the optimizer. This abstraction is the basis for a new AccessPathVerification.
With that verification, we now have all the properties we need for the type of analysis requires for exclusivity enforcement, but now generalized for any memory analysis. This is suitable for an extremely lightweight analysis with no side data structures. We currently have a massive amount of ad-hoc memory analysis throughout SIL, which is incredibly unmaintainable, bug-prone, and not performance-robust. We can begin taking advantage of this verifably complete model to solve that problem.
The properties this gives us are:
Access analysis must be complete over memory operations: every memory operation needs a recognizable valid access. An access can be unidentified only to the extent that it is rooted in some non-address type and we can prove that it is at least not part of an access to a nominal class or global property. Pointer provenance is also required for future IRGen-level bitfield optimizations.
Access analysis must be complete over address users: for an identified object root all memory accesses including subobjects must be discoverable.
Access analysis must be symmetric: use-def and def-use analysis must
be consistent.
AccessPath is merely a wrapper around the existing accessed-storage utilities and IndexTrieNode. Existing passes already very succesfully use this approach, but in an ad-hoc way. With a general utility we can:
update passes to use this approach to identify memory access,
reducing the space and time complexity of those algorithms.
implement an inexpensive on-the-fly, debug mode address lifetime analysis
implement a lightweight debug mode alias analysis
ultimately improve the power, efficiency, and maintainability of
full alias analysis
make our type-based alias analysis sensistive to the access path