Commit e1ec7db
committed
[LoopVectorize] Refine runtime memory check costs when there is an outer loop
When we generate runtime memory checks for an inner loop it's
possible that these checks are invariant in the outer loop and
so will get hoisted out. In such cases, the effective cost of
the checks should reduce to reflect the outer loop trip count.
This fixes a 25% performance regression introduced by commit
49b0e6d
when building the SPEC2017 x264 benchmark with PGO, where we
decided the inner loop trip count wasn't high enough to warrant
the (incorrect) high cost of the runtime checks. Also, when
runtime memory checks consist entirely of diff checks these are
likely to be outer loop invariant.1 parent ea50e94 commit e1ec7db
File tree
2 files changed
+67
-12
lines changed- llvm
- lib/Transforms/Vectorize
- test/Transforms/LoopVectorize/AArch64
2 files changed
+67
-12
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1957 | 1957 | | |
1958 | 1958 | | |
1959 | 1959 | | |
| 1960 | + | |
| 1961 | + | |
1960 | 1962 | | |
1961 | 1963 | | |
1962 | 1964 | | |
| |||
2053 | 2055 | | |
2054 | 2056 | | |
2055 | 2057 | | |
| 2058 | + | |
| 2059 | + | |
| 2060 | + | |
2056 | 2061 | | |
2057 | 2062 | | |
2058 | 2063 | | |
| |||
2076 | 2081 | | |
2077 | 2082 | | |
2078 | 2083 | | |
2079 | | - | |
| 2084 | + | |
| 2085 | + | |
2080 | 2086 | | |
2081 | 2087 | | |
2082 | 2088 | | |
2083 | 2089 | | |
2084 | 2090 | | |
2085 | 2091 | | |
2086 | | - | |
| 2092 | + | |
2087 | 2093 | | |
2088 | 2094 | | |
| 2095 | + | |
| 2096 | + | |
| 2097 | + | |
| 2098 | + | |
| 2099 | + | |
| 2100 | + | |
| 2101 | + | |
| 2102 | + | |
| 2103 | + | |
| 2104 | + | |
| 2105 | + | |
| 2106 | + | |
| 2107 | + | |
| 2108 | + | |
| 2109 | + | |
| 2110 | + | |
| 2111 | + | |
| 2112 | + | |
| 2113 | + | |
| 2114 | + | |
| 2115 | + | |
| 2116 | + | |
| 2117 | + | |
| 2118 | + | |
| 2119 | + | |
| 2120 | + | |
| 2121 | + | |
| 2122 | + | |
| 2123 | + | |
| 2124 | + | |
| 2125 | + | |
| 2126 | + | |
| 2127 | + | |
| 2128 | + | |
| 2129 | + | |
| 2130 | + | |
| 2131 | + | |
| 2132 | + | |
| 2133 | + | |
| 2134 | + | |
| 2135 | + | |
| 2136 | + | |
| 2137 | + | |
| 2138 | + | |
2089 | 2139 | | |
2090 | 2140 | | |
2091 | 2141 | | |
| |||
2144 | 2194 | | |
2145 | 2195 | | |
2146 | 2196 | | |
2147 | | - | |
2148 | | - | |
| 2197 | + | |
| 2198 | + | |
2149 | 2199 | | |
2150 | 2200 | | |
2151 | 2201 | | |
| |||
2179 | 2229 | | |
2180 | 2230 | | |
2181 | 2231 | | |
2182 | | - | |
2183 | | - | |
| 2232 | + | |
| 2233 | + | |
2184 | 2234 | | |
2185 | 2235 | | |
2186 | 2236 | | |
| |||
Lines changed: 11 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
| 36 | + | |
36 | 37 | | |
37 | 38 | | |
38 | 39 | | |
| |||
68 | 69 | | |
69 | 70 | | |
70 | 71 | | |
71 | | - | |
| 72 | + | |
| 73 | + | |
72 | 74 | | |
73 | 75 | | |
74 | 76 | | |
| |||
104 | 106 | | |
105 | 107 | | |
106 | 108 | | |
107 | | - | |
| 109 | + | |
| 110 | + | |
108 | 111 | | |
109 | 112 | | |
110 | 113 | | |
| |||
140 | 143 | | |
141 | 144 | | |
142 | 145 | | |
143 | | - | |
| 146 | + | |
| 147 | + | |
144 | 148 | | |
145 | 149 | | |
146 | 150 | | |
| |||
176 | 180 | | |
177 | 181 | | |
178 | 182 | | |
179 | | - | |
180 | | - | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
181 | 186 | | |
182 | 187 | | |
183 | 188 | | |
| |||
0 commit comments