Commit 5b65896
authored
[libc++] Optimize ranges::copy{, _n} for vector<bool>::iterator (#121013)
This PR optimizes the performance of `std::ranges::copy` and
`std::ranges::copy_n` specifically for `vector<bool>::iterator`,
addressing a subtask outlined in issue #64038. The optimizations yield
performance improvements of up to **2000x** for aligned copies and
**60x** for unaligned copies. Additionally, new tests have been added to
validate these enhancements.
- Aligned source-destination bits
ranges::copy
```
--------------------------------------------------------------------------
Benchmark Before After Improvement
--------------------------------------------------------------------------
bm_ranges_copy_vb_aligned/8 10.8 ns 1.42 ns 8x
bm_ranges_copy_vb_aligned/64 88.5 ns 2.28 ns 39x
bm_ranges_copy_vb_aligned/512 709 ns 1.95 ns 364x
bm_ranges_copy_vb_aligned/4096 5568 ns 5.01 ns 1111x
bm_ranges_copy_vb_aligned/32768 44754 ns 38.7 ns 1156x
bm_ranges_copy_vb_aligned/65536 91092 ns 73.2 ns 1244x
bm_ranges_copy_vb_aligned/102400 139473 ns 127 ns 1098x
bm_ranges_copy_vb_aligned/106496 189004 ns 81.5 ns 2319x
bm_ranges_copy_vb_aligned/110592 153647 ns 71.1 ns 2161x
bm_ranges_copy_vb_aligned/114688 159261 ns 70.2 ns 2269x
bm_ranges_copy_vb_aligned/118784 181910 ns 73.5 ns 2475x
bm_ranges_copy_vb_aligned/122880 174117 ns 76.5 ns 2276x
bm_ranges_copy_vb_aligned/126976 176020 ns 82.0 ns 2147x
bm_ranges_copy_vb_aligned/131072 180757 ns 137 ns 1319x
bm_ranges_copy_vb_aligned/135168 190342 ns 158 ns 1205x
bm_ranges_copy_vb_aligned/139264 192831 ns 103 ns 1872x
bm_ranges_copy_vb_aligned/143360 199627 ns 89.4 ns 2233x
bm_ranges_copy_vb_aligned/147456 203881 ns 88.6 ns 2301x
bm_ranges_copy_vb_aligned/151552 213345 ns 88.4 ns 2413x
bm_ranges_copy_vb_aligned/155648 216892 ns 92.9 ns 2335x
bm_ranges_copy_vb_aligned/159744 222751 ns 96.4 ns 2311x
bm_ranges_copy_vb_aligned/163840 225995 ns 173 ns 1306x
bm_ranges_copy_vb_aligned/167936 235230 ns 202 ns 1165x
bm_ranges_copy_vb_aligned/172032 244093 ns 131 ns 1863x
bm_ranges_copy_vb_aligned/176128 244434 ns 111 ns 2202x
bm_ranges_copy_vb_aligned/180224 249570 ns 108 ns 2311x
bm_ranges_copy_vb_aligned/184320 254538 ns 108 ns 2357x
bm_ranges_copy_vb_aligned/188416 261817 ns 113 ns 2317x
bm_ranges_copy_vb_aligned/192512 269923 ns 125 ns 2159x
bm_ranges_copy_vb_aligned/196608 273494 ns 210 ns 1302x
bm_ranges_copy_vb_aligned/200704 280035 ns 269 ns 1041x
bm_ranges_copy_vb_aligned/204800 293102 ns 231 ns 1269x
```
ranges::copy_n
```
--------------------------------------------------------------------------
Benchmark Before After Improvement
--------------------------------------------------------------------------
bm_ranges_copy_n_vb_aligned/8 11.8 ns 0.89 ns 13x
bm_ranges_copy_n_vb_aligned/64 91.6 ns 2.06 ns 44x
bm_ranges_copy_n_vb_aligned/512 718 ns 2.45 ns 293x
bm_ranges_copy_n_vb_aligned/4096 5750 ns 5.02 ns 1145x
bm_ranges_copy_n_vb_aligned/32768 45824 ns 40.9 ns 1120x
bm_ranges_copy_n_vb_aligned/65536 92267 ns 73.8 ns 1250x
bm_ranges_copy_n_vb_aligned/102400 143267 ns 125 ns 1146x
bm_ranges_copy_n_vb_aligned/106496 148625 ns 82.4 ns 1804x
bm_ranges_copy_n_vb_aligned/110592 154817 ns 72.0 ns 2150x
bm_ranges_copy_n_vb_aligned/114688 157953 ns 70.4 ns 2244x
bm_ranges_copy_n_vb_aligned/118784 162374 ns 71.5 ns 2270x
bm_ranges_copy_n_vb_aligned/122880 168638 ns 72.9 ns 2313x
bm_ranges_copy_n_vb_aligned/126976 175596 ns 76.6 ns 2292x
bm_ranges_copy_n_vb_aligned/131072 181164 ns 135 ns 1342x
bm_ranges_copy_n_vb_aligned/135168 184697 ns 157 ns 1176x
bm_ranges_copy_n_vb_aligned/139264 191395 ns 104 ns 1840x
bm_ranges_copy_n_vb_aligned/143360 194954 ns 88.3 ns 2208x
bm_ranges_copy_n_vb_aligned/147456 208917 ns 86.1 ns 2426x
bm_ranges_copy_n_vb_aligned/151552 211101 ns 87.2 ns 2421x
bm_ranges_copy_n_vb_aligned/155648 213175 ns 89.0 ns 2395x
bm_ranges_copy_n_vb_aligned/159744 218988 ns 86.7 ns 2526x
bm_ranges_copy_n_vb_aligned/163840 225263 ns 156 ns 1444x
bm_ranges_copy_n_vb_aligned/167936 230725 ns 184 ns 1254x
bm_ranges_copy_n_vb_aligned/172032 235795 ns 119 ns 1981x
bm_ranges_copy_n_vb_aligned/176128 241145 ns 101 ns 2388x
bm_ranges_copy_n_vb_aligned/180224 250680 ns 99.5 ns 2519x
bm_ranges_copy_n_vb_aligned/184320 262954 ns 99.7 ns 2637x
bm_ranges_copy_n_vb_aligned/188416 258584 ns 103 ns 2510x
bm_ranges_copy_n_vb_aligned/192512 267190 ns 125 ns 2138x
bm_ranges_copy_n_vb_aligned/196608 270821 ns 213 ns 1271x
bm_ranges_copy_n_vb_aligned/200704 279532 ns 262 ns 1067x
bm_ranges_copy_n_vb_aligned/204800 283412 ns 222 ns 1277x
```
- Unaligned source-destination bits
```
--------------------------------------------------------------------------------
Benchmark Before After Improvement
--------------------------------------------------------------------------------
bm_ranges_copy_vb_unaligned/8 12.8 ns 8.59 ns 1.5x
bm_ranges_copy_vb_unaligned/64 98.2 ns 8.24 ns 12x
bm_ranges_copy_vb_unaligned/512 755 ns 18.1 ns 42x
bm_ranges_copy_vb_unaligned/4096 6027 ns 102 ns 59x
bm_ranges_copy_vb_unaligned/32768 47663 ns 774 ns 62x
bm_ranges_copy_vb_unaligned/262144 378981 ns 6455 ns 59x
bm_ranges_copy_vb_unaligned/1048576 1520486 ns 25942 ns 59x
bm_ranges_copy_n_vb_unaligned/8 11.3 ns 8.22 ns 1.4x
bm_ranges_copy_n_vb_unaligned/64 97.3 ns 7.89 ns 12x
bm_ranges_copy_n_vb_unaligned/512 747 ns 18.1 ns 41x
bm_ranges_copy_n_vb_unaligned/4096 5932 ns 99.0 ns 60x
bm_ranges_copy_n_vb_unaligned/32768 47776 ns 749 ns 64x
bm_ranges_copy_n_vb_unaligned/262144 378802 ns 6576 ns 58x
bm_ranges_copy_n_vb_unaligned/1048576 1547234 ns 26229 ns 59x
```1 parent 242aa8c commit 5b65896
File tree
9 files changed
+418
-186
lines changed- libcxx
- docs/ReleaseNotes
- include
- __algorithm
- test
- benchmarks/algorithms
- std/algorithms/alg.modifying.operations/alg.copy
9 files changed
+418
-186
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
47 | | - | |
| 47 | + | |
| 48 | + | |
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
16 | 17 | | |
17 | 18 | | |
| 19 | + | |
18 | 20 | | |
19 | 21 | | |
20 | 22 | | |
| |||
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
32 | 38 | | |
33 | 39 | | |
34 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
35 | 157 | | |
36 | 158 | | |
37 | 159 | | |
| |||
95 | 217 | | |
96 | 218 | | |
97 | 219 | | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
98 | 230 | | |
99 | 231 | | |
100 | 232 | | |
| |||
110 | 242 | | |
111 | 243 | | |
112 | 244 | | |
113 | | - | |
| 245 | + | |
114 | 246 | | |
115 | 247 | | |
116 | 248 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| 28 | + | |
27 | 29 | | |
28 | 30 | | |
29 | 31 | | |
| |||
183 | 185 | | |
184 | 186 | | |
185 | 187 | | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | | - | |
269 | | - | |
270 | | - | |
271 | | - | |
272 | | - | |
273 | | - | |
274 | | - | |
275 | | - | |
276 | | - | |
277 | | - | |
278 | | - | |
279 | | - | |
280 | | - | |
281 | | - | |
282 | | - | |
283 | | - | |
284 | | - | |
285 | | - | |
286 | | - | |
287 | | - | |
288 | | - | |
289 | | - | |
290 | | - | |
291 | | - | |
292 | | - | |
293 | | - | |
294 | | - | |
295 | | - | |
296 | | - | |
297 | | - | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
307 | | - | |
308 | | - | |
309 | | - | |
310 | 188 | | |
311 | 189 | | |
312 | 190 | | |
| |||
989 | 867 | | |
990 | 868 | | |
991 | 869 | | |
992 | | - | |
993 | | - | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
994 | 873 | | |
995 | 874 | | |
996 | 875 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
129 | 129 | | |
130 | 130 | | |
131 | 131 | | |
| 132 | + | |
132 | 133 | | |
133 | 134 | | |
134 | 135 | | |
| |||
0 commit comments