Closed
Description
Battle plan:
- Quantify 3.2.0 perf
- ... for selected scenarios:
- 3rd-party unoptimized code (ComplexTable)
- 3rd-party optimized code (PlainTable)
- 1st-party optimized code (FastGrid)
- ... providing useful breakdowns of:
- ... time in .NET rendering logic (BuildRenderTree calls, diffing, etc.)
- ... time in the JS DOM-updating logic
- ... for selected scenarios:
- Identify and implement optimizations, in each case quantifying the change:
- Inside Blazor code
- ... inside the Razor compiler, to reduce the amount of stuff we render at runtime
- ... inside the .NET rendering logic
- ... inside the JS DOM-updating logic
- Inside Mono interpreter (have provided scenarios to Vlad for this effort)
-
In browser's WASM engine (this is something Vlad/Zoltan are leading and might not complete before 5.0 ships)Out of scope for this issue. - In 3rd-party code
-
... by providing more guidance/docsCovered below -
... by providing performant base classes (virtualization, base grid)Covered in Virtualization support #24179
-
- As a result of switching to CoreFX for 5.0
- Inside Blazor code
- Consider backporting optimizations to 3.2.x if they yield big benefits without involving changes to user code
Update: Results of investigation
After a very extensive investigation, here are things I think we should do to complete this for 5.0:
- Write up the main findings from the investigation
- Exactly how perf differs between 3.2.0 and 5.0 Preview 8
- Why the perf numbers for FastGrid/PlainTable/ComplexTable are what they are. That is, provide breakdowns showing how use of each feature adds to the totals, explaining the differences in perf in ways that justify the perf advice we will give.
- Ensure we can keep collecting useful profiling data in the future (not just on my machine)
- Add the ConsoleRunner project to the repo so we can easily run Blazor scenarios on the desktop interpreter
- Begin a discussion with @BrzVlad and others responsible for the .NET interpreter about how we could merge in something like the trace collection logic so others can get profiling data in the future without needing a custom build of the runtime. Filed Consider adding profile trace export to Mono interpreter runtime#40617
- Implement optimizations inside Blazor
- RenderTreeArrayBuilder optimization - this makes the core rendertreebuilder logic faster by reducing the amount of indirection and struct copying (including by making RenderTreeFrame mutable). This gives about a 10% gain on the FastGrid benchmark. Blazor WebAssembly perf: RenderTreeArrayBuilder optimization #24464
- In RenderTreeDiffBuilder.InitializeNewAttributeFrame, use culture-insensitive ordinal string comparison when looking for attribute names starting with
on
. This is ultra-trivial and gives a 2-3% gain on the PlainTable/ComplexTable benchmark. Blazor WebAssembly perf: InitializeNewAttributeFrame should use culture-insensitive ordinal string comparison #24465 - Optimize parameter writer dictionary lookup, for example using an object-identity dictionary instead of one that hashes string keys. This gives nearly 10% gain for the ComplexTable benchmark. Blazor WebAssembly perf: Optimize parameter writer dictionary lookup #24466
- Optimize multiple-attributes overwrite detection by eliminating the string-hashing dictionary. For example, the SimpleStringIntDict prototype gives a 17% boost to the ComplexTable benchmark if it's changed to pass 3 catch-all params to each cell component. Blazor WebAssembly perf: Optimize multiple-attributes overwrite detection #24467
- Write up docs on how to write better-performing Blazor WebAssembly apps. For example:
- How much overhead you should expect from each extra layer of components you add, and each extra parameter you pass. Hence you should consider inlining child components if you're rendering a large number of them.
- Why it's so important to use
IsFixed
when cascading values to large numbers of receivers. - That
@attributes
is relatively expensive - Why, for perf-criticial components, you should consider implementing your own manual parameter assignment logic on
SetParameterAsync
, and how to do it efficiently. - How and why to use
<Virtualize>
The specific optimizations I've listed above aren't the only ones I'm aware of. I've tried many things based on the profiling data. These optimizations are the ones that have the biggest impact for the least cost.