-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Milestone
Description
Several CMake subdirectories, including src/, clobber CMAKE_C_CFLAGS
. This prevents compiler optimizations enabled by the environment, significantly harming performance.
Test Case
I added a performance metric to debug output: lost packets per frame (lppf
). Lower is better. A script compiled 0cbb86e and ran the following command for each listed configuration:
timeout 2m bin/freenect-glview 2>&1 | grep lppf
It then extracted the last entries for each stream, each containing running totals of frames and lost packets and cumulative lppf
.
Results
TEST CFLAGS=""
[Stream 70] Lost 1512 total packets in 3132 frames (0.482759 lppf)
[Stream 80] Lost 1045 total packets in 3147 frames (0.332062 lppf)
TEST CFLAGS="-O2"
[Stream 70] Lost 161 total packets in 2723 frames (0.059126 lppf)
[Stream 80] Lost 87 total packets in 2717 frames (0.032021 lppf)
TEST CFLAGS="-O3"
[Stream 70] Lost 124 total packets in 2854 frames (0.043448 lppf)
[Stream 80] Lost 69 total packets in 2858 frames (0.024143 lppf)
TEST CFLAGS="-march=native"
[Stream 70] Lost 1590 total packets in 3125 frames (0.508800 lppf)
[Stream 80] Lost 1123 total packets in 3144 frames (0.357188 lppf)
TEST CFLAGS="-march=native -O3"
[Stream 70] Lost 131 total packets in 3498 frames (0.037450 lppf)
[Stream 80] Lost 59 total packets in 3494 frames (0.016886 lppf)
TEST CFLAGS="-mavx -O2"
[Stream 70] Lost 138 total packets in 3096 frames (0.044574 lppf)
[Stream 80] Lost 92 total packets in 3091 frames (0.029764 lppf)
Further tests are necessary to determine the effect of CPU extensions.
A more significant boost comes from optimization level.
Metadata
Metadata
Assignees
Labels
No labels