Skip to content

Conversation

@fneddy
Copy link

@fneddy fneddy commented Mar 19, 2025

Use vector extensions when compiling for s390x and binutils knows about them. At runtime, check whether kernel supports vector extensions (it has to be not just the CPU, but also the kernel) and choose between the regular and the vectorized implementations.

@fneddy fneddy mentioned this pull request Feb 27, 2025
10 tasks
Copy link
Contributor

@Vollstrecker Vollstrecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way better than the last (first) version. Although I would prefer to just have the option and an add_subdirectory/ to contrib/s390x and then everything else in contrib/s390x/CMakeLists.txt. set source_files_properties should work there also, but the sources would be needed to be added to the targets itself if you don't want to deal with PARENT_SCOPE (you don't want to), so the add_subdirectory call will need to go after the targets are created.

@fneddy
Copy link
Author

fneddy commented Mar 20, 2025

i decided to go with an object library for the target specific code. it seemed to be the cleanest way. however i had to add it to the install target for the static library, :/

@Vollstrecker
Copy link
Contributor

You added it to the link not the install, but yes this is the way to use this. If you want to use it like this, a cleaner way would be $<$<TARGET__EXISTS:tgt>:lib) directly in the target_link_libraries call, or moving the add_subdirectory down where you add the second one, then you can just call target_link_libraries in contrib/s390x/CMakeLists.txt again and add it there (with the usual if static and if shared).

@Vollstrecker
Copy link
Contributor

My bad, we're not linking against anything atm. But it looks much cleaner and we are prepared for linking to other stuff.

I'm happy.

@fneddy
Copy link
Author

fneddy commented Mar 20, 2025

i mean this line: install(TARGETS zlib_crc32_vx EXPORT zlibStaticExport)

background:

if i do target_link_libraries(zlibstatic PRIVATE zlib_crc32_vx) in the main CMakeLists.txt, then cmake complains with CMake Error: install(EXPORT "zlibStaticExport" ...) includes target "zlibstatic" which requires target "zlib_crc32_vx" that is not in any export set. i fixed this by the install line.

however, with target_link_libraries(zlibstatic PRIVATE $<TARGET_NAME_IF_EXISTS:zlib_crc32_vx>) i can omit the install line in the contrib CMakeLists.txt

i don't really know what that install lines actually does, as this is a OBJECT library and nothing at all should be installed. should i remove it?

@Vollstrecker
Copy link
Contributor

It can be that on some plattforms object-libs are just static libs and then linking on static to the other fails (or better
is recorded for later. As s390 seems to support that, nothing is installed, on other plattforms this shouldn't even be mentioned as the lib isn't created.

The easiest way would be to move add_subdirectory to line 233(ish) and then calling target_sources for both libs instead of the object lib. Then they will be appended to the other sources. As all sources are compiled twice, 2 more files shouldn't be that big problem. If you set the compile_definitions along with the compile_options for the file, this could also be shorter.

@fneddy fneddy marked this pull request as ready for review March 21, 2025 10:31
@fneddy
Copy link
Author

fneddy commented Mar 21, 2025

Open points before merge

  • should there be a documentation for functable.h
  • is the location of functable.h ok, or should it be moved to toplevel?
  • is once.h ok?
  • is the integration in configure ok?

personally I think that the functable.h is a good method to hook in contrib code. I'll let the POWER guys know about it and i guess they'll jump on the same train for their implementation. Might be an idea to check if the ARM optimization should also be moved out of crc32.c into its own contrib directory.

@fneddy
Copy link
Author

fneddy commented Apr 7, 2025

from my side this code is ready for final review and to be merged.

@fneddy fneddy force-pushed the s390x_vec_crc32 branch 2 times, most recently from 3cafbb7 to 7ccadea Compare April 9, 2025 07:50
@Vollstrecker
Copy link
Contributor

I have to say with that foce-pushing it's hard to track your changes as it's always one big patch. i.e. CMakeLists.txt line 232 and 233 could be one and I can't tell if this came with the last commits or if I missed it before.

@fneddy
Copy link
Author

fneddy commented Apr 9, 2025

I am sorry i try to create a nice consistent history, therefor i force push on my branches.

The reason to split the dependency into two lines was a preparation for the accelerated deflate/inflate patch where i introduce the accelerator as another contrib dependendy .
Therefor i need the functable for all cases:

  • either you build with only zlib_crcvx
  • or with only zlib_dfltcc
  • or with both zlib_crcvx and zlib_dfltcc

it seemed easier and more logical to make the functable its own entity rather then handling all cases and taking care to not double the symbols.

@fneddy
Copy link
Author

fneddy commented Apr 9, 2025

the current state of the dfltcc patch(that builds upon this crcvx patch) is here.

I split it up into consistent commit that can be build and tested independently and hopefully are easier to review.

My current plan is to finish it up, do some extensive testing and then, if everything goes all right to open the PR.

@Vollstrecker
Copy link
Contributor

it seemed easier and more logical to make the functable its own entity rather then handling all cases and taking care to not double the symbols.

I don't judge about the split (I don't even read the code). It's just

target_link_libraries(zlib PRIVATE $<TARGET_NAME_IF_EXISTS:zlib_s390x_functable>)
target_link_libraries(zlib PRIVATE $<TARGET_NAME_IF_EXISTS:zlib_crc32_vx>)
target_link_libraries(zlib PRIVATE $<TARGET_NAME_IF_EXISTS:zlib_dfltcc>)

vs.

target_link_libraries(zlib PRIVATE $<TARGET_NAME_IF_EXISTS:zlib_s390x_functable>
$<TARGET_NAME_IF_EXISTS:zlib_crc32_vx>
$<TARGET_NAME_IF_EXISTS:zlib_dfltcc>)

But if it's just that functable is needed by the other 2 and needs to be present in the link, it's better to declare it as PUBLIC (maybe INTERFACE, didn't check) on these and remove it here completely.

I am sorry i try to create a nice consistent history, therefor i force push on my branches.

Acceptable goal, therefor squash-merger was invented. For reviewing work-in-progress it's a pain to keep track of what's changed. And yes, getting intercepted by reviews while not finished is also a pita.

@Neustradamus
Copy link

@fneddy: Good job!

Use vector extensions when compiling for s390x and binutils knows
about them. At runtime, check whether kernel supports vector
extensions (it has to be not just the CPU, but also the kernel) and
choose between the regular and the vectorized implementations.

Co-authored-by: Eduard Stefes <[email protected]>
@fneddy
Copy link
Author

fneddy commented Aug 22, 2025

it seems that the PR does not resonate well so here a new shoot. I minimized the changes in the old code.

@Vollstrecker
Copy link
Contributor

I thinks @madler is just waiting for you telling him you're really done and ready to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants