Skip to content

Conversation

@kofemann
Copy link

@kofemann kofemann commented Oct 8, 2025

We see that the application that accesses an HDF5 file with CLAM library creates an extremely high load on metadata servers if the data is on the cluster filesystem. The reason is that the file is opened and closed for each access over the iterator.

This change is just an idea of how such an access pattern can be optimised by keeping the file open and closing it when the object is deleted. NOTE: The code is not tested and should be treated only as a PoC

We see that the application that accesses an HDF5 file with CLAM library creates an extremely high load on metadata servers if the data is on the cluster filesystem. The reason is that the file is opened and closed for each access over the iterator. 

This change is just an idea of how such an access pattern can be optimised by keeping the file open and closing it when the object is deleted. NOTE: The code is not tested and should be treated only as a PoC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant