YaRN : store rope scaling type as int32_t in memory #5285
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I recently noticed the comment in 8f961ab directed at me. My first experience with C++ was on embedded devices, so sometimes I forget just how cheap memory is on PCs - we are loading multi-gigabyte models from disk, after all. There really is no need to store hyperparameters in 8-bit ints and bitfields - on the contrary, it makes the struct layout more confusing, and can hurt performance for frequently accessed values.
This PR changes rope_scaling_type to be an int32_t in memory, and makes rope_finetuned a proper
bool
instead of a bitfield.As a more useful optimization, the name maps are stored as
const char *
, so the functions that use them are operating directly on the string literals instead of copies. std::string has an overload of operator== to compare withconst char *
, so this can't hurt performance. And many uses were calling c_str() anyway, so this simplifies the code a bit.