Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Before the latest updates I have been using ENV Vars to set base settings for things like cache, n_threads etc.
This does not seem to work anymore, I noticed the readme was updated to show the use of switches like this:
--model models/7B/ggml-model.bin
Is this how all base settings are given now? The model switch above does infact work.
Usually llama.cpp will default to mmap by itself but in the new versions I keep seeing the cannot mlock errors.
I have attempted ENV and switches to disable mlock and I cannot seem to do so.
I have attempted:
--use_mlock 0
--use_mmap 1
a combination including both of these togetjer and seperate.
While testing was happening the env vars were not touched.
Current Behavior
Instead of disabling mlock, it seems that its default every time I run the server.