Skip to content

Base settings confusion (How to disable mlock) #171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
snxraven opened this issue May 8, 2023 · 10 comments
Closed
4 tasks done

Base settings confusion (How to disable mlock) #171

snxraven opened this issue May 8, 2023 · 10 comments
Labels
bug Something isn't working

Comments

@snxraven
Copy link

snxraven commented May 8, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Before the latest updates I have been using ENV Vars to set base settings for things like cache, n_threads etc.

This does not seem to work anymore, I noticed the readme was updated to show the use of switches like this:
--model models/7B/ggml-model.bin

Is this how all base settings are given now? The model switch above does infact work.

Usually llama.cpp will default to mmap by itself but in the new versions I keep seeing the cannot mlock errors.

I have attempted ENV and switches to disable mlock and I cannot seem to do so.

I have attempted:

--use_mlock 0

--use_mmap 1

a combination including both of these togetjer and seperate.

While testing was happening the env vars were not touched.

Current Behavior

Instead of disabling mlock, it seems that its default every time I run the server.

@gjmulder
Copy link
Contributor

gjmulder commented May 8, 2023

If we have changed to:

python3 -m llama_cpp.server --model models/7B/ggml-model.bin

Do we then need to also update the Dockerfiles to pass params similar to the following?

ENTRYPOINT ["python3", "-m", "llama_cpp.server"]
CMD []

@abetlen abetlen added the bug Something isn't working label May 8, 2023
@abetlen
Copy link
Owner

abetlen commented May 8, 2023

This is a bug, the behaviour is that it should allow for both environmnet variables and cli options and validate at the end.

@gjmulder for dockerfiles the environmnet variables are still preferred because they play better with the variaous container deployment methods

@abetlen
Copy link
Owner

abetlen commented May 8, 2023

@snxraven I just pushed a fix (v0.1.46) that falls back to the environment variables if the cli option is missing, let me know if that works. You can also call --help now for a full list of cli arguments and their defualt values.

@snxraven
Copy link
Author

snxraven commented May 8, 2023

@abetlen I have attempted the latest it does not seem to fall back for me and I still see the memlock errors:
image

All of my previous settings within my docker env are ignored in 0.1.46

I confirmed this because cache and n_threads was also ignored.

The only one that seems to have fallen back however was the model, I did not include --model

@gjmulder
Copy link
Contributor

gjmulder commented May 8, 2023

MODEL does work, however.

Ditto on the MLOCK warning. I worked around it with the following change to my Dockerfile:

CMD bash -c "ulimit -l unlimited && python3 -m llama_cpp.server"

EDIT: Works for Ubuntu 20.04 via Docker.

@abetlen
Copy link
Owner

abetlen commented May 8, 2023

@snxraven you're right, I ended up ignoring environment variables for options that had defaults (why MODEL worked but e.g. N_THREADS didn't) because they were being set by argparse. That should be fixed with v0.1.47 so the order of precedence is CLI -> Environment Variable -> Default Value

@snxraven
Copy link
Author

snxraven commented May 8, 2023

building and trying :)

@snxraven
Copy link
Author

snxraven commented May 8, 2023

@abetlen The other settings like cache etc all seem to be working now, the only issue at the moment now it seems would be the mlock warning. Can this be disabled by default or do I need to disable this another way using a switch?

llama.cpp defaults to mmap

@abetlen
Copy link
Owner

abetlen commented May 8, 2023

@snxraven that's not a warning I can disable, if mlock is enabled that warning will just come up unless you increase that limit. If you are intending to use mlock I would recommend increasing that limit, I remember looking it up and I just had to edit /etc/security/limits.conf and add two lines

*        hard    memlock        unlimited
*        soft    memlock        unlimited

@snxraven
Copy link
Author

snxraven commented May 8, 2023

The issue was not being able to disable mlock at all.

With all of your recent changes, it does seem setting:

USE_MLOCK=0

Is working, it seems this issue can now be closed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants