Skip to content

Bug: llama.cpp does not use XTC sampler when given temperature == 0 even if temperature is not in sampler sequence #9904

Closed
@justinsteven

Description

@justinsteven

What happened?

It's possible I'm misunderstanding samplers and sampler parameters.

It's also possible this is a symptom of a larger problem, where "default" values for some samplers may cause other samplers to be not activated in llama-server.

Observed behaviour

There are several sampler parameters which, when given "do nothing" default values via llama-server's /completions API, seem to cause the XTC sampler to not be used.

Update: The above should be "Given a temperature of 0, even if temperature is not in the requested sampler sequence, the XTC sampler is not used"

The following JSON payload demonstrates the issue:

{
  "prompt": "<|im_start|>system\nYou are a creative story writer<|im_end|>\n<|im_start>user\nWrite a story about a wizard who is losing his ability to do magic, and tries everything to get it back.<|im_end|>\n<|im_start|>assistant\n",
  "n_predict": 512,
  "seed": 1,
  "xtc_probability": 0.5,
  "xtc_threshold": 0.1,
  "samplers": [
    "xtc"
  ],
  "top_k": 0,
  "tfs_z": 1,
  "top_p": 1,
  "min_p": 0,
  "temperature": 0
}

In my testing, this causes the XTC sampler to not be activated. The vibe was off, and the following hacky debugging that I added was not activating:

diff --git a/src/llama-sampling.cpp b/src/llama-sampling.cpp
index 2e655068..63e0d043 100644
--- a/src/llama-sampling.cpp
+++ b/src/llama-sampling.cpp
@@ -1084,6 +1084,7 @@ static void llama_sample_xtc_apply(struct llama_sampler * smpl, llama_token_data
         || cur_p->size < 2) {
         return;
     }
+    puts("ok");

     std::uniform_real_distribution<float> distribution(0.0f, 1.0f);
     float chance = distribution(ctx->rng);

Given the following simpler JSON payload, the hacky debugging was successfully activated:

{
  "prompt": "<|im_start|>system\nYou are a creative story writer<|im_end|>\n<|im_start>user\nWrite a story about a wizard who is losing his ability to do magic, and tries everything to get it back.<|im_end|>\n<|im_start|>assistant\n",
  "n_predict": 512,
  "seed": 1,
  "xtc_probability": 0.5,
  "xtc_threshold": 0.1,
  "samplers": [
    "xtc"
  ]
}

Furthermore, each of the things after my samplers array seem to individually cause XTC to not activate. For example, a temperature of 0 (without specifying any of top_k, tfs_z, top_p or min_p) is enough to cause XTC to not activate.

There may be other parameters, including sampler parameters, which cause XTC to not activate, but which I did not test.

(Update: I was wrong about this, it seems as though only temperature == 0 reproduces the issue)

This is problematic for clients such as SillyTavern, which seem to always send all samplers in the array but which rely on sending default parameters (e.g. 0 in the case of temperature) to cause them to be effectively disabled. Such a client will never be able to activate XTC not activate XTC if the user gives a temperature of 0 in the hopes of disabling the temperature sampler.

Expected behaviour

If XTC is in the samplers array, and xtc_threshold and xtc_probability meet the criteria for XTC to be used, XTC should be used regardless of parameters for other samplers.

More generally, if any sampler is in the samplers array, and its parameters meet the criteria for it to be used, it should be used regardless of parameters for other samplers (?)

Related

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
version: 3923 (becfd387)
built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux

Relevant log output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions