Bug: llama.cpp does not use XTC sampler when given temperature == 0 even if temperature is not in sampler sequence

### What happened?

It's possible I'm misunderstanding samplers and sampler parameters.

~~It's also possible this is a symptom of a larger problem, where "default" values for some samplers may cause other samplers to be not activated in `llama-server`.~~

# Observed behaviour

~~There are several sampler parameters which, when given "do nothing" default values via `llama-server`'s `/completions` API, seem to cause the XTC sampler to not be used.~~

Update: The above should be "Given a temperature of 0, even if temperature is not in the requested sampler sequence, the XTC sampler is not used"

The following JSON payload demonstrates the issue:

```json
{
  "prompt": "<|im_start|>system\nYou are a creative story writer<|im_end|>\n<|im_start>user\nWrite a story about a wizard who is losing his ability to do magic, and tries everything to get it back.<|im_end|>\n<|im_start|>assistant\n",
  "n_predict": 512,
  "seed": 1,
  "xtc_probability": 0.5,
  "xtc_threshold": 0.1,
  "samplers": [
    "xtc"
  ],
  "top_k": 0,
  "tfs_z": 1,
  "top_p": 1,
  "min_p": 0,
  "temperature": 0
}
```

In my testing, this causes the XTC sampler to not be activated. The vibe was off, and the following hacky debugging that I added was not activating:

```diff
diff --git a/src/llama-sampling.cpp b/src/llama-sampling.cpp
index 2e655068..63e0d043 100644
--- a/src/llama-sampling.cpp
+++ b/src/llama-sampling.cpp
@@ -1084,6 +1084,7 @@ static void llama_sample_xtc_apply(struct llama_sampler * smpl, llama_token_data
         || cur_p->size < 2) {
         return;
     }
+    puts("ok");

     std::uniform_real_distribution<float> distribution(0.0f, 1.0f);
     float chance = distribution(ctx->rng);
```

Given the following simpler JSON payload, the hacky debugging was successfully activated:

```json
{
  "prompt": "<|im_start|>system\nYou are a creative story writer<|im_end|>\n<|im_start>user\nWrite a story about a wizard who is losing his ability to do magic, and tries everything to get it back.<|im_end|>\n<|im_start|>assistant\n",
  "n_predict": 512,
  "seed": 1,
  "xtc_probability": 0.5,
  "xtc_threshold": 0.1,
  "samplers": [
    "xtc"
  ]
}
```

~~Furthermore, each of the things after my samplers array seem to individually cause XTC to not activate. For example, a `temperature` of 0 (without specifying any of `top_k`, `tfs_z`, `top_p` or `min_p`) is enough to cause XTC to not activate.~~

~~There may be other parameters, including sampler parameters, which cause XTC to not activate, but which I did not test.~~

(Update: I was wrong about this, it seems as though only `temperature == 0` reproduces the issue)

This is problematic for clients such as SillyTavern, which seem to always send all samplers in the array but which rely on sending default parameters (e.g. 0 in the case of `temperature`) to cause them to be effectively disabled. Such a client will ~~never be able to activate XTC~~ not activate XTC if the user gives a temperature of 0 in the hopes of disabling the temperature sampler.

# Expected behaviour

If XTC is in the samplers array, and `xtc_threshold` and `xtc_probability` meet the criteria for XTC to be used, XTC should be used regardless of parameters for other samplers.

More generally, if any sampler is in the samplers array, and its parameters meet the criteria for it to be used, it should be used regardless of parameters for other samplers (?)

# Related

* #9742 
* https://github.com/SillyTavern/SillyTavern/issues/2992

### Name and Version

```
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
version: 3923 (becfd387)
built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu
```

### What operating system are you seeing the problem on?

Linux

### Relevant log output

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug: llama.cpp does not use XTC sampler when given temperature == 0 even if temperature is not in sampler sequence #9904

What happened?

Observed behaviour

Expected behaviour

Related

Name and Version

What operating system are you seeing the problem on?

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: llama.cpp does not use XTC sampler when given temperature == 0 even if temperature is not in sampler sequence #9904

Description

What happened?

Observed behaviour

Expected behaviour

Related

Name and Version

What operating system are you seeing the problem on?

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions