-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
LocalAI version:
1.25.0
Environment, CPU architecture, OS, and Version:
Linux REDACTED 4.18.0-147.5.1.6.h541.eulerosv2r9.x86_64 #1 SMP Wed Aug 4 02:30:13 UTC 2021 x86_64 GNU/Linux
Describe the bug
I have frequency_penalty or presence_penalty in the body of my /chat/completions request but it ignore it. If it is unset in the model .yaml it is set to 0, otherwise it only use the .yaml one. Tested on Airoboros 70b.
For this try I have set both to 1:
The full "configuration read" log :
9:16AM DBG Configuration read: &{PredictionOptions:{Model:airoboros-l2-70b-2.1.ggmlv3.Q4_0.bin Language: N:0 TopP:0.7 TopK:80 Temperature:1.5 Maxtokens:10 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:airoboros-70b F16:true Threads:16 Debug:true Roles:map[assistant:ASSISTANT: system:SYSTEM: user:USER:] Embeddings:true Backend:llama-stable TemplateConfig:{Chat:airoboros-chat ChatMessage: Completion:airoboros-completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:8 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:2048 NUMA:false LoraAdapter: LoraBase: NoMulMatQ:false} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0}}
Parameters:
9:16AM DBG Parameters: &{PredictionOptions:{Model:airoboros-l2-70b-2.1.ggmlv3.Q4_0.bin Language: N:0 TopP:0.7 TopK:80 Temperature:1.5 Maxtokens:10 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:airoboros-70b F16:true Threads:16 Debug:true Roles:map[assistant:ASSISTANT: system:SYSTEM: user:USER:] Embeddings:true Backend:llama-stable TemplateConfig:{Chat:airoboros-chat ChatMessage: Completion:airoboros-completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:8 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:2048 NUMA:false LoraAdapter: LoraBase: NoMulMatQ:false} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0}}
To Reproduce
Setup a model and try to change presence_penalty using body parameters
Expected behavior
Logs
Additional context
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working