Skip to content

Grammar support for Web Server #855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 1, 2023
Merged

Grammar support for Web Server #855

merged 1 commit into from
Nov 1, 2023

Conversation

dthuerck
Copy link
Contributor

Created this small PR which just adds a grammar parameter to the completions endpoint, therefore closing #778 . Of course this violates OpenAI's API definition, but the parameter is optional. Here's an example for it to test (use the json.gnbf from llama.cpp):

import json
import requests

if __name__ == "__main__":

  prompt = "Please provide me with a valid JSON that encodes the following " \
    "Address: Testine Testerson, 420 Imaginary St, 12345 Star City, USA. " \
    "Make sure yu respect data types and only use one per field. "

  print("######")
  print("# WITHOUT GRAMMAR")
  print("######")

  # simple post request asking for a code problem
  res = requests.post(
    url="http://localhost:8000/v1/completions",
    json={
      "prompt" : f"<s>[INST]{prompt}[/INST]",
      "stop" : ["</s>"],
      "max_tokens" : 256,
      "stream" : True,
      "temperature" : 0.9
    },
    stream=True
  )

  for line in res.iter_lines(decode_unicode=True):
    if line:
      try:
        line = line[6:]
        data = json.loads(line)
        print(data["choices"][0]["text"], end="", flush=True)
      except Exception as e:
        pass

  print("")

  print("######")
  print("# WITH GRAMMAR")
  print("######")

  # now try with using the grammar
  with open("json.gnbf") as f:
    grammar = f.read()

  res = requests.post(
    url="http://localhost:8000/v1/completions",
    json={
      "prompt" : f"<s>[INST]{prompt}[/INST]",
      "stop" : ["</s>"],
      "max_tokens" : 256,
      "stream" : True,
      "temperature" : 0.8,
      "grammar" : grammar
    },
    stream=True
  )

  for line in res.iter_lines(decode_unicode=True):
    if line:
      try:
        line = line[6:]
        data = json.loads(line)
        print(data["choices"][0]["text"], end="", flush=True)
      except Exception as e:
        pass

@abetlen abetlen merged commit 5f8f369 into abetlen:main Nov 1, 2023
@Freed-Wu
Copy link

Freed-Wu commented Nov 2, 2023

@dthuerck you should close #788 not #778

@aabbi
Copy link

aabbi commented Feb 25, 2024

Thanks for adding this. The parameter can be passed as an extra_body parameter when using the openai client. Would it make sense to add this in the documentation somewhere (I had to dig a little bit to figure out how to do this) ?
Example usage

llm_uri = "http://localhost:8000/v1"
llm_client = OpenAI(base_url=llm_uri, api_key="xx")
modelname = "llama-2-7b-chat.Q6_K.gguf"
with open("json.gnbf") as f:
    grammar = f.read()
resp = llm_client.chat.completions.create(messages=prompt, model=modelname, extra_body={"grammar": grammar})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants