Skip to content

Commit 145a984

Browse files
[API] llm-vscode extension support (mlc-ai#1198)
This PR enables ```llm-vscode``` extension API support for copilot-like code completion, following [HF's LSP](https://github.com/huggingface/llm-ls). Fully compatible with ```CodeLlama``` and ```starcoder``` on mlc-llm. - huggingface/llm-vscode#103 enhances extension user experience when used with mlc-llm rest api. Thanks @ pacman100, who came up with this on his latest blogpost: https://huggingface.co/blog/personal-copilot
1 parent 0e08845 commit 145a984

File tree

2 files changed

+34
-0
lines changed

2 files changed

+34
-0
lines changed

python/mlc_chat/interface/openai_api.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,3 +144,18 @@ class EmbeddingsResponse(BaseModel):
144144
data: List[Dict[str, Any]]
145145
model: Optional[str] = None
146146
usage: UsageInfo
147+
148+
149+
class VisualStudioCodeCompletionParameters(BaseModel):
150+
temperature: float = None
151+
top_p: float = None
152+
max_new_tokens: int = None
153+
154+
155+
class VisualStudioCodeCompletionRequest(BaseModel):
156+
inputs: str
157+
parameters: VisualStudioCodeCompletionParameters
158+
159+
160+
class VisualStudioCodeCompletionResponse(BaseModel):
161+
generated_text: str

python/mlc_chat/rest.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@
3131
EmbeddingsRequest,
3232
EmbeddingsResponse,
3333
UsageInfo,
34+
VisualStudioCodeCompletionRequest,
35+
VisualStudioCodeCompletionResponse,
3436
)
3537

3638

@@ -364,6 +366,23 @@ async def read_stats_verbose():
364366
return session["chat_mod"].stats(verbose=True)
365367

366368

369+
@app.post("/v1/llm-vscode/completions")
370+
async def request_llm_vscode(request: VisualStudioCodeCompletionRequest):
371+
"""
372+
Creates a vscode code completion for a given prompt.
373+
Follows huggingface LSP (https://github.com/huggingface/llm-ls)
374+
"""
375+
generation_config = GenerationConfig(
376+
temperature=request.parameters.temperature,
377+
top_p=request.parameters.top_p,
378+
mean_gen_len=request.parameters.max_new_tokens,
379+
max_gen_len=request.parameters.max_new_tokens,
380+
)
381+
msg = session["chat_mod"].generate(prompt=request.inputs, generation_config=generation_config)
382+
383+
return VisualStudioCodeCompletionResponse(generated_text=msg)
384+
385+
367386
ARGS = convert_args_to_argparser().parse_args()
368387
if __name__ == "__main__":
369388
uvicorn.run("mlc_chat.rest:app", host=ARGS.host, port=ARGS.port, reload=False, access_log=False)

0 commit comments

Comments
 (0)