-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem? Please describe.
As a developer of an app that leverages LocalAI and Llama-2 for streaming completions, I want to give users the ability to "abort" or "cancel" the streaming response, so that my self-hosted instance is not spinning CPU / GPU cycles generating the rest of the stream that users won't even see.
Describe the solution you'd like
Ideally, I'd like to use the NodeJS OpenAI package API to abort the stream. As documented in https://github.com/openai/openai-node#streaming-responses , we should be able to invoke
stream.controller.abort()
or just break; out of the async loop.
Describe alternatives you've considered
I've tried the following two approaches.
import OpenAI from "openai";
const content = `
Please write JavaScript code that creates
a scatter plot with D3.js.
Use \`const\` and \`let\` instead of \`var\`.
Use the arrow function syntax.
## JavaScript code
`;
const openai = new OpenAI({
apiKey: "",
baseURL: "http://192.168.0.140:8080/v1",
});
const stream = await openai.chat.completions.create({
model: "llama-2-7b-chat.ggmlv3.q4_0.bin",
messages: [{ role: "user", content }],
stream: true,
});
let keepGoing = true
setTimeout(() => {
// Approach A: This appears to do nothing.
stream.controller.abort();
// Approach B:
// This stops the client from iterating,
// but the server keeps computing the response
keepGoing = false;
}, 10 * 1000);
for await (const part of stream) {
process.stdout.write(part.choices[0].delta.content);
if(!keepGoing){
break;
}
}Additional context
Moep90, 9876691, MysticalMount and seibe
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request