Skip to content

Ability to abort streaming completion #974

@curran

Description

@curran

Is your feature request related to a problem? Please describe.

As a developer of an app that leverages LocalAI and Llama-2 for streaming completions, I want to give users the ability to "abort" or "cancel" the streaming response, so that my self-hosted instance is not spinning CPU / GPU cycles generating the rest of the stream that users won't even see.

Describe the solution you'd like

Ideally, I'd like to use the NodeJS OpenAI package API to abort the stream. As documented in https://github.com/openai/openai-node#streaming-responses , we should be able to invoke

stream.controller.abort()

or just break; out of the async loop.

Describe alternatives you've considered

I've tried the following two approaches.

import OpenAI from "openai";

const content = `
Please write JavaScript code that creates
a scatter plot with D3.js.

Use \`const\` and \`let\` instead of \`var\`.
Use the arrow function syntax.

## JavaScript code
`;

const openai = new OpenAI({
  apiKey: "",
  baseURL: "http://192.168.0.140:8080/v1",
});

const stream = await openai.chat.completions.create({
  model: "llama-2-7b-chat.ggmlv3.q4_0.bin",
  messages: [{ role: "user", content }],
  stream: true,
});

let keepGoing = true
setTimeout(() => {

  // Approach A: This appears to do nothing.
  stream.controller.abort();

  // Approach B:
  // This stops the client from iterating,
  // but the server keeps computing the response
  keepGoing = false;
}, 10 * 1000);


for await (const part of stream) {
  process.stdout.write(part.choices[0].delta.content);
  if(!keepGoing){
    break;
  }
}

Additional context

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions