Skip to content

Add method for dumping context and loading it. #39

@saul-jb

Description

@saul-jb

Feature Description

As far as I can tell there is currently no way to currently persist context, I would like to be able to dump the internal state of the model to a file or some other form of persistent storage so I can load a chat sometime later and keep the context history.

This would solve the following use cases:

  • Continue a chat after a crash or restart.
  • Run multiple chats on a single model.

The Solution

Having the ability to access (dump) the internal data as a stream would allow one to save to a file then load it from there later to continue the chat.

Considered Alternatives

It might be possible to save the entire (text) history and load it next time but this seems quite inefficient and would take a long time to load.

Other solutions to solve the above use cases would be welcome.

Additional Context

Does anyone know if llama.cpp expose such a feature which would make it trivial to add to this project?

This project: https://github.com/kuvaus/LlamaGPTJ-chat has a feature like this and I believe it uses llama.cpp so I am thinking there should be a way to do this.

Related Features to This Feature Request

  • Metal support
  • CUDA support
  • Grammar

Are you willing to resolve this issue by submitting a Pull Request?

No, I don’t have the time and I’m okay to wait for the community / maintainers to resolve this issue.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions