Skip to content

Custom Sampling Pipelines #348

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

martindevans
Copy link
Member

@martindevans martindevans commented Dec 8, 2023

Introduced an entirely new way to sample models.

The current mechanism is hardcoded (see here) and can only be configured by tweaking values in InferenceParams.

This means a couple of things are not possible:

  • Re-ordering sampling (for example this PR suggests applying temperature twice)
  • Developing entirely new sampling mechanisms (e.g. MinP could not have been developed using LLamaSharp).

This PR introduces:

  • Low level interface for entirely customised sampling ISamplingPipeline. If the SamplingPipeline property in the inference params is non-null this pipeline will be used.
  • BaseSamplingPipeline is an abstract implementation if ISamplingPipeline which makes it a little easier to implement.
  • DefualtSamplingPipeline which is a demo implementation that does standard sampling.

You can see an example of it in use here: https://github.com/SciSharp/LLamaSharp/pull/348/files#diff-16c496b3d63b9606d57c5d2a5592059223a9e4bc7ab29416ea775d5356888df7R37

…her options with an entirely custom pipeline.

 - Added a `Sample` method to `LLamaContext` which uses a custom pipeline
 - Modified all executors to use the custom pipeline if it exists
…better written in code.

 - Added BaseSamplingPipeline which provides a base impl of `ISamplingPipeline`
 - Added `DefaultSamplingPipeline` which mimics normal llama.cpp sampling
Copy link
Collaborator

@AsakusaRinne AsakusaRinne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great works! Greedy sampling is also a common sampling method, which is good to be provided in LLamaSharp. It's not required in this PR, just a mark. :)

@martindevans
Copy link
Member Author

That reminds me of something related that I thought about and I might come back to in a future PR. At the moment the LLamaContext basically has several different modes, depending on what parameters you pass:

  • Pure Greedy (Temp <= 0)
  • MiroStat
  • MiroStat2
  • TopK + TailFree + LocallyTypical + TopP + MinP + Temperature + Sampling

So using this new pipeline system we could explicitly split it into 4 separate pipelines, construct them inside LLamaContext and use them as necessary. Would split some of the code out of LLamaContext into a more re-usable form.

@martindevans martindevans merged commit d87d654 into SciSharp:master Dec 11, 2023
@martindevans martindevans deleted the new_object_based_sampling_pipeline branch December 11, 2023 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants