What would you like to be added:
To do this, we simply add the PostResponse extension point to the prefix plugin, and update the cache with the response text.
Why is this needed:
The generated tokens are also cached by the model servers (vLLM at least). Upon receiving the response, the prefix plugin should also add the response to the prefix indexer. This makes the prefix indexer more accurate.