Skip to content

Get outputs from intermediate layers #4224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BDHU opened this issue Nov 26, 2023 · 4 comments
Closed

Get outputs from intermediate layers #4224

BDHU opened this issue Nov 26, 2023 · 4 comments

Comments

@BDHU
Copy link

BDHU commented Nov 26, 2023

I want to analyze the output from intermediate layers of Llama 2 7b during the forward pass, any hint on where to look at the code? The whole forward seems to base on a graph and it's not clearly to me where the boundary of layers are. Any help appreciated!

@ggerganov
Copy link
Member

It's very difficult to do that at this point. We will try to improve this in #2783 (no ETA)

@BDHU
Copy link
Author

BDHU commented Nov 26, 2023

I see. Now suppose if I'm only interested in the execution latency and not the accuracy, can we simply remove some layers inside llama_model_load in llama.cpp? I tried to manually remove layers using model.layers.erase(std::next(model.layers.begin())); and reduce n_gpu_layers so that the model will only execute certain layers. I do this after llm_load_tensors is invoked, but it seem it has no impact on tokens/sec even when I remove 30 layers, which seems strange to me. Is it possible to simply remove some layers before the graph is constructed?

@ggerganov
Copy link
Member

The easiest thing to do is something like this:

diff --git a/llama.cpp b/llama.cpp
index 9fb7244b..f6d8ea2d 100644
--- a/llama.cpp
+++ b/llama.cpp
@@ -3867,6 +3867,8 @@ struct llm_build_context {
         }
 
         for (int il = 0; il < n_layer; ++il) {
+            if (il > 5 && il < 20) continue; // skip layers [6, 20)
+
             struct ggml_tensor * inpSA = inpL;
 
             // norm

@BDHU
Copy link
Author

BDHU commented Nov 26, 2023

Ah thanks! It works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants