[Usage]: How to access mlp layer using the current version vllm(0.4.0)

### Your current environment


---

**Description:**

I am currently updating code that was written based on an older version of `vllm` (version 0.2.7). In the previous implementation, I accessed the `mlp` layer using the following code snippet:

```python
obj = model.llm_engine.driver_worker.model_runner.model.model.layers[i].mlp
```

However, after updating to the latest version of `vllm`, this line now raises the following error:

```bash
AttributeError: 'LLMEngine' object has no attribute 'driver_worker'
```

It seems that the architecture of `vllm` has changed in the newer version, and I am unsure how to access the `mlp` layer now.

Below is the relevant part of the code where I use this method:

```python
from vllm import LLM, SamplingParams
model = LLM(model=args.model, tensor_parallel_size=torch.cuda.device_count(), enforce_eager=True)

if args.activation_mask:
    activation_masks = torch.load(args.activation_mask)

for activation_mask, mask_lang in zip(activation_masks, mask_langs):
    if activation_mask:
        def factory(mask):
            def llama_forward(self, x):
                gate_up, _ = self.gate_up_proj(x)
                i = gate_up.size(-1)
                activation = F.silu(gate_up[:, :, : i // 2])
                activation.index_fill_(2, mask, 0)
                x = activation * gate_up[:, :, i // 2 :]
                x, _ = self.down_proj(x)
                return x

            def bloom_forward(self, x: torch.Tensor):
                x, _ = self.dense_h_to_4h(x)
                x = self.gelu_impl(x)
                x.index_fill_(2, mask, 0)
                x, _ = self.dense_4h_to_h(x)
                return x

            if is_llama:
                return llama_forward
            else:
                return bloom_forward

        for i, layer_mask in enumerate(activation_mask):
            if is_llama:
                obj = model.llm_engine.driver_worker.model_runner.model.model.layers[i].mlp
            else:
                obj = model.llm_engine.driver_worker.model_runner.model.transformer.h[i].mlp
            obj.forward = MethodType(factory(layer_mask.to('cuda')), obj)

  for lang in langs:
        texts, sampling_params, = load_dataset(lang, sampling_params)
        outputs = model.generate(texts, sampling_params)
```

**Questions:**

1. What is the correct method to access the `mlp` layer in the new version of `vllm`?
2. Has there been a change in how the model architecture is structured in the new versions? If so, could you please guide me on how to adjust the above code to work with the updated architecture?

Any guidance would be appreciated. Thanks!

### How would you like to use vllm

I don't know how to integrate it with new version vllm.


### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Usage]: How to access mlp layer using the current version vllm(0.4.0) #8278

Your current environment

How would you like to use vllm

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Usage]: How to access mlp layer using the current version vllm(0.4.0) #8278

Description

Your current environment

How would you like to use vllm

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions