IP-Adapter FaceID PLus How to use questions

https://github.com/huggingface/diffusers/blob/9ef43f38d43217f690e222a4ce0239c6a24af981/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L492

## error msg:
    pipe.unet.encoder_hid_proj.image_projection_layers[0].clip_embeds = clip_embeds.to(dtype=torch.float16)
    AttributeError: 'list' object has no attribute 'to'

hi！
I'm having some problems using the ip adapter FaceID PLus. Can you help me answer these questions? Thank you very much

1. first question:  What should I pass in the `ip_adapter_image` parameter in the `prepare_ip_adapter_image_embeds` function
2. second question:  What problem does this cause when the following code does not match in the merge code link below and in the example in the ip_adapter.md file 
this is merge link: 
 https://github.com/huggingface/diffusers/pull/7186#issuecomment-1986961595
Differential code:
      ```
      ref_images_embeds = torch.stack(ref_images_embeds, dim=0).unsqueeze(0)
      neg_ref_images_embeds = torch.zeros_like(ref_images_embeds)
      id_embeds = torch.cat([neg_ref_images_embeds, ref_images_embeds]).to(dtype=torch.float16, device="cuda"))
      ```
@yiyixuxu @fabiorigano 

## os:
diffusers==diffusers-0.28.0.dev0

## this is my code:

```
# @FileName：StableDiffusionIpAdapterFaceIDTest.py
# @Description：
# @Author：dyh
# @Time：2024/4/24 11:45
# @Website：www.xxx.com
# @Version：V1.0
import cv2
import numpy as np
import torch
from PIL import Image
from diffusers import StableDiffusionPipeline
from insightface.app import FaceAnalysis
from transformers import CLIPVisionModelWithProjection

model_path = '../../../aidazuo/models/Stable-diffusion/stable-diffusion-v1-5'
clip_path = '../../../aidazuo/models/CLIP-ViT-H-14-laion2B-s32B-b79K'
ip_adapter_path = '../../../aidazuo/models/IP-Adapter-FaceID'
ip_img_path = '../../../aidazuo/jupyter-script/test-img/vermeer.png'


def extract_face_features(image_lst: list, input_size: tuple):
    # Extract Face features using insightface
    ref_images = []
    app = FaceAnalysis(name="buffalo_l",
                       root=ip_adapter_path,
                       providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])

    app.prepare(ctx_id=0, det_size=input_size)
    for img in image_lst:
        image = cv2.cvtColor(np.asarray(img), cv2.COLOR_BGR2RGB)
        faces = app.get(image)
        image = torch.from_numpy(faces[0].normed_embedding)
        ref_images.append(image.unsqueeze(0))
    ref_images = torch.cat(ref_images, dim=0)

    return ref_images


ip_adapter_img = Image.open(ip_img_path)

image_encoder = CLIPVisionModelWithProjection.from_pretrained(
    clip_path,
    torch_dtype=torch.float16,
    use_safetensors=True
)

pipe = StableDiffusionPipeline.from_pretrained(
    model_path,
    variant="fp16",
    safety_checker=None,
    image_encoder=image_encoder,
    torch_dtype=torch.float16).to("cuda")

adapter_file_lst = ["ip-adapter-faceid-plus_sd15.bin"]
adapter_weight_lst = [0.5]

pipe.load_ip_adapter(ip_adapter_path, subfolder=None, weight_name=adapter_file_lst)
pipe.set_ip_adapter_scale(adapter_weight_lst)

face_id_embeds = extract_face_features([ip_adapter_img], ip_adapter_img.size)

clip_embeds = pipe.prepare_ip_adapter_image_embeds(ip_adapter_image=[ip_adapter_img],
                                                   ip_adapter_image_embeds=None,
                                                   device='cuda',
                                                   num_images_per_prompt=1,
                                                   do_classifier_free_guidance=True)

pipe.unet.encoder_hid_proj.image_projection_layers[0].clip_embeds = clip_embeds.to(dtype=torch.float16)
pipe.unet.encoder_hid_proj.image_projection_layers[0].shortcut = False  # True if Plus v2

generator = torch.manual_seed(33)
images = pipe(
    prompt='a beautiful girl',
    ip_adapter_image_embeds=clip_embeds,
    negative_prompt="",
    num_inference_steps=30,
    num_images_per_prompt=1,
    generator=generator,
    width=512,
    height=512).images

print(images)
```




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

IP-Adapter FaceID PLus How to use questions #7766

error msg:

os:

this is my code:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

IP-Adapter FaceID PLus How to use questions #7766

Description

error msg:

os:

this is my code:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions