Refactor: (clip.cpp) identify and regroup pre-processing strategies

### Background Description

Currently, `clip_image_preprocess` still looks quite messy.

From a graphic designer perspective, this function is purely just a "photoshop in cpp", its main purpose is to preprocess a given image before sending it to the transformer. The preprocess involves: crop / resize / pad the given image.

Currently, there are some strategies to preprocess an image:
- Resize to a fixed (square) size and add padding if the ratio is not square (used by llava 1.5, gemma 3, GLM)  
  Note: llava 1.5 use a gray-ish color for padding, while the rest use black color
- Allow dynamic resolution / ratio, but limit max size (used by qwen2vl, pixtral)  
  Image will still need to be resized to the nearest multiply of patch size
- Crop the image into slices, aka llava-uhd (used by llava 1.6, minicpm-v)

### Possible Refactor Approaches

Make an enum, split into dedicated function and give them good naming.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor: (clip.cpp) identify and regroup pre-processing strategies #13077

Background Description

Possible Refactor Approaches

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor: (clip.cpp) identify and regroup pre-processing strategies #13077

Description

Background Description

Possible Refactor Approaches

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions