You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* [Community Pipeline] Add Marigold Monocular Depth Estimation
- add single-file pipeline
- update README
* fix format - add one blank line
* format script with ruff
* use direct image link in example code
---------
Co-authored-by: Sayak Paul <[email protected]>
| Marigold Monocular Depth Estimation | A universal monocular depth estimator, utilizing Stable Diffusion, delivering sharp predictions in the wild. (See the [project page](https://marigoldmonodepth.github.io) and [full codebase](https://github.com/prs-eth/marigold) for more details.) | [Marigold Depth Estimation](#marigold-depth-estimation) | [](https://huggingface.co/spaces/toshas/marigold) [](https://colab.research.google.com/drive/12G8reD13DdpMie5ZQlaFNo2WCGeNUH-u?usp=sharing) | [Bingxin Ke](https://github.com/markkua) and [Anton Obukhov](https://github.com/toshas) |
11
12
| LLM-grounded Diffusion (LMD+) | LMD greatly improves the prompt following ability of text-to-image generation models by introducing an LLM as a front-end prompt parser and layout planner. [Project page.](https://llm-grounded-diffusion.github.io/) [See our full codebase (also with diffusers).](https://github.com/TonyLianLong/LLM-groundedDiffusion) | [LLM-grounded Diffusion (LMD+)](#llm-grounded-diffusion) | [Huggingface Demo](https://huggingface.co/spaces/longlian/llm-grounded-diffusion) [](https://colab.research.google.com/drive/1SXzMSeAB-LJYISb2yrUOdypLz4OYWUKj) | [Long (Tony) Lian](https://tonylian.com/) |
12
13
| CLIP Guided Stable Diffusion | Doing CLIP guidance for text to image generation with Stable Diffusion |[CLIP Guided Stable Diffusion](#clip-guided-stable-diffusion)|[](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/CLIP_Guided_Stable_diffusion_with_diffusers.ipynb)|[Suraj Patil](https://github.com/patil-suraj/)|
13
14
| One Step U-Net (Dummy) | Example showcasing of how to use Community Pipelines (see https://github.com/huggingface/diffusers/issues/841)|[One Step U-Net](#one-step-unet)| - |[Patrick von Platen](https://github.com/patrickvonplaten/)|
Marigold is a universal monocular depth estimator that delivers accurate and sharp predictions in the wild. Based on Stable Diffusion, it is trained exclusively with synthetic depth data and excels in zero-shot adaptation to real-world imagery. This pipeline is an official implementation of the inference process. More details can be found on our [project page](https://marigoldmonodepth.github.io) and [full codebase](https://github.com/prs-eth/marigold) (also implemented with diffusers).
This depth estimation pipeline processes a single input image through multiple diffusion denoising stages to estimate depth maps. These maps are subsequently merged to produce the final output. Below is an example code snippet, including optional arguments:
72
+
73
+
```python
74
+
import numpy as np
75
+
fromPILimport Image
76
+
from diffusers import DiffusionPipeline
77
+
from diffusers.utils import load_image
78
+
79
+
pipe = DiffusionPipeline.from_pretrained(
80
+
"Bingxin/Marigold",
81
+
custom_pipeline="marigold_depth_estimation"
82
+
# torch_dtype=torch.float16, # (optional) Run with half-precision (16-bit float).
# denoising_steps=10, # (optional) Number of denoising steps of each inference pass. Default: 10.
93
+
# ensemble_size=10, # (optional) Number of inference passes in the ensemble. Default: 10.
94
+
# processing_res=768, # (optional) Maximum resolution of processing. If set to 0: will not resize at all. Defaults to 768.
95
+
# match_input_res=True, # (optional) Resize depth prediction to match input resolution.
96
+
# batch_size=0, # (optional) Inference batch size, no bigger than `num_ensemble`. If set to 0, the script will automatically decide the proper batch size. Defaults to 0.
97
+
# color_map="Spectral", # (optional) Colormap used to colorize the depth map. Defaults to "Spectral".
98
+
# show_progress_bar=True, # (optional) If true, will show progress bars of the inference progress.
LMD and LMD+ greatly improves the prompt understanding ability of text-to-image generation models by introducing an LLM as a front-end prompt parser and layout planner. It improves spatial reasoning, the understanding of negation, attribute binding, generative numeracy, etc. in a unified manner without explicitly aiming for each. LMD is completely training-free (i.e., uses SD model off-the-shelf). LMD+ takes in additional adapters for better control. This is a reproduction of LMD+ model used in our work. [Project page.](https://llm-grounded-diffusion.github.io/)[See our full codebase (also with diffusers).](https://github.com/TonyLianLong/LLM-groundedDiffusion)
0 commit comments