Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
<div align="center">
|
| 2 |
<h1>π¨ LucidFlux:<br/>Caption-Free Universal Image Restoration with a Large-Scale Diffusion Transformer</h1>
|
| 3 |
|
|
@@ -27,8 +43,71 @@ Let us know if this works!
|
|
| 27 |
---
|
| 28 |
|
| 29 |
## π What is LucidFlux?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
LucidFlux is a framework designed to perform high-fidelity image restoration across a wide range of degradations without requiring textual captions. By combining a Flux-based DiT backbone with Light-weight Condition Module and SigLIP semantic alignment, LucidFlux enables caption-free guidance while preserving structural and semantic consistency, achieving superior restoration quality.
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
## π Performance Benchmarks
|
| 33 |
|
| 34 |
<div align="center">
|
|
@@ -195,6 +274,10 @@ LucidFlux is a framework designed to perform high-fidelity image restoration acr
|
|
| 195 |
</tbody>
|
| 196 |
</table>
|
| 197 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 198 |
</div>
|
| 199 |
|
| 200 |
---
|
|
@@ -247,24 +330,25 @@ LucidFlux is a framework designed to perform high-fidelity image restoration acr
|
|
| 247 |
<td width="200"><b>LQ</b></td>
|
| 248 |
<td width="200"><b>HYPIR</b></td>
|
| 249 |
<td width="200"><b>Topaz</b></td>
|
|
|
|
| 250 |
<td width="200"><b>Gemini-NanoBanana</b></td>
|
| 251 |
<td width="200"><b>GPT-4o</b></td>
|
| 252 |
<td width="200"><b>Ours</b></td>
|
| 253 |
</tr>
|
| 254 |
-
<tr align="center"><td colspan="
|
| 255 |
-
<tr align="center"><td colspan="
|
| 256 |
-
<tr align="center"><td colspan="
|
| 257 |
-
<tr align="center"><td colspan="
|
| 258 |
</table>
|
| 259 |
|
| 260 |
<details>
|
| 261 |
<summary>Show more examples</summary>
|
| 262 |
|
| 263 |
<table>
|
| 264 |
-
<tr align="center"><td colspan="
|
| 265 |
-
<tr align="center"><td colspan="
|
| 266 |
-
<tr align="center"><td colspan="
|
| 267 |
-
<tr align="center"><td colspan="
|
| 268 |
</table>
|
| 269 |
|
| 270 |
</details>
|
|
@@ -310,46 +394,33 @@ pip install -r requirements.txt
|
|
| 310 |
```
|
| 311 |
|
| 312 |
### Inference
|
| 313 |
-
- **Flux.1 dev** β [π€ FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)
|
| 314 |
-
Then update the model path in the `configs` for `flux-dev` in `src/flux/util.py` to your local FLUX.1-dev model path.
|
| 315 |
|
| 316 |
-
|
| 317 |
-
|
|
|
|
| 318 |
|
| 319 |
-
|
| 320 |
-
|
|
|
|
| 321 |
|
| 322 |
-
|
| 323 |
-
Then set `siglip_ckpt` to the corresponding local path.
|
| 324 |
|
| 325 |
-
|
| 326 |
-
|
|
|
|
| 327 |
|
| 328 |
-
|
| 329 |
-
|
|
|
|
| 330 |
|
| 331 |
-
```bash
|
| 332 |
-
inference.sh
|
| 333 |
|
| 334 |
-
|
| 335 |
-
input_folder=input_images_folder
|
| 336 |
-
checkpoint_path=path/to/lucidflux.pth
|
| 337 |
-
swin_ir_ckpt=path/to/swinir.ckpt
|
| 338 |
-
siglip_ckpt=path/to/siglip.ckpt
|
| 339 |
|
| 340 |
-
|
| 341 |
-
|
| 342 |
-
python inference.py \
|
| 343 |
-
--checkpoint ${checkpoint_path} \
|
| 344 |
-
--swinir_pretrained ${swin_ir_ckpt} \
|
| 345 |
-
--control_image ${input_folder} \
|
| 346 |
-
--siglip_ckpt ${siglip_ckpt} \
|
| 347 |
-
--prompt "restore this image into high-quality, clean, high-resolution result" \
|
| 348 |
-
--output_dir ${result_dir}/ \
|
| 349 |
-
--width 1024 --height 1024 --num_steps 50 \
|
| 350 |
```
|
| 351 |
|
| 352 |
-
|
| 353 |
|
| 354 |
## πͺͺ License
|
| 355 |
|
|
@@ -383,4 +454,4 @@ For any questions or inquiries, please reach out to us:
|
|
| 383 |
</details>
|
| 384 |
|
| 385 |
|
| 386 |
-
</div>
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
- zh
|
| 5 |
+
library_name: transformers
|
| 6 |
+
tags:
|
| 7 |
+
- image-restoration
|
| 8 |
+
- diffusion
|
| 9 |
+
- computer-vision
|
| 10 |
+
- flux
|
| 11 |
+
- pytorch
|
| 12 |
+
license: other
|
| 13 |
+
license_name: flux-1-dev
|
| 14 |
+
license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
<div align="center">
|
| 18 |
<h1>π¨ LucidFlux:<br/>Caption-Free Universal Image Restoration with a Large-Scale Diffusion Transformer</h1>
|
| 19 |
|
|
|
|
| 43 |
---
|
| 44 |
|
| 45 |
## π What is LucidFlux?
|
| 46 |
+
|
| 47 |
+
<!-- <div align="center">
|
| 48 |
+
<img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/demo/demo2.png" alt="What is LucidFlux - Quick Prompt Demo" width="1200"/>
|
| 49 |
+
<br>
|
| 50 |
+
</div> -->
|
| 51 |
+
|
| 52 |
LucidFlux is a framework designed to perform high-fidelity image restoration across a wide range of degradations without requiring textual captions. By combining a Flux-based DiT backbone with Light-weight Condition Module and SigLIP semantic alignment, LucidFlux enables caption-free guidance while preserving structural and semantic consistency, achieving superior restoration quality.
|
| 53 |
|
| 54 |
+
<!-- ## π Quick Start
|
| 55 |
+
|
| 56 |
+
### π§ Installation
|
| 57 |
+
|
| 58 |
+
```bash
|
| 59 |
+
# Clone the repository
|
| 60 |
+
git clone https://github.com/ephemeral182/LucidFlux.git
|
| 61 |
+
cd LucidFlux
|
| 62 |
+
|
| 63 |
+
# Create conda environment
|
| 64 |
+
conda create -n postercraft python=3.11
|
| 65 |
+
conda activate postercraft
|
| 66 |
+
|
| 67 |
+
# Install dependencies
|
| 68 |
+
pip install -r requirements.txt
|
| 69 |
+
|
| 70 |
+
``` -->
|
| 71 |
+
|
| 72 |
+
<!-- ### π Quick Generation
|
| 73 |
+
|
| 74 |
+
Generate high-quality aesthetic posters from your prompt with `BF16` precision:
|
| 75 |
+
|
| 76 |
+
```bash
|
| 77 |
+
python inference.py \
|
| 78 |
+
--prompt "Urban Canvas Street Art Expo poster with bold graffiti-style lettering and dynamic colorful splashes" \
|
| 79 |
+
--enable_recap \
|
| 80 |
+
--num_inference_steps 28 \
|
| 81 |
+
--guidance_scale 3.5 \
|
| 82 |
+
--seed 42 \
|
| 83 |
+
--pipeline_path "black-forest-labs/FLUX.1-dev" \
|
| 84 |
+
--custom_transformer_path "LucidFlux/LucidFlux-v1_RL" \
|
| 85 |
+
--qwen_model_path "Qwen/Qwen3-8B"
|
| 86 |
+
```
|
| 87 |
+
|
| 88 |
+
If you are running on a GPU with limited memory, you can use `inference_offload.py` to offload some components to the CPU:
|
| 89 |
+
|
| 90 |
+
```bash
|
| 91 |
+
python inference_offload.py \
|
| 92 |
+
--prompt "Urban Canvas Street Art Expo poster with bold graffiti-style lettering and dynamic colorful splashes" \
|
| 93 |
+
--enable_recap \
|
| 94 |
+
--num_inference_steps 28 \
|
| 95 |
+
--guidance_scale 3.5 \
|
| 96 |
+
--seed 42 \
|
| 97 |
+
--pipeline_path "black-forest-labs/FLUX.1-dev" \
|
| 98 |
+
--custom_transformer_path "LucidFlux/LucidFlux-v1_RL" \
|
| 99 |
+
--qwen_model_path "Qwen/Qwen3-8B"
|
| 100 |
+
``` -->
|
| 101 |
+
<!--
|
| 102 |
+
### π» Gradio Web UI
|
| 103 |
+
|
| 104 |
+
We provide a Gradio web UI for LucidFlux.
|
| 105 |
+
|
| 106 |
+
```bash
|
| 107 |
+
python demo_gradio.py
|
| 108 |
+
``` -->
|
| 109 |
+
|
| 110 |
+
|
| 111 |
## π Performance Benchmarks
|
| 112 |
|
| 113 |
<div align="center">
|
|
|
|
| 274 |
</tbody>
|
| 275 |
</table>
|
| 276 |
|
| 277 |
+
|
| 278 |
+
|
| 279 |
+
<!-- <img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/user_study/hpc.png" alt="User Study Results" width="1200"/> -->
|
| 280 |
+
|
| 281 |
</div>
|
| 282 |
|
| 283 |
---
|
|
|
|
| 330 |
<td width="200"><b>LQ</b></td>
|
| 331 |
<td width="200"><b>HYPIR</b></td>
|
| 332 |
<td width="200"><b>Topaz</b></td>
|
| 333 |
+
<td width="200"><b>SeeDream 4.0</b></td>
|
| 334 |
<td width="200"><b>Gemini-NanoBanana</b></td>
|
| 335 |
<td width="200"><b>GPT-4o</b></td>
|
| 336 |
<td width="200"><b>Ours</b></td>
|
| 337 |
</tr>
|
| 338 |
+
<tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_061.jpg" width="1400"></td></tr>
|
| 339 |
+
<tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_094.jpg" width="1400"></td></tr>
|
| 340 |
+
<tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_205.jpg" width="1400"></td></tr>
|
| 341 |
+
<tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_209.jpg" width="1400"></td></tr>
|
| 342 |
</table>
|
| 343 |
|
| 344 |
<details>
|
| 345 |
<summary>Show more examples</summary>
|
| 346 |
|
| 347 |
<table>
|
| 348 |
+
<tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_062.jpg" width="1400"></td></tr>
|
| 349 |
+
<tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_160.jpg" width="1400"></td></tr>
|
| 350 |
+
<tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_111.jpg" width="1400"></td></tr>
|
| 351 |
+
<tr align="center"><td colspan="7"><img src="https://raw.githubusercontent.com/W2GenAI-Lab/LucidFlux/main/images/commercial_comparison/commercial_123.jpg" width="1400"></td></tr>
|
| 352 |
</table>
|
| 353 |
|
| 354 |
</details>
|
|
|
|
| 394 |
```
|
| 395 |
|
| 396 |
### Inference
|
|
|
|
|
|
|
| 397 |
|
| 398 |
+
Prepare models in 2 steps, then run a single command.
|
| 399 |
+
|
| 400 |
+
1) Login to Hugging Face (required for gated FLUX.1-dev). Skip if already logged-in.
|
| 401 |
|
| 402 |
+
```bash
|
| 403 |
+
python -m tools.hf_login --token "$HF_TOKEN"
|
| 404 |
+
```
|
| 405 |
|
| 406 |
+
2) Download required weights to fixed paths and export env vars
|
|
|
|
| 407 |
|
| 408 |
+
```bash
|
| 409 |
+
# FLUX.1-dev (flow+ae), SwinIR prior, T5, CLIP, SigLIP and LucidFlux checkpoint to ./weights
|
| 410 |
+
python -m tools.download_weights --dest weights
|
| 411 |
|
| 412 |
+
# Exports FLUX_DEV_FLOW/FLUX_DEV_AE to your shell
|
| 413 |
+
source weights/env.sh
|
| 414 |
+
```
|
| 415 |
|
|
|
|
|
|
|
| 416 |
|
| 417 |
+
Run inference (uses fixed relative paths):
|
|
|
|
|
|
|
|
|
|
|
|
|
| 418 |
|
| 419 |
+
```bash
|
| 420 |
+
bash inference.sh
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 421 |
```
|
| 422 |
|
| 423 |
+
You can also obtain results of LucidFlux on RealSR and RealLQ250 from Hugging Face: [**LucidFlux**](https://huggingface.co/W2GenAI/LucidFlux).
|
| 424 |
|
| 425 |
## πͺͺ License
|
| 426 |
|
|
|
|
| 454 |
</details>
|
| 455 |
|
| 456 |
|
| 457 |
+
</div>
|