update model path
This commit is contained in:
@@ -1,64 +1,86 @@
|
||||
# HiDream-I1
|
||||
|
||||
`HiDream-I1` is a series of state-of-the-art open-source image generation models featuring a 16 billion parameter rectified flow transformer with Mixture of Experts architecture, designed to create high-quality images from text prompts.
|
||||
|
||||
## Project Updates
|
||||
- ```2025/4/7```: We've open-sourced the text-to-image model **HiDream-I1**.
|
||||
|
||||
## Installation
|
||||
Please make sure you have installed [Flash Attention](https://github.com/Dao-AILab/flash-attention). We recommend CUDA versions 12.4 for the manual installation.
|
||||
```
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## Models
|
||||
|
||||
We offer both the full version and distilled models. For more information about the models, please refer to the link under Usage.
|
||||
|
||||
| Name | Usage | HuggingFace repo |
|
||||
| --------------------------- | ---------------------------------------------------------- | -------------------------------------------------------------- |
|
||||
| HiDream-I1-Full | [inference.py](./inference.py) | https://huggingface.co |
|
||||
| HiDream-I1-Distilled | [inference_distilled.py](./inference_distilled.py) | https://huggingface.co |
|
||||
| Name | Script | Inference Steps | HuggingFace repo |
|
||||
| --------------- | -------------------------------------------------- | --------------- | ---------------------- |
|
||||
| HiDream-I1-Full | [inference.py](./inference.py) | 50 | 🤗 [HiDream-I1-Full](https://huggingface.co/HiDream-ai/HiDream-I1-Full) |
|
||||
| HiDream-I1-Dev | [inference_distilled.py](./inference_distilled.py) | 28 | 🤗 [HiDream-I1-Dev](https://huggingface.co/HiDream-ai/HiDream-I1-Dev) |
|
||||
| HiDream-I1-Fast | [inference_distilled.py](./inference_distilled.py) | 16 | 🤗 [HiDream-I1-Fast](https://huggingface.co/HiDream-ai/HiDream-I1-Fast) |
|
||||
|
||||
## Model Metrics
|
||||
|
||||
## Quick Start
|
||||
Please make sure you have installed [Flash Attention](https://github.com/Dao-AILab/flash-attention). We recommend CUDA versions 12.4 for the manual installation.
|
||||
```
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Then you can run the inference scripts to generate images:
|
||||
|
||||
``` python
|
||||
|
||||
# For full model inference
|
||||
python ./inference.py
|
||||
|
||||
# For distilled dev model inference
|
||||
INFERENCE_STEP=28 PRETRAINED_MODEL_NAME_OR_PATH=HiDream-ai/HiDream-I1-Dev python inference_distilled.py
|
||||
|
||||
# For distilled fast model inference
|
||||
INFERENCE_STEP=16 PRETRAINED_MODEL_NAME_OR_PATH=HiDream-ai/HiDream-I1-Fast python inference_distilled.py
|
||||
|
||||
```
|
||||
> **Note:** The inference script will automatically download `meta-llama/Meta-Llama-3.1-8B-Instruct` model files. If you encounter network issues, you can download these files ahead of time and place them in the appropriate cache directory to avoid download failures during inference.
|
||||
|
||||
|
||||
## Evaluation Metrics
|
||||
|
||||
### DPG-Bench
|
||||
| Model | Overall | Global | Entity | Attribute | Relation | Other |
|
||||
|-----------------|-----------|-----------|-----------|-----------|-----------|-----------|
|
||||
| PixArt-alpha | 71.11 | 74.97 | 79.32 | 78.60 | 82.57 | 76.96 |
|
||||
| SDXL | 74.65 | 83.27 | 82.43 | 80.91 | 86.76 | 80.41 |
|
||||
| DALL-E 3 | 83.50 | 90.97 | 89.61 | 88.39 | 90.58 | 89.83 |
|
||||
| Flux.1-dev | 83.79 | 85.80 | 86.79 | 89.98 | 90.04 | 89.90 |
|
||||
| SD3-Medium | 84.08 | 87.90 | 91.01 | 88.83 | 80.70 | 88.68 |
|
||||
| Janus-Pro-7B | 84.19 | 86.90 | 88.90 | 89.40 | 89.32 | 89.48 |
|
||||
| CogView4-6B | 85.13 | 83.85 | 90.35 | 91.17 | 91.14 | 87.29 |
|
||||
| **HiDream-I1** | **85.89**| 76.44 | 90.22 | 89.48 | 93.74 | 91.83 |
|
||||
| Model | Overall | Global | Entity | Attribute | Relation | Other |
|
||||
| -------------- | --------- | ------ | ------ | --------- | -------- | ----- |
|
||||
| PixArt-alpha | 71.11 | 74.97 | 79.32 | 78.60 | 82.57 | 76.96 |
|
||||
| SDXL | 74.65 | 83.27 | 82.43 | 80.91 | 86.76 | 80.41 |
|
||||
| DALL-E 3 | 83.50 | 90.97 | 89.61 | 88.39 | 90.58 | 89.83 |
|
||||
| Flux.1-dev | 83.79 | 85.80 | 86.79 | 89.98 | 90.04 | 89.90 |
|
||||
| SD3-Medium | 84.08 | 87.90 | 91.01 | 88.83 | 80.70 | 88.68 |
|
||||
| Janus-Pro-7B | 84.19 | 86.90 | 88.90 | 89.40 | 89.32 | 89.48 |
|
||||
| CogView4-6B | 85.13 | 83.85 | 90.35 | 91.17 | 91.14 | 87.29 |
|
||||
| **HiDream-I1** | **85.89** | 76.44 | 90.22 | 89.48 | 93.74 | 91.83 |
|
||||
|
||||
### GenEval
|
||||
|
||||
| Model | Overall | Single Obj. | Two Obj. | Counting | Colors | Position | Color attribution |
|
||||
|-----------------|----------|-------------|----------|----------|----------|----------|-------------------|
|
||||
| SDXL | 0.55 | 0.98 | 0.74 | 0.39 | 0.85 | 0.15 | 0.23 |
|
||||
| PixArt-alpha | 0.48 | 0.98 | 0.50 | 0.44 | 0.80 | 0.08 | 0.07 |
|
||||
| Flux.1-dev | 0.66 | 0.98 | 0.79 | 0.73 | 0.77 | 0.22 | 0.45 |
|
||||
| DALL-E 3 | 0.67 | 0.96 | 0.87 | 0.47 | 0.83 | 0.43 | 0.45 |
|
||||
| CogView4-6B | 0.73 | 0.99 | 0.86 | 0.66 | 0.79 | 0.48 | 0.58 |
|
||||
| SD3-Medium | 0.74 | 0.99 | 0.94 | 0.72 | 0.89 | 0.33 | 0.60 |
|
||||
| Janus-Pro-7B | 0.80 | 0.99 | 0.89 | 0.59 | 0.90 | 0.79 | 0.66 |
|
||||
| **HiDream-I1** | **0.83**| 1.00 | 0.98 | 0.79 | 0.91 | 0.60 | 0.72 |
|
||||
| Model | Overall | Single Obj. | Two Obj. | Counting | Colors | Position | Color attribution |
|
||||
| -------------- | -------- | ----------- | -------- | -------- | ------ | -------- | ----------------- |
|
||||
| SDXL | 0.55 | 0.98 | 0.74 | 0.39 | 0.85 | 0.15 | 0.23 |
|
||||
| PixArt-alpha | 0.48 | 0.98 | 0.50 | 0.44 | 0.80 | 0.08 | 0.07 |
|
||||
| Flux.1-dev | 0.66 | 0.98 | 0.79 | 0.73 | 0.77 | 0.22 | 0.45 |
|
||||
| DALL-E 3 | 0.67 | 0.96 | 0.87 | 0.47 | 0.83 | 0.43 | 0.45 |
|
||||
| CogView4-6B | 0.73 | 0.99 | 0.86 | 0.66 | 0.79 | 0.48 | 0.58 |
|
||||
| SD3-Medium | 0.74 | 0.99 | 0.94 | 0.72 | 0.89 | 0.33 | 0.60 |
|
||||
| Janus-Pro-7B | 0.80 | 0.99 | 0.89 | 0.59 | 0.90 | 0.79 | 0.66 |
|
||||
| **HiDream-I1** | **0.83** | 1.00 | 0.98 | 0.79 | 0.91 | 0.60 | 0.72 |
|
||||
|
||||
### HPSv2.1 benchmark
|
||||
|
||||
| Model | Averaged | Animation | Concept-art | Painting | Photo |
|
||||
|-------------------------|----------------|------------|---------------|--------------|------------|
|
||||
| Stable Diffusion v2.0 | 26.38 | 27.09 | 26.02 | 25.68 | 26.73 |
|
||||
| Midjourney V6 | 30.29 | 32.02 | 30.29 | 29.74 | 29.10 |
|
||||
| SDXL | 30.64 | 32.84 | 31.36 | 30.86 | 27.48 |
|
||||
| Dall-E3 | 31.44 | 32.39 | 31.09 | 31.18 | 31.09 |
|
||||
| SD3 | 31.53 | 32.60 | 31.82 | 32.06 | 29.62 |
|
||||
| Midjourney V5 | 32.33 | 34.05 | 32.47 | 32.24 | 30.56 |
|
||||
| CogView4-6B | 32.31 | 33.23 | 32.60 | 32.89 | 30.52 |
|
||||
| Flux.1-dev | 32.47 | 33.87 | 32.27 | 32.62 | 31.11 |
|
||||
| stable cascade | 32.95 | 34.58 | 33.13 | 33.29 | 30.78 |
|
||||
| **HiDream-I1** | **33.82** | 35.05 | 33.74 | 33.88 | 32.61 |
|
||||
| Model | Averaged | Animation | Concept-art | Painting | Photo |
|
||||
| --------------------- | --------- | --------- | ----------- | -------- | ----- |
|
||||
| Stable Diffusion v2.0 | 26.38 | 27.09 | 26.02 | 25.68 | 26.73 |
|
||||
| Midjourney V6 | 30.29 | 32.02 | 30.29 | 29.74 | 29.10 |
|
||||
| SDXL | 30.64 | 32.84 | 31.36 | 30.86 | 27.48 |
|
||||
| Dall-E3 | 31.44 | 32.39 | 31.09 | 31.18 | 31.09 |
|
||||
| SD3 | 31.53 | 32.60 | 31.82 | 32.06 | 29.62 |
|
||||
| Midjourney V5 | 32.33 | 34.05 | 32.47 | 32.24 | 30.56 |
|
||||
| CogView4-6B | 32.31 | 33.23 | 32.60 | 32.89 | 30.52 |
|
||||
| Flux.1-dev | 32.47 | 33.87 | 32.27 | 32.62 | 31.11 |
|
||||
| stable cascade | 32.95 | 34.58 | 33.13 | 33.29 | 30.78 |
|
||||
| **HiDream-I1** | **33.82** | 35.05 | 33.74 | 33.88 | 32.61 |
|
||||
|
||||
## License
|
||||
|
||||
|
||||
Reference in New Issue
Block a user