Update README.md
Browse files
README.md
CHANGED
@@ -19,7 +19,7 @@ library_name: diffusers
|
|
19 |
|
20 |
# Key Feature
|
21 |
|
22 |
-
- **Open Source**: Full [model weights](https://huggingface.co/rhymes-ai/Allegro) and [code](https://github.com/rhymes-ai/Allegro) available to the community, Apache 2.0!
|
23 |
- **Versatile Content Creation**: Capable of generating a wide range of content, from close-ups of humans and animals to diverse dynamic scenes.
|
24 |
- **Text-Image-to-Video Generation**: Generate videos from user-provided prompts and images. Supported input types include:
|
25 |
- Generating subsequent video content from a user prompt and first frame image.
|
@@ -87,17 +87,13 @@ library_name: diffusers
|
|
87 |
|
88 |
# Quick start
|
89 |
|
90 |
-
1. **Download the Allegro GitHub code.**
|
91 |
|
92 |
2. **Install the necessary requirements.**
|
93 |
-
|
94 |
-
|
95 |
-
- PyTorch >= 2.4
|
96 |
-
- CUDA >= 12.4
|
97 |
-
For details, see `requirements.txt`.
|
98 |
-
2. It is recommended to use Anaconda to create a new environment (Python >= 3.10) for running the example.
|
99 |
|
100 |
-
3. **Download the Allegro-TI2V model weights.**
|
101 |
|
102 |
4. **Run inference.**
|
103 |
```bash
|
@@ -113,9 +109,7 @@ library_name: diffusers
|
|
113 |
--seed 1427329220
|
114 |
```
|
115 |
|
116 |
-
The output video resolution is fixed at
|
117 |
-
|
118 |
-
### Arguments and Descriptions
|
119 |
|
120 |
| Argument | Description |
|
121 |
|----------------------|---------------------------------------------------------------------------------------------------|
|
@@ -124,10 +118,10 @@ library_name: diffusers
|
|
124 |
| `--last_frame` | [Optional] If provided, the model will generate intermediate video content based on the specified first and last frame images. |
|
125 |
| `--enable_cpu_offload` | [Optional] Offload the model into CPU for less GPU memory cost (about 9.3G, compared to 27.5G if CPU offload is not enabled), but the inference time will increase significantly. |
|
126 |
|
127 |
-
|
128 |
|
129 |
- It is recommended to use [EMA-VFI](https://github.com/MCG-NJU/EMAVFI) to interpolate the video from 15 FPS to 30 FPS.
|
130 |
-
- For better visual quality, you can use
|
131 |
|
132 |
|
133 |
|
|
|
19 |
|
20 |
# Key Feature
|
21 |
|
22 |
+
- **Open Source**: Full [model weights](https://huggingface.co/rhymes-ai/Allegro-TI2V) and [code](https://github.com/rhymes-ai/Allegro) available to the community, Apache 2.0!
|
23 |
- **Versatile Content Creation**: Capable of generating a wide range of content, from close-ups of humans and animals to diverse dynamic scenes.
|
24 |
- **Text-Image-to-Video Generation**: Generate videos from user-provided prompts and images. Supported input types include:
|
25 |
- Generating subsequent video content from a user prompt and first frame image.
|
|
|
87 |
|
88 |
# Quick start
|
89 |
|
90 |
+
1. **Download the [Allegro GitHub code](https://github.com/rhymes-ai/Allegro).**
|
91 |
|
92 |
2. **Install the necessary requirements.**
|
93 |
+
- Ensure Python >= 3.10, PyTorch >= 2.4, CUDA >= 12.4. For details, see [requirements.txt](https://github.com/rhymes-ai/Allegro/blob/main/requirements.txt).
|
94 |
+
- It is recommended to use Anaconda to create a new environment (Python >= 3.10) to run the following example.
|
|
|
|
|
|
|
|
|
95 |
|
96 |
+
3. **Download the [Allegro-TI2V model weights](https://huggingface.co/rhymes-ai/Allegro-TI2V).**
|
97 |
|
98 |
4. **Run inference.**
|
99 |
```bash
|
|
|
109 |
--seed 1427329220
|
110 |
```
|
111 |
|
112 |
+
The output video resolution is fixed at 720 × 1280. Input images with different resolutions will be automatically cropped and resized to fit.
|
|
|
|
|
113 |
|
114 |
| Argument | Description |
|
115 |
|----------------------|---------------------------------------------------------------------------------------------------|
|
|
|
118 |
| `--last_frame` | [Optional] If provided, the model will generate intermediate video content based on the specified first and last frame images. |
|
119 |
| `--enable_cpu_offload` | [Optional] Offload the model into CPU for less GPU memory cost (about 9.3G, compared to 27.5G if CPU offload is not enabled), but the inference time will increase significantly. |
|
120 |
|
121 |
+
5. **(Optional) Interpolate the video to 30 FPS**
|
122 |
|
123 |
- It is recommended to use [EMA-VFI](https://github.com/MCG-NJU/EMAVFI) to interpolate the video from 15 FPS to 30 FPS.
|
124 |
+
- For better visual quality, you can use imageio to save the video.
|
125 |
|
126 |
|
127 |
|