RhymesAI commited on
Commit
e0aa850
·
verified ·
1 Parent(s): 6cad2b3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -14
README.md CHANGED
@@ -19,7 +19,7 @@ library_name: diffusers
19
 
20
  # Key Feature
21
 
22
- - **Open Source**: Full [model weights](https://huggingface.co/rhymes-ai/Allegro) and [code](https://github.com/rhymes-ai/Allegro) available to the community, Apache 2.0!
23
  - **Versatile Content Creation**: Capable of generating a wide range of content, from close-ups of humans and animals to diverse dynamic scenes.
24
  - **Text-Image-to-Video Generation**: Generate videos from user-provided prompts and images. Supported input types include:
25
  - Generating subsequent video content from a user prompt and first frame image.
@@ -87,17 +87,13 @@ library_name: diffusers
87
 
88
  # Quick start
89
 
90
- 1. **Download the Allegro GitHub code.**
91
 
92
  2. **Install the necessary requirements.**
93
- 1. Ensure the following dependencies are met:
94
- - Python >= 3.10
95
- - PyTorch >= 2.4
96
- - CUDA >= 12.4
97
- For details, see `requirements.txt`.
98
- 2. It is recommended to use Anaconda to create a new environment (Python >= 3.10) for running the example.
99
 
100
- 3. **Download the Allegro-TI2V model weights.**
101
 
102
  4. **Run inference.**
103
  ```bash
@@ -113,9 +109,7 @@ library_name: diffusers
113
  --seed 1427329220
114
  ```
115
 
116
- The output video resolution is fixed at **720 × 1280**. Input images with different resolutions will be automatically cropped and resized to fit.
117
-
118
- ### Arguments and Descriptions
119
 
120
  | Argument | Description |
121
  |----------------------|---------------------------------------------------------------------------------------------------|
@@ -124,10 +118,10 @@ library_name: diffusers
124
  | `--last_frame` | [Optional] If provided, the model will generate intermediate video content based on the specified first and last frame images. |
125
  | `--enable_cpu_offload` | [Optional] Offload the model into CPU for less GPU memory cost (about 9.3G, compared to 27.5G if CPU offload is not enabled), but the inference time will increase significantly. |
126
 
127
- ### (Optional) Interpolate the video to 30 FPS
128
 
129
  - It is recommended to use [EMA-VFI](https://github.com/MCG-NJU/EMAVFI) to interpolate the video from 15 FPS to 30 FPS.
130
- - For better visual quality, you can use `imageio` to save the video.
131
 
132
 
133
 
 
19
 
20
  # Key Feature
21
 
22
+ - **Open Source**: Full [model weights](https://huggingface.co/rhymes-ai/Allegro-TI2V) and [code](https://github.com/rhymes-ai/Allegro) available to the community, Apache 2.0!
23
  - **Versatile Content Creation**: Capable of generating a wide range of content, from close-ups of humans and animals to diverse dynamic scenes.
24
  - **Text-Image-to-Video Generation**: Generate videos from user-provided prompts and images. Supported input types include:
25
  - Generating subsequent video content from a user prompt and first frame image.
 
87
 
88
  # Quick start
89
 
90
+ 1. **Download the [Allegro GitHub code](https://github.com/rhymes-ai/Allegro).**
91
 
92
  2. **Install the necessary requirements.**
93
+ - Ensure Python >= 3.10, PyTorch >= 2.4, CUDA >= 12.4. For details, see [requirements.txt](https://github.com/rhymes-ai/Allegro/blob/main/requirements.txt).
94
+ - It is recommended to use Anaconda to create a new environment (Python >= 3.10) to run the following example.
 
 
 
 
95
 
96
+ 3. **Download the [Allegro-TI2V model weights](https://huggingface.co/rhymes-ai/Allegro-TI2V).**
97
 
98
  4. **Run inference.**
99
  ```bash
 
109
  --seed 1427329220
110
  ```
111
 
112
+ The output video resolution is fixed at 720 × 1280. Input images with different resolutions will be automatically cropped and resized to fit.
 
 
113
 
114
  | Argument | Description |
115
  |----------------------|---------------------------------------------------------------------------------------------------|
 
118
  | `--last_frame` | [Optional] If provided, the model will generate intermediate video content based on the specified first and last frame images. |
119
  | `--enable_cpu_offload` | [Optional] Offload the model into CPU for less GPU memory cost (about 9.3G, compared to 27.5G if CPU offload is not enabled), but the inference time will increase significantly. |
120
 
121
+ 5. **(Optional) Interpolate the video to 30 FPS**
122
 
123
  - It is recommended to use [EMA-VFI](https://github.com/MCG-NJU/EMAVFI) to interpolate the video from 15 FPS to 30 FPS.
124
+ - For better visual quality, you can use imageio to save the video.
125
 
126
 
127