VAE
Video-Generation
daiqi commited on
Commit
0e6dd2c
·
verified ·
1 Parent(s): d14a57d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
  # Reducio-VAE Model Card
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
- This model is a 3D VAE that encodes video into a compact latent space conditioned on a content frame. It compresses a video by a factor of \\(\frac{T}{4}\times\frac{H}{32}\times\frac{W}{32}\\), enabling \\(4096\times\\) downsampling.
12
  It is part of the [Reducio-DiT](https://arxiv.org/abs/xxxx), which is a video generation method. Codebase available [here](https://github.com/microsoft/Reducio-VAE).
13
 
14
 
@@ -53,7 +53,7 @@ Metrics on 1K Pexels validation set and UCF-101:
53
 
54
  |Method|Downsample Factor|\\(\|z\|\\)|PSNR |SSIM |LPIPS |rFVD (Pexels)|rFVD (UCF-101)|
55
  |---------|---------------------|------------------|------------|--------------------|--------------|----------------|------------|
56
- |SD2.1-VAE|\\(1\times8\times8\\)|4|29.23|0.82|0.09|25.96|21.00|
57
  |SDXL-VAE|\\(1\times8\times8\\)|16|30.54|0.85|0.08|19.87|23.68|
58
  |OmniTokenizer|\\(4\times8\times8\\)|8|27.11|0.89|0.07|23.88|30.52|
59
  |OpenSora-1.2|\\(4\times8\times8\\)|16|30.72|0.85|0.11|60.88|67.52|
 
8
  # Reducio-VAE Model Card
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
+ This model is a 3D VAE that encodes video into a compact latent space conditioned on a content frame. It compresses a video by a factor of \\(\frac{T}{4}\times\frac{H}{32}\times\frac{W}{32}\\), enabling 4096x downsampling.
12
  It is part of the [Reducio-DiT](https://arxiv.org/abs/xxxx), which is a video generation method. Codebase available [here](https://github.com/microsoft/Reducio-VAE).
13
 
14
 
 
53
 
54
  |Method|Downsample Factor|\\(\|z\|\\)|PSNR |SSIM |LPIPS |rFVD (Pexels)|rFVD (UCF-101)|
55
  |---------|---------------------|------------------|------------|--------------------|--------------|----------------|------------|
56
+ |SD2.1-VAE|\\(\frac{T}{4}\times\frac{H}{32}\times\frac{W}{32}\\)|4|29.23|0.82|0.09|25.96|21.00|
57
  |SDXL-VAE|\\(1\times8\times8\\)|16|30.54|0.85|0.08|19.87|23.68|
58
  |OmniTokenizer|\\(4\times8\times8\\)|8|27.11|0.89|0.07|23.88|30.52|
59
  |OpenSora-1.2|\\(4\times8\times8\\)|16|30.72|0.85|0.11|60.88|67.52|