This is a finetuned version of the LTX Video 0.9 VAE, which attempts to improve the checkerboard artifacts which are common in the original model. Most of the finetuning was done on the decoder only, to prevent the latent space from changing. After that, some limited training of the encoder only, with the decoder frozen, was done to further reduce artifacts while only minimally changing the latent space.

The finetuning was partially successful, as it did reduce the strength of the artifacts, but it did not eliminate them entirely. Unfortunately they may be very diffucult to eliminate completely, due to the use of strided convolutions in the encoder and pixel shuffle upscaling in the decoder. (See this article for explanation)

Two versions of the model are included, one with the finetuned decoder but original encoder, and the other with the same decoder and also the finetuned encoder. Changing the encoder will slightly change the results of videos generated with i2v, but it remains compatible with the diffusion model.

License is unchanged from the original, and I will update the license and release the training code once Lightricks follow through on their promise to release the model under a commercially permissive license.

Comparison video (download)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for spacepxl/ltx-video-0.9-vae-finetune

Finetuned
(9)
this model