Sample what you can't compress; image auto-encoders wihtout GANs

(arxiv.org)

19 points | by vighneshb a year ago ago

4 comments

In our latest paper we shoa that a GAN loss (used by almost all latent diffusion models) to train their autoencoders is not required and instead can be replaced with a diffusion loss. Our auto-encoder is trained end-to-end and achieves higher compression and better generation quality.

I am excited to share it with you. Let me know what you think.

Cheers

[-]

billconan a year ago

I just saw https://hanlab.mit.edu/projects/hart

it seems to be another autoencoder(autoregressive) + diffusion.

[-]

vighneshb a year ago

This is very interesting. Unlike us (who focus on the decoder) they focus on changing the representation itself so that they can achieve better generation. Thanks for the link.

[-]

billconan a year ago

they use autoencoder/autoregressive model to predict the big picture, and diffusion for the details, similar to yours.

The difference is they use discrete tokens.