We present Diffusion in Style, a simple method to adapt Stable Diffusion to any desired style, using only a small set of target images. It is based on the key observation that the style of the images generated by Stable Diffusion is tied to the initial latent tensor. Not adapting this initial latent tensor to the style makes fine-tuning slow, expensive, and impractical, especially when only a few target style images are available. In contrast, fine-tuning is much easier if this initial latent tensor is also adapted. Our Diffusion in Style is orders of magnitude more sample-efficient and faster. It also generates more pleasing images than existing approaches, as shown qualitatively and with quantitative comparisons.
Rakesh Chawla, Andrea Rizzi, Matthias Finger, Federica Legger, Matteo Galli, Sun Hee Kim, Jian Zhao, João Miguel das Neves Duarte, Tagir Aushev, Hua Zhang, Alexis Kalogeropoulos, Yixing Chen, Tian Cheng, Ioannis Papadopoulos, Gabriele Grosso, Valérie Scheurer, Meng Xiao, Qian Wang, Michele Bianco, Varun Sharma, Joao Varela, Sourav Sen, Ashish Sharma, Seungkyu Ha, David Vannerom, Csaba Hajdu, Sanjeev Kumar, Sebastiana Gianì, Kun Shi, Abhisek Datta, Siyuan Wang, Anton Petrov, Jian Wang, Yi Zhang, Muhammad Ansar Iqbal, Yong Yang, Xin Sun, Muhammad Ahmad, Donghyun Kim, Matthias Wolf, Anna Mascellani, Paolo Ronchese, , , , , , , , , , , , , , , , , , , , , , , ,
Alessandro Mapelli, Radoslav Marchevski, Alina Kleimenova
Rachid Guerraoui, Anne-Marie Kermarrec, Sadegh Farhadkhani, Rafael Pereira Pires, Rishi Sharma, Marinus Abraham de Vos