ReVersion:基于扩散的图像关系反演 ReVersion: Diffusion-Based Relation Inversion from Images

作者:Ziqi Huang Tianxing Wu Yuming Jiang Kelvin C. K. Chan Ziwei Liu


Diffusion models gain increasing popularity for their generativecapabilities. Recently, there have been surging needs to generate customizedimages by inverting diffusion models from exemplar images. However, existinginversion methods mainly focus on capturing object appearances. How to invertobject relations, another important pillar in the visual world, remainsunexplored. In this work, we propose ReVersion for the Relation Inversion task,which aims to learn a specific relation (represented as “relation prompt”) fromexemplar images. Specifically, we learn a relation prompt from a frozenpre-trained text-to-image diffusion model. The learned relation prompt can thenbe applied to generate relation-specific images with new objects, backgrounds,and styles. Our key insight is the “preposition prior” – real-world relationprompts can be sparsely activated upon a set of basis prepositional words.Specifically, we propose a novel relation-steering contrastive learning schemeto impose two critical properties of the relation prompt: 1) The relationprompt should capture the interaction between objects, enforced by thepreposition prior. 2) The relation prompt should be disentangled away fromobject appearances. We further devise relation-focal importance sampling toemphasize high-level interactions over low-level appearances (e.g., texture,color). To comprehensively evaluate this new task, we contribute ReVersionBenchmark, which provides various exemplar images with diverse relations.Extensive experiments validate the superiority of our approach over existingmethods across a wide range of visual relations.



Related posts