PadInv

Qingyan Bai*¹ Yinghao Xu*² Jiapeng Zhu³ Weihao Xia⁴ Yujiu Yang¹ Yujun Shen⁵

¹ Tsinghua University ² CUHK ³ HKUST ⁴ UCL ⁵ ByteDance Inc.

Overview

This work targets high-fidelity GAN inversion with better recovered spatial details. For this purpose, we propose to involve the padding space of the generator to complement the native latent space. Concretely, we replace the constant padding (e.g., usually zeros) used in convolution layers with some instance-aware coefficients. In this way, the inductive bias assumed in the pre-trained model can be appropriately adapted to fit each individual image. We demonstrate that such a space extension allows a more flexible image manipulation, such as the separate control of face contour and facial details, and enables a novel editing manner where users can customize their own manipulations highly efficiently.

Results

Our method allows (a) high-fidelity GAN inversion with better spatial details, and enables two novel applications including (b) face blending and (c) customizing manipulations with one image pair.

More inversion results on human faces, outdoor churches, and indoor bedrooms.

More face blending results by borrowing face contour from one image and facial details from another.

More customized editing results, where each manipulation is defined by a simple image pair. Such a pair can be created highly efficiently either by graphics editors (like Photoshop) or by the convenient copy-and-paste.

BibTeX

@inproceedings{bai2022high,
  title={High-fidelity GAN inversion with padding space},
  author={Bai, Qingyan and Xu, Yinghao and Zhu, Jiapeng and Xia, Weihao and Yang, Yujiu and Shen, Yujun},
  booktitle={European Conference on Computer Vision},
  pages={36--53},
  year={2022},
  organization={Springer}
}

Related Work

Rui Xu, Xintao Wang, Kai Chen, Bolei Zhou, Chen Change Loy. Positional Encoding as Spatial Inductive Bias in GANs. CVPR, 2021.
Comment: Proposes positional encoding is indispensable for generating images with high fidelity and zero-padding is not sufficient.

Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or. Encoding in style: a stylegan encoder for image-to-image translation. CVPR, 2021.
Comment: Proposes an encoder for image translation tasks.

Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou. In-Domain GAN Inversion for Real Image Editing. ECCV, 2020.
Comment: Proposes an in-domain GAN inversion approach for inversion and real image editing.

Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, Daniel Cohen-Or. Designing an encoder for StyleGAN image manipulation. TOG, 2021.
Comment: Proposes an encoder with better editability by encouraging the latent codes to be subject to the native distribution.

ECCV 2022