MegaPortraits: One-shot Megapixel Neural Head Avatars

ACMM 2022


In this work, we advance the neural head avatar technology to the megapixel resolution while focusing on the particularly challenging task of cross-driving synthesis, i.e., when the appearance of the driving image is substantially different from the animated source image. We propose a set of new neural architectures and training methods that can leverage both medium-resolution video data and high-resolution image data to achieve the desired levels of rendered image quality and generalization to novel views and motion. We show that suggested architectures and methods produce convincing high-resolution neural avatars, outperforming the competitors in the cross-driving scenario. Lastly, we show how a trained high- resolution neural avatar model can be distilled into a lightweight student model which runs in real-time and locks the identities of neural avatars to several dozens of pre-defined source images. Real-time operation and identity lock are essential for many practical applications head avatar systems.

Main scheme

We propose a system for the one-shot creation of high-resolution human avatars, called megapixel portraits or MegaPortraits for short. Our model is trained in two stages. Optionally, we propose an additional distillation stage for faster inference.

Our training setup is relatively standard. We sample two random frames from our dataset at each step: the source frame and the driver frame. Our model imposes the motion of the driving frame (i.e., the head pose and the facial expression) onto the appearance of the source frame to produce an output image. The main learning signal is obtained from the training episodes where the source and the driver frames come from the same video, and hence our model’s prediction is trained to match the driver frame.


                author = {Drobyshev, Nikita and Chelishev, Jenya and Khakhulin, Taras and Ivakhnenko, Aleksei and Lempitsky, Victor and Zakharov, Egor},
                title = {MegaPortraits: One-shot Megapixel Neural Head Avatars},
                journal   = {Proceedings of the 30th ACM International Conference on Multimedia},
                year      = {2022},