MoRF: Mobile Realistic Fullbody Avatars from a Monocular Video

Samsung AI Center
arXiv pdf arXiv abs No plans for the code :(

We create fullbody avatars from short monocular videos that support novel camera and body poses.
MoRF avatars run 30+FPS, 640x640px on modern Qualcomm (e.g. Android) mobile devices.

Demo mobile app

Method TL;DR

Mesh fitting
Given a short monocular video we firstly fit SMPL-X mesh to it: we base on SMPLify-X and add 1) shared body shape, 2) silhouette loss, 3) temporal loss, and 4) heuristic to remove bad fitted frames:

Rendering network training
Given SMPL-X fits and segmentation masks we train DNR-based rendering network. Our novelty is: for each training frame we overfit a warping field (shown as (b) in figure below) in UV space to compensate between-frame mesh fit misalignment: Main scheme

Neural texture warping effect


Failure cases


  title={MoRF: Mobile Realistic Fullbody Avatars from a Monocular Video}, 
  author={Larionov, Alexey and Ustinova, Evgeniya and Sidorenko, Mikhail and Svitov, David and Zakharkin, Ilya and Lempitsky, Victor and Bashirov, Renat},
  journal={arXiv preprint arXiv:2303.10275},