MoRF: Mobile Realistic Fullbody Avatars from a Monocular Video

Alexey
Larionov,

Evgeniya
Ustinova,

Mikhail
Sidorenko,

David
Svitov,

Ilya
Zakharkin,

Victor
Lempitsky,

Renat
Bashirov

Samsung AI Center

arXiv pdf arXiv abs No plans for the code :(

We create fullbody avatars from short monocular videos that support novel camera and body poses.
MoRF avatars run 30+FPS, 640x640px on modern Qualcomm (e.g. Android) mobile devices.

Demo mobile app

Method TL;DR

Mesh fitting
Given a short monocular video we firstly fit SMPL-X mesh to it: we base on SMPLify-X and add 1) shared body shape, 2) silhouette loss, 3) temporal loss, and 4) heuristic to remove bad fitted frames:

Rendering network training
Given SMPL-X fits and segmentation masks we train DNR-based rendering network. Our novelty is: for each training frame we overfit a warping field (shown as (b) in figure below) in UV space to compensate between-frame mesh fit misalignment: Main scheme

Neural texture warping effect

Suppmat

Failure cases

Only SMPL-X like geometry is rendered well, loose clothing and other geometric details are not

Not enough data & surface area to learn

No prior for train-blind areas

BibTeX

@article{morf2023larionov,
  title={MoRF: Mobile Realistic Fullbody Avatars from a Monocular Video}, 
  author={Larionov, Alexey and Ustinova, Evgeniya and Sidorenko, Mikhail and Svitov, David and Zakharkin, Ilya and Lempitsky, Victor and Bashirov, Renat},
  journal={arXiv preprint arXiv:2303.10275},
  year={2023}
}