HeadEvolver: Text to Head Avatars via Locally Learnable Mesh Deformation

Duotun Wang^*1, Hengyu Meng^*1,3, Zeyu Cai^*1, Zhijing Shao¹, Qianxi Liu¹, Lin Wang^1,4, Mingming Fan^1,4, Ying Shan², Xiaohang Zhan², Zeyu Wang^1,4

The Hong Kong University of Science and Technology (Guangzhou)¹, Tencent AI Lab², South China University of Technology³, The Hong Kong University of Science and Technology⁴

Abstract

We present HeadEvolver, a novel framework to generate stylized head avatars from text guidance. HeadEvolver uses locally learnable mesh deformation from a template head mesh, producing high-quality digital assets for detail-preserving editing and animation. To tackle the challenges of lacking fine-grained and semantic-aware local shape control in global deformation through Jacobians, we introduce a trainable parameter as a weighting factor for the Jacobian at each triangle to adaptively change local shapes while maintaining global correspondences and facial features. Moreover, to ensure the coherence of the resulting shape and appearance from different viewpoints, we use pretrained image diffusion models for differentiable rendering with regularization terms to refine the deformation under text guidance. Extensive experiments demonstrate that our method can generate diverse head avatars with an articulated mesh that can be edited seamlessly in 3D graphics software, facilitating downstream applications such as more efficient animation with inherited blend shapes and semantic consistency.

Project arXiv BibTeX

BibTeX copied to clipboard