Jan 30, 2026

MUHAMMAD GHIFARY

Facial modeling has come a long way since the early days of simple 3D scans. Today, the FLAME (Faces Learned with an Articulated Model and Expressions) model stands as a cornerstone in computer vision, providing a powerful, differentiable, and highly expressive framework for human head modeling (Li et al. 2017).

In this article, we'll dive deep into the mechanics of FLAME, explore recent breakthroughs, and look at how we can leverage Apple's MLX framework to run these models at lightning speeds on Apple Silicon.

What is FLAME?

FLAME is a Linear Blend Skinning (LBS) model that captures the vast variety of human head shapes and expressions. Unlike older models that focused solely on the face, FLAME models the entire head — including the neck, jaw, and eyballs.

At its heart, FLAME represents a 3D mesh $M(\vec{\beta}, \vec{\theta}, \vec{\psi})$, which is a function of:

The final position of a vertex $v$ is calculated using the LBS formula:

$$ v_{\mathrm{final}} = \sum_{j=1}^{J} w_{j} G_j(\vec{\theta}, J) (v_{\mathrm{template}} + B_s(\vec{\beta}) + B_p(\vec{\theta}) + B_e(\vec{\psi})) $$

where

Recent Advancements

FLAME is not just a static model; it has become the core for modern facial research.