Awesome talking face generation

papers & codes

2022

title	-	paper	code	dataset	keywords
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model	SIGGRAPH (22)	paper		emotion
Expressive Talking Head Generation with Granular Audio-Visual Control	CVPR(22)	paper	-
Deep Learning for Visual Speech Analysis: A Survey	-	paper	-	-	survey
StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN	-	paper	code	-	stylegan
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation	-	paper	code(coming soon)		NeRF
Cross-Modal Mutual Learning for Audio-Visual Speech Recognition and Manipulation	-	paper	-	-	-
One-shot talking face generation from single-speaker audio-visual correlation learning	AAAI(22)	paper	code	-	-
SyncTalkFace: Talking Face Generation with Precise Lip-syncing via Audio-Lip Memory	AAAI(22)	paper(temp)	-	LRW, LRS2, BBC News	-
DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering		paper			NeRF
Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos		paper
Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions		paper
DialogueNeRF: Towards Realistic Avatar Face-to-face Conversation Video Generation		paper
Talking Head Generation Driven by Speech-Related Facial Action Units and Audio- Based on Multimodal Representation Fusion		paper

2021

title	-	paper	code	dataset
Parallel and High-Fidelity Text-to-Lip Generation		paper
[Survey]Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis	-	paper	-	-
FaceFormer: Speech-Driven 3D Facial Animation with Transformers	CVPR(22)	paper	code	-
Voice2Mesh: Cross-Modal 3D Face Model Generation from Voices	-	paper	code	-
FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning	ICCV	paper	code	-
Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis	-	paper	code	-
Audio-Driven Emotional Video Portraits	CVPR	paper	code	MEAD, LRW
LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization	CVPR	paper	-	-
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation	CVPR	paper	code	VoxCeleb2, LRW
Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset	CVPR	paper	code	HDTF
MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement	ICCV	paper	code(coming soon)	-
AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis	ICCV	paper	code	-
Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation	AAAI	paper	code(coming soon)	Mocap dataset
Visual Speech Enhancement Without A Real Visual Stream	-	paper	-	-
Text2Video: Text-driven Talking-head Video Synthesis with Phonetic Dictionary	-	paper	code	-
Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion	IJCAI	paper	code	VoxCeleb, GRID, LRW
3D-TalkEmo: Learning to Synthesize 3D Emotional Talking Head	-	paper	-	-
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person	-	paper	-	VoxCeleb2, Obama

2020

title	-	paper	code	dataset
[Survey]What comprises a good talking-head video generation?: A survey and benchmark	-	paper	code	-
One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing	CVPR(21)	paper	code	-
Speech Driven Talking Face Generation from a Single Image and an Emotion Condition	-	paper	code	CREMA-D
A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild	ACMMM	paper	code	LRS2
Talking-head Generation with Rhythmic Head Motion	ECCV	paper	code	Crema, Grid, Voxceleb, Lrs3
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation	ECCV	paper	code	VoxCeleb2, AffectNet
Neural voice puppetry:Audio-driven facial reenactment	ECCV	paper	-	-
Fast Bi-layer Neural Synthesis of One-Shot Realistic Head Avatars	ECCV	paper	code	-
HeadGAN:Video-and-Audio-Driven Talking Head Synthesis	-	paper	-	VoxCeleb2
MakeItTalk: Speaker-Aware Talking Head Animation	-	paper	code, code	VoxCeleb2, VCTK
Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose	-	paper	code	ImageNet, FaceWarehouse, LRW
Photorealistic Lip Sync with Adversarial Temporal Convolutional Networks	-	paper	-	-
SPEECH-DRIVEN FACIAL ANIMATION USING POLYNOMIAL FUSION OF FEATURES	-	paper	-	LRW
Animating Face using Disentangled Audio Representations	WACV	paper	-
Everybody’s Talkin’: Let Me Talk as You Want	-	paper	-	-
Multimodal Inputs Driven Talking Face Generation With Spatial-Temporal Dependency	-	paper	-	-
Speech Driven Talking Face Generation from a Single Image and an Emotion Condition	-	paper	-	-

2019

title	-	paper	code	dataset
Hierarchical Cross-Modal Talking Face Generation with Dynamic Pixel-Wise Loss	CVPR	paper	code	VGG Face, LRW

datasets

MEAD link
HDTF link
CREMA-D link
VoxCeleb link
LRS2 link
LRW link
GRID link
BIWI link
SAVEE link

metrics

PSNR (peak signal-to-noise ratio)
SSIM (structural similarity index measure)
LMD (landmark distance error)
LRA (lip-reading accuracy) -
FID (Fréchet inception distance)
LSE-D (Lip Sync Error - Distance)
LSE-C (Lip Sync Error - Confidence)
LPIPS (Learned Perceptual Image Patch Similarity) -
NIQE (Natural Image Quality Evaluator) -

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome talking face generation

papers & codes

2022

2021

2020

2019

datasets

metrics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

wangsuzhen/awesome_talking_face_generation

Folders and files

Latest commit

History

Repository files navigation

Awesome talking face generation

papers & codes

2022

2021

2020

2019

datasets

metrics

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages