Awesome Video Saliency Prediction
[2017] Spatio-Temporal Saliency Networks for Dynamic Saliency Prediction, paper , github
[2018] Revisiting Video Saliency: A Large-scale Benchmark and a New Model, paper , github1 , github2
[2018] Video Saliency Prediction Based on Spatial-Temporal Two-Stream Network, paper , github
[2018] DeepVS: A Deep Learning Based Video Saliency Prediction Approach, paper , github
[2018] Temporal Saliency Adaptation in Egocentric Videos, paper , github
[2019] Temporal Recurrences for Video Saliency Prediction, paper , github
[2019] Simple vs complex temporal recurrences for video saliency prediction, paper , github
[2019] TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection, paper , github
[2019] Video Saliency Prediction Using Spatiotemporal Residual Attentive Networks, paper , github
[2020] Unified Image and Video Saliency Modeling, paper , github
[2020] A Spatial-Temporal Recurrent Neural Network for Video Saliency Prediction, paper , github
[2020] 3DSal: An Efficient 3D-CNN Architecture for Video Saliency Prediction, paper , github
[2020] DeepCT: A novel deep complex-valued network with learnable transform for video saliency prediction, paper
[2021] Video saliency prediction via spatio-temporal reasoning, paper
[2021] GASP: Gated Attention For Saliency Prediction, paper , github
[2021] Video Saliency Prediction Using Enhanced Spatiotemporal Alignment Network, paper , github
[2021] Hierarchical Domain-Adapted Feature Learning for Video Saliency Prediction, paper , github
[2021] Noise-Aware Video Saliency Prediction, paper , github
[2021] SalED: Saliency prediction with a pithy encoder-decoder architecture sensing local and global information, paper , github
[2021] STA3D: Spatiotemporally attentive 3D network for video saliency prediction, paper
[2021] A Gated Fusion Network for Dynamic Saliency Prediction, paper
[2022] ECANet: Explicit cyclic attention-based network for video saliency prediction, paper
[2022] An efficient saliency prediction model for Unmanned Aerial Vehicle video, paper , github
[2023] Accurate video saliency prediction via hierarchical fusion and temporal recurrence, paper
[2023] Visual saliency assistance mechanism based on visually impaired navigation systems, paper
[2023] GFNet: gated fusion network for video saliency prediction, paper
[2023] Transformer-Based Multi-Scale Feature Integration Network for Video Saliency Prediction, paper , github
[2023] Spatio-Temporal Self-Attention Network for Video Saliency Prediction, paper , github
[2023] Multi-Scale Spatiotemporal Feature Fusion Network for Video Saliency Prediction, paper
[2023] TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps Distillation, paper , github
[2023] UniST: Towards Unifying Saliency Transformer for Video Saliency Prediction and Detection, paper
[2024] Transformer-based Video Saliency Prediction with High Temporal Dimension Decoding, paper
[2024] SalFoM: Dynamic Saliency Prediction with Video Foundation Models, paper
[2024] Transformer-based multi-level attention integration network for video saliency prediction, paper
[2024] OFF-ViNet: Optical Flow-Based Feature Warping ViNet for Video Saliency Prediction Considering Future Prediction, paper
[2024] The Visual Saliency Transformer Goes Temporal: TempVST for Video Saliency Prediction, paper
[2025] TM2SP: A Transformer-based Multi-Level Spatiotemporal Feature Pyramid Network for Video Saliency Prediction, paper
[2025] Hierarchical spatiotemporal Feature Interaction Network for video saliency prediction, paper
[2025] Minimalistic Video Saliency Prediction via Efficient Decoder & Spatio Temporal Action Cues, paper
[2025] TFS-Net: Temporal first simulation network for video saliency prediction, paper
[2025] RecSal-Net: Recursive Saliency Network for video saliency prediction, paper , github
[2025] Combining spatio-temporal attention and multi-level feature fusion for video saliency prediction, paper
[2025] PredVSD: Video saliency prediction based on conditional diffusion model, paper
[2025] Video saliency prediction via single feature enhancement and temporalrecurrence, paper
[2019] DAVE: A Deep Audio-Visual Embedding for Dynamic Saliency Prediction, paper , github
[2020] Audiovisual saliency prediction via deep learning, paper
[2020] STAViS: Spatio-Temporal AudioVisual Saliency Network, paper , github
[2020] Learning to Predict Salient Faces: A Novel Visual-Audio Saliency Model, paper , github
[2020] A Multimodal Saliency Model for Videos With High Audio-Visual Correspondence, paper
[2021] Joint learning of visual-audio saliency prediction and sound source localization on multi-face videos, paper
[2021] ViNet: Pushing the limits of Visual Modality for Audio-Visual Saliency Prediction, paper , github
[2021] Deep Audio-Visual Fusion Neural Network for Saliency Estimation, paper
[2021] Temporal-Spatial Feature Pyramid for Video Saliency Detection, paper
[2021] A Novel Lightweight Audio-visual Saliency Model for Videos, paper
[2022] Dual Domain-Adversarial Learning for Audio-Visual Saliency Prediction, paper
[2022] Audio–visual collaborative representation learning for Dynamic Saliency Prediction, paper
[2023] CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective, paper
[2024] Audio-Visual Saliency Prediction with Multisensory Perception and Integration, paper , github
[2024] From Discrete Representation to Continuous Modeling: A Novel Audio-Visual Saliency Prediction Model With Implicit Neural Representations, paper
[2024] DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction, paper , github
[2024] Relevance-guided Audio Visual Fusion for Video Saliency Prediction, paper
[2025] Text-Audio-Visual-conditioned Diffusion Model for Video Saliency Prediction, paper
[2025] DTFSal: Audio-Visual Dynamic Token Fusion for Video Saliency Prediction, paper
[2018] DeepVS: A Deep Learning Based Video Saliency Prediction Approach, paper , github
[2018] Revisiting Video Saliency: A Large-scale Benchmark and a New Model, paper , github
[2020] MVVA Dataset: Learning to Predict Salient Faces: A Novel Visual-Audio Saliency Model, paper , github1 , github2
[2024] Saliency Prediction on Mobile Videos: A Fixation Mapping-Based Dataset and A Transformer Approach, paper , github
[2024] Audio-visual saliency prediction for movie viewing in immersive environments: Dataset and benchmarks, paper
[2024] Video saliency prediction for First-Person View UAV videos: Dataset and benchmark, paper
[2024] Saliency Prediction of Sports Videos: A Large-Scale Database and a Self-Adaptive Approach, paper