Awesome Vision for Time Series (Vision4TS) Papers

This repository tracks the latest paper on Vision Models for Time series Analysis and serves as the official repository for Harnessing Vision Models for Time Series Analysis: A Survey. This repository is actively maintained by D2I Group@UH. We will update our reposititory and survey regularly.

🌟 [News] Our survey paper is accepted by IJCAI 2025 survey track!

🌟 You're welcome to suggest new Vision4TS papers by contacting jni7 [at] uh [dot] edu!

🌟 Please consider citing our survey paper if you find it helpful :), and feel free to share this repository with others!

🏆 Contribution | 📌 Taxonomy | ⚙️ Package | 🔗 Citation

Contribution

Time series analysis has witnessed the inspiring development from traditional autoregressive models, deep learning models, to recent Transformers and Large Language Models (LLMs). Efforts in leveraging vision models for time series analysis have also been made along the way but are less visible to the community due to the predominant research on sequence modeling in this domain. However, the discrepancy between continuous time series and the discrete token space of LLMs, and the challenges in explicitly modeling the correlations of variates in multivariate time series have shifted some research attentions to the equally successful Large Vision Models (LVMs) and Vision Language Models (VLMs). To fill the blank in the existing literature, this survey discusses the advantages of vision models over LLMs in time series analysis and provides a comprehensive and in-depth overview of the existing methods.

Taxonomy are proposed as a dual view of Time Series to Image Transformation and Imaged Time Series Modeling. For the former, primary methods for imaging UTS or MTS are described and remarked on their pros and cons. For the latter, the existing methods are classified by conventional vision models, Large Vision Models (LVMs) and Large Multimodal Models (LMMs).


Figure 1: The general process of leveraging vision models for time series analysis


Figure 2: Image Transformation of Time Series


Figure 3: Illustration of different modeling strategies on imaged time series

The overall structure of our survey follows the general process of applying vision models for time series analysis as delineated in Figure 1. Based on the proposed dual view taxonomy, primary imaging methods on time series in Figure 2 and imaged modelling solutions in Figure 3, are reviewed in this survey, followed by the discussion including pre- & post-processing involved in this framework and future directions in this promising field.

Package

This package provides the common visualization methods for time series, including Line Plot, Heatmap, Spectrogram (STFT, Wavelet Transform, Filterbank), GAP and RP. We have uploaded our code package to PyPI, run the following command for installation.

pip install time2img

Our code is compatible with all common benchmarks found in Google Drive. You can run example to reproduce our illustration of different time series imaging methods (Figure 2) in the paper.

List of Vision4TS Papers

Surveys

[2025] [IJCAI] Harnessing Vision Models for Time Series Analysis: A Survey [paper][code]
[2025] [JILSA] Unsupervised Time-Series Signal Analysis with Autoencoders and Vision Transformers: A Review of Architectures and Applications [paper]

Tutorials

[2025] [KDD] Multi-Model Time Series Analysis: Data, Methods, and Applications [website]

Papers

🗓️ 2026 ---

[2026] [AAAI] Harnessing Vision-Language Models for Time Series Anomaly Detection [paper]
[2026] [AAAI] OccamVTS: Distilling Vision Models to 1% Parameters for Time Series Forecasting [paper][code]

🗓️ 2025 ---

[2025] [Arxiv] SVTime: Small Time Series Forecasting Models Informed by "Physics" of Large Vision Model Forecasters [paper]
[2025] [Arxiv] ViFusionTST: Deep Fusion of Time-Series Image Representations from Load Signals for Early Bed-Exit Prediction [paper]
[2025] [Arxiv] Time Series Representations for Classification Lie Hidden in Pretrained Vision Transformers [paper]
[2025] [Arxiv] MLLM4TS: Leveraging Vision and Multimodal Language Models for General Time-Series Analysis [paper]
[2025] [CAIE] TSSI: Time Series as Screenshot Images for multivariate time series classification using convolutional neural networks [paper]
[2025] [Arxiv] Vision-Enhanced Time Series Forecasting via Latent Diffusion Models [paper]
[2025] [Arxiv] VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Visual Backbones [paper][code]
[2025] [Arxiv] From Images to Signals: Are Large Vision Models Useful for Time Series Analysis? [paper]
[2025] [NeurIPS] A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking [paper][arxiv][code]
[2025] [NeurIPS] Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting [paper][arxiv][code]
[2025] [NeurIPS] GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images [paper][arxiv][code]
[2025] [NeurIPS Workshop] TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning [paper][arxiv][code]
[2025] [ICLR] TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis [paper][arxiv]
[2025] [Arxiv] Can Multimodal LLMs Perform Time Series Anomaly Detection? [paper][code]
[2025] [ICML] Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting [paper][code]
[2025] [ICML] VisionTS: Visual masked autoencoders are free-lunch zero-shot time series forecasters [paper][code]

🗓️ 2024 ---

[2024] [KDD] CAFO: Feature-centric explanation on time series classification [paper][code]
[2024] [NeurIPS] Utilizing image transforms and diffusion models for generative modeling of short and long time series [paper][arxiv][code]
[2024] [NeurIPS] Brain-JEPA: Brain Dynamics Foundation Model with Gradient Positioning and Spatiotemporal Masking [paper][arxiv][code]
[2024] [NeurIPS Workshop] Vision language models are few-shot audio spectrogram classifiers [paper] [arxiv]
[2024] [ICANN] Fusion of image representations for time series classification with deep learning [paper][code]
[2024] [ICPR] ViT2 - Pre-training Vision Transformers for Visual Times Series Forecasting [paper][code]
[2024] [IEEE TKDE] Hierarchical context representation and self-adaptive thresholding for multivariate anomaly detection [paper]
[2024] [IEEE Sensors Journal] Multisensor data fusion and time series to image encoding for hardness recognition [paper]
[2024] [Finance Research Letters] Quantum-Enhanced Forecasting: Leveraging Quantum Gramian Angular Field and CNNs for Stock Return Predictions [paper][arxiv]
[2024] [Eng. Appl. Artif. Intell.] EEG channel selection using Gramian Angular Fields and spectrograms for energy data visualization [paper]
[2024] [Arxiv] Training-free time-series anomaly detection: Leveraging image foundation models [paper]
[2024] [Arxiv] On the feasibility of vision-language models for time-series classification [paper][code]
[2024] [Arxiv] See it, think it, sorted: Large multimodal models are few-shot time series anomaly analyzers [paper]
[2024] [Arxiv] Plots unlock time-series understanding in multimodal models [paper]
[2024] [Arxiv] ViTime: A visual intelligence-based foundation model for time series forecasting [paper][code]
[2024] [Arxiv] TimEHR: Image-based time series generation for electronic health records [paper][code]

🗓️ 2023 ---

[2023] [ICLR] TimesNet: Temporal 2d-variation modeling for general time series analysis [paper][code]
[2023] [ICLR] BrainBERT: Self-supervised representation learning for intracranial recordings [paper][arxiv][code]
[2023] [NeurIPS] Time series as images: Vision transformer for irregularly sampled time series [paper][code]
[2023] [NeurIPS Workshop] Insight miner: A time series analysis dataset for cross-domain alignment with natural language [paper]
[2023] [ICASSP] AST-SED: An effective sound event detection method based on audio spectrogram transformer [paper]
[2023] [ICAIF] From pixels to predictions: Spectrogram and vision transformer for better time series forecasting [paper]
[2023] [BigDataService] ECG classification using Deep CNN and Gramian Angular Field [paper][arxiv]
[2023] [ASPAI] Classification of time series as images using deep convolutional neural networks: application to glitches in gravitational wave data [paper]
[2023] [Neural Netw.] Image-based time series forecasting: A deep convolutional neural network approach [paper]
[2023] [Electr. Power Syst. Res.] The use of deep learning and 2-D wavelet scalograms for power quality disturbances classification [paper]
[2023] [Arxiv] Leveraging vision-language models for granular market change prediction [paper]
[2023] [Arxiv] Your time series is worth a binary image: machine vision assisted deep framework for time series forecasting [paper][code]

🗓️ 2022 ---

[2022] [AAAI] SSAST: Self-supervised audio spectrogram transformer [paper][code]
[2022] [Interspeech] MAE-AST: Masked autoencoding audio spectrogram transformer [paper][code]
[2022] [EMBC] Encoding Cardiopulmonary Exercise Testing Time Series as Images for Classification using Convolutional Neural Network [paper][arxiv][code]
[2022] [AIME] TTS-GAN: A transformer-based time-series generative adversarial network [paper][code]
[2022] [Neural Process. Lett.] Time Series Classification Based on Image Transformation Using Feature Fusion Strategy [paper]

🗓️ 2021 ---

[2021] [Interspeech] AST: Audio spectrogram transformer [paper][code]
[2021] [ICAIF] Visual time series forecasting: an image-driven approach [paper]
[2021] [ICAIF] Deep video prediction for time series forecasting [paper]
[2021] [IEEE ACM Trans. Audio Speech Lang. Process.] TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech [paper][arxiv]

🗓️ 2020 ---

[2020] [ICAIF] Trading via image classification [paper]
[2020] [Expert Syst. Appl.] Forecasting with Time Series Imaging [paper][code]
[2020] [IEEE Access] Human Activity Recognition Based on Gramian Angular Field and Deep Convolutional Neural Network [paper]
[2020] [Energy] A novel ensemble method for hourly residential electricity consumption forecasting by imaging time series [paper]
[2020] [IEEE CAA J. Autom. Sinica] Deep learning and time series-to-image encoding for financial forecasting [paper]

🗓️ Before 2020 ---

[2019] [AAAI] A Deep Neural Network for unsupervised anomaly detection and diagnosis in multivariate time series data [paper][code]
[2019] [Arxiv] Multivariate time series classification using dilated convolutional neural network [paper][code]
[2017] [ICMV] Classification of Time-series Images using Deep Convolutional Neural Networks [paper]
[2017] [Sensors] Learning Traffic as Images: A Deep Convolutional Neural Network for Large-scale Transportation Network Speed Prediction [paper]
[2015] [IJCAI] Imaging Time-series to Improve Classification and Imputation [paper]
[2015] [AAAI Workshop] Encoding Time Series as Images for Visual Inspection and Classification Using Tiled Convolutional Neural Networks [paper]
[2014] [ICPR] Extracting Texture Features for Time Series Classification [paper]
[2013] [ICDM] Time Series Classification Using Compression Distance of Recurrence Plots [paper]
[2005] [SDM] Time-series Bitmaps: a Practical Visualization Tool for Working with Large Time Series Databases [paper]

Taxonomy

_{TS-Recover denotes recovering time series from predicted images.} $*$_{: the method has been used to model the individual UTSs of an MTS.}$^{\natural}$ _{: a new pre-trained model was proposed in the work.}$^{\flat}$ : _{when pre-trained models were unused, Fine-tune refers to train a task-specific model from scratch.}

Method	TS-Type	Imaging	Multimodal	Model	Pre-trained	Fine-tune	Prompt	TS-Recover	Task	Domain	Code
Kumar et al., 2005	UTS	TS-Bitmap	✘	Multiple	✘	✘	✘	✘	Multiple	General	✘
Silva et al., 2013	UTS	RP	✘	K-NN	✘	✘	✘	✘	Classification	General	✘
Souza et al., 2014	UTS	RP	✘	SVM	✘	$✔^\flat$	✘	✘	Classification	General	✘
Wang and Oates, 2015a	UTS	GAF	✘	CNN	✘	$✔^\flat$	✘	$✔$	Classification	General	✘
Wang and Oates, 2015b	UTS	GAF	✘	CNN	✘	$✔^\flat$	✘	$✔$	Classification & Imputation	General	✘
Ma et al., 2017	MTS	Heatmap	✘	CNN	✘	$✔^\flat$	✘	$✔$	Forecasting	Traffic	✘
Hatami et al., 2018	UTS	RP	✘	CNN	✘	$✔^\flat$	✘	✘	Classification	General	✘
Yazdanbakhsh and Dick, 2019	MTS	Heatmap	✘	CNN	✘	$✔^\flat$	✘	✘	Classification	General	✔
MSCRED	MTS	Other	✘	ConvLSTM	✘	$✔^\flat$	✘	✘	Anomaly	General	✔
Li et al., 2020	UTS	RP	✘	CNN	$✔$	$✔$	✘	✘	Forecasting	General	✔
Cohen et al., 2020	UTS	LinePlot	✘	Ensemble	✘	$✔^\flat$	✘	✘	Classification	Finance	✘
Barra et al., 2020	UTS	GAF	✘	CNN	✘	$✔^\flat$	✘	✘	Classification	Finance	✘
VisualAE	UTS	LinePlot	✘	CNN	✘	$✔^\flat$	✘	$✔$	Forecasting	Finance	✘
Zeng et al., 2021	MTS	Heatmap	✘	CNN, LSTM	✘	$✔^\flat$	✘	$✔$	Forecasting	Finance	✘
AST	UTS	Spectrogram	✘	DeiT	$✔$	$✔$	✘	✘	Classification	Audio	✔
TTS-GAN	MTS	Heatmap	✘	ViT	✘	$✔^\flat$	✘	$✔$	Ts-Generation	Health	✔
SSAST	UTS	Spectrogram	✘	ViT	$✔^\natural$	$✔$	✘	✘	Classification	Audio	✔
MAE-AST	UTS	Spectrogram	✘	MAE	$✔^\natural$	$✔$	✘	✘	Classification	Audio	✔
AST-SED	UTS	Spectrogram	✘	SSAST, GRU	$✔$	$✔$	✘	✘	EventDetection	Audio	✘
Jin et al., 2023	UTS	LinePlot	✘	CNN	$✔$	$✔$	✘	✘	Classification	Physics	✘
ForCNN	UTS	LinePlot	✘	CNN	✘	$✔^\flat$	✘	✘	Forecasting	General	✘
Vit-num-spec	UTS	Spectrogram	✘	ViT	✘	$✔^\flat$	✘	✘	Forecasting	Finance	✘
ViTST	MTS	LinePlot	✘	Swin	$✔$	$✔$	✘	✘	Classification	General	✔
MV-DTSA	UTS*	LinePlot	✘	CNN	✘	$✔^\flat$	✘	$✔$	Forecasting	General	✔
TimesNet	MTS	Heatmap	✘	CNN	✘	$✔^\flat$	✘	$✔$	Multiple	General	✔
ITF-TAD	UTS	Spectrogram	✘	CNN	$✔$	✘	✘	✘	Anomaly	General	✘
Kaewrakmuk et al., 2024	UTS	GAF	✘	CNN	$✔$	$✔$	✘	✘	Classification	Sensing	✘
HCR-AdaAD	MTS	RP	✘	CNN, GNN	✘	$✔^\flat$	✘	✘	Anomaly	General	✘
FIRTS	UTS	Other	✘	CNN	✘	$✔^\flat$	✘	✘	Classification	General	✔
CAFO	MTS	RP	✘	CNN, ViT	✘	$✔^\flat$	✘	✘	Explanation	General	✔
ViTime	UTS*	LinePlot	✘	ViT	$✔^\natural$	$✔$	✘	$✔$	Forecasting	General	✔
ImagenTime	MTS	Other	✘	CNN	✘	$✔^\flat$	✘	$✔$	Ts-Generation	General	✔
TimEHR	MTS	Heapmap	✘	CNN	✘	$✔^\flat$	✘	$✔$	Ts-Generation	Health	✔
VisionTS	UTS*	Heatmap	✘	MAE	$✔$	$✔$	✘	$✔$	Forecasting	General	✔
InsightMiner	UTS	LinePlot	$✔$	LLaVA	$✔$	$✔$	$✔$	✘	Txt-Generation	General	✘
Wimmer and Rekabsaz, 2023	MTS	LinePlot	$✔$	CLIP, LSTM	$✔$	$✔$	✘	✘	Classification	Finance	✘
Dixit et al., 2024	UTS	Spectrogram	$✔$	GPT4o, Gemini & Claude3	$✔$	✘	$✔$	✘	Classification	Audio	✘
Daswani et al., 2024	MTS	LinePlot	$✔$	GPT4o, Gemini	$✔$	✘	$✔$	✘	Multiple	General	✘
TAMA	UTS	LinePlot	$✔$	GPT4o	$✔$	✘	$✔$	✘	Anomaly	General	✘
Prithyani et al., 2024	MTS	LinePlot	$✔$	LLaVA	$✔$	$✔$	$✔$	✘	Classification	General	✔

Citation

@inproceedings{ni2025harnessing,
  title={Harnessing Vision Models for Time Series Analysis: A Survey},
  author={Ni, Jingchao and Zhao, Ziming and Shen, ChengAo and Tong, Hanghang and Song, Dongjin and Cheng, Wei and Luo, Dongsheng and Chen, Haifeng},
  booktitle={IJCAI},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
fig		fig
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Vision for Time Series (Vision4TS) Papers

Contribution

Package

List of Vision4TS Papers

Surveys

Tutorials

Papers

Taxonomy

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

D2I-Group/awesome-vision-time-series

Folders and files

Latest commit

History

Repository files navigation

Awesome Vision for Time Series (Vision4TS) Papers

Contribution

Package

List of Vision4TS Papers

Surveys

Tutorials

Papers

Taxonomy

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages