Skip to content

Commit cf33822

Browse files
committed
add TRACE
2 parents 2f6ac19 + e7b0dca commit cf33822

File tree

10 files changed

+76
-0
lines changed

10 files changed

+76
-0
lines changed
442 KB
Loading
489 KB
Loading
151 KB
Loading
42.1 KB
Loading
40.6 KB
Loading
96.4 KB
Loading
207 KB
Loading
468 KB
Loading

app/projects/trace/page.mdx

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
import { Authors, Badges } from '@/components/utils'
2+
3+
# TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval
4+
5+
<Authors
6+
authors="Jialin Chen, Yale University; Ziyu Zhao, McGill University; Gaukhar Nurbek, University of Texas Rio Grande Valley; Aosong Feng, Yale University; Ali Maatouk, Yale University; Leandros Tassiulas, Yale University; Yifeng Gao, University of Texas Rio Grande Valley; Rex Ying, Yale University"
7+
/>
8+
9+
<Badges
10+
venue="NeurIPS 2025"
11+
github="https://github.com/Graph-and-Geometric-Learning/TRACE-Multimodal-TSEncoder"
12+
arxiv="https://arxiv.org/abs/2506.09114"
13+
pdf="https://arxiv.org/pdf/2506.09114"
14+
/>
15+
16+
## Introduction
17+
Time-series data is central to domains like healthcare, weather, and energy, yet it rarely exists alone. In real-world settings, it is often paired with rich textual context such as clinical notes or weather reports. This combination calls for models that can jointly understand time-series signals and text.
18+
As shown in figure below, a flash flood report describing heavy rainfall and strong winds can help retrieve historical time-series patterns with similar dynamics, supporting tasks like forecasting and disaster alerts. But existing approaches remain limited—they often ignore the textual context and struggle to align time-series and language representations effectively.
19+
![A Use Case of Text-to-Timeseries Retrieval|scale=0.7](./assets/use_case.png)
20+
## Method
21+
We introduce TRACE — a Time-series Retriever with Aligned Context Embedding. TRACE is the first multimodal retriever that learns semantically grounded time-series embeddings through fine-grained dual-level alignment. It uses a masked autoencoder with Channel Identity Tokens (CITs) to capture channel-specific behaviors and employs hierarchical hard negative mining to align time-series and textual representations effectively.
22+
TRACE serves two purposes:
23+
1. As a general-purpose retriever, it enhances foundation models via retrieval-augmented generation (RAG).
24+
2. As a standalone encoder, it achieves state-of-the-art performance on forecasting and classification benchmarks.
25+
![Overview of TRACE|scale=0.7](./assets/cover_fig.png)
26+
27+
As shown in the figure below, TRACE first learns robust time-series representations through masked reconstruction with channel-aware attention. It then aligns each time-series channel with its corresponding text using fine-grained contrastive learning. Building on this, TRACE introduces a retrieval-augmented generation strategy that fetches relevant context for downstream tasks. This modular design delivers strong standalone performance while integrating seamlessly with existing time-series foundation models.
28+
![Architecture of TRACE|scale=0.7](./assets/architecture.png)
29+
30+
## Results
31+
We evaluate TRACE from three perspectives:
32+
(1) its performance in cross-modal and time-series retrieval compared to strong baselines,
33+
(2) its effectiveness as a retriever in retrieval-augmented forecasting pipelines, and
34+
(3) its generalization as a standalone encoder for forecasting and classification.
35+
36+
### Cross-modal Retrieval
37+
To assess retrieval performance, we replace TRACE’s encoder with several strong time-series foundation models that generate fixed-length embeddings. Each encoder is fine-tuned end-to-end with a lightweight projection layer and a contrastive learning objective for fair comparison.
38+
As shown in Table 1, TRACE achieves state-of-the-art results, with nearly 90% top-1 label matching and 44% top-1 modality matching. Its retrieval accuracy surpasses the classification performance of all models trained from scratch, underscoring the effectiveness of alignment-based supervision. Among baselines, Moment performs best, but TRACE’s fine-grained embeddings enable more precise cross-modal retrieval and semantic matching.
39+
40+
![Table 1: Retrieval results on 2,000 bidirectional Text–Timeseries query pairs. “Random” indicates a non-informative retriever that ranks candidates uniformly at random. |scale=0.7](./assets/table-1.png)
41+
42+
### Timeseries-to-Timeseries Retrieval
43+
We tested TRACE on a time-series-to-time-series retrieval task, where the goal is finding the most semantically similar series for each query.
44+
Table 2 shows TRACE outperforming all baselines—ED, DTW, SAX-VSM, and CTSR—across key metrics: Precision@1, Precision@5, and Mean Reciprocal Rank (MRR). It also maintained the lowest retrieval latency.
45+
The performance gap highlights a key difference. Methods like SAX-VSM and CTSR struggle to capture deeper temporal and semantic patterns. TRACE's alignment-aware training, by contrast, delivers accurate and efficient retrieval across multivariate signals while remaining scalable.
46+
47+
![Table 2: TS-to-TS Retrieval performance comparison. Evaluation is conducted over 1000 randomly sampled weather time-series queries. |scale=0.3](./assets/table-2.png)
48+
49+
### Retrieval-augmented Time Series Forecasting
50+
We used TRACE to find the most relevant time-series and text pairs from our dataset based on embedding similarity.
51+
Table 3 shows that retrieval augmentation improves forecasting performance across all models. The biggest gains came from combining time series with text (TS+Text), especially for decoder-only models like Timer-XL and Time-MoE.
52+
Interestingly, TRACE itself showed minimal improvement between TS-only and TS+Text retrieval. This isn't a weakness—it indicates TRACE's embeddings are already well-aligned across modalities. The model doesn't need much help because its multimodal space is already doing the work.
53+
This makes TRACE effective as a lightweight, general-purpose retriever for RAG pipelines.
54+
55+
![Table 3: Forecasting performance on Weather dataset for next 24 steps under different retrieval-augmented generation settings.|scale=0.4](./assets/table-3.png)
56+
57+
### Standalone Time Series Encoder
58+
We tested TRACE on forecasting and classification tasks, comparing it against traditional models trained from scratch and existing time series foundation models.
59+
The classification results (Table 4) revealed an interesting pattern: fine-tuned foundation models actually performed worse than simpler train-from-scratch models. The likely reason is over-generalization—their embeddings become too broad and lose domain-specific signals needed for accurate classification. TRACE took a different approach. It achieved significantly higher accuracy and F1 scores than baselines, both with and without retrieval-augmented generation (RAG). This suggests TRACE maintains discriminative structure while preserving semantic alignment.
60+
61+
![Table 4: Weather Event Classification Results.|scale=0.2](./assets/table-4.png)
62+
63+
Table 5 shows TRACE outperforming baselines across datasets, particularly on longer prediction horizons where other models struggle. Traditional approaches show inconsistent performance as the forecasting window extends. TRACE's cross-modal design appears to be the key difference—it provides better semantic understanding and more context-aware predictions.
64+
65+
![Table 5: Forecasting results (MAE and MSE) of full-shot models and time series foundation models on multi-variate (M) and univariate (U) datasets. Red: the best, Blue: the 2nd best.|scale=0.7](./assets/table-5.png)

config/publications.ts

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,17 @@ export interface Publication {
2020
}
2121

2222
export const publications: Publication[] = [
23+
{
24+
title: "TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval",
25+
authors: "Jialin Chen, Ziyu Zhao, Gaukhar Nurbek, Aosong Feng, Ali Maatouk, Leandros Tassiulas, Yifeng Gao, Rex Ying",
26+
venue: "NeurIPS 2025",
27+
page: "trace",
28+
code: "https://github.com/Graph-and-Geometric-Learning/TRACE-Multimodal-TSEncoder",
29+
paper: "https://arxiv.org/pdf/2506.09114",
30+
abstract: "We address the challenge of time-series retrieval, which remains largely underexplored as existing methods lack semantic grounding, struggle with heterogeneous modalities, and have limited capacity for multi-channel signals. We propose TRACE, a multimodal retriever that grounds time-series embeddings in aligned textual context.",
31+
impact: "TRACE enables fine-grained channel-level alignment and uses hard negative mining for semantically meaningful retrieval across flexible modes (Text-to-Timeseries and Timeseries-to-Text). Beyond retrieval, it functions as a standalone encoder that achieves state-of-the-art performance on forecasting and classification tasks.",
32+
tags: [Tag.MultiModalFoundationModel],
33+
},
2334
{
2435
title: "Non-Markovian Discrete Diffusion with Causal Language Models",
2536
authors: "Yangtian Zhang, Sizhuang He, Daniel Levine, Lawrence Zhao, David Zhang, Syed A. Rizvi, Shiyang Zhang, Emanuele Zappala, Rex Ying, David van Dijk",

0 commit comments

Comments
 (0)