-
-
Notifications
You must be signed in to change notification settings - Fork 178
Open
Labels
keepKeep this issue away from being stale.Keep this issue away from being stale.questionFurther information is requestedFurther information is requested
Description
Issue description
Hi, thank you for your great work and for sharing the implementation of TimeMixer++!
While reading the code, I noticed a possible discrepancy between the paper and the current implementation of the forecast function in timemixerpp/backbone.py
In the paper, the output is defined as:
$$\text{output} = \text{Ensemble}([ \text{Head}_m(x^L_m) ]_{m=0}^M)$$
That is, the final output should be an ensemble (e.g., averaging or weighted sum) of all scale-specific predictions.
However, in the current code:
dec_out_list = []
for i, enc_out in zip(range(len(x_list)), enc_out_list):
dec_out = self.predict_layers[i](enc_out.permute(0, 2, 1)).permute(0, 2, 1)
dec_out = self.projection_layer(dec_out)
dec_out = dec_out.reshape(B, self.n_pred_features, -1).permute(0, 2, 1).contiguous()
dec_out_list.append(dec_out)
dec_out = self.revin_layers[0](dec_out, mode="denorm") if self.use_norm else dec_out
return dec_out
It seems that only the last scale’s output (dec_out) is used, and the ensemble operation across all scales (dec_out_list) is missing.
Could you please clarify:
- Was this simplification intentional (e.g., due to negligible performance gain from ensemble)?
- Or should the implementation actually ensemble all dec_out_list items (e.g., mean or weighted sum) to match the paper description?
Thanks again for the great work and detailed codebase!
Your contribution
Already starred
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
keepKeep this issue away from being stale.Keep this issue away from being stale.questionFurther information is requestedFurther information is requested