Why do Transformers suck at Time Series Forecasting
Why Deep Learning’s Favorite Model lose out to simple univariate models
When they were first gaining attention, the world lost its mind about Transformers in Time Series Forecasting. Unfortunately, Transformers never quite lived up to the hype. So, what went wrong?
To quote the authors of, “TSMixer: An All-MLP Architecture for Time Series Forecasting”-
“The natural intuition is that multivariate models, such as those based on Transformer architectures, should be more effective than univariate models due to their ability to leverage cross-variate information. However, Zeng et al. (2023) revealed that this is not always the case — Transformer-based models can indeed be significantly worse than simple univariate temporal linear models on many commonly used forecasting benchmarks. The multivariate models seem to suffer from overfitting especially when the target time series is not correlated with other covariates.”
The problems for Transformers don’t end here. The authors of ‘Are Transformers Effective for Time Series Forecasting’ demonstrated that Transformer models could be beaten by a very simple linear model. When analyzing why Transformers failed, they pointed to the Multi-Headed Self Attention as a potential reason for their failure.
“More importantly, the main working power of the Transformer architecture is from its multi-head self-attention mechanism, which has a remarkable capability of extracting semantic correlations between paired elements in a long sequence (e.g., words in texts or 2D patches in images), and this procedure is permutation-invariant, i.e., regardless of the order. However, for time series analysis, we are mainly interested in modeling the temporal dynamics among a continuous set of points, wherein the order itself often plays the most crucial role.”
To learn more about their research and Transformers in TSF tasks, I would suggest reading the article below. Are Transformers effective for TSF-
For more details, sign up for my free AI Newsletter, AI Made Simple. AI Made Simple- https://artificialintelligencemadesimple.substack.com/
If you liked this article and wish to share it, please refer to the following guidelines.
That is it for this piece. I appreciate your time. As always, if you’re interested in working with me or checking out my other work, my links will be at the end of this email/post. If you like my writing, I would really appreciate an anonymous testimonial. You can drop it here. And if you found value in this write-up, I would appreciate you sharing it with more people. It is word-of-mouth referrals like yours that help me grow.
If you find AI Made Simple useful and would like to support my writing- please consider getting a premium subscription to my sister publication Tech Made Simple below. Supporting gives you access to a lot more content and enables me to continue writing. You can use the button below for a special discount for readers of AI Made Simple, which will give you a premium subscription at 50% off forever. This will cost you 400 INR (5 USD) monthly or 4000 INR (50 USD) per year.
Reach out to me
Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.
Small Snippets about Tech, AI and Machine Learning over here
AI Newsletter- https://artificialintelligencemadesimple.substack.com/
My grandma’s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819