Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

@article{Zhou2020InformerBE,
  title={Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting},
  author={Haoyi Zhou and Shanghang Zhang and Jieqi Peng and Shuai Zhang and Jianxin Li and Hui Xiong and Wan Zhang},
  journal={ArXiv},
  year={2020},
  volume={abs/2012.07436},
  url={https://api.semanticscholar.org/CorpusID:229156802}
}
An efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics: a ProbSparse self-attention mechanism, which achieves O(L log L) in time complexity and memory usage, and has comparable performance on sequences' dependency alignment.

AGCNT: Adaptive Graph Convolutional Network for Transformer-based Long Sequence Time-Series Forecasting

A transformer-based model, named AGCNT, which is efficient and can capture the correlation between the sequences in the multivariate LSTF task without causing the memory bottleneck, and outperforms state-of-the-art baselines on large-scale datasets.

Halveformer: A Novel Architecture Combined with Linear Models for Long Sequences Time Series Forecasting

This paper proposes a novel architecture named Halveformer, which combines linear models with the encoder to enhance both model performance and efficiency and demonstrates that Halveformer significantly outperforms existing advanced methods.

Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution

An efficient Transformer-based model, named Conformer, is proposed, which differentiates itself from existing methods for LTTF in three aspects and outperforms the state-of-the-art methods on LTTF and generates reliable prediction results with uncertainty quantification.

Knowledge-enhanced Transformer for Multivariate Long Sequence Time-series Forecasting

A novel approach is introduced that encapsulates conceptual relationships among variables within a well-defined knowledge graph, forming dynamic and learnable KGEs for seamless integration into the transformer architecture, which improves the accuracy of multivariate LSTF by capturing complex temporal and relational dynamics across multiple domains.

InParformer: Evolutionary Decomposition Transformers with Interactive Parallel Attention for Long-Term Time Series Forecasting

A novel Transformer-based forecasting model named InParformer with an Interactive Parallel Attention (InPar Attention) mechanism is proposed to learn long-range dependencies comprehensively in both frequency and time domains.

Enformer: Encoder-Based Sparse Periodic Self-Attention Time-Series Forecasting

It is proved that the reasonable improvement of transformer structure in time-series prediction can reduce the amount of calculation and give consideration to the accuracy at the same time.

Long Sequence Time-Series Forecasting via Gated Convolution and Temporal Attention Mechanism

This work improves Informer based on gated convolution and temporal attention mechanism, called GCTAM, and demonstrates that the method outperforms Informer on multiple real datasets.

Segformer: Segment-Based Transformer with Decomposition for Long-Term Series Forecasting

A Transformer-based model, Segformer, which extracts multiple components with obvious dependencies and coordinates the modeling process with the help of multi-component decomposition blocks and collaboration blocks, and offers an efficient solution for long-term dependency modeling problem of time-series.

Does Long-Term Series Forecasting Need Complex Attention and Extra Long Inputs?

A lightweight Period-Attention mechanism (Periodformer), which renovates the aggregation of long-term subseries via explicit periodicity and short-termSubseries via built-in proximity and reduces the average search time while finding better hyperparameters.

Grouped self-attention mechanism for a memory-efficient Transformer

The proposed two novel modules, Grouped Self-Attention (GSA) and Compressed Cross-Att attention (CCA) achieve a computational space and time complexity of order $O(l)$ with a sequence length $l$ under small hyperparameter limitations, and can capture locality while considering global information.
...

Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

First, convolutional self-attention is proposed by producing queries and keys with causal convolution so that local context can be better incorporated into attention mechanism, and LogSparse Transformer is proposed, improving forecasting accuracy for time series with fine granularity and strong long-term dependencies under constrained memory budget.

A Memory-Network Based Solution for Multivariate Time-Series Forecasting

A deep learning based model named Memory Time-series network (MTNet) for time series forecasting, Inspired by Memory Network proposed for solving the question-answering task, which consists of a large memory component, three separate encoders, and an autoregressive component to train jointly.

Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks

A novel deep learning framework, namely Long- and Short-term Time-series network (LSTNet), to address this open challenge of multivariate time series forecasting, using the Convolution Neural Network and the Recurrent Neural Network to extract short-term local dependency patterns among variables and to discover long-term patterns for time series trends.

A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction

A dual-stage attention-based recurrent neural network (DA-RNN) to address the long-term temporal dependencies of the Nonlinear autoregressive exogenous model and can outperform state-of-the-art methods for time series prediction.

ARMDN: Associative and Recurrent Mixture Density Networks for eRetail Demand Forecasting

A Neural Network architecture called AR-MDN, that simultaneously models associative factors, time-series trends and the variance in the demand, is proposed, that yields a significant improvement in forecasting accuracy when compared with existing alternatives.

Learning Longer-term Dependencies in RNNs with Auxiliary Losses

This paper proposes a simple method that improves the ability to capture long term dependencies in RNNs by adding an unsupervised auxiliary loss to the original objective, making truncated backpropagation feasible for long sequences and also improving full BPTT.

Long-term Forecasting using Higher Order Tensor RNNs

This work theoretically establishes the approximation guarantees and the variance bound for HOT-RNN for general sequence inputs, and demonstrates 5% ~ 12% improvements for long-term prediction over general RNN and LSTM architectures on a range of simulated environments with nonlinear dynamics, as well on real-world time series data.

Long-term Forecasting using Tensor-Train RNNs

Tensor-Train RNN (TT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics, and decompose the higher-order structure using the tensor-train (TT) decomposition to reduce the number of parameters while preserving the model performance.

CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation

This paper is the first to adapt the self-attention mechanism for multivariate, geo-tagged time series data with a novel approach called Cross-Dimensional Self-Attention (CDSA) to process each dimension sequentially, yet in an order-independent manner.
...