Prompt-TSF: A Power Load Forecasting Model Based on Prompt Learning and Two-Stage Fine-Tuning

Authors

  • Hongxiang Cai State Grid Zhejiang Electric Power Co., Ltd. Ninghai County Power Supply Company
  • Zhiyong Wang State Grid Blockchain Technology (Beijing) Co., Ltd.
  • Yangping Tang State Grid Zhejiang Electric Power Co., Ltd. Ninghai County Power Supply Company
  • Jin Ao State Grid Blockchain Technology (Beijing) Co., Ltd.,
  • Qian Li State Grid Blockchain Technology (Beijing) Co., Ltd.

DOI:

https://doi.org/10.4108/ew.12109

Keywords:

Power Load Forecasting, Large Language Models, Prompt Learning, Two-Stage Fine-Tuning, Time Series Analysis

Abstract

Accurate power load forecasting is vital for grid stability. Traditional models struggle with complex nonlinearities, while adapting Large Language Models (LLMs) to time-series data faces modal alignment challenges. This paper proposes Prompt-TSF, a framework employing prompt learning and two-stage fine-tuning for multivariate forecasting. We utilize a high-frequency dataset of load and weather variables from Quanzhou, China. A custom tool transforms numerical data into natural language prompts, bridging the modal gap and enriching context. Furthermore, a unique two-stage fine-tuning strategy trains the LLM first on domain knowledge and formatting, followed by specific predictive regression. Experiments demonstrate that Prompt-TSF significantly outperforms traditional forecasting methods.

Downloads

Download data is not yet available.

References

[1] Massaoudi M, Refaat SS, Chihi I, Trabelsi M, Oueslati FS, and Abu-Rub H. A novel stacked generalization ensemble-based hybrid lgbm-xgb-mlp model for short-term load forecasting. Energy, 2021; 214:118874.

[2] Zhou H, Zhang S, Peng J, Zhang S, Li J, et al. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, 2021; 35(12):11106-11115.

[3] Wu H, Xu J, Wang J, Long M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in neural information processing systems, 2021; 34:22419-22430.

[4] Zeng A, Chen M, Zhang L, Xu Q. Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, 2022; 37(9):11121-11128.

[5] Nie Y, Nguyen NH, Sinthong P, Kalagnanam J. A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730. 2023.

[6] Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, et al. Lora: Low-rank adaptation of large language models. ICLR, 2021; 1(2):3.

[7] OpenAI, Achiam J, Adler S, Agarwal S, Ahmad L, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774. 2024.

[8] Li C, Gan Z, Yang Z, Yang J, Li L, et al. Multi-modal foundation models: From specialists to general-purpose assistants. Foundations and Trends® in Computer Graphics and Vision, 2024; 16(1-2):1-214.

[9] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, et al. Attention is all you need. arXiv preprint arXiv:1706.03762. 2023.

[10] Frome A, Corrado GS, Shlens J, Bengio S, Dean J, et al. Devise: A deep visual-semantic embedding model. In Neural Information Processing Systems, 2013: 26.

[11] Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning. PmLR, 2021: 8748-8763.

[12] Chen YC, Li L, Yu L, El Kholy A, Ahmed F, et al. Uniter: Universal image-text representation learning. In European conference on computer vision. Cham: Springer International Publishing, 2020: 104-120.

[13] Lu J, Batra D, Parikh D, and Lee S. Vilbert: Pretraining task- agnostic visiolinguistic representations for vision-and-language tasks. Advances in neural information processing systems, 2019: 32.

[14] Peng Z, Wang W, Dong L, Hao Y, Huang S, et al. Kosmos-2: Grounding multimodal large language models to the world. arXiv preprint arXiv:2306.14824. 2023.

[15] Taylor JW, McSharry PE. Short-term load forecasting methods: An evaluation based on european data. IEEE Transactions on Power Systems, 2007; 22(4):2213–2219.

[16] Fekri MN, Patel H, Grolinger K, Sharma V. Deep learning for load forecasting with smart meter data: Online adaptive recurrent neural network. Applied Energy, 2021; 282:116177.

[17] Box GEP, Jenkins GM. Time Series Analysis: Forecasting and Control. Prentice Hall PTR, USA, 3rd edition, 1994.

[18] Hong T, Fan S. Probabilistic electric load forecasting: A tutorial review. International Journal of Forecasting, 2016; 32(3):914–938.

[19] Houlsby N, Giurgiu A, Jastrzebski S, Morrone B, de Laroussilhe Q, et al. Parameter-efficient transfer learning for nlp. In International conference on machine learning. PMLR, 2019: 2790-2799.

[20] Amey Hengle AK, Singh S, Bandhakavi A, Akhtar MS, Chakroborty T. Intent-conditioned and non-toxic counterspeech generation using multi-task instruction tuning with rlaif. Preprint, 2024.

[21] Wani MA, Afzal S. A new framework for fine tuning of deep networks. In 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), 2017: 359–363.

[22] Ouyang L, Wu J, Jiang X, Almeida D, Wainwright CL, Mishkin P, et al. Training language models to follow instructions with human feedback. Advances in neural information processing systems, 2022; 35:27730-27744.

[23] Xue H, Salim FD. Promptcast: A new prompt-based learning paradigm for time series forecasting. IEEE Transactions on Knowledge and Data Engineering, 2024; 36(11):6851-6864.

[24] Jin M, Wang S, Ma L, Chu Z, Zhang JY, et al. Time- LLM: Time series forecasting by reprogramming large language models. In The Twelfth International Conference on Learning Representations, 2024.

[25] Kim T, Kim J, Tae Y, Park C, Choi JH, Choo J. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2022.

[26] Défossez A, Copet J, Synnaeve G, Adi Y. High fidelity neural audio compression. arXiv preprint arXiv:2210.13438. 2022.

[27] Touvron H, Martin L, Stone K, Albert P, Almahairi A, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288. 2023.

[28] Brown T, Mann B, Ryder N, Subbiah M, Kaplan J, et al. Language models are few-shot learners. Advances in neural information processing systems, 2020; 33:1877-1901.

[29] Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, Neubig G. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM computing surveys, 2021; 55(9):1-35.

[30] Box GEP, Jenkins GM, Reinsel GC, Ljung GM. Time Series Analysis: Forecasting and Control. Wiley, Hoboken, NJ, USA, 5th edition, 2015.

[31] Hyndman RJ, Khandakar Y. Automatic time series forecasting: The forecast package for r. Journal of Statistical Software, 2008; 27(3):1–22.

[32] Taieb SB, Hyndman RJ. A gradient boosting approach to the kaggle load forecasting competition. International Journal of Forecasting, 2014; 30(2):382–394.

[33] Winters PR. Forecasting sales by exponentially weighted moving averages. Management Science, 1960; 6:324–342.

[34] Montgomery DC, Johnson LA, Gardiner JS. Forecasting and Time Series Analysis. McGraw-Hill, New York, NY, USA, 2nd edition, 1990.

[35] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997; 9(8):1735–1780.

[36] Graves A, Mohamed AR, Hinton G. Speech recognition with deep recurrent neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013: 6645–6649.

[37] Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J. Lstm: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems, 2017; 28(10):2222–2232.

[38] Gers FA, Schmidhuber J, and Cummins F. Learning to forget: continual prediction with lstm. In 1999 Ninth International Conference on Artificial Neural Networks ICANN 99, 1999; 2:850–855.

[39] LeCun Y, Bengio Y, and Hinton G. Deep learning. Nature, 2015; 521(7553):436–444.

Downloads

Published

15-04-2026

How to Cite

1.
Cai H, Wang Z, Tang Y, Ao J, Li Q. Prompt-TSF: A Power Load Forecasting Model Based on Prompt Learning and Two-Stage Fine-Tuning. EAI Endorsed Trans Energy Web [Internet]. 2026 Apr. 15 [cited 2026 Apr. 15];12. Available from: https://publications.eai.eu/index.php/ew/article/view/12109