Prompt-TSF: A Power Load Forecasting Model Based on Prompt Learning and Two-Stage Fine-Tuning
DOI:
https://doi.org/10.4108/ew.12109Keywords:
Power Load Forecasting, Large Language Models, Prompt Learning, Two-Stage Fine-Tuning, Time Series AnalysisAbstract
Accurate power load forecasting is vital for grid stability. Traditional models struggle with complex nonlinearities, while adapting Large Language Models (LLMs) to time-series data faces modal alignment challenges. This paper proposes Prompt-TSF, a framework employing prompt learning and two-stage fine-tuning for multivariate forecasting. We utilize a high-frequency dataset of load and weather variables from Quanzhou, China. A custom tool transforms numerical data into natural language prompts, bridging the modal gap and enriching context. Furthermore, a unique two-stage fine-tuning strategy trains the LLM first on domain knowledge and formatting, followed by specific predictive regression. Experiments demonstrate that Prompt-TSF significantly outperforms traditional forecasting methods.
Downloads
References
[1] Massaoudi M, Refaat SS, Chihi I, Trabelsi M, Oueslati FS, and Abu-Rub H. A novel stacked generalization ensemble-based hybrid lgbm-xgb-mlp model for short-term load forecasting. Energy, 2021; 214:118874.
[2] Zhou H, Zhang S, Peng J, Zhang S, Li J, et al. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, 2021; 35(12):11106-11115.
[3] Wu H, Xu J, Wang J, Long M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in neural information processing systems, 2021; 34:22419-22430.
[4] Zeng A, Chen M, Zhang L, Xu Q. Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, 2022; 37(9):11121-11128.
[5] Nie Y, Nguyen NH, Sinthong P, Kalagnanam J. A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730. 2023.
[6] Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, et al. Lora: Low-rank adaptation of large language models. ICLR, 2021; 1(2):3.
[7] OpenAI, Achiam J, Adler S, Agarwal S, Ahmad L, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774. 2024.
[8] Li C, Gan Z, Yang Z, Yang J, Li L, et al. Multi-modal foundation models: From specialists to general-purpose assistants. Foundations and Trends® in Computer Graphics and Vision, 2024; 16(1-2):1-214.
[9] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, et al. Attention is all you need. arXiv preprint arXiv:1706.03762. 2023.
[10] Frome A, Corrado GS, Shlens J, Bengio S, Dean J, et al. Devise: A deep visual-semantic embedding model. In Neural Information Processing Systems, 2013: 26.
[11] Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning. PmLR, 2021: 8748-8763.
[12] Chen YC, Li L, Yu L, El Kholy A, Ahmed F, et al. Uniter: Universal image-text representation learning. In European conference on computer vision. Cham: Springer International Publishing, 2020: 104-120.
[13] Lu J, Batra D, Parikh D, and Lee S. Vilbert: Pretraining task- agnostic visiolinguistic representations for vision-and-language tasks. Advances in neural information processing systems, 2019: 32.
[14] Peng Z, Wang W, Dong L, Hao Y, Huang S, et al. Kosmos-2: Grounding multimodal large language models to the world. arXiv preprint arXiv:2306.14824. 2023.
[15] Taylor JW, McSharry PE. Short-term load forecasting methods: An evaluation based on european data. IEEE Transactions on Power Systems, 2007; 22(4):2213–2219.
[16] Fekri MN, Patel H, Grolinger K, Sharma V. Deep learning for load forecasting with smart meter data: Online adaptive recurrent neural network. Applied Energy, 2021; 282:116177.
[17] Box GEP, Jenkins GM. Time Series Analysis: Forecasting and Control. Prentice Hall PTR, USA, 3rd edition, 1994.
[18] Hong T, Fan S. Probabilistic electric load forecasting: A tutorial review. International Journal of Forecasting, 2016; 32(3):914–938.
[19] Houlsby N, Giurgiu A, Jastrzebski S, Morrone B, de Laroussilhe Q, et al. Parameter-efficient transfer learning for nlp. In International conference on machine learning. PMLR, 2019: 2790-2799.
[20] Amey Hengle AK, Singh S, Bandhakavi A, Akhtar MS, Chakroborty T. Intent-conditioned and non-toxic counterspeech generation using multi-task instruction tuning with rlaif. Preprint, 2024.
[21] Wani MA, Afzal S. A new framework for fine tuning of deep networks. In 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), 2017: 359–363.
[22] Ouyang L, Wu J, Jiang X, Almeida D, Wainwright CL, Mishkin P, et al. Training language models to follow instructions with human feedback. Advances in neural information processing systems, 2022; 35:27730-27744.
[23] Xue H, Salim FD. Promptcast: A new prompt-based learning paradigm for time series forecasting. IEEE Transactions on Knowledge and Data Engineering, 2024; 36(11):6851-6864.
[24] Jin M, Wang S, Ma L, Chu Z, Zhang JY, et al. Time- LLM: Time series forecasting by reprogramming large language models. In The Twelfth International Conference on Learning Representations, 2024.
[25] Kim T, Kim J, Tae Y, Park C, Choi JH, Choo J. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2022.
[26] Défossez A, Copet J, Synnaeve G, Adi Y. High fidelity neural audio compression. arXiv preprint arXiv:2210.13438. 2022.
[27] Touvron H, Martin L, Stone K, Albert P, Almahairi A, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288. 2023.
[28] Brown T, Mann B, Ryder N, Subbiah M, Kaplan J, et al. Language models are few-shot learners. Advances in neural information processing systems, 2020; 33:1877-1901.
[29] Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, Neubig G. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM computing surveys, 2021; 55(9):1-35.
[30] Box GEP, Jenkins GM, Reinsel GC, Ljung GM. Time Series Analysis: Forecasting and Control. Wiley, Hoboken, NJ, USA, 5th edition, 2015.
[31] Hyndman RJ, Khandakar Y. Automatic time series forecasting: The forecast package for r. Journal of Statistical Software, 2008; 27(3):1–22.
[32] Taieb SB, Hyndman RJ. A gradient boosting approach to the kaggle load forecasting competition. International Journal of Forecasting, 2014; 30(2):382–394.
[33] Winters PR. Forecasting sales by exponentially weighted moving averages. Management Science, 1960; 6:324–342.
[34] Montgomery DC, Johnson LA, Gardiner JS. Forecasting and Time Series Analysis. McGraw-Hill, New York, NY, USA, 2nd edition, 1990.
[35] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997; 9(8):1735–1780.
[36] Graves A, Mohamed AR, Hinton G. Speech recognition with deep recurrent neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013: 6645–6649.
[37] Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J. Lstm: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems, 2017; 28(10):2222–2232.
[38] Gers FA, Schmidhuber J, and Cummins F. Learning to forget: continual prediction with lstm. In 1999 Ninth International Conference on Artificial Neural Networks ICANN 99, 1999; 2:850–855.
[39] LeCun Y, Bengio Y, and Hinton G. Deep learning. Nature, 2015; 521(7553):436–444.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Hongxiang Cai, Zhiyong Wang, Yangping Tang, Jin Ao, Qian Li

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 4.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.