Efficient Key Frame Extraction from Videos Using Convolutional Neural Networks and Clustering Techniques

Anjali H Kugate; Bhimambika  Y Balannanavar; R.H Goudar; Vijayalaxmi N Rathod; Dhananjaya G M; Anjanabhargavi Kulkarni; Geeta Hukkeri; Rohit B. Kaliwal

doi:10.4108/eetcasa.5131

Authors

Anjali H Kugate Visvesvaraya Technological University https://orcid.org/0009-0001-7923-3135
Bhimambika Y Balannanavar Visvesvaraya Technological University https://orcid.org/0009-0006-5546-5544
R.H Goudar Visvesvaraya Technological University https://orcid.org/0000-0002-4590-7744
Vijayalaxmi N Rathod Visvesvaraya Technological University
Dhananjaya G M Visvesvaraya Technological University https://orcid.org/0000-0002-8492-335X
Anjanabhargavi Kulkarni Visvesvaraya Technological University https://orcid.org/0000-0001-8455-0082
Geeta Hukkeri Manipal Academy of Higher Education https://orcid.org/0000-0001-9511-8578
Rohit B. Kaliwal Visvesvaraya Technological University

DOI:

https://doi.org/10.4108/eetcasa.5131

Keywords:

Video summarization, Key Extraction, Edge Detection, Motion Analysis, Key frames

Abstract

One of the most reliable information sources is video, and in recent years, online and offline video consumption has increased to an unprecedented degree. One of the main difficulties in extracting information from videos is that unlike images, where information can be gleaned from a single frame, a viewer must watch the entire video in order to comprehend the context. In this work, we try to use various algorithmic techniques, such as deep neural networks and local features, in conjunction with a variety of clustering techniques, to find an efficient method of extracting interesting key frames from videos to summarize them. Video summarization plays a major role in video indexing, browsing, compression, analysis, and many other domains. One of the fundamental elements of video structure analysis is key frame extraction, which pulls significant frames out of the movie. An important frame from a video that may be used to summarize videos is called a key frame. We provide a technique that leverages convolutional neural networks in our suggested model, static video summarization, and key frame extraction from movies.

References

S. D. Thepade and P. H. Patil, "Novel visual content summarization in videos using keyframe extraction with Thepade's Sorted Ternary Block truncation Coding and Assorted similarity measures," 2015 International Conference on Communication, Information & Computing Technology (ICCICT), Mumbai, India, 2015, pp. 1-5, doi: 10.1109/ICCICT.2015.7045726. DOI: https://doi.org/10.1109/ICCICT.2015.7045726

R. J. R, P. Nimmagadda, K. Sudhakar, B. C. J, P. Rajasekar and S. M. A, "Perceptual Video Summarization Using Keyframes Extraction Technique," 2023 3rd International Conference on Innovative Practices in Technology and Management (ICIPTM), Uttar Pradesh, India, 2023, pp. 1-4, doi: 10.1109/ICIPTM57143.2023.10118236.

C. Sharma and P. K. Sathish, "Parallelizing keyframe extraction for video summarization," 2015 International Conference on Signal Processing and Communication Engineering Sys- tems, Guntur, India, 2015, pp. 245-249, doi: 10.1109/SPACES.2015.7058258. DOI: https://doi.org/10.1109/SPACES.2015.7058258

A. Tonge and S. D. Thepade, "A Novel Approach for Static Video Content Summarization using Shot Segmentation and k-means Clustering," 2022 IEEE 2nd Mysore Sub Section In- ternational Conference (MysuruCon), Mysuru, India, 2022, pp. 1-7, doi: 10.1109/Mysu- ruCon55714.2022.9972379.

R. F. Rachmadi, K. Uchimura and G. Koutaki, "Video classification using compacted dataset based on selected keyframe," 2016 IEEE Region 10 Conference (TENCON), Singapore, 2016, pp. 873-878, doi: 10.1109/TENCON.2016.7848130. DOI: https://doi.org/10.1109/TENCON.2016.7848130

M. S. Nair and J. Mohan, "Video Summarization using Convolutional Neural Network and Random Forest Classifier," TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), Kochi, India, 2019, pp. 476-480, doi: 10.1109/TENCON.2019.8929724..

Y. Ding, D. Shen, L. Ye and W. Zhu, "A keyframe extraction method based on transition detection and image entropy," 2022 7th International Conference on Communication, Image and Signal Processing (CCISP), Chengdu, China, 2022, pp. 260-264, doi: 10.1109/CCISP55629.2022.9974364.

W. Sabbar, A. Chergui and A. Bekkhoucha, "Video summarization using shot segmentation and local motion estimation," Second International Conference on the Innovative Compu- ting Technology (INTECH 2012), Casablanca, Morocco, 2012, pp. 190-193, doi: 10.1109/INTECH.2012.6457809. DOI: https://doi.org/10.1109/INTECH.2012.6457809

J. Iparraguirre and C. Delrieux, "Speeded-Up Video Summarization Based on Local Fea-tures," 2013 IEEE International Symposium on Multimedia, Anaheim, CA, USA, 2013, pp. 370-373, doi: 10.1109/ISM.2013.70. DOI: https://doi.org/10.1109/ISM.2013.70

C. Luo, "Video Summarization for Object Tracking in the Internet of Things," 2014 Eighth International Conference on Next Generation Mobile Apps, Services and Technologies, Ox ford, UK, 2014, pp. 288-293, doi: 10.1109/NGMAST.2014.20. DOI: https://doi.org/10.1109/NGMAST.2014.20

M. Asim, N. Almaadeed, S. Al-maadeed, A. Bouridane and A. Beghdadi, "A Key Frame Based Video Summarization using Color Features," 2018 Colour and Visual Computing Symposium (CVCS), Gjovik, Norway, 2018, pp. 1-6, doi: 10.1109/CVCS.2018.8496473.

A. S. Parihar, R. Mittal, P. Jain and Himanshu, "Survey and Comparison of Video Summarization Techniques," 2021 5th International Conference on Computer, Communication and Signal Processing (ICCCSP), Chennai, India, 2021, pp. 268-272, doi: 10.1109/ICCCSP52374.2021.9465347.

Thomas, Sinnu Susan, Sumana Gupta, and Venkatesh K. Subramanian. "Perceptual synoptic view of pixel, object and semantic based attributes of video." Journal of Visual Communication and Image Representation 38 (2016): 367-377. DOI: https://doi.org/10.1016/j.jvcir.2016.03.015

You, Junyong, et al. "A multiple visual models based perceptive analysis framework for multilevel video summarization." IEEE Transactions on Circuits and Systems for Video Technology 17.3 (2007): 273-285. DOI: https://doi.org/10.1109/TCSVT.2007.890857

Ajmal, Muhammad, et al. "Video summarization: techniques and classification." Computer Vision and Graphics: International Conference, ICCVG 2012, Warsaw, Poland, September 24-26, 2012. Proceedings. Springer Berlin Heidelberg, 2012.

Kapoor, Aditi, K. K. Biswas, and Madasu Hanmandlu. "Fuzzy video summarization using key frame extraction." 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG). IEEE, 2013. DOI: https://doi.org/10.1109/NCVPRIPG.2013.6776235

Yasmin, Ghazaala, et al. "Key moment extraction for designing an agglomerative clustering algorithm-based video summarization framework." Neural computing and applications 35.7 (2023): 4881-4902.

Sreeja, M. U., and Binsu C. Kovoor. "A multi-stage deep adversarial network for video summarization with knowledge distillation." Journal of Ambient Intelligence and Humanized Computing 14.8 (2023): 9823-9838.

Savran Kızıltepe, Rukiye, John Q. Gan, and Juan José Escobar. "A novel keyframe extraction method for video classification using deep neural networks." Neural Computing and Applications 35.34 (2023): 24513-24524.

Hsu, Tzu-Chun, Yi-Sheng Liao, and Chun-Rong Huang. "Video summarization with spatiotemporal vision transformer." IEEE Transactions on Image Processing (2023).

Issa, Obada, and Tamer Shanableh. "Static video summarization using video coding features with frame-level temporal subsampling and deep learning." Applied Sciences 13.10 (2023): 6065.

Khan, Habib, et al. "Deep multi-scale pyramidal features network for supervised video summarization." Expert Systems with Applications 237 (2024): 121288.

Sabha, Ambreen, and Arvind Selwal. "Data-driven enabled approaches for criteria-based video summarization: a comprehensive survey, taxonomy, and future directions." Multimedia Tools and Applications 82.21 (2023): 32635-32709.

Rahman, Mohammad Rajiur, et al. "Enhancing lecture video navigation with AI generated summaries." Education and Information Technologies (2023): 1-24.

Derdiyok, Seyma, and Fatma Patlar Akbulut. "Biosignal based emotion-oriented video summarization." Multimedia Systems 29.3 (2023): 1513-1526.