Developing a hyperparameter optimization method for classification of code snippets and questions of stack overflow: HyperSCC

Muhammed Maruf Öztürk

doi:10.4108/eai.27-5-2022.174084

Developing a hyperparameter optimization method for classification of code snippets and questions of stack overflow: HyperSCC

Authors

Muhammed Maruf Öztürk Department of Computer Engineering, Suleyman Demirel University, West Campus, Isparta, 32040, Turkey

DOI:

https://doi.org/10.4108/eai.27-5-2022.174084

Keywords:

Multi-label classification, hyperparameter optimization, programming language prediction

Abstract

Although there exist various machine learning and text mining techniques to identify the programming language of complete code files, multi-label code snippet prediction was not considered by the research community. This work aims at devising a tuner for multi-label programming language prediction of stack overflow posts. To that end, a Hyper Source Code Classifier (HyperSCC) is devised along with rule-based automatic labeling by considering the bottlenecks of multi-label classification. The proposed method is evaluated on seven multi-label predictors to conduct an extensive analysis. The method is further compared with the three competitive alternatives in terms of one-label programming language prediction. HyperSCC outperformed the other methods in terms of the F1 score. Preprocessing results in a high reduction (50%) of training time when ensemble multi-label predictors are employed. In one-label programming language prediction, Gradient Boosting Machine (gbm) yields the highest accuracy (0.99) in predicting R posts that have a lot of distinctive words determining labels. The findings support the hypothesis that multi-label predictors can be strengthened with sophisticated feature selection and labeling approaches.

Downloads

Published

27-05-2022

How to Cite

Öztürk MM. Developing a hyperparameter optimization method for classification of code snippets and questions of stack overflow: HyperSCC. EAI Endorsed Scal Inf Syst [Internet]. 2022 May 27 [cited 2025 Jul. 8];10(1):e5. Available from: https://publications.eai.eu/index.php/sis/article/view/1267

Download Citation

Issue

Vol. 10 No. 1 (2023): EAI Endorsed Transactions on Scalable Information Systems

Section

Research articles

License

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.