Fuzzy Systems and its Applications

Fuzzy Systems and its Applications

An online streaming feature selection method based on the Choquet fuzzy integral

Document Type : Original Article

Authors
1 Department of Computer Engineering, Faculty of Engineering, Yazd University ,Yazd, Iran
2 Department of Computer Engineering, Faculty of Engineering, Yazd University
3 Department of Computer Engineering, Faculty of Engineering, Lorestan University, Khorramabad, Iran
Abstract
Feature selection is a data preprocessing technique used for high-dimensional data sets before machine learning and data mining algorithms. The feature selection aims to find a minimal and optimal subset of the feature set. This subset includes valuable features while not including redundant ones. To do this, many current feature selection methods require the entire feature at first, and if a new feature is added to the feature set in the future, the algorithm must be run from the beginning. However, it is impossible to get all the features in many real-world applications or even wait for them. Therefore, online feature selection methods are provided for such issues that the entire feature space is not available at first. This paper presents an online feature selection method using the concept of Choquet fuzzy integral. This method first evaluates feature flows based on several filter criteria. Then, based on the Choquet operator, their results are combined, and decisions are made to preserve or ignore the feature. In the evaluation step, the performance of the proposed algorithm is compared with six online feature selection methods based on two categories. The proposed method is based on the results obtained in five real-world datasets that achieve about two percent improvement over similar methods based on classification accuracy and F-Score criteria. Also, due to the simple calculations in the process of the proposed method, the evaluation of features is done in a short time.
Keywords

[1] Larbani, M., Chi, H., Gwo, T. (2011) A Novel Method for Fuzzy Measure Identification. International Journal of Fuzzy Systems, 13, 24–34.

[2] Bai, S., Lin, Y., Lv, Y., Chen, J. and Wang, C. (2021) Kernelized fuzzy rough sets based online streaming feature selection for large-scale hierarchical classification. Applied Intelligence, 51, 1602–1615.

[3] CBeliakov, G. and Divakov, D. (2020) On representation of fuzzy measures for learning Choquet and Sugeno integrals. Knowledge-Based Systems, 189, 105134.

[4] Dhal, P. and Azad, C. (2021) A comprehensive survey on feature selection in the various fields of machine learning. Applied Intelligence.
 
[5] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-Pour, H. (2021) An efficient Pareto-based feature selection algorithm for multi-label classification. Information Sciences, 581, 428–447.

[6] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-Pour, H. (2021) VMFS: A VIKOR-based multi-target feature selection. Expert Systems with Applications, 115224.

[7] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-pour, H. (2021) Ensemble of feature selection algorithms: a multi-criteria decision-making approach. International Journal of Machine Learning and Cybernetics, 1–21.

[8] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-pour, H. (2020) MFS-MCDM: Multi-label feature selection using multi-criteria decision making. KnowledgeBased Systems,206, 106365.

[9] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-pour, H. (2020) MGFS: A multi-label graph-based feature selection algorithm via PageRank centrality. Expert Systems with Applications, 142, 113024.

[10] Hu, X., Zhou, P., Li, P., Wang, J. and Wu, X. (2018) A survey on online feature selection with streaming features. Frontiers of Computer Science, 12, 479–493.

[11] Jialei Wang, Peilin Zhao, Hoi, S.C.H. and Rong Jin. (2014) Online Feature Selection and Its Applications. IEEE Transactions on Knowledge and Data Engineering, 26, 698–710.
 
[12] Rahmaninia, M. and Moradi, P. (2018) OSFSMI: Online stream feature selection method based on mutual information. Applied Soft Computing, 68, 733–746.

[13] Tiwari, S.R. and Rana, K.K. (2021) Feature Selection in Big Data: Trends and Challenges. Data Science and Intelligent Applications, 52, 83–98.

[14] Yu, K., Wu, X., Ding, W. and Pei, J. (2014) Towards Scalable and Accurate Online Feature Selection for Big Data. IEEE International Conference on Data Mining, 660-669.

[15] Zhou, J., P. Foster, D., A. Stine, R. and H. Ungar, L. (2006) Streamwise feature selection. Journal of Machine Learning Research, 3, 1532–4435.

[16] Zhou, P., Hu, X., Li, P. and Wu, X. (2019) OFS-Density: A novel online streaming feature selection method. Pattern Recognition, 86, 48–61.

[17] Zhou, P., Hu, X., Li, P. and Wu, X. (2017) Online feature selection for highdimensional class-imbalanced data. Knowledge-Based Systems, 136, 187–199.

[18] Zhou, P., Hu, X., Li, P. and Wu, X. (2019) Online streaming feature selection using adapted Neighborhood Rough Set. Information Sciences, 481, 258–279.

[19] Zhou, P., Li, P., Zhao, S. and Wu, X. (2021) Feature Interaction for Streaming Feature Selection. IEEE Transactions on Neural Networks and Learning Systems, 32, 4691–4702.

[20] Zhou, P., Li, P., Zhao, S. and Zhang, Y. (2021) Online early terminated streaming feature selection based on Rough Set theory. Applied Soft Computing, 113, 107993.
Volume 5, Issue 1 - Serial Number 10
Open Access Statement
June 2022
Pages 161-185

  • Receive Date 27 February 2022
  • Revise Date 05 May 2022
  • Accept Date 23 September 2022