A feature selection method based on hesitant fuzzy sets for multi-label learning

Document Type : Original Article

Authors

Abstract

Nowadays, feature selection is an essential step in machine learning due to the increase in high-dimensional data. In addition, with the continuous production of various data and the high dimensions of these data, practical methods in reducing the dimensions, mainly feature selection, are needed. Many data can be grouped into multi-label data. This means that each instance in a data set can belong to more than one class label. This paper proposes a feature selection method based on hesitant fuzzy sets to reduce the dimensions of multi-label data. In this method, we have used a combination of three different criteria in measuring the correlation between features and labels, as well as three similarity criteria to measure the similarity between features. We have considered these methods as the experts in feature evaluation. Correlation and similarity combinations have been performed based on the concept of information energy in hesitant fuzzy sets. To demonstrate the effectiveness of the proposed method, comparisons have been made with new methods in the field of multi-label feature selection. These comparisons are based on the classification accuracy, Hemming loss, and execution time of the algorithm.

Keywords


[1] نصرتی ناهوک، حسن، افتخاری، مهدی. (1392) یک روش جدید برای انتخاب ویژگی مبتنی بر منطق فازی. سیستم‌های هوشمند در مهندسی برق، دوره 4، شماره (1):  صص. 71 تا 83.
 
[2] Atanassov, K.T. (1986) Intuitionistic fuzzy sets. Fuzzy sets and Systems, 20, 87–96.
 
[3] Bolón-Canedo, V. and Alonso-Betanzos, A. (2019) Ensembles for feature selection: A review and future trends. Information Fusion, 52, 1–12.
 
[4] Cai, J., Luo, J., Wang, S. and Yang, S. (2018) Feature selection in machine learning: A new perspective. Neurocomputing, 300, 70–79.
 
[5] Ebrahimpour, M.K. and Eftekhari, M. (2017) Ensemble of feature selection methods: A hesitant fuzzy sets approach. Applied Soft Computing Journal, 50, 300–312.
 
[6] Gong, J.-W., Liu, H.-C., You, X.-Y. and Yin, L. (2021) An integrated multi-criteria decision making approach with linguistic hesitant fuzzy sets for E-learning website evaluation and selection. Applied Soft Computing, 102, 107118.
 
[7] Hall, M.A. (1999) Correlation-based feature selection for machine learning. University of Waikato.
 
[8] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-pour, H. (2021) A pareto-based ensemble of feature selection algorithms. Expert Systems with Applications, 180, 115130.
 
[9] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-Pour, H. (2021) An efficient Pareto-based feature selection algorithm for multi-label classification. Information Sciences, 581, 428–447.
 
[10] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-Pour, H. (2021) VMFS: A VIKOR-based multi-target feature selection. Expert Systems with Applications, 115224.
 
[11] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-pour, H. (2021) Ensemble of feature selection algorithms: a multi-criteria decision-making approach. International Journal of Machine Learning and Cybernetics, 1–21.
 
[12] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-pour, H. (2020) MFS-MCDM: Multi-label feature selection using multi-criteria decision making. KnowledgeBased Systems,206, 106365.
 
[13] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-pour, H. (2020) MGFS: A multi-label graph-based feature selection algorithm via PageRank centrality. Expert Systems with Applications, 142, 113024.
 
[14] Hashemi, A., Dowlatshahi, M.B. and Nezamabadi-Pour, H. (2021) A bipartite matching-based feature selection for multi-label learning. International Journal of Machine Learning and Cybernetics, 12, 459–475.
 
[15] Joodaki, M., Dowlatshahi, M.B. and Joodaki, N.Z. (2021) An ensemble feature selection algorithm based on PageRank centrality and fuzzy logic. KnowledgeBased Systems,233, 107538.
 
[16] Kashef, S. and Nezamabadi-pour, H. (2019) A label-specific multi-label feature selection algorithm based on the Pareto dominance concept. Pattern Recognition, 88, 654–667.
 
[17] Kashef, S., Nezamabadi-pour, H. and Nikpour, B. (2018) Multilabel feature selection: A comprehensive review and guiding experiments. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8, e1240.
 
[18] Paniri, M., Dowlatshahi, M.B. and Nezamabadi-pour, H. (2021) Ant-TD: Ant colony optimization plus temporal difference reinforcement learning for multilabel feature selection. Swarm and Evolutionary Computation, 64, 100892.
 
[19] Paniri, M., Dowlatshahi, M.B. and Nezamabadi-pour, H. (2020) MLACO: A multi-label feature selection algorithm based on ant colony optimization. Knowledge-Based Systems, 192, 105285.
 
[20] Pereira, R.B., Plastino, A., Zadrozny, B. and Merschmann, L.H.C. (2018) Categorizing feature selection methods for multi-label classification. Artificial Intelligence Review, 49, 57–78.
 
[21] Reyes, O., Morell, C. and Ventura, S. (2015) Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context. Neurocomputing, 161.
 
[22] Rickard, J.T., Aisbett, J. and Gibbon, G. (2009) Fuzzy subsethood for fuzzy setsof type-2 and generalized type-n. IEEE Transactions on Fuzzy Systems, 17, 50–60.
 
[23] Torra, V. (2010) Hesitant fuzzy sets. International Journal of Intelligent Systems, 25, 529–539.
 
[24] Venkatesh, B. and Anuradha, J. (2019) A review of Feature Selection and its methods. Cybernetics and Information Technologies, 19, 3–26.
 
[25] Zadeh, L.A. (1965) Fuzzy sets. Information and Control, 8, 338–353.
 
[26] Zhang, M.L. and Zhou, Z.H. (2007) ML-KNN: A lazy learning approach to multilabel learning. Pattern Recognition, 40, 2038–2048.
 
[27] Zhang, P., Liu, G. and Gao, W. (2019) Distinguishing two types of labels for multi-label feature selection. Pattern Recognition, 95, 72–82.
 
[28] Zhang, R., Nie, F., Li, X. and Wei, X. (2019) Feature selection with multi-view data: A survey. Information Fusion, 50, 158–167.