1. Feature selection in a neighborhood decision information system with application to single cell RNA data classification.
- Author
-
Zhang, Jie, Zhang, Gangqiang, Li, Zhaowen, Qu, Liangdong, and Wen, Ching-Feng
- Subjects
FEATURE selection ,ENTROPY (Information theory) ,INFORMATION storage & retrieval systems ,NEIGHBORHOODS ,RNA - Abstract
A neighborhood information system (NIS) deals with an information system (IS) by means of neighborhoods. Sometimes it has some advantages over an IS. A neighborhood decision information system (NDIS) means a NIS with decision attributes. Single cell RNA (scRNA) data possess the characteristics of high dimensionality, small sample, unbalanced distribution, big noise and high redundancy. It has become an important research topic to select suitable and effective genes. This paper studies feature selection in a NDIS and considers its application for scRNA data classification. We first give the distance between information values on each attribute in a NDIS. Then, we present tolerance relations on the object set of a NDIS based on this distance. Next, we define the rough approximations in a NDIS by means of the presented tolerance relations. Furthermore, we put forward the notions of δ -dependence degree, δ -information entropy, δ -conditional information entropy and δ -joint information entropy in a NDIS. Based on Kryszkiewicz's ideal, we introduce δ -generalized decision and consider feature selection in a consistent NDIS by decision. Finally, we study feature selection in a consistent NDIS by using dependence degree and information entropy, and design the relevant algorithms. The experimental results conducted several scRNA data demonstrate that the designed algorithms possess excellent performance. • We present tolerance relations on the object set of a NDIS. • We define the rough approximations in a NDIS. • We put forward the notions of δ -dependence degree, δ -information entropy, δ -conditional information entropy and δ -joint information entropy in a NDIS. • We introduce δ -generalized decision and consider feature selection in a consistent NDIS by decision. • We propose three feature selection methods in a NDIS by using dependence degree and information entropy and design the relevant algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF