1. A Semi-supervised Stacked Autoencoder Approach for Network Traffic Classification
- Author
-
Ons Aouedi, Dhruvjyoti Bagadthey, Kandaraj Piamrat, Laboratoire des Sciences du Numérique de Nantes (LS2N), Université de Nantes - Faculté des Sciences et des Techniques, Université de Nantes (UN)-Université de Nantes (UN)-École Centrale de Nantes (ECN)-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Indian Institute of Technology Madras (IIT Madras), IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), and Université de Nantes (UN)-Université de Nantes (UN)-École Centrale de Nantes (ECN)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Computer science ,Feature extraction ,02 engineering and technology ,Semi-supervised learning ,010501 environmental sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,Index Terms-Traffic classification ,[INFO.INFO-NI]Computer Science [cs]/Networking and Internet Architecture [cs.NI] ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,ComputingMilieux_MISCELLANEOUS ,0105 earth and related environmental sciences ,Stacked Denoising Autoencoder ,business.industry ,Traffic classification ,Stacked Autoencoder ,Deep learning ,Dropout ,Supervised learning ,Autoencoder ,Stacked De- noising Autoencoder ,Unsupervised learning ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
Accepted at IEEE ICNP HDR-Nets workshop 2020; Network traffic classification is an important task in modern communications. Several approaches have been proposed to improve the performance of differentiating among applications. However, most of them are based on supervised learning where only labeled data are used. In reality, a lot of datasets are partially labeled due to many reasons and unlabeled portions of the data, which can also provide informative characteristics, are ignored. To handle this issue, we propose a semi-supervised approach based on deep learning. We deployed deep learning because of its unique nature for solving problems, and its ability to take into account both labeled and unlabeled data. Moreover, it can also integrate feature extraction and classification into a single model. To achieve these goals, we propose an approach using stacked sparse autoencoder (SSAE) accompanied by de-noising and dropout techniques to improve the robustness of extracted features and prevent the over-fitting problem during the training process. The obtained results demonstrate a better performance than traditional models while keeping the whole procedure automated.
- Published
- 2020
- Full Text
- View/download PDF