Abstract:
Deep packet inspection methods have become more sophisticated with the rapidly developing technology. To understand the condition of the network, many different packet inspection techniques have been evolved. Newly developing machine learning methods have been used recently on these systems. The aim is to know which type of traffic is running through the network. In this thesis, different deep packet inspection methods are proposed to detect malicious traffic and find the applications running on the network. Time–series and flow–based methods are proposed to accomplish these tasks. Novel feature sets are constructed to execute these methods. Greedy algorithm which finds an upper bound for the distance between the probability distributions with different sizes is utilized in feature extraction process. The extracted features can be divided into two categories which are statistical features and payload–based features. Packet header values such as IP addresses are used to derive statistical features. Also, payload portion of packets are used to extract novel payload–based features. The fea ture sets are used with decision tree models in supervised learning to execute detection procedures. Proposed approaches are used in network intrusion detection and network application classification tasks. For network intrusion detection, performance evalua tion is given by using different publicly available well–known intrusion detection data sets consisting of different types of attacks. For network application classification, a data set consisting of real–world network traces from popular applications is used. Sim ulation results show that the proposed flow–based approaches have good performance in fulfilling these tasks.