Article Preview
TopIntroduction
The Internet of Things is a term defined by numerous sources of professional and scientific research literature. The idea of the IoT concept was first defined by Kevin Ashton, co-owner and CEO of Auto-ID Center in 1999. With further development and increase of application, the concept of IoT was defined by numerous professional standardization bodies, organizations and associations in the field of IK technologies, as well as numerous researchers. The IoT concept can be viewed by expanding existing human-application interaction through a new dimension of integration and communication represented by objects. The IoT concept's potential enables its implementation and application in various areas covering society, the environment, and industry.
According to the forecasts presented in Statista, (2018a), at the end of 2020, approximately 31 billion IoT devices where available globally, and until 2025 there will be 75 billion IoT devices. By doing so, 41% or 12.86 million IoT devices will be installed within the smart home (SH) concept Statista, (2018)b. Restrictions on IoT devices in general, and therefore the SHIoT (Smart Home IoT) devices are described in the research Ivan Cvitić et al. (2016), which include hardware constraints, demands for high autonomy, and low production costs, thus reducing the possibility of implementing advanced protection methods and increasing the risk of the many threats shown in Ali and Awad (2018). These device limitations in the IoT concept increase the risk of carrying out numerous cyberattacks on IoT devices or using IoT devices to pierce attacks on other targets.
Traffic generated by SHIoT devices or MTC (Machine Type Communication) traffic differs from traffic generated by conventional HTC (Human Type Communication) traffic, as shown in the survey Al-Shammari et al. (2018). Although SHIoT devices are characterized by heterogeneity, MTC traffic is homogeneous to HTC traffic, meaning that devices of the same or similar purpose behave approximately equally or generate similar traffic Laner et al., (2013). Current research focuses mainly on creating device behavioral patterns (fingerprinting) specific for an individual device or classifying them by functionality or purpose. Such an approach is not adequate nor efficient in dynamic conditions such as IoT where new devices are developed and put in the market daily with new features, functionalities, and purpose. More generic classes need to be defined in such an environment, independent of devices' semantic characteristics and based solely on network traffic features that they are generating.
This research's underlying hypothesis is that SHIoT devices can differentiate by traffic flow characteristics such as the ratio of received and sent data and that such features can be utilized to define IoT devices' classes. Such an approach is vital for the future development of cyberattack detection and mitigation systems tailored for IoT concept. This kind of classes definition will be independent of semantic categorization and functionality based classification approach, which will be applicable to IoT devices developed in the future. In that way, novel systems for traffic anomaly detection based on machine learning can be developed because it will be possible to define normal behavior profiles for each defined class of IoT device as a foundation for the detection of individual IoT device anomalous behavior.
The rest of this paper is organized as follows: subsection „Related research“ deals with the current research, their shortcoming, and the positioning of our research according to previous findings. Subsection „Research methodology“ explains the methodology and methods used in the research. The second section gives an overview of the smart home environment, used communication technology, and device heterogeneity. Through the same section, some of the most important cybersecurity challenges related to IoT concept are addressed. The third section represents the data collection process, including used device description, descriptive statistics of collected data, and feature extraction process explanations. In the fourth section classes of IoT devices was defined based on the coefficient of variation of device's upload and download traffic ratio. The fifth section discusses the presented approach for IoT devices class definition and feature calculations. In the final, sixth, section we gave the conclusion, final remarks, and future research direction based on this research findings.