Deep Learning Models for Detecting Software Defects

Deep Learning Models for Detecting Software Defects

Kalpana Singh, Alankrita Aggarwal
Copyright: © 2024 |Pages: 16
DOI: 10.4018/979-8-3693-1503-3.ch009
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Software defect classification must be automated in order to ensure software dependability, as the use of automatic software-based applications grows. This chapter suggests an automatic software defect classification approach that is expert-based. The technique of grouping defects into predetermined categories is known as defect report categorization. Recently, numerous machine learning (ML) techniques have been introduced to classify defects into different categories. This study suggests a classification model that creates new word embeddings from defect reports using deep learning (DL) models, specifically the long short-term memory (LSTM) network, recurrent neural network (RNN), convolution neural network (CNN), and multilayer perceptron (MLP). The outcomes are compared to pre-trained word embedding using Google's word2vec in terms of recall, accuracy, and precision. The experimental results show that LSTM outperforms the other models used in the investigation. The maximum accuracy that LSTM can attain on the redmine dataset is 70%.
Chapter Preview
Top

1. Introduction

Software dependability and defect classification are vital due to the growing dependence on software-based systems. Scholars started focusing their attention on how software processes affect software defects. To examine the reasons behind software faults and improve the development process, software defects need to be categorised. Additionally, this keeps software flaws from improving the organization's software maturity and software quality (Abdu, A. et al., 2022, Druzhkov, P. N. & Kustikova, V. D., 2016, Gao, J. et al,2019). According to definitions, a software bug is “a deficiency in a work product that does not meet its specifications and must be replaced or repaired. (Shi, M., Wang, K., & Li, C., 2019, Wang, H., & Yuan, L., 2022, Xiao, Y. et al., 2019, Zhou, P. et al., 2019). A report on software defects is a document that documents various anomalies seen and reported by users of software products. There are various attributes associated with a fault, such as Id, Status, Description, Severity, Resolution, and Priority. Bug reports are monitored via bug tracking systems. Redmine Bug Tracker is an online project management tool that assists software developers, businesses, and professionals by storing defect data (Lopes, F. et al., 2020). The flaw follows a cycle that begins with faults being categorised in order to be validated. After that, the expert is tasked with analysing the flaw report in order to remedy it. Additionally, the specialists categorise the defect. Thus, it is imperative to classify software issues; yet, as the volume of defect reports increases, the process of manually classifying issues gets more challenging and intricate.

Methods for automating the process of classifying flaws have been provided by researchers (Aizawa, A., 2000). The proposed techniques represented the fault reports' textual descriptions as supervised and semi-supervised text classification tasks. Before text is input into machine learning (ML) algorithms, it is preprocessed and transformed into vectors of actual value vectors in a text classification problem. It undergoes testing and training for reported hidden flaws. The suggested methods either used the BoW (Bag of Words) model or the TF-IDF (Term Frequency-Inverse Document Frequency) model to represent words as vectors of real valued values (Aizawa, A. 2000). The meaning of the text or words is not understood by either model. We used deep learning (DL) models, such as CNN (Convolution Neural Network) and LSTM (Long Short Term Memory network), to automatically classify defects in a report. LSTM networks are capable of remembering a long sequence or pattern for a longer period and hence solve the gradient descent problem. The information is updated across time steps using input and output gates. CNN is a feed forward neural network having two or more hidden layers known as convolution and pooling layers (Deng, J. et al,2014).

Complete Chapter List

Search this Book:
Reset