Research Article - (2021) Volume 7, Issue 8
Asadi Srinivasulu1*, Tarkeswar Barua1, Srinivas Nowduri2 and Sivaram R2
1Data Analytics Research Laboratory, SVEC, Blue Crest University College, Monrovia, Liberia
2Department of Computer Science, PCC, 900 W Orman Ave, Pueblo CO 81004, United States
Received Date: March 01, 2021; Accepted Date: September 08, 2021; Published Date: September 20, 2021
Citation: Srinivasulu A (2021) Early Prediction of Covid-19 Using Modified Recurrent Neural Networks. J Infec Dis Treat Vol.07 No.08.
Today COVID-19 virus development has become one of the most broadly perceived kind of disease. This research work is accomplish different pre-training techniques and strategies associated, under deep recurrent networks (DRN) in classifying X-ray chest images into three broad collection viz., Normal/Negative, Pneumonia positive and COVID-19 positive; based on two other open-source data sets. This five-fold research work is based on data sets consists of 980 X-ray chest images of contaminated COVID-19 patients, along with experimentation using various deep[16] learning[31] and neural network methods; organized in the following way.
Initially introduce some pre-training techniques, eventually helps the network to learn better; especially so in an imbalanced data set; where less cases of COVID-19 along with more cases of other classes. Second, propose a new and modified recurrent [17] neural network (MRNN) that is recurrent with the simple application of a filter to an input, results in activation. Third, repeated application of the same filter to input entails in a map of activation (viz., feature map), indicates the locations/strength of detected feature in the input; such as an image concatenation of the neural networks and VGG Net networks. Fourth, our computational work reveal that the resulted network, achieves the best accuracy in utilizing multiple features extracted by two other robust networks. Fifth, our network evaluation (based on 980 images), reveal that the actual accuracy is surely possible and adaptable to in real life circumstances. At the same time, the average accuracy of this proposed network in detecting COVID-19 cases is found to be 95.60%; with overall average accuracy for all other classes is at 72.45%. Finally, a comparative study is made with others, in terms of its accuracy, time complexity and high performance; is found to be reducing computational cost, working with large amount of training data which is better than the existing system.
Detection, Classification, Neural Networks (NN), RNN, MRNN, COVID-19 virus, DWI, CAD, Image Processing, Deep learning (DL) [16, 31], Recurring learning, X-ray chest images [29].
The corona is a group of viruses initially originate in mammals, as a disease. COVID-19 (COrona VIrous Disease), belongs to corona family, introduced by world health organization (WHO) on Dec. 31st 2019. The very first case was found in the Wuhan city China (Guandong state), known as Wu Flue. COVID-19 has longest gnome of RNA (Ribo Nucleic Acid) approx. 26k to 32k long. COVID-19 [2][3] Malignant might be the most widely recognized kind of disease among inquest folks in the entire world.
Most recent COVID-19 data in 2019 indicate, a third significant reason for biting the dust from Corona cases in USA, with around 161,460 cases; with 19.27% of new disease occurrences and 26,730 fatalities; indicating 8.34% of most malignant growth passing. Despite the fact that COVID-19[33][1] might be far reaching sort of danger, however due to its recognition in the early stages; made a good achievement rates, and exorbitant due to block movement of the condition. Accordingly researchers are compelled to monitor and conclude it’s fundamental to expanded patient's endurance.
Graph for Mortality
ML is a subdivision of its super artificial AI, which uses data to enable machines to learn to perform tasks on their own; which also help classifying and forecasting information from the given data set [18][19[35]. On the other hand, statistical consistency refers to specific property of a statistical procedure such as estimates of population parameters, confidential interval estimation and tests of hypothesis; is a fundamental notation for supervised and unsupervised learning. Multiclass classification is ML [3] [4] task consists of more than two classes or outputs. This research work, focus on a unified framework in studying the consistency of a general multi-class learning problem; that conclude, generalizing many known past results for specific learning problems.
Majority of multi-class learning problems uses an evaluation matrix based on a loss matrix; as a result, algorithms for such problems are surrogate minimizing algorithms, which are characterized by surrogate [5] [6] loss. If surrogate loss is convex, that results in surrogate minimizing algorithm, finally it can be framed as convex optimization problem and can be solved efficiently.
This research study focus in three directions. First part, attempts to describe calibrated surrogates losses which leads to a consistent surrogate minimizing algorithm for a given loss matrix. It also discusses the necessary and sufficient conditions under which calibration will happen; based on geometric properties of the surrogate and true loss. Second part focus on discussing about convex calibration dimension that characterized the intrinsic difficulty; while achieving consistency for a training problem. Finally we analyze the generic procedure to conduct convex calibrated surrogate.
In health data analytics field, computer vision (CV) helped acknowledgment based on conclusion convolution added design (CAD); which is truly blended with imaging trademark anatomist. On the other hand, the ML grouping demonstrate in planning and supporting radiologists for exact analysis, diminishing time parameter and the associate cost effectiveness in such determination.
The DL strategies indicate a promising outcomes in an assortment of CV undertakings, such as division, arrangement, and article discovery. These techniques comprises of recurrent [16] layers that can extricate various low-level nearby highlights to significant level worldwide. An associated layer toward end of the recurrent neural layers changes over tangled highlights into probabilities of specific names. For example, Clump Standardization Layer (CSL), standardizes the contribution of a layer with a zero mean and a unit variation and Dropout Layer (DL), which is one of regularization strategies that overlooks haphazardly chosen hubs. DL is also expected to improve the exhibition of profound learning-based techniques. Concluding focus is based on principle difficulties of profound learningbased strategies, which are applied to various fields, such as, health clinical imaging and image database applications [34].
Literature Review
According to the past research in this area, an immense amount of work has been done by people working at hospitals, clinics, and laboratories; along with many researchers and scientists, dedicate considerable efforts in fight against COVID-19 epidemic [33]. Due to unconscionable dissemination of the disease, the implementation of AI made a significant contribution to the digital health district by applying the basics of Automatic Speech Recognition (ASR) and DL algorithms. This study also focus on the importance of speech signal processing, through early screening and diagnosing the COVID-19 virus by utilizing the Recurrent [17] Neural Network (RNN). Particularly, via its architecture, Long Short-Term Memory (LSTM) for analyzing the patient’s symptoms such as cough, breath and voice. Our results find low accuracy in data set test compare to coughing and breathing sound samples. Our results are in preliminary stage, possibly expect to enhance accuracy of the voice tests; by expanding the data set and targeting a larger group of healthy and infected people.
Recently, NAACCR announce that all malignancy cases are characterized by ICD for Oncology with the exception of adolescence and pre-adult COVID-19s; which were arranged by the ICCC. The root causes of deaths were grouped by the ICD. Whenever COVID-19 is attacked, disease frequency rates introduced in this report were balanced for delays in detailing, which happen on account of a slack in the event that catch or information redresses.
Past several researchers reported the automated screening and diagnosing, based on the analysis of chest CT-images [2], [3], [4], [5]. AI is found to be clenched and enforced in e-health districts to aid early detection of COVID-19; by analyzing sound through, coughing, breathing, and speech [1]. The respiratory [32] sound is an indication for human health status; can be recognized and diagnosed by implementing ML algorithms [6].
Ever since the outbreak of COVID-19 virus, many scientists and researchers start considering the detection of COVID-19 from respiratory [32] sounds [7]. In similar studies a low power consumed wearable system is proposed for detecting asthma and wheezing. This analysis is based on patient’s frequencies of both sound features and respiratory [32] sound [8] [9]. In another significant study, a convolutional neural networks (CNN) [30] is used in detecting different types of coughs; based on the analysis of their extracted sound features [10]. Besides, several system are proposed for predicting COVID-19; using DL algorithms via classifiers such as CNN, Long Short- Term Memory (LSTM), and Artificial Neural Networks (ANN) [11]. Therefore, COVID-19 patients’ health status can be determined through their speech signals. A patient’s health state detection system, can be used to observe and analyze the sleep-quality, severity of illness, fatigue, and anxiety [12]. Since, cough has been a symptom of many diseases; it is possible to distinguish between coughs, to establish the type of illness by testing the auditory features using multiple classifiers [13]. Another seminal work proposes system multi-pronged mediator AI architecture, exclusively for differentiating different types of coughs [14].
The latest year for which rate of mortality information are accessible slacks 2 to 4 years behind the present year because of the time required for information assortment, accumulation, quality control, and spread. The quantity of intrusive disease cases was assessed utilizing a 3-advance spatio-worldly model dependent on excellent rate of information from 50 states and the District of Columbia speaking to around 94% populace inclusion (information were missing for the entire years for Minnesota and for certain years for different states). This strategy can't appraise quantities of basal cell or squamous cell skin malignancies since information on the event of these COVID-19s are not required to be accounted for to disease vaults.
The lifetime probability of being determined to have an intrusive malignancy is higher for men (43.4%) than for women (56.6%) hazard. In 2016, an expected 10,480 youngsters (birth to 13.5 years) will be determined to have malignancy (barring kind/ marginal cerebrum COVID-19s) and 1,240 will pass on from the illness. Kindhearted and marginal cerebrum COVID-19s are excluded from the 2016 case gauges in light of the fact that the computation strategy requires recorded information and these COVID-19s were not required to be accounted for until 2004. COVID-19 frequency rates expanded in kids and young people by 0.623% every year from 1975 through 2012.
Existing Systems and their Implementation Details
Currently there are some existing systems for assessing the pre-condition of Covid-19 patients based on few NN techniques [9] [10]. All these used techniques and algorithms are found to be limited in terms of their performance accuracy and time complexity. IN order to address these parameters we propose MRNN algorithms as described in this work. This section in particular address the existing systems along with their drawbacks, and then propose a new systems as shown below
Existing System
Existing systems, based on confusion matrices/tables, the concatenated network might be performing better in detecting COVID-19 and unable to detect its positive cases [9][8]. Their proposed techniques are clearly based on an imbalanced data set, with limited sample cases of COVID-19; thus does not yield the required results. This work elegantly improves COVID-19 early detection through MRNN; coupled with other classes detection (such as positive or negative). It also identified the main reasons behind low precision in the past work is due to their data mining techniques; which we have eradicated in our research work with the help of deep learning[16][31] techniques.
Another research work present, the results in two different forms, such as 2 and 3 classes due to imbalance in the COVID-19 image data set; coupled with several insignificant results [31]. Our computational work clearly presents more meaningful results for each class (in fact for all the classes); which are of more practical in nature. As we concentrate more towards better and real performance of our network, we did testing on a bigger scale of Covid-19 images.
Our work shows, among the total 11, 273 COVID-19 cases there are only 67 cases found to be wrongly infected. Our work encompasses few recurrent neural networks, due to their capability in addressing certain pertinent issues within the feedforward neural networks as following [17][33]
• Inability to handle sequential data
• Depends only on the current input
• Fail to memorize the previous inputs
One of the solutions in addressing these issues is the use of RNN. An RNN is fully capable of not only handling sequential data, but also accepts the current input data; apart from memorizing the previously received inputs in its internal memory.
Drawbacks in the Existing System
• Less accuracy.
• High time complexity.
• Less performance.
• High Computational Cost.
• Uses more training COVID-19 image data set.
Proposed System
Modified RNNs are the most mainstream profound learning models for handling multidimensional cluster information, for example, shading pictures. A run of the mill Modified RNN[31]comprises of different Recurrent[17] and pooling layers followed by a couple of completely associated layers to all the while get familiar with an element order and characterize pictures. It utilizes blunder back spread a proficient type of inclination plunge to refresh the loads interfacing its contributions to the yields through its multi-layered architecture. In this paper, we present a two-phase approach utilizing two separate Modified RNNs. The main Modified RNN distinguishes cores in a given tissue picture while the second Modified RNN takes patches focused at the identified atomic focuses as contribution to anticipate the likelihood that the fix has a place with an instance of PCa repeat. Before portraying our modified RNN models, presented the subtleties of the information, used to build up the proposed PCa repeat model.
Advantages of Proposed System
• High accuracy.
• Low time complexity.
• High performance.
• Reduces Computational cost.
• Even works with small amount of training data is better than the existing system.
• Deep convolution network is based on Exception and ResNet60V3 networks while improving the accuracy [16[31].
• Training technique for dealing with imbalanced image data sets.
• Based on Exception and ResNet60V3 networks, we evaluate our network of 11302 chest X-ray chest images [29].
• After using ResNet60V3 and Exception networks on the image data set, we have finally compared performance of our proposed network with others.
System Design and Implementation
The basic idea of our system design and implementation, is to ensure that, the Covid-19 patients information built in a way that can accommodate, the plan, fragments for their early prediction[23][24]. This system design is thus a strategy or forte of portraying the plan, fragments, modules, interfaces, and data for a proper structure to satisfy essentials. There are some spread and joint effort with the data sets in terms of their structures assessment, systems plan and systems building. Execution or proficiency is estimated based on their yield projected by the application. Prerequisite particulars have found to have a significant influence in the investigation of their framework. Given the appropriate patients’ prerequisite details, results in a conceivable structure to a superior framework; [21] [22] [23] that eventually fit into our required condition. It also expect to rest on a great extent with the existing clients of the current framework, through the necessity particulars [24].
Paper Implementation and Experimentation Details
As described above, in the implementation and experimentation of our work, we combined CNN and RNN to achieve better results in terms performance, accuracy and time complexity. The corresponding implementation aspects of systems modules and ERNN algorithms [21] [23] can further be described as following.
System Modules
We call the execution flow of experimentation as the systems module in our work that can be discussed in three modules as following
• Collecting image Data Sources • Preprocessing image Data sets • Feature extraction Learning
Collecting image Data Sources: We have taken data sets in UCI, Kaggle and Google, Websites. For our experimentation [7] [8], we found we have used two types of data sets, to assess the performance risk.
Structured data: Structure data is a simply a tabular data, that refers to information with a high level of association rows and columns with other data. To that extent that it incorporate within a consistent social database; and promptly helpful to search by basic, direct web crawler calculations or other hunt activities. Most of the times this is commonly visible in .CSV [11] [12] format.
Preprocessing image Datasets
• Features i.e., attributes through which patients are affected are extracted from the image data sets.
• Preprocessing maybe eliminating duplicate values and adding missing values
• Each feature importance in affecting the patient can be found using correlation analysis or in max pooling stages.
• In case of unstructured data need to process to structured data with target class.
ERNN Algorithm
In view of achieving more accuracy, performance and time complexity, we are forced to extend RNN, to an extended RNN (ERNN). Always, ERNN takes the output from previous step, and uses as an input to the current step [31][12]. In general, the RNN's are mainly useful in sequence classification-sentiment and video classification, through two different steps viz., sequence labeling-part of speech tagging and named entity recognition [15[16][17]. To derive the above advantages, we made some changes to RNN and obtain ERNN; via three sequential steps as detailed below [13][14]
• To decide how much past data it should remember.
• To decide how much the size of data unit to be adds to the current state.
• To decide on concluding part of current cell state that makes the final output.
Based on the Covid-19 data set i.e. a total 960 images, our experimentation comprised of the following thirteen steps:
Step 1: Import the required libraries
Step 2: Import the training dataset
Step 3: Perform feature scaling to transform the data
Step 4: Create a data structure with 60-time steps and 1 output
Step 5: Import Keras library and its packages
Step 6: Initialize the ERNN
Step 7: Add the LSTM layers and some dropout regularization.
Step 8: Add the output layer.
Step 9: Compile the ERNN
Step 10: Fit the ERNN to the training set
Step 11: Load the COVID-19 test image data for 2020
Step 12: Get the predicted COVID-19 for 2020
Step 13: Visualize the results of predicted and real COVID-19
Thus the perforce of our algorithm, is found to be of more accuracy, consuming less execution time; detailing the Covid-19 cases in their early prediction[22[23][24].
Evaluation Methods
In order to exhibit and assess the impact of our proposed method on MRNN, we have adopted few strategies as following. Initially, OP, BP, FN and GN are defined on the individual basis, for confusion matrix, to be examined first. Since the quantity of cases effectively anticipated as required due to OP. At the same time, the quantity of examples erroneously anticipated as required due to BP. The quantity of cases accurately anticipated as not required due to GN. Finally the quantity of occasions inaccurately anticipated as not required due to FN. At that point, we can get four estimations such as: performance, accuracy, time complexity review and F1-measure are calculated based on the following formulae
Considering every data set and all parameters into consideration, for demonstrating chances of illness; the precision of hazard expectation is expected to rely upon the assorted variety highlight of the medical clinic information. That is, the improvement in the element portrayal of the ailment, corresponds to the higher the exactness. Our experiment surfaced the precision rate at 90.54% to all, which are under ‘more likely’ assess the exposure.
Time Complexity/Execution Time: It is realized that, our approach consume 50% less time when compared to other existing techniques. This time can be further minimized using graphic processing unit (GPU) and tensor processing unit (TPU). This entire work execution time is also depends on the system performance. Finally the system performance is in turn depends on the system software, system hardware and available space.
Input: As mentioned earlier, our experiment is considering 960 image data set for the input, from the image database.
The graphical image is thus obtained as shown below
This research presents a seminal results towards a groundbreaking and modern approach for early diagnosis of COVID-19 detection. It primarily conclude a particular mechanism of the proposed COVID-19 early detection system. The concluding analysis of this work is based on evaluating different acoustic features of cough, breath and speech voices. This research elegantly compares and concludes a patients' voice inconvenient accuracy, and is found to be proportional to his/her cough and breathe sounds. The research also surface the main reason behind these in-efficient preliminary results, as time constraints and computing power. The concluding image database and sound data set is found to be comparatively small and lacks a control group data on health database topics; especially so in comparing with other patients suffering from other respiratory[32] problems. Eventually, this research data analysis and observations basis of patient's cough and breathe, are found to be more effective factors to diagnose infections. As a result, based on some peoples’ quarantine experience, it surfaced that the production of chat bots is soon possible to provide mental support and used as an aid in controlling the anxiety/expected disorders.
This work is based on several typical prediction [21] [20] algorithms, and propose a new framework to create high accuracy and high convergence speed. The proposed modified RNNs algorithm is found to be adoptable for structure and unstructured data from hospitals. The core objective of this research work, is clearly reflect in enhancing better accuracy and performance of the proposed model. This work infers that, the used Covid-19 data set is undauntedly increase the performance and accuracy of most algorithms; as long as MRNN and linear regression lead others.
This work demark the classified training data set into 8 successive phases. Out of 960 images, we have found, 149 COVID-19 positive, 234 pneumonia positive, 250 normal/ negative. Through our proposed MRNN self-learn COID-19, we able to achieve 50% of class characteristics; which is significant in reality. In each phase, we also found a remarkable note: The image database from normal and pneumonia classes are found to be different, reflecting our proposed MRNN’s capability in distinguishing COVID-19 from other classes. Among the total 960 images of our trained data sets, only 24 images were allocated for evaluating the network; clearly reflects better accuracy and performance. Our proposed modelobtain an average 89.62% of accuracy and 91.54% sensitivity for COVID-19 class; with an overall accuracy of 88.4% between classifications.
As a part of future work possibilities, this trained MRNN can be made available for public, invariably help better medical diagnosis. Currently with existing few regular prediction [23] algorithms, our projected algorithms exactness/accuracy is at 91% with an assembly speed. In the event, when the larger data sets made available from COVID-19 patients, further pushes this model’s accuracy as well as the neural network. In view of fast growing trends in NN advanced techniques, it is can certainly be possible to achieve better accuracy.