An intelligent prediction model for infectious diseases’ outbreaks: an ensemble of machine learning, big climate data and indigenous knowledge
Loading...
Date
Authors
Phoobane, Maingosinathi Paulina
Journal Title
Journal ISSN
Volume Title
Publisher
Central University of Technology
Abstract
Climate change is real. It affects all spheres of fauna and flora, remarkably increasing human vulnerability to infectious diseases. Infectious diseases are the major causes of death in low-income countries, with Sub-Saharan Africa accounting for a larger proportion of these fatalities. One of the challenges in fighting infectious diseases in developing countries like South Africa is the lack of effective surveillance systems for predicting and monitoring infectious diseases. It is essential to provide an effective, relevant early warning system (EWS) for infectious disease outbreaks to mitigate impact through vaccination and other interventions. Although several studies have developed early warning systems or models to predict and/or monitor infectious disease outbreaks such as malaria, these studies are predominately based only on one knowledge system, primarily focused on scientific approaches and do not account for an African context. The integration of these current approaches with indigenous knowledge (IK) systems could enhance current early warning systems and make them more effective, relevant and understandable to the communities the EWS are intended to serve. The objective of this research study is to develop an intelligent people-centred model for predicting infectious disease outbreaks using IK, big climate data and historical malaria incidence in the Vhembe district of South Africa. Vhembe is a district in the province of Limpopo that is identified as prone to malaria outbreaks and other climate-driven shocks.
The lens that guided this research study is the EWS framework termed Malaria Outbreak Early Warning System (MOEWS). MOEWS encompasses the collection and analysis of weather and IK data to predict malaria outbreaks. Malaria outbreak risk knowledge encompasses collecting, understanding and integrating historical weather, malaria and IK data with drought indices. Machine learning (ML) models were used to predict malaria outbreaks using the collected data.
Both classification and regression ML algorithms were explored to predict malaria using structured data; weather, malaria and drought indices. Data pre-processing, such as normalization of data, handling outliers, oversampling and feature selections, was performed before modelling the dataset. ML models, developed in Jupiter Notebook using Python, were trained and tested using historical weather, drought indices and malaria data. The Multilayer Perceptron classification showed an optimal performance. MLP model demonstrated an accuracy of 93%, a precision of 95%, a perfect recall of 100%, and an F1 score of 96%, indicating a robust predictive capability. As a result, the MLP model was selected for malaria prediction. The model was subsequently deployed to make malaria predictions using real-time data on weather conditions and reported malaria incidences. A lag of one to two months was applied between the predictors and the malaria outbreak.
The results suggest the MLP's robustness in identifying patterns and predicting malaria outbreaks effectively. The deployment of the MLP model with real-time data on weather conditions and reported malaria incidences highlights its practical application in forecasting malaria outbreaks. This implies that the model has the potential to serve as a reliable tool in early warning systems, enabling proactive interventions and informed decision-making for malaria control and prevention. For the malaria outbreak forecast using indigenous knowledge, IK indicators for malaria forecast collected from the local people in Vhembe were formatted and represented as concepts. The relationship/causal effects between the concepts and with malaria were determined and formally represented as adjacency matrices. Adjacency matrices were used to develop fuzzy cognitive maps (FCM) for each of the four seasons of the year, illustrating the concepts and their causal relationships. The importance of each IK indicator within each seasonal FCM was established by analysing the density, in-degree, out-degree and centrality measures. The findings highlight that in Autumn, "autumn heavy rainfalls" and "dirty water in containers/small pools" are major predictors of malaria outbreak while "summer heavy rainfalls," "dirty water in containers" and "summer temperature" are key indicators in summer. In contrast, the main predictors of malaria in winter are 'Fig "Muhuyu" trees not shedding leaves' and 'sight of insects/locusts' while spring rainfalls and '"Mofafa grass" having many ticks' were found important for malaria prediction in spring. These insights imply that malaria forecasting systems can benefit from incorporating seasonally relevant IK indicators to improve accuracy and contextual relevance. The autumn FCM model was utilised to forecast seasonal malaria outbreaks. This was done using real-time indigenous knowledge indicators reported by selected local experts via the mobile MOEWS App. Based on the malaria IK indicators reported by the IK expert for autumn, the malaria outbreak concept had a value of 0.7856 indicating that a low malaria incidence ranging from 400 to 800 cases could be expected. Malaria alerts generated from IK and machine learning-based forecasts were disseminated through the MOEWS mobile App, social media platforms and a web portal. The MOEWS App plays a crucial role in this process, providing a direct and immediate connection between the forecast and the community. The alerts are intended to aid policymakers in developing mitigation strategies and informing decision-making processes related to malaria. This ensures that the policymakers are well-informed and can respond swiftly to potential outbreaks
© Central
Description
Doctor of philosophy in information technology
