Using Public Health datasets to predict one’s ability to pay for Pre-Exposure prophylaxis (PrEP) services in Uganda
DOI:
https://doi.org/10.69660/jcsda.02012504Keywords:
Pre-Exposure Prophylaxis (PrEP), Machine Learning, Predictive Modeling, Artificial Intelligence (AI)Abstract
In Uganda, the uptake of pre-exposure prophylaxis (PrEP) as a preventive measure against HIV infection is notably low, despite its proven effectiveness, particularly among high-risk populations (UPHIA, 2020). Although PrEP has historically been available at no cost in government facilities, the recent decrease in HIV medication costs and the shift towards private-sector involvement necessitate a reliable assessment of individuals’ ability to pay for PrEP. The growing volume of HIV-related data presents a unique opportunity to leverage artificial intelligence (AI) and machine learning (ML) techniques to identify high-risk sub-populations that are both eligible for and willing to pay for PrEP services. This retrospective study, analyzed three diverse datasets, including, the Uganda Demographic Health Survey, the Uganda Population HIV/AIDS Impact Assessment survey, and a private dataset from the Rocket Health Telemedicine Clinic. The study population included individuals aged 18 years and above that have accessed a private health facility for sexual reproductive health services or products. Statistical methods, including the Chi-square test and Spearman’s correlation test, were employed to identify features with a statistical significance to the ability to pay for PrEP. The datasets were aggregated, cleaned and then split into 70% for training and 30% for testing and validation. An ensemble of machine learning classification models was trained using Python and the PyCaret library. The AdaBoost classifier demonstrated superior predictive power, with a recall of 99% and an AUC of 100%, indicating robust prediction capabilities on this dataset. The model achieved a high training score of 99%, suggesting an excellent fit to the training data. Further analysis revealed that factors such as age, gender, employment status, and socioeconomic status were the most influential predictors of the ability to pay for PrEP services. A web application interface was developed using the Streamlit library, allowing individuals and programs to upload data and make predictions about the likelihood of individuals paying for PrEP. The developed tool leverages publicly available data to identify populations capable of paying for PrEP services, fostering a collaborative effort towards achieving better health outcomes and ensuring the sustainability of HIV prevention services.