Using a Machine Learning Algorithm to Predict Online Patient Portal Utilization: A Patient Engagement Study
Objective: There is a low rate of online patient portal utilization in the U.S. This study aimed to utilize a machine learning approach to predict access to online medical records through a patient portal.
Methods: This is a cross-sectional predictive machine learning algorithm-based study of Health Information National Trends datasets (Cycles 1 and 2; 2017-2018 samples). Survey respondents were U.S. adults (≥18 years old). The primary outcome was a binary variable indicating that the patient had or had not accessed online medical records in the previous 12 months. We analyzed a subset of independent variables using k-means clustering with replicate samples. A cross-validated random forest-based algorithm was utilized to select features for a Cycle 1 split training sample. A logistic regression and an evolved decision tree were trained on the rest of the Cycle 1 training sample. The Cycle 1 test sample and Cycle 2 data were used to benchmark algorithm performance.
Results: Lack of access to online systems was less of a barrier to online medical records in 2018 (14%) compared to 2017 (26%). Patients accessed medical records to refill medicines and message primary care providers more frequently in 2018 (45%) than in 2017 (25%).
Discussion: Privacy concerns, portal knowledge, and conversations between primary care providers and patients predict portal access.
Conclusion: Methods described here may be employed to personalize methods of patient engagement during new patient registration.