As new resourceful interventions are being introduced, prediction of diabetes progression and end-organ failure, become a necessity. Risk stratification can be promoted by developing new biomarkers or by using artificial intelligence algorithmic tools to mine readily available and under-utilized clinical data. The latter reside in Electronic Medical Records (EMR), and constitute what is commonly termed big data.
We identified 554,110 diabetic patients and 645,077 pre-diabetic individuals (NHS). All those had 2 glucose and/or one HgbA1C tests at minimum, a diagnostic code and/or listed use of a hypoglycemic drug.
Two models were trained. First model was trained to identify pre-diabetics prone to progress to diabetes within 1-year from index date. The second model was trained to identify diabetic patients prone to present with microalbumin above 300 mg/g or eGFR below 45 within 1-year.
Performance of first model (incorporating tens of signals to create over 900 features to include historical lab results) was compared to that of a logistic regression model (incorporating sex, age, glucose and HgbA1C). It outperformed the logistic one at any given sensitivity by a 50-100% increase of PPV. Major contributors to this performance were glucose, HabA1C, BMI, age & sex as expected. Minor contributors included HDL, triglycerides, ALT, WBC, RBC, GGT and drugs.
Performance of second model was compared again to that of a logistic regression model (incorporating sex, age, eGFR, creatinine and urinalysis). This outperformed the logistic one at any given sensitivity by a 35-70% increase of PPV. The major contributors to this performance were creatinine, eGFR, urinalysis, age & sex as expected. Minor contributors included HDL, triglycerides, albumin, WBC, BMI, glucose, HgbA1C, Hgb, and drugs.
Use of ML-based tools allows to decrease number of those needed to treat and to increase capture of those at risk, thus reducing morbidity & mortality and promoting cost effectiveness plans.
Presented by: Ran Goshen