Thoughts On Provider-Based Predictive Modeling In Value-Based Contracts

Analytics to predict future medical costs of individuals and populations are limited by the characteristics of the three types of available data. 

  • Abstracted data may come from hospital notice of discharge, admission, emergency department visit, or skilled nursing facility transfer. 
  • Clinical data usually comes from an electronic medical record (EMR), biometric feeds, lab feeds, pharmacy feeds, or health assessments (by either the patient or his/her care manager).  
  • Claims data generally comes from medical or pharmacy benefit managers/payers.

Important attributes of data used in health care analytics include sensitivity (ability to detect all conditions), specificity (ability to identify conditions accurately), timeliness and availability.  Each of the three data sources used for predictive modeling has both strengths and weaknesses.  Abstracted data is sensitive, non-specific, timely and generally available.  Clinical data is sensitive, specific, timely and variably available (may be incomplete, unstructured, or unavailable from providers outside the EMR system in use).  Claims data is insensitive, non-specific, untimely and always available.  The variations in strong and weak attributes between these three data sources suggests that, when combined, the three should form a more effective basis of analysis for the prediction of future medical outcomes and costs.Miles Snowden, MD, MPH, CEBS Chief Medical Officer, OptumHealth

An example of the power of combining three disparate data sources for predictive modeling can be seen in diabetes management.  Experience in our own programs suggests that when we identify diabetics by claims data and engage them in our chronic disease management programs, we improve their compliance with consensus, evidence-based medicine by about 20%.  However, our internal experience also shows that about 30% of individuals identified as diabetic through clinical data (EMR) were not identified as diabetic in claims data.  About 2/3 of diabetics identified only by clinical data were through abnormal lab results (Hgb A1C) and about 1/3 were by prescription data (glycemic agents).  Finally, our internal experience also shows that when abstracted data (hospital notice of discharge) allows us to promptly engage diabetics post-discharge, we decrease 30 day readmission rates by about 25%. 

Predictive modeling from the analysis of combined abstracted, clinical and claims data holds great promise and will have the best opportunity for success in a physician-centric model (i.e.,  in reimbursement arrangements that shift accountability and opportunity to physicians).  Perfect medical prediction remains elusive, however, as it will require augmenting health status analysis with data on patient knowledge of condition (assessment based), and patient health behaviors (observation based).  As we migrate more of our health need identification and intervention initiatives from payers to providers, we should continue to see advancements in our ability to predict and mitigate future medical events.


–Miles Snowden, Chief Medical Officer, Optum

One thought on “Thoughts On Provider-Based Predictive Modeling In Value-Based Contracts

  1. Hi! Interesting article to read as I am an informatics student. Are you able to get claims data from different EHRs? Example can your softwear pick up diabetes or HgA1c from EPIC, Cerner, and McKesson? I’d love to hear about the interoperability of the systems to get the claim data to the level of the case managers as I currently am a case manager.

Leave a Reply