Baseball and Big Data


Can healthcare learn something from the national pastime?

As a CMO and former CMIO, I’ve spent the better part of my professional life looking at numbers and trying to wrestle with the enigma that has come to be known as Population Health. As a 2017 article in HealthIT Analytics so aptly titled it, “population health is a big idea with big potential and a nearly unlimited range of possibilities, which can be both a good thing and a bad thing for healthcare organizations trying to figure out where to start.”

Baseball’s Been Berry, Berry Good To Me

To clear my head and get my arms around any challenge, I usually turn on the TV and look at a ballgame, any ballgame, that can get my mind off medicine and foster a little creative inspiration. It was baseball that actually got me thinking about a new approach to population health.

It was sometime in 2015 when my good friend, Larry Schor, and I were caught up in the frenzy of the National League pennant race.  Larry, a Cubs fan, and I, a Mets fan, were all too accustomed to hapless Octobers watching rival teams in World Series contention. But 2015 was different – the Chicago Cubs, tormented for decades by the Curse of the Billy Goat, were in the hunt.  They finished the season with the third best record in the majors and made it into the postseason as the second wild card team.  Meanwhile, my New York Mets would lock up the National League East title and torment me all the way to the World Series.

Larry and I were beyond excited and found ourselves using baseball metaphors while talking shop about population health. Sharing some of the same frustrations we had with buzzword topics like big data, predictive modeling and the like, we realized that improvements in healthcare delivery will never happen if we wait for the next big machine or platform.    Frustrated by the slow pace of change in healthcare, it dawned on us that there must be some wisdom in the work of Billy Beane and his Moneyball magic.

Beane, the Mets first round draft pick in 1980, was an upstart scout in the A’s front office in 1990, choosing to forgo another run at “The Show.”  Sandy Alderson, the A’s GM at the time (and Mets GM by 2010) assigned Beane to develop a means of identifying undervalued players through Sabermetrics, a statistical means of finding diamonds in the rough.  Bill James, the modern pioneer of this method, noticed that a batting average was rather limited in predicting the impact a player may have on the game.  Hits divided by total number of at-bats might impress but doesn’t say much about the player’s primary goal – reaching base.  Thus, the on-base percentage was derived, revealing players who expertly reached base via base-on-balls and hits-by-pitch as well as with hits. Beane honed his analytics-driven methods to build a roster of undervalued, productive players through the 1990s. Under Beane’s tenure, the A’s reached the playoffs from 2000 through 2003 and again in 2006.

Beane looked for high potential in undervalued players by identifying data points and comparators that were not ordinarily found in a scouting report.  Rather than rely on “gut instinct,” it is the data that drives decisions for strategic execution.

The Healthcare IT Implications Were Obvious

Back to pop health – Larry and I quickly realized that we’ve always managed patients with well-known data points but, like those who preceded Billy Beane and his sabermetrics approach, were treating patients, sometimes by practice guidelines, more often by “gut instinct,” a critical skill that should not govern ultimate decisions.

We developed a model that seemed more appropriately labeled as “small data” analytics, one emulating baseball concepts – the anticipation and remediation of risk representing a team’s offense; adjustments to its care management strategies are its defense. (Healthcare providers are the players here; patients are fans.)

Take, for example, a hypothetical 4-provider practice.  Let’s assume that these providers had patient cohorts of similar size and complexity, though with an uneven split of diabetic patients.  Repurposing baseball’s box score, we then derived a set of measures to create “situation awareness,” providing an composite of all that is necessary to discern current state and a trending patterns of care delivery.

By utilizing traditional metrics, i.e. change in HgbA1c over time, the model we proposed helps assess differences in care and invites practice partners to learn from one another.

  • POP (Population well-controlled): The population health management team’s win/loss standing; patient population under control divided by the total patients
  • CPB (Critical Patients Behind): The PHM team’s win/loss standing; patient population uncontrolled divided by the total patients, thus requiring clinical intervention
  • RB (Risk Burden): The primary care physician’s risk burden percentage; total uncontrolled divided by total diabetic patients
  • AC (Attained/Maintained Control): The PCP’s on-base percentage; total maintained control divided by total patients (no significant change over time)
  • RM (Risk Managed): The PCP’s slugging percentage; total improved control divided by total patients (% of patients who improved over time)
  • ERR (Error): The PCP’s error percentage; total reduced control divided by total patients (% of patients who worsened over time)


At a glance, we immediately see that Dr. Harding is struggling.  One explanation might be the simple fact that Harding has a large cohort of diabetics, though no larger than Johnson’s or Wilson’s.  Harding’s low AC score suggests issues with adherence, health literacy, or possibly time spent ensuring that patients understand their condition and the meaning of A1c control.  A quick look at Lincoln’s stats suggests a somewhat smaller cohort, most of whom maintain current state of diabetic control, but with a significant proportion (27%) who have had improvements with a minimum of patients regressing over time (7%).

What lessons can we infer from these box scores?  First and foremost is the snapshot, not just of current state (RB), but of change over time – AC, RM, ERR.  Given the limited resources of physician practices and health systems alike, small data analyses like this one can provide meaningful situation awareness without sophisticated algorithms. Findings not only demonstrate quantitative differences in outcomes, but encourage colleagues to collaborate and share best practices, identify the severity of disease and the need for additional services, e.g. diabetes educators, care managers, group sessions.

Population Health Management is a spectacular challenge vulnerable to changing definitions and competition for resources.  Moving beyond the static quality metric towards a more dynamic view of patient cohorts is just one approach that can change the outcome of the game.

Baseball and Big Data is the subject of the Data Book Podcast with Jack Murtha and Tom Castles of Healthcare Analytics News.  Listen to the full podcast here.


Dr. Neil Kudler is the Chief Medical Officer at VertitechIT and the former Chief Medical Information Officer for Baystate Health. He is a long-suffering fan of the New York Mets who is determined to stay positive about his team’s chances, at least until the All-Star break.

More Blogs Like This: