
Decode Health’s AI Platform Flags Imminent High-Cost Risk in MS
As GenAI captures a growing share of attention in healthcare, much of the conversation centers on models that create content rather than tools that quietly improve decisions at the point of care. Our new study in Communications Medicine presents a different, complementary path forward: explainable predictive AI that turns existing healthcare data into early warnings for costly, avoidable events in complex diseases.
This work complements our biomarker discovery programs and highlights Decode Health’s second major focus area: clinical decision support and population risk prediction that can be integrated into live workflows.
CLICK HERE TO ACCESS THE FULL ARTICLE: https://bit.ly/3XW4Rpn
Why this matters now
Health systems and life sciences companies are being asked to “do something with AI,” yet many initiatives never progress beyond pilots. Meanwhile, chronic diseases like multiple sclerosis (MS) generate outsized and unpredictable costs.
In our study, people with MS made up about 0.2% of a 267,000-member commercial population, yet they accounted for over 2.5% of total spending. Their average monthly cost was roughly $4,400 per person, compared to about $660 for non-MS members, and their monthly expenditures were much more variable.
If you can identify who is about to enter a very high-cost state, you can design targeted interventions, deploy digital tools, and measure the impact. That is exactly the problem this work addresses.
If you can identify who is about to enter a very high-cost state, you can design targeted interventions, deploy digital tools, and measure the impact. That is exactly the problem this work addresses.
Headline results from our MS study
Using de-identified commercial claims:
- We trained and evaluated machine learning models to predict which MS patients would be in the top 10% of healthcare spending over the next four months.
- In a simulated real-world evaluation period, our models captured about 76% of the spending generated by the true top 10% of MS patients. Two common historical approaches, based on recent spending, captured about 44% and 37%, respectively.
- The models were especially effective at identifying new entrants to the high-cost group – patients whose risk was increasing but who had not yet shown up as “expensive” in previous months. These individuals generated very high inpatient and outpatient costs, consistent with acute, often preventable events.
In short, the system did not just re-identify yesterday’s high-cost patients. It surfaced tomorrow’s.
…the system did not just re-identify yesterday’s high-cost patients. It surfaced tomorrow’s.
Two complementary programs at Decode Health
Decode Health operates in two tightly linked domains:
1. Biomarker and molecular signature discovery
Using multi-omics and other data types to identify signatures that detect undiagnosed conditions or explain disease biology and treatment response.
2. Clinical decision support and predictive risk modeling
Using real-world data, such as claims and EHRs, to forecast risk, utilization, and disease progression, enabling teams to intervene earlier.
This MS work is an example of the second pillar. It demonstrates how the same platform supporting biomarker programs can also generate actionable signals directly from routine healthcare data.
What this platform actually does
Longitudinal risk stratification from routine data
We convert raw claims into a member-month view of risk. The platform tracks how people move between spending quartiles, identifies members at higher risk, and predicts who is likely to enter the highest-spending group within a defined future window. In MS, the rising-risk group was only about a quarter of members at any time, yet it accounted for roughly one third to nearly half of total spend.
Explainable drivers at the code and encounter level
Using permutation-based feature impact, the models identify which diagnosis, procedure, pharmacy, and encounter codes increase or decrease risk. Disease-modifying therapies are important, but they are not the only signals. High-impact features also include complex hospital visits, imaging, rehabilitation, and comorbidities such as depression and hypertension.
Deployment-ready risk workflow
The study outlines a tiered system in which claims-based AI identifies members at imminent risk, a multidisciplinary team examines the key factors behind each prediction, and care teams choose targeted outreach methods, such as telehealth visits, medication reviews, or expedited neurology consults.
How we built it under the hood
Behind these outputs is a reusable analytics engine:
- Robust data engineering that integrates medical, facility, pharmacy, and provider information into a clean, longitudinal dataset at the member-month level.
- Competitive model search across more than 130 regression and classification algorithms, including gradient boosting, random forests, generalized linear models, support vector machines, k-nearest neighbors, and neural networks.
- Two ways of framing risk: (1) Expected four-month spend per member. (2) Probability of entering the top-spending decile during that window. Both formulations were tested. For this use case, regression models that predicted expected future spend best reflected real downstream costs and outperformed the classification models, so they form the basis of the paper’s results.
- Rigorous validation using cross-validation, a time-segregated holdout set, and a separate evaluation window to simulate live deployment while protecting against data leakage.
Decode Health capabilities that create strategic opportunity
Several aspects of this work are reusable across diseases and data sources:
- A scalable analytics platform that can incorporate new cohorts, conditions, and geographies without rebuilding the pipeline from scratch.
- Rising-risk segmentation focuses on member-month shifts toward higher spending, which aligns better with intervention strategies than static “high-cost list” approaches.
- Operational explainability that links risk to specific care patterns and comorbidities instead of opaque model scores.
- Documented and transparent methods aligned with TRIPOD-AI reporting, which streamlines the process from research to regulated, production-grade tools.
These ingredients are the same ones needed to scale clinical decision support, real-world data analytics, and companion digital tools across therapeutic areas.
Where we focus first, and who else can use it
Decode Health’s primary focus is partnering with life sciences, biopharma, and med tech companies that want to:
- Enrich clinical trials and real-world evidence programs with claims-based phenotypes and rising-risk signals.
- Design precision access, adherence, and patient support initiatives that target people on the brink of deterioration.
- Combine molecular and digital biomarkers with predictive utilization signals to create more holistic decision tools.
Because the platform works directly on real-world utilization data, the same outputs are directly relevant for health plans and risk-bearing provider organizations that need to prioritize outreach, manage value-based contracts, and proactively manage high-cost therapies.
The common thread is the ability to see risk earlier and act on it in a transparent way.
Explore the study and connect with us.
The full article is open access and provides detailed methods, statistics, and model comparisons for teams that want to dive deeper. If you’d like to see how this approach could apply to your assets, populations, or data, we’re happy to schedule a focused discussion.
















