
Multi-Analytical Integration of Public RNA-seq Data Reveals Blood-Based Sepsis Severity Biomarkers for Predictive Modeling
November 2025 Poster Presentation
Cheryl L. Sesler1, Lukasz S. Wylezinski1,2, Guzel I. Shaginurova1, Elena V. Grigorenko1, Franklin R. Cockerill, III1,3,4, Charles F. Spurlock, III1,2,5,6
1 Decode Health, Nashville, TN
2 Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN
3 Department of Medicine, Rush University Medical Center, Chicago, IL
4 Trusted Health Advisors, Scottsdale, AZ
5 Wagner School of Public Service, New York University, New York, NY
6 Thomas F. Frist, Jr. College of Medicine, Belmont University, Nashville, TN
Introduction. RNA sequencing (RNA-seq) combined with advanced analytics is accelerating biomarker discovery, yet blood tests that reliably measure sepsis severity are still limited. Sepsis is a dysregulated host response to infection that often presents nonspecific symptoms, delaying treatment and raising mortality rates. We assessed whether integrating classical differential expression (DE) analysis with multivariate feature selection (FS) can identify robust RNA signatures of sepsis severity from public datasets, creating a concise feature set for downstream machine learning (ML) modeling.
Methods. PAXgene whole blood RNA-seq data were obtained from four NCBI studies (total n = 622). Nine pre-defined comparisons were conducted to analyze infection status, Sequential Organ Failure Assessment (SOFA) score, septic shock, and survival. For each comparison, we identified DE genes (adjusted P < 0.05, |log₂FC| ≥ 1.0, mean expression ≥ 10) and FS genes supported by two supervised algorithms, recursive feature elimination and chi-square testing. Genes were ranked based on recurrence across comparisons, and functional relevance was assessed with pathway enrichment analysis and literature review. Gene signatures formed the candidate feature space to classify patients.
Results. Three recurrent gene sets were identified: 262 DE genes, 50 FS genes, and 345 genes across both methods (DE + FS). All sets were enriched for neutrophil degranulation, a bactericidal pathway also involved in organ injury, and for disrupted fibronectin matrix organization, which is consistent with the loss of vascular integrity in sepsis. The FS and DE + FS sets uniquely highlighted abnormal hemostasis, particularly platelet degranulation, linking coagulation to poor outcomes. Only seven genes were common to all sets, and four of these genes consistently correlated with increasing severity of sepsis, emphasizing their potential as markers of disease progression. Gene sets used as inputs for ML produced models with area under the curve (AUC) values up to 0.947, indicating strong predictive performance for classifying sepsis severity.
Conclusions. Integrating multivariate FS with DE analysis of well-curated public RNA‑seq studies revealed coherent, biologically plausible blood signatures of sepsis, including pathways and genes missed by univariate methods alone. This multi-analytical framework generates focused gene panels that support diagnostic and prognostic modeling, advancing the precision management of sepsis.

















