SeriesFusion
Science, curated & edited by AI
Paradigm Challenge  /  AI

Six months of hospital records are better for predicting patient readmission than ten years of data because old notes eventually become useless noise.

Predictive performance in healthcare peaks surprisingly early when using clinical notes for machine learning. While structured data like lab results benefits from longer histories, text-based insights plateau after just 3 to 6 months. Adding older records actually introduces more confusion than clarity for the model. This finding contradicts the industry belief that massive, multi-year datasets are always better for medical AI. Hospitals can achieve better results by focusing on recent, high-quality documentation rather than dredging through decades of archives.

Original Paper

Temporal Data Requirement for Predicting Unplanned Hospital Readmissions

Ramin Mohammadi, Vahab vahdat, Sarthak Jain, Amir T. Namin, Ramya Palacholla, Sagar Kamarthi

arXiv  ·  2605.00738

With the proliferation of Electronic Health Records (EHRs), a critical challenge in building predictive models is determining the optimal historical data time window to maximize accuracy. This study investigates the impact of various observation windows ranging from the day of surgery to three years prior on predicting 30-day readmission following hip and knee arthroplasties. The dataset encompasses both structured encounter records (over 4 million) and unstructured clinical notes (80,000) from