Skip menu End of menu

REPORT | SleepFM — One Night’s Sleep, 100+ Disease Signals: A Foundation Model for Predictive Sleep Medicine

SleepFM — One Night’s Sleep, 100+ Disease Signals: A Foundation Model for Predictive Sleep Medicine

 

SleepFM (Stanford Medicine, Nature Medicine, 2026) is a large foundation model trained on ~585,000 hours of polysomnography (PSG) from ~65,000 people that predicts risk across ~130 clinical categories from a single night of sleep data. Reported top performance includes Parkinson’s disease (C‑index ≈ 0.89) and dementia (≈ 0.85). The work argues that standard sleep scoring uses a small fraction of PSG signal and that deep models can extract rich, clinically relevant signatures, but the authors emphasize limitations and the need for external validation before deployment.

Study design and dataset
  • Model type: large neural foundation model for sleep (SleepFM) using single‑night PSG as input.
  • Training data: ~585,000 hours of PSG from ~65,000 individuals drawn from clinical sleep centers and research cohorts.
  • Labels: electronic health record (EHR) diagnoses, registry data, and clinical phenotypes spanning metabolic, cardiovascular, oncologic, neurological, pregnancy, and psychiatric categories (~130 outcomes).
  • Evaluation: retrospective prediction tasks with held‑out test sets; performance measured with concordance index (C‑index) and area under ROC for many outcomes.
Methods (high level)
  • Input processing: multi‑channel PSG (EEG, EOG, EMG, respiratory channels, SpO2, ECG where available) preprocessed and tokenized/time‑embedded for model input.
  • Model architecture: large transformer/convolutional hybrid pretrained on self‑supervised sleep reconstruction tasks, then fine‑tuned for supervised outcome prediction across multiple labels (multi‑task learning).
  • Training strategy: scale‑up with pretraining on unlabeled PSG followed by multi‑label supervised fine‑tuning; class imbalance handled via loss weighting and data augmentation.
  • Interpretability: saliency maps, channel importance, temporal attention patterns to highlight predictive epochs and features.
  • Robustness checks: internal cross‑validation, stratified analyses by age/sex, and subgroup performance reporting.
Key results
  • Breadth: SleepFM produced significant predictive signals for ~130 clinical categories from single‑night PSG, many unexpected (e.g., some cancers, pregnancy complications, and systemic diseases).
  • Strong performers: Parkinson’s disease (C‑index ≈ 0.89) and dementia (≈ 0.85); several cardiovascular and metabolic outcomes showed moderate predictive ability.
  • Added value: Model outperformed conventional sleep metrics (AHI, sleep stage percentages) and simple handcrafted features on most tasks.
  • Explainability: Attention and saliency analyses often pointed to sleep fragmentation, autonomic signatures (ECG/HRV), respiratory/oxygenation patterns, and specific EEG features as drivers of predictions.
Interpretation and implications
  • Signal richness: Single-night PSG contains latent biomarkers beyond traditional scoring that correlate with a wide range of health states.
  • Potential uses: triage/referral from sleep labs, risk‑stratification for neurodegenerative disease monitoring, augmenting EHR phenotyping, hypothesis generation for mechanistic studies.
  • Caution: retrospective EHR labels can include misclassification; high discrimination does not imply causality or clinical utility without prospective confirmation.
Limitations and caveats (authors’ emphasis)
  • Sample bias: data sourced from clinical/referred populations (not a representative community cohort), limiting population generalizability.
  • Label quality: many outcomes derived from EHR codes with variable accuracy and timing relative to PSG.
  • Single‑night constraint: model predicts from one night; while powerful, night‑to‑night variability may affect reliability for some conditions.
  • Confounding and spurious correlations: model may rely on signals correlated with care patterns (medications, monitoring) rather than disease biology.
  • External validation required: generalizability across sites, devices, demographics, and non‑clinical PSG settings needs prospective testing.
  • Ethical and regulatory: privacy, transparency, bias mitigation, and downstream clinical workflows require addressing before deployment.
Clinical and research recommendations
  • Immediate next steps: independent external validation on population cohorts, prospective prognostic studies, and head‑to‑head comparisons with multi‑night monitoring approaches.
  • For clinicians/researchers: view SleepFM outputs as hypothesis‑generating and not yet replacement for diagnostic workflows; consider integration into observational studies to discover mechanistic links between sleep signatures and disease.
  • For developers/regulators: prioritize model explainability, audit datasets for biases, and test real‑world impact on decision‑making and patient outcomes before clinical use.

 

SleepFM demonstrates that a single night of polysomnography contains far more predictive information than standard scoring captures, offering a potential paradigm shift in sleep medicine and clinical phenotyping. However, robust external validation, careful attention to bias and causality, and clinical‑utility trials are essential before clinical implementation.

Citations

  • SleepFM team (Stanford Medicine). SleepFM: A foundation model for sleep-based disease prediction. Nature Medicine. 2026 Feb. doi:10.1038/s41591-026-xxxx-x
  • Stanford Medicine news release (Jan/Feb 2026).

What Do You Think?

Comment below! Not a member? Registration is easy!

Become a Member

Leave a Reply

Your email address will not be published. Required fields are marked *