Elucidata Ranks Among the Top 10 in the Broad Institute’s Autoimmune Disease ML Challenge

We’re proud to announce that Elucidata’s team has secured 7th place globally and is among the Top 10 Prize Awardees in the Autoimmune Disease Machine Learning Challenge organized by the Broad Institute in collaboration with CrunchDAO. This achievement underscores our commitment to pushing the boundaries of computational biology and AI-powered research in immunology and digital pathology.

Rethinking Molecular Insights from Histopathology

For decades, hematoxylin and eosin (H&E) stained tissue slides have been the gold standard in diagnosing complex diseases such as Inflammatory Bowel Disease (IBD). While visually rich, these slides lack molecular depth. The Broad ML Challenge posed a provocative question: Can deep learning models infer spatially resolved gene expression profiles directly from whole-slide images, without relying on expensive molecular assays?

This challenge called for an ambitious goal: predicting the spatial expression of 2,000 unseen genes using multi-modal data, like H&E images, spatial transcriptomics, and single-cell RNA-seq. With sparse and noisy data that mimics real-world complexity, this wasn’t just a modeling exercise, it was a test of scientific intuition, adaptability, and innovation.

Our Approach

Led by Elucidata’s research engineers and data scientists, Nobal Dhruw, Gaurang Mahajan, and Rajdeep Mondal, the team ran extensive modeling experiments. The aprroach was divided into two phases:

1. Image-to-Gene Prediction via Contrastive Learning: We trained a CLIP-like model to align H&E-derived image features with gene expression vectors using h-Optimus, a histopathology-optimized image encoder. Multi-scale tiles (32px to 512px) helped capture both cellular morphology and tissue context.

  • Spearman’s correlation improved from 0.28 (ResNet50) to 0.41 (h-Optimus + contrastive learning).
  • Embeddings enabled zero-shot generalization to unseen samples.

2. Out-of-Distribution (OOD) Gene Prediction: To estimate expression of unmeasured genes, we matched predicted 460-gene profiles to similar cells in a reference scRNASeq atlas using k-nearest neighbors (kNN).We further improved accuracy by predicting cell type from H&E patches and assigning average scRNASeq gene profiles for that type - boosting correlation to 0.48.

Finishing in the top 10 affirmed the team’s deep domain knowledge and methodical problem-solving, essential ingredients for impactful AI in life sciences.

Why This Matters

This outcome reinforces a belief we hold deeply at Elucidata: the future of diagnostics and drug discovery lies at the intersection of multi-modal data and purpose-built AI systems. The current limitations in model interpretability and biological validation aren’t barriers, they’re opportunities for innovation.

We see this not just as a win in a competition, but as a milestone in a longer journey to:

  • Enhancing Diagnostic Tools: Enabling pathologists to extract molecular insights directly from routine H&E slides, improving diagnostic accuracy.
  • Enabling Personalized Medicine: Assisting clinicians in treatment selection based on predicted gene expression profiles.
  • Reducing Costs: Significantly lowering the reliance on expensive spatial transcriptomic experiments while retaining high-resolution molecular information.
  • Advancing AI in Spatial Biology: Establishing a framework for integrating deep learning models with multi-modal omics data, setting the stage for future breakthroughs in digital pathology and spatial genomics.

What’s Next

The exploration sparked by this challenge continues at Elucidata. We remain committed to building scalable, AI-native systems that make biomedical data more actionable, advancing discovery, accelerating development, and driving better decisions across the life sciences ecosystem.

About Elucidata

Elucidata is a data-first AI company that accelerates life sciences R&D by converting fragmented biomedical datasets into harmonized, AI-ready assets. Its proprietary Polly platform integrates EHRs, genomics, imaging, and clinical trial data for seamless downstream analysis and model development. Headquartered in San Francisco with offices across the U.S. and India, Elucidata supports over 70 pharma and diagnostics clients, and has contributed to 16 drug programs progressing toward FDA approvals. Recognized by the National Cancer Institute and Fast Company for its innovation in biotech, Elucidata is enabling a new era of data-driven discovery.

Explore our work at elucidata.io.

Blog Categories

Talk to our Data Expert
Thank you for reaching out!

Our team will get in touch with you over email within next 24-48hrs.
Oops! Something went wrong while submitting the form.

Blog Categories