Adapting Vision Foundation Models for Real-time Ultrasound Image Segmentation

1Yale University, 2United Imaging Intelligence
MICCAI 2025 (Early accept)

Motivation

  • Ultrasound image segmentation is a long-standing problem in medical imaging due to its low signal-to-noise ratio, indistinct and ambiguous anatomical boundaries, and high anatomical variability across patients.
  • Existing ultrasound segmentation methods often struggle with adaptability to new tasks, relying on costly manual annotations.
  • Current real-time segmentation approaches fail to match state-of-the-art performance.

Method

Main Framework

We adapt Hiera to extract multi-scale features, interleaved with DINOv2 features and decoded by a hierarchical decoder. Red blocks denote trainable parameters.

  • Adapters: We introduce a lightweight adapter positioned after the skip connection in Hiera’s multi-scale attention block built on MViTv2.
  • Feature interleaving: To enhance semantic representation, we incorporate an auxiliary DINOv2 encoder by applying an interleaving strategy by merging features slice by slice along channel dimensions.
  • Hierarchical decoder: We propose a hierarchical decoder that progressively fuses coarse-to-fine representations in a UNet-like style.
  • Results

    Data efficiency and adaptability on cardiac ultrasound

    Results Results Results

    Our approach remains highly effective under limited supervision, significantly outperforming baselines when trained with only 1% and 10% of the training data.


    Cross-dataset generalization on thyroid ultrasound

    Results Results Results Results Results

    Our method demonstrates strong generalization capability when trained on TN3K and tested on DDTI and outperforms existing state-of-the-art methods on other thyroid ultrasound datasets.

    Citation

    @article{zhang2025adapting,
            title={Adapting Vision Foundation Models for Real-time Ultrasound Image Segmentation},
            author={Zhang, Xiaoran and Chen, Eric Z and Zhao, Lin and Chen, Xiao and Liu, Yikang and Maihe, Boris and Duncan, James S and Chen, Terrence and Sun, Shanhui},
            journal={arXiv preprint arXiv:2503.24368},
            year={2025}
          }