Evaluation Metrics

The submission will be evaluated based on the following metrics for different lesion percentages (<1%, 1%~5%, >5%). Evaluation codes are availabe here.

  • Dice
  • MASD (Mean Average Surface Distance)
  • NSD (Normalized Surface Distance)
For definitions of the above evaluation metrics, please refer to the paper below:

Reinke A, Tizabi M D, Sudre C H, et al. Common limitations of image processing metrics: A picture story[J]. arXiv preprint arXiv:2104.05642, 2021.

We released our evaluation docker here 

evaluation docker

Ranking

  • Compute the Dice, MASD, and NSD values for each case.
  • Establish each team’s rank for Dice, MASD and NSD over all cases.
  • Compute the mean rank over all three evaluation measures to obtain the team’s rank.

Leaderboard

We will maintain two separate leaderboards. The first leaderboard will exclusively use the provided training dataset to rank the performance of the submitted methods. The second leaderboard will be used for ranking methods that utilize pretrained models or other publicly available datasets.