Assessment - Grand Challenge

Evaluation Metrics

The submission will be evaluated based on the following metrics for different lesion percentages (<1%, 1%~5%, >5%). Evaluation codes are availabe here.

Dice
MASD (Mean Average Surface Distance)
NSD (Normalized Surface Distance)

For definitions of the above evaluation metrics, please refer to the paper below:
Reinke A, Tizabi M D, Sudre C H, et al. Common limitations of image processing metrics: A picture story[J]. arXiv preprint arXiv:2104.05642, 2021.

We released our evaluation docker here

evaluation docker

Ranking

Compute the Dice, MASD, and NSD values for each case.
Establish each team’s rank for Dice, MASD and NSD over all cases.
Compute the mean rank over all three evaluation measures to obtain the team’s rank.
Leaderboard

We will maintain two separate leaderboards. The first leaderboard will exclusively use the provided training dataset to rank the performance of the submitted methods. The second leaderboard will be used for ranking methods that utilize pretrained models or other publicly available datasets.

Evaluation Metrics

We released our evaluation docker here

evaluation docker

Ranking

Compute the Dice, MASD, and NSD values for each case.Establish each team’s rank for Dice, MASD and NSD over all cases.Compute the mean rank over all three evaluation measures to obtain the team’s rank.Leaderboard

Compute the Dice, MASD, and NSD values for each case.
Establish each team’s rank for Dice, MASD and NSD over all cases.
Compute the mean rank over all three evaluation measures to obtain the team’s rank.
Leaderboard