New Benchmarks for Semantic Segmentation Models

Mapillary Research ranks no. 1 for semantic segmentation of street scenes on the Cityscapes and Mapillary Vistas leaderboards.

Mapillary Research ranks #1 for semantic segmentation of street scenes on the Cityscapes and Mapillary Vistas leaderboards.

Semantic segmentation on the Mapillary Vistas Dataset Semantic segmentation on a Mapillary Vistas image

Mapillary’s semantic segmentation models are based on the most recent deep learning research. Our technology allows us to train models from scratch. The results of our work have now set new benchmarks for two of the most renowned and challenging datasets for semantic segmentation of street scenes.

With our most recent algorithmic advances, we rank #1 on the well-known Cityscapes benchmark dataset based on the Intersection-over-Union (IoU) metric on class level, which for our result is 82.0%. This metric (also known as the Jaccard index) is used for comparing the similarity and diversity of sample sets and rewards correct pixel classifications while penalizing incorrect ones.

Our approach also notably improves over the second best entry in the leaderboard for the iIoU metric on class level (65.9% vs 62.1%, a difference of nearly 4 percentage points), where the metric has been adjusted for the bias arising from object instances that cover a large image area.

Additionally, we have achieved the best result on Mapillary Vistas, which compared to Cityscapes has five times as many images and more complex object categories (65 vs. 19 classes), contains high-resolution (up to 22 MP) images from all over the world, and varies more in weather conditions as well as daytime and season.

With a single evaluation pass of our model, the results (IoU 53.37%) on Mapillary Vistas significantly surpass the winning entry of the 2017 Large-Scale Scene Understanding (LSUN'17) workshop winners (52.99%), which requires 24 passes per image and thus is ~24x slower than ours.

These results use our most recent In-Place Activated BatchNorm method, which is a very general approach to better exploiting memory on GPUs during training time. As semantic segmentation is a memory-demanding task, we look forward to seeing how the community will use our method (code provided here) for improving other applications in deep learning research.

/Peter & Mapillary Research

Semantic segmentation on Cityscapes Semantic segmentation on a Cityscapes image

Continue the conversation