Authors:
Saw Mya Nandar
Addresses:
Department of Computer Systems and Technologies, University of Computer Studies, Yangon, Yangon Region, Myanmar.
The field of computer vision, semantic segmentation is a fundamental problem that necessitates the precise assignment of semantic labels to each pixel in an image. Despite the fact that pixel-wise labelling has been considered the gold standard due to the fine-grained resolution it offers, it is extremely expensive in terms of the annotation and processing resources it requires. The patch-wise labeling approach has emerged as a potentially useful compromise between the efficiency of annotation and the accuracy of segmentation. The purpose of this study is to provide a comprehensive comparison analysis of patch-wise and pixel-wise labelling strategies for semantic segmentation across various datasets and architectures. We investigate the trade-offs that exist between characteristics such as label granularity, computational expense, model performance, and the ability to generalise. The most cutting-edge segmentation networks, including U-Net, DeepLabV3+, and Swin Transformer, are utilised in experiments carried out on benchmark datasets such as Cityscapes and PASCAL VOC. Our findings reveal the conditions under which patch-wise labelling can serve as a powerful substitute for pixel-wise approaches in situations where supervision is weak or resources are limited. Regarding annotation, model architecture, and the implementation of segmentation systems in the real world, the paper considers the ramifications.
Keywords: Patch-Wise and Pixel-Wise; Semantic Segmentation; PASCAL and VOC; Segmentation Systems; Deep Learning; Annotation Process; Boundary Precision.
Received on: 12/10/2024, Revised on: 15/12/2024, Accepted on: 22/01/2025, Published on: 03/06/2025
DOI: 10.69888/FTSCL.2025.000425
FMDB Transactions on Sustainable Computer Letters, 2025 Vol. 3 No. 2, Pages: 97-104