IMAGE SEGMENTATION AND CLASSIFICATION USING SLIC SUPERPIXEL IN A FOREST ENVIRONMENT
DOI:
https://doi.org/10.31891/2307-5732-2024-345-6-9Keywords:
unmanned aerial vehicles , convolutional neural network, forest segmentation, remote sensing, semantic segmentationAbstract
Unmanned aerial vehicles (UAVs) have become an indispensable tool for collecting high-precision geospatial data due to their affordability, mobility, and ability to provide extremely high detail. Unlike traditional remote sensing systems, such as satellites or airborne platforms, UAVs allow for observation of localized areas with a high frequency of repeated data collection. This opens up new possibilities for monitoring environmental changes, analyzing land use, and identifying individual objects such as trees due to the extremely high resolution of the images.
The study focuses on automated identification of trees within a forest area represented as an orthophoto map created from 255 high-resolution images. The main data processing method is digital superpixel segmentation using the Simple Linear Iterative Clustering (SLIC) algorithm. This approach allows grouping pixels into compact regions (superpixels) characterized by homogeneous texture or color properties, which greatly simplifies further classification. Three different segmentation configurations were chosen for the study: 2000, 3000, and 4000 superpixels. The algorithm parameters, such as smoothing scale σ = 5 and compactness = 10, were selected to ensure optimal segmentation quality.
Superpixel classification was performed using the architecture of the ResNet-50 deep convolutional neural network. This model was pre-trained on a large set of images to recognize common textures and shapes, after which its weights were adapted to classify trees and backgrounds based on the new superpixel image dataset. This approach, which combines model pre-training and knowledge transfer, significantly improved classification accuracy.
During the experiments, it was determined that the best performance was achieved when segmenting 3000 superpixels, where the classification accuracy was 87%. This indicates an optimal balance between segmentation detail and the model's ability to accurately recognize objects. Using a smaller number of superpixels resulted in the loss of fine details, while excessive detail increased computational costs and degraded the results due to insufficient object clarity.