SEQUENTIAL LEARNING METHOD WITH DECOUPLED SUB-MODELS FOR IMPROVING THE EFFICIENCY OF HIERARCHICAL APPAREL IMAGE CLASSIFICATION
DOI:
https://doi.org/10.31891/2307-5732-2025-359-67Keywords:
limited data, e-commerce, clothing recognition, deep learning, hierarchical classification, ResNet50Abstract
This paper presents a sequential training method for specialized models in hierarchical fashion image classification based on a modified ResNet-50 architecture. Unlike the traditional approach with a single model, single input, and combined outputs, we developed three separate models trained sequentially for classifying categories, subcategories, and attributes. The key innovation lies in training the subcategory model first with full ResNet-50 optimization, then freezing these learned features for subsequent category and attribute models. This approach allows each model to specialize in its specific classification task while maintaining computational efficiency. Experimental research was conducted on the Fashion Product Images (Small) dataset, focusing on "Apparel" and "Footwear" categories comprising 24’996 images. The dataset presents significant challenges including class imbalance, with some subcategories containing fewer than 100 samples. Results demonstrate significant improvement in subcategory classification accuracy from 78.8% to 83.9% on the full dataset and robustness to accuracy degradation under limited data conditions– with a 10-fold reduction in training samples, subcategory classification accuracy increases from 32.4% to 70.2%, representing a 2.2x improvement. The developed method provides a 45% reduction in total model size (24.8M vs 45.6M parameters) and faster convergence (21-24 epochs versus 40). Each specialized model employs a simplified MLP architecture with 512-256 neurons configuration, dropout regularization of 0.3, and L2 regularization of 0.0005. The practical value of the method lies in enabling efficient classification under limited computational resources and training data constraints, which is typical for specialized e-commerce domains. While attribute classification accuracy slightly decreased from 98.6% to 92%, this trade-off is acceptable given the substantial improvements in other metrics and the method's superior performance in data-scarce scenarios, making it particularly suitable for real-world e-commerce applications where obtaining large annotated datasets is challenging.
References
Downloads
Published
Issue
Section
License
Copyright (c) 2025 РОМАН ТИМКІВ, ПАВЛО ГОРУН (Автор)

This work is licensed under a Creative Commons Attribution 4.0 International License.