谷歌浏览器插件
订阅小程序
在清言上使用

Generating Labeled Samples Based on Improved Cdcgan for Hyperspectral Data Augmentation: A Case Study of Drought Stress Identification of Strawberry Leaves

Fengle Zhu, Jian Wang, Ping Lv,Xin Qiao, Mengzhu He,Yong He,Zhangfeng Zhao

COMPUTERS AND ELECTRONICS IN AGRICULTURE(2024)

引用 0|浏览2
暂无评分
摘要
Deep learning has been increasingly adopted in analyzing the hyperspectral imaging (HSI) data, and a large amount of high-quality labeled dataset is indispensable for its superior modeling performance. However, the acquisition of large-scale annotated spectral data is very time-consuming and costly. To address this challenge, this study proposed an improved conditional Deep Convolutional Generative Adversarial Network (cDCGAN) model to generate labeled HSI samples for data augmentation. The identification of drought stressed strawberry leaves was taken as the study object. Seven small-sample training datasets were constructed with sample sizes of 6, 10, 20, 30, 40, 50, 70 and 100, respectively. Different cDCGAN architectures were tested by varying the network depth and label concatenation pattern. An Indicator of Generated Data Quality (IGDQ) was proposed to evaluate the quality of generated spectra for exploring the optimal architecture of cDCGAN. Then, on each of seven training datasets with limited samples, high-quality pseudo spectral data were generated using the proposed cDCGAN and merged to the original training datasets for data augmentation. Residual Network (ResNet) classifier was established respectively before and after data augmentation. Conventional machine learning classifiers, including Support Vector Machine (SVM) and Decision Tree (DT), were also constructed. Results showed that the accuracy of ResNet, SVM, and DT improved by an average of 6.9%, 3.4%, and 3.1%, respectively, after data augmentation. Moreover, the minimal sample size achieving effective data augmentation could be as low as 20, its augmented datasets achieved comparable or even superior accuracy than the original training dataset with 100 samples. The various aspects affecting the quality of generated spectral data were also discussed, including different model frameworks (cDCGAN and cWGAN) and architectures. The overall results demonstrated that the proposed cDCGAN model achieved satisfactory results on the small-sample datasets of drought stressed strawberry leaves. This method has great potential for the common scenario of imbalanced or small-sample datasets in the domain of plant science.
更多
查看译文
关键词
cDCGAN,Spectral generation,Data augmentation,Hyperspectral data,Drought stressed leaves
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要