Nondestructive Classification of Soybean Seed Varieties by Hyperspectral Imaging and Ensemble Machine Learning Algorithms - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec 7;20(23):6980.
doi: 10.3390/s20236980.

Nondestructive Classification of Soybean Seed Varieties by Hyperspectral Imaging and Ensemble Machine Learning Algorithms

Affiliations

Nondestructive Classification of Soybean Seed Varieties by Hyperspectral Imaging and Ensemble Machine Learning Algorithms

Yanlin Wei et al. Sensors (Basel). .

Abstract

During the processing and planting of soybeans, it is greatly significant that a reliable, rapid, and accurate technique is used to detect soybean varieties. Traditional chemical analysis methods of soybean variety sampling (e.g., mass spectrometry and high-performance liquid chromatography) are destructive and time-consuming. In this paper, a robust and accurate method for nondestructive soybean classification is developed through hyperspectral imaging and ensemble machine learning algorithms. Image acquisition, preprocessing, and feature selection are used to obtain different types of soybean hyperspectral features. Based on these features, one of ensemble classifiers-random subspace linear discriminant (RSLD) algorithm is used to classify soybean seeds. Compared with the linear discrimination (LD) and linear support vector machine (LSVM) methods, the results show that the RSLD algorithm in this paper is more stable and reliable. In classifying soybeans in 10, 15, 20, and 25 categories, the RSLD method achieves the highest classification accuracy. When 155 features are used to classify 15 types of soybeans, the classification accuracy of the RSLD method reaches 99.2%, while the classification accuracies of the LD and LSVM methods are only 98.6% and 69.7%, respectively. Therefore, the ensemble classification algorithm RSLD can maintain high classification accuracy when different types and different classification features are used.

Keywords: correlation coefficient matrix; ensemble machine learning algorithms; hyperspectral imaging; random subspace linear discriminant.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Hyperspectral imaging system.
Figure 2
Figure 2
Image processing and spectrum extraction. (I) The reflectance image of soybean seeds at 640.08 nm; (II) The binarized image by Otsu threshold method; (III) The edge of the soybeans and the reflectance “hot spot” extraction by the open operation and the threshold method; (IV) The region of interest of the soybean seeds; (V) The hyperspectral image of the soybean seeds; (VI) The hyperspectral reflectance of individual soybean seeds.
Figure 3
Figure 3
Flowchart of the classification by the random subspace linear discriminant (RSLD) ensemble classifier.
Figure 4
Figure 4
The mean spectrum graph for each of the 25 soybean seed varieties.
Figure 5
Figure 5
(a) The correlation matrix of 462 hyperspectral bands for JY204 soybean seed; (b) Average correlations within diagonal blocks.
Figure 6
Figure 6
Ten-fold classification error of the RSLD algorithm varies with (a) the number of predictors and (b) the number of learners in the ensemble using 30 predictors, when 40, 80, 120, 155 and 185 band features are selected to classify 25 soybean varieties according to the correlation matrix.
Figure 7
Figure 7
The validation accuracies of 10 (a), 15 (b), 20 (c), and 25 (d) soybean varieties classified by the LSVM, LD, and RSLD algorithms using the selected 40, 80, 120, 155, and 185 band features and all spectral band features (462) according to the correlation matrix.
Figure 7
Figure 7
The validation accuracies of 10 (a), 15 (b), 20 (c), and 25 (d) soybean varieties classified by the LSVM, LD, and RSLD algorithms using the selected 40, 80, 120, 155, and 185 band features and all spectral band features (462) according to the correlation matrix.
Figure 8
Figure 8
The average classification accuracies by the LSVM, LD and RSLD algorithms versus with the number of band features.

Similar articles

Cited by

References

    1. Corassa G.M., Santi A.L., Amado T.J.C., Reimche G.B., Gaviraghi R., Bisognin M.B., Pires J.L.F. Performance of soybean varieties differs according to yield class: A case study from Southern Brazil. Precis. Agric. 2019;20:520–540. doi: 10.1007/s11119-018-9595-0. - DOI
    1. Zhu S., Chao M., Zhang J., Xu X., Song P., Zhang J., Huang Z. Identification of Soybean Seed Varieties Based on Hyperspectral Imaging Technology. Sensors. 2019;19:5225. doi: 10.3390/s19235225. - DOI - PMC - PubMed
    1. Zhao T., Wang Z.-T., Branford-White C.-J., Xu H., Wang C.-H. Classification and differentiation of the genus Peganum indigenous to China based on chloroplast trnL-F and psbA-trnH sequences and seed coat morphology. Plant Biol. 2011;6:940–947. doi: 10.1111/j.1438-8677.2011.00455.x. - DOI - PubMed
    1. Ye S., Wang Y., Huang D.-Q., Li J., Gong Y., Xu L., Liu L. Genetic purity test of F1 hybrid seed with molecular markers in cabbage (Brassica oleracea var. capitata) Sci. Hortic. 2013;155:92–96. doi: 10.1016/j.scienta.2013.03.016. - DOI
    1. Rao P.S., Bharathi M., Reddy K.B., Keshavulu K., Rao L.V.S., Neeraja C.N. Varietal identification in rice (Oryza sativa) through chemical tests and gel electrophoresis of soluble seed proteins. Indian J. Agric. Sci. 2012;82:304–311.