WANG Longxin, FENG Wenning, CUI Fuyun, et al. Sensory quality prediction of tobacco leaves based on RFECV-RF-Boosting[J]. Journal of Light Industry.
Citation:
WANG Longxin, FENG Wenning, CUI Fuyun, et al. Sensory quality prediction of tobacco leaves based on RFECV-RF-Boosting[J]. Journal of Light Industry.
Sensory quality prediction of tobacco leaves based on RFECV-RF-Boosting
-
1. College of Tobacco Science, Henan Agricultural University, Zhengzhou 450002, China;
-
2. Technology Center, China Tobacco Hebei Industrial Co., Ltd., Shijiazhuang 050051, China;
-
3. Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou 450001, China
-
Corresponding author:
FENG Wenning, fengwn@126.com
-
Received Date:
2025-07-09
Accepted Date:
2025-09-30
Available Online:
2026-05-09
-
Abstract
【Objective】 To address the challenges of subjectivity and data acquisition difficulties in tobacco leaf sensory quality evaluation, and to achieve precise quantitative prediction of tobacco leaf sensory quality based on digital analysis. 【Methods】 A total of 264 tobacco leaf samples from four typical producing regions in China—Henan, Hunan, Yunnan, and Guizhou—were selected for chemical composition analysis and sensory quality evaluation. After removing redundant variables through correlation analysis of chemical indicators, RFECV-RF was used to select the optimal feature subsets for each sensory attribute. Three classic boosting algorithms—XGBoost, CatBoost, and LightGBM—were applied, with hyperparameters optimized using five-fold cross-validation within the Optuna framework to build prediction models for nine sensory attributes. 【Results】 1) Correlation analysis of chemical indices removed four chemical constituent indices, namely total sugar, sugar-to-nicotine ratio, potassium-to-chlorine ratio, and palmitic acid, and retained 25 chemical composition indices, including reducing sugar and nicotine, for subsequent modeling. 2) RFECV-RF feature selection identified the optimal feature subset for each sensory attribute, and further demonstrated that total nitrogen, reducing sugar, potassium, and nicotine were the key chemical constituents affecting tobacco leaf sensory quality. Except for “impact”, the root mean square error ( RMSE) obtained by cross-validation was lower than that of the full-feature model, indicating that feature selection effectively reduced model complexity and improved prediction accuracy. 3) Under the optimal algorithm, the coefficient of determination (R2) for the sensory attributes ranged from 0. 711 3 to 0. 894 0, the RMSE ranged from 0. 084 5 to 0. 140 4, and the mean absolute percentage error (MAPE) ranged from 1. 06% to 1. 70%, indicating good and stable predictive performance. 【Conclusion】 The prediction model framework enables high-precision quantification of tobacco leaf sensory quality. The research result provide aa reference for the digital formulation design and quality control of cigarette products.
-
-
References
-
Proportional views
-
-