基于RFECV-RF-Boosting的烟叶感官质量预测研究
Prediction of tobacco leaf sensory quality based on RFECV-RF and boosting algorithms
-
摘要: 【目的】 解决烟叶感官质量评价中存在的主观性强、数据获取困难等问题,实现基于数字化分析对烟叶感官质量的精准定量预测。【方法】 以河南、湖南、云南和贵州4个典型风格产区的264份烟叶为研究对象,开展化学成分检测与感官质量评价,经化学指标间相关性分析剔除冗余后,采用基于随机森林的交叉验证递归特征消除方法(RFECV-RF)对各感官指标筛选最优特征子集,再分别采用极端梯度提升(XGBoost)、分类梯度提升(CatBoost)和轻量级梯度提升机(LightGBM)3种经典梯度提升(Boosting)算法,经五折交叉验证优化超参数后建立9项感官指标预测模型。【结果】 1) 化学指标相关性分析剔除总糖、糖碱比、钾氯比和棕榈酸4项化学成分指标,保留总糖、还原糖、烟碱等25项化学成分指标用于后续建模。2) RFECV-RF特征筛选选取各感官指标对应的最佳特征组,明确总氮、还原糖、钾和烟碱是影响烟叶感官质量的关键化学成分,其交叉验证的均方根误差(RMSE)均低于全特征集模型,有效降低模型复杂度,显著提升预测精度。3)最优算法下各感官指标决定系数(R2)为0.711 3~0.894 0,RMSE为0.084 5~0.140 4,平均绝对百分比误差(MAPE)为1.06%~1.70%,均取得了良好且稳定的预测效果。【结论】 本文预测模型框架可实现烟叶感官质量高精度的量化预测,为卷烟产品数字化配方设计与品质控制提供参考。
-
关键词:
- 烟叶化学成分 /
- 感官质量 /
- Boosting算法 /
- 机器学习 /
- 特征选择
Abstract: 【Objective】 This study aimed to address the problems of strong subjectivity and difficulty in data acquisition in sensory evaluation of tobacco leaves, and to achieve precise quantitative prediction of tobacco leaf sensory quality based on chemical composition data. 【Methods】 A total of 264 tobacco leaf samples from four typical style-producing regions (Henan, Hunan, Yunnan, and Guizhou) were used for chemical composition determination and sensory quality evaluation. After removing redundant indicators through correlation analysis of chemical variables, the recursive feature elimination with cross-validation based on random forest (RFECV-RF) method was employed to select the optimal feature subset for each sensory attribute. Subsequently, three classic boosting algorithms, namely XGBoost, CatBoost, and LightGBM, were applied, and their hyperparameters were optimized via five-fold cross-validation to develop prediction models for nine sensory attributes. 【Results】 1) RFECV-RF feature selection revealed that total nitrogen, reducing sugars, potassium, and nicotine were the key chemical components influencing tobacco leaf sensory quality. 2) Except for “strength,” the RMSE values for all other attributes were lower with the optimal feature subset than with the full feature model. 3) Under the optimal algorithm, the coefficients of determination (R2) for the sensory attributes ranged from 0.711 3 to 0.894 0, RMSE from 0.084 5 to 0.140 4, and mean absolute percentage error (MAPE) from 1.06% to 1.70%, all showing good and stable predictive performance. 【Conclusion】 The proposed prediction model framework enables high-precision quantitative prediction of tobacco leaf sensory quality. These result provide scientifically reliable technical support for digital formulation design and quality control of cigarette products. -
-
[1]
刘曙光,甘学文,王光耀,等.基于主要化学成分的醇化片烟感官质量预测模型[J].西南农业学报,2020,33(7):1467-1473.
LIU S G,GAN X W,WANG G Y,et al.Construction of sensory quality of tobacco strips during aging based on main chemical constituents[J].Southwest China Journal of Agricultural Sciences,2020,33(7):1467-1473. -
[2]
王建伟,张艳玲,王桂瑶,等.不同香型产区烤烟高可用性上部烟叶质量特征分析[J].烟草科技,2025,58(1):61-68.
WANG J W,ZHANG Y L,WANG G Y,et al.Quality characteristics of upper tobacco leaves with high usability and flavor types from different tobacco growing regions[J].Tobacco Science & Technology,2025,58(1):61-68. -
[3]
郜军艺,彭隆基,胡燕,等.移栽期对云烟87烤后烟叶化学成分及感官质量的影响[J].西南农业学报,2024,37(9):2031-2041.
GAO J Y,PENG L J,HU Y,et al.Effect of transplanting period on chemical composition and sensory quality of cured Yunyan 87[J].Southwest China Journal of Agricultural Sciences,2024,37(9):2031-2041. -
[4]
李志伟,陈溪,王鹏泽,等.基于因子、聚类及判别方法分析烟叶化学和感官质量[J].安徽农学通报,2023,29(13):144-149
,161. LI Z W,CHEN X,WANG P Z,et al.Analysis of chemical and sensory quality of tobacco leaves based on factor,clustering and discriminant methods[J].Anhui Agricultural Science Bulletin,2023,29(13):144-149,161. -
[5]
潘义宏,周芳芳,黄坤,等.连作烤烟根系内次生代谢产物对烤烟品质因子的影响[J].西南农业学报,2023,36(10):2167-2174.
PAN Y H,ZHOU F F,HUANG K,et al.Effect of secondary metabolites in root of continuous cropping flue-cured tobacco on its quality factors[J].Southwest China Journal of Agricultural Sciences,2023,36(10):2167-2174. -
[6]
HE C L,CHEN R X,REN K,et al.A predictive model for the sensory aroma characteristics of flue-cured tobacco based on a back-propagation neural network[J].SN Applied Sciences,2020,2(11):1867.
-
[7]
黄建,杨新士,唐民,等.江西省烟叶化学指标分析及感官质量分类模型构建[J].湖北农业科学,2024,63(10):153-159.
HUANG J,YANG X S,TANG M,et al.Analysis of chemical indicators of tobacco leaves in Jiangxi province and construction of sensory quality classification model[J].Hubei Agricultural Sciences,2024,63(10):153-159. -
[8]
侯冰清,王硕立,张友杰,等.基于BP神经网络的雪茄原料感官质量预测模型构建[J].中国农学通报,2024,40(27):126-133.
HOU B Q,WANG S L,ZHANG Y J,et al.Prediction model of sensory quality of cigar raw materials based on BP neural network[J].Chinese Agricultural Science Bulletin,2024,40(27):126-133. -
[9]
张云伟,张健涛,张海,等.基于近红外光谱与Transformer的烟叶感官指标预测方法[J].农业机械学报,2026,57(1):386-396.
ZHANG Y W,ZHANG J T,ZHANG H,et al.Prediction method of tobacco sensory indicators based on near infrared spectroscopy and Transformer[J].Transactions of the Chinese Society for Agricultural Machinery,2026,57(1):386-396. -
[10]
别瑞,周婷云,周显升,等.基于XGBoost算法的山东烟叶质量预测模型初探[J].中国烟草科学,2022,43(5):80-86
,93. BIE R,ZHOU T Y,ZHOU X S,et al.Study on quality prediction model of Shandong tobacco based on XGBoost algorithm[J].Chinese Tobacco Science,2022,43(5):80-86,93. -
[11]
国家烟草专卖局.烟草及烟草制品 水溶性糖的测定连续流动法:YC/T 159—2019[S].北京:中国标准出版社,2019. State Tobaao Monopoly Administration.Tobacco and tobacco products—Determination of water soluble sugars—Continuous flow method:YC/T 159—2019[S].Beijing:Standard Press of China,2019.
-
[12]
国家烟草专卖局.烟草及烟草制品 总氮的测定连续流动法:YC/T 161—2002[S].北京:中国标准出版社,2002. State Tobaao Monopoly Administration.Tobacco and tobacco products—Determination of total nitrogen—Continuous flow method:YC/T 161—2002[S].Beijing:Standard Press of China,2002.
-
[13]
国家烟草专卖局.烟草及烟草制品 总植物碱的测定 连续流动(硫氰酸钾)法:YC/T 468—2021[S].北京:中国标准出版社,2021. State Tobaao Monopoly Administration.Tobacco and tobacco products—Determination of total alkaloids—Continuous flow method (potassium thiocyanate):YC/T 468—2021[S].Beijing:Standard Press of China,2021.
-
[14]
国家烟草专卖局.烟草及烟草制品 氯的测定 连续流动法:YC/T 162—2011[S].北京:中国标准出版社,2011. State Tobaao Monopoly Administration.Tobacco and tobacco products—Determination of chloride—Continuous flow method:YC/T 162—2011[S].Beijing:Standard Press of China,2011.
-
[15]
国家烟草专卖局.烟草及烟草制品 钾的测定 连续流动法:YC/T 217—2007[S].北京:中国标准出版社,2007. State Tobaao Monopoly Administration.Tobacco and tobacco products—Determination of Potassium—Continuous flow method:YC/T 217—2007[S].Beijing:Standard Press of China,2007.
-
[16]
国家烟草专卖局.烟草及烟草制品 多酚类化合物 绿原酸、莨菪亭和芸香苷的测定:YC/T 202—2006[S].北京:中国标准出版社,2006. State Tobaao Monopoly Administration.Tobacco and tobacco products—Determination of polyphenols—Chlorogenic acid,scopletin and rutin:YC/T 202—2006[S].Beijing:Standard Press of China,2006.
-
[17]
刘瑞红,潘立宁,王晓瑜,等.气相色谱测定烟草中非挥发有机酸方法改进[J].化学分析计量,2022,31(12):22-28.
LIU R H,PAN L N,WANG X Y,et al.Improvement of the analysis method for non-volatile organic acids in tobacco by gas chromatography[J].Chemical Analysis and Meterage,2022,31(12):22-28. -
[18]
国家烟草专卖局.烟草及烟草制品 感官评价方法:YC/T 138—1998[S].北京:中国标准出版社,1998. State Tobaao Monopoly Administration.Tobacco and tobacco products—The sensory evaluation methods:YC/T 138—1998[S].Beijing:Standard Press of China,1998.
-
[19]
EBRAHIMI WARKIANI M,MOATTAR M H.A comprehensive survey on recent feature selection methods for mixed data:Challenges,solutions and future directions[J].Neurocomputing,2025,623:129372.
-
[20]
AKIBA T,SANO S,YANASE T,et al.Optuna:A next-generation hyperparameter optimization framework[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.Anchorage AK USA.ACM,2019:2623-2631.
-
[21]
BANOŽIĆ M,JOKIĆ S,A AČG KAR D,et al.Carbohydrates-key players in tobacco aroma formation and quality determination[J].Molecules,2020,25(7):1734.
-
[22]
LIU T,NIU Y M,CHENG K X,et al.Exploring the formation pathway and antioxidant properties of the sugar-smoking pigment 5-GGMF[J].Food Chemistry,2024,442:138406.
-
[23]
谢恒多.氮钾协同对烟碱合成的影响及NtQPT2A上游调控因子的筛选[D].成都:四川农业大学,2024. XIE H D.The synergistic effect of nitrogen and potassium on nicotine synthesis and screening of upstream regulatory factors of NtQPT2A[D].Chengdu:Sichuan Agricultural University,2024.
-
[24]
曹景林,程君奇,李亚培,等.烤烟常规化学成分与吸食品质关系的研究进展[J].湖北农业科学,2020,59(S1):253-258
,262. CAO J L,CHENG J Q,LI Y P,et al.Research progress on the relationship between routine chemical composition and smoking quality of flue-cured tobacco[J].Hubei Agricultural Sciences,2020,59(S1):253-258,262. -
[25]
刘天择,杨菁,汪旭,等.不同部位烤烟化学成分及热解产物与加热卷烟感官质量的关系[J].中国烟草科学,2023,44(1):77-84.
LIU T Z,YANG J,WANG X,et al.Relationships between chemical components and pyrolytic products and sensory quality of heated tobacco of different position flue-cured tobacco leaves[J].Chinese Tobacco Science,2023,44(1):77-84. -
[26]
黄天雄,于洁,贾楠,等.基于拉曼光谱法所建的多元校正模型预测烟草中绿原酸和芸香苷的含量[J].理化检验-化学分册,2022,58(2):210-215.
HUANG T X,YU J,JIA N,et al.Prediction of chlorogenic acid and rutin in tobacco by multivariate calibration model based on Raman spectroscopy[J].Physical Testing and Chemical Analysis Part B (Chemical Analysis),2022,58(2):210-215. -
[27]
朱晓晨,尹奇志,赵福芹,等.基于LightGBM的船舶航速预测模型[J].大连海事大学学报,2023,49(1):56-65.
ZHU X C,YIN Q Z,ZHAO F Q,et al.Ship speed prediction model based on LightGBM[J].Journal of Dalian Maritime University,2023,49(1):56-65.
-
[1]
-
点击查看大图
计量
- PDF下载量: 1
- 文章访问数: 375
- 引证文献数: 0

下载: