中国药物警戒 ›› 2025, Vol. 22 ›› Issue (12): 1410-1417.
DOI: 10.19803/j.1672-8629.20250512

• 安全与合理用药 • 上一篇    下一篇

基于自发报告数据驱动机器学习模型预测严重药品不良反应

刘溪1, 李晨1,2, 田源1, 陈孟莉1,2*   

  1. 1解放军总医院药剂科,北京 100853;
    2解放军药品不良反应监测中心,北京 100853
  • 收稿日期:2025-08-01 发布日期:2025-12-19
  • 通讯作者: *陈孟莉,女,博士,研究员,医院药学与合理用药。E-mail:hellolily301cn@126.com
  • 作者简介:刘溪,男,在读硕士,临床药学与机器学习建模。
  • 基金资助:
    基于机器学习预警监测老年多重用药患者肾功能损伤的风险模型研究(24BJZ37)

Prediction of Severe Adverse Drug Reactions Based on Spontaneous Reporting Data-Driven Machine Learning Models

LIU Xi1, LI Chen1,2, TIAN Yuan1, CHEN Mengli1,2*   

  1. 1Department of Pharmacy, Medical Supplies Centre of PLA General Hospital, Beijing 100853, China;
    2PLA ADR Monitoring Center, Beijing 100853, China
  • Received:2025-08-01 Published:2025-12-19

摘要: 目的 基于药品不良反应(ADR)自发报告数据,构建严重ADR的智能预测模型,提升药物警戒效率,辅助早期识别高风险ADR,优化医疗资源分配。方法 纳入4 144例ADR自发报告。采用DeepSeek大语言模型标准化诊断信息和药品名称。选取24个临床特征,通过特征工程处理年龄、时间变量及无序类别数据。系统比较伯努利朴素贝叶斯(BNB)、随机森林(RF)等10种机器学习算法,并评估SMOTE、ADASYN、TomekLinks等采样技术对数据不平衡(严重ADR占10.4%)的处理效果。模型性能以精确召回曲线下面积(AUPRC)为首要指标,曲线下的面积(AUC)、灵敏度(TPR)为次要指标,结合SHAP值解析特征贡献。结果 BNB结合TomekLinks表现最优,在内部验证集中的AUC=0.921、AUPRC=0.757、灵敏度=0.626,在模型外数据验证中AUPRC=0.711、AUC=0.901,泛化能力良好。通过SHAP解析,“导致住院或延长住院”“年龄”“免疫异常及感染”是显著的正向影响特征,“对原患疾病的影响不明显”“胃肠、皮肤及皮肤附件损害”“结果痊愈/好转”是负向影响特征。结论 将DeepSeek大语言模型应用于结构化ADR数据标准化处理,结果证实,针对高维稀疏特征,欠采样技术TomekLinks优于过采样方法。BNB算法作为贝叶斯定理的经典分类算法,高效的分类能力在众多算法中依然表现优异。局限性包括单中心数据偏倚和严重ADR样本量不足。未来需整合多中心数据、结合药物分子特征及自然语言处理技术,构建更精准的预警系统。

关键词: 药品不良反应, 严重程度, 预测模型, 机器学习, 伯努利朴素贝叶斯, DeepSeek

Abstract: Objective To construct an intelligent model for prediction of severe adverse drug reactions (ADR) based on spontaneous ADR reporting data in order to enhance the efficiency of pharmacovigilance, identify high-risk adverse reactions earlier, and optimize the allocation of healthcare resources. Methods A retrospective analysis was conducted of 4 144 spontaneous ADR reports. The DeepSeek large language model (LLM) was applied to standardize diagnostic information and names of drugs. Twenty-four clinical features were selected to process data on age, time variables, and unordered categories with feature engineering. Ten machine learning algorithms, including Bernoulli Naive Bayes (BNB) and Random Forest (RF), were compared. Besides, the effectiveness of such sampling techniques as SMOTE, ADASYN, and TomekLinks in addressing data imbalance (10.4% severe ADR) was evaluated. Model performance was evaluated with the area under the precision-recall curve (AUPRC) as primary metrics, and the area under the curve (AUC) and true positive rate (TPR) as secondary metrics. Feature contributions were analyzed by using SHAP values. Results The combination of BNB and TomekLinks delivered the best performance. In internal validation, AUC=0.921, AUPRC=0.757, and TPR=0.626 were achieved, compared with AUPRC=0.711 and AUC=0.901 in external validation, suggesting good generalization ability. SHAP analysis revealed that hospitalizations or prolonged hospital stay, age, and immune abnormalities or infections indicated significant positive influence, while insignificant impact on preexisting diseases, damage to the gastrointestinal system and the skin and its accessories, and recovery/improvement were indicators of negative influence. Conclusion Undersampling techniques, particularly TomekLinks, outperform oversampling methods for high-dimensional sparse features. The BNB algorithm, a classic classification method based on Bayes' theorem, continues to excel in classification efficiency among various algorithms. Limitations include potential bias from single-center data and insufficient sample size for severe ADR. A more accurate early warning system should be established by integrating multi-center data, taking molecular features of drugs into consideration and leveraging natural language processing technologies.

Key words: Adverse Drug Reaction, Severity, Prediction Model, Machine Learning, Bernoulli Naive Bayes, DeepSeek

中图分类号: