Chinese Journal of Pharmacovigilance ›› 2025, Vol. 22 ›› Issue (9): 1040-1044.
DOI: 10.19803/j.1672-8629.20250291

Previous Articles     Next Articles

An Automatic Generation Platform for Adverse Drug Reaction Reports from Literature Sources Based on the Generative Artificial Intelligence Large Language Model

LIU Qin1, ZHANG Jiayi1, WU Xuan1,*, SOANES Nigel2   

  1. 1AstraZeneca Investment (China) Ltd., Shanghai 201203, China;
    2ASTRAZENECA UK LIMITED, Cambridge CB2 0AA, United Kingdom
  • Received:2025-05-09 Published:2025-09-22

Abstract: Objective To leverage generative artificial intelligence to explore how to automate the generation of adverse drug reaction (ADR) reports from medical literature in order to enhance the efficiency and accuracy of the report processing workflow. Methods Based on the mature basic large language model in this industry, table parsing, OCR recognition and other advanced technologies, and using the annotated data training set (6 925 pieces) retrieved from about 700 published articles in Chinese downloaded from CNKI and WanFang Database, related training was carried out under an ADR report generation-specific scenario. A model was established that could understand academic publications and automatically generate individual case safety reports (ICSRs). This platform was optimized through multiple rounds of algorithmic iterations and assisted by manual review. The research results were evaluated by comparing the values of three AI algorithm performance indicators (recall rate, precision, F1 value) and the report processing time. Results After four rounds of algorithm iterations, the recall rate, precision and F1 value of this generative AI model for ADR reports reached 97.1%, 90.1% and 93.5% respectively, all meeting the project’s acceptance standards. The processing time for ICSRs was reduced from 80 minutes taken by conventional manual methods to 45 minutes, which was boosted by 77.8%. Conclusion Generative artificial intelligence offers new tools for the intelligent identification and efficient generation of ADR reports sourced from literature and plays a significant role in driving the automation and intelligent transformation of pharmacovigilance. However, this platform has such limitations as the complexity of formatting and abnormal data. Constant optimization of datasets and algorithms is required, along with strict adherence to ethical and regulatory requirements to ensure data compliance.

Key words: Adverse Drug Reactions, Individual Case Safety Reports, Generative Artificial Intelligence, Large Language Models, Machine Learning

CLC Number: