With the development of medical big data, the real-world study (RWS) has received increasing attention in recent years, and has a good promising prospect. However, there are still some challenges in the implementation of RWS that has led to extensive discussion among scholars. The most urgent issue currently to be addressed is the unstructured nature of real-world data (RWD). Based on regular expressions, this study used rule-based information extraction method to extract structured information from admission records, pathological reports, surgical records, and image records of bladder cancer patients in Zhongnan Hospital of Wuhan University in recent years, and evaluated the extraction effects with accuracy and recall as indicators, aiming to provide reference for subsequent research.
HomeArticlesVol 34,2024 No.3Detail
Research on real-world knowledge mining and knowledge graph completion v(III):structured information extraction from real world data of bladder cancer based on regular expression
Published on Mar. 29, 2024Total Views: 781 timesTotal Downloads: 682 timesDownloadMobile
- Abstract
- Full-text
- References
Abstract
Full-text
References
1.US FDA. Real-world evidence program framework[EB/ OL]. (2019-05) [2022-07-13]. https://www.fda.gov/drugs/webinar-framework-fdas-real-world-evidence-program-mar-15-2019.
2.杨羽, 詹思延. 上市后大数据药品安全主动监测模式研究的必要性和可行性[J]. 药物流行病学杂志, 2016, 25(7): 401-404, 413. [Yang Y, Zhan SY. Analysis of necessity and feasibility in studies of post-marketing drug safety active surveillance based on big data[J]. Chinese Journal of Pharmacoepidemiology, 2016, 25(7): 401-404, 413.] DOI: 10.19960/j.cnki.issn1005-0698.2016.07.001.
3.阎思宇,李绪辉,陈沐坤,等. 面向真实世界的知识挖掘与知识图谱补全研究(二): 非结构化电子病历信息抽取方法及进展[J]. 医学新知, 2023, 33(5): 358-365. [Yan SY, Li XH, Chen MK, et al. Research on real-world knowledge mining and knowledge graph completion (II): methods and progress of information extraction from unstructured electronicmedical records[J]. Yixue Xinzhi Zazhi, 2023, 33(5): 358-365.] DOI: 10.12173/j.issn.1004-5511.202301016.
4.胡军伟,秦奕青,张伟. 正则表达式在Web信息抽取中的应用[J]. 北京信息科技大学学报(自然科学版), 2011, 26(6): 86-89. [Hu JW, Qin YQ, Zhang W. Regular expression and its applications to web information extraction[J]. Journal of Beijing Institute of Machinery, 2011, 26(6): 86-89.] DOI: 10.3969/j.issn.1674-6864. 2011.06.019.
5.Cheung ATM, Kurland DB, Neifert S, et al. Developing an automated registry (Autoregistry) of spine surgery using natural language processing and health system scale databases[J]. Neurosurgery. 2023, 93(6): 1228-1234. DOI: 10.1227/neu.0000000000002568.
6.Flores CA, Figueroa RL, Pezoa JE. FREGEX: a feature extraction method for biomedical text classification using regular expressions[J]. Annu Int Conf IEEE Eng Med Biol Soc. 2019, 2019: 6085-6088. DOI: 10.1109/EMBC.2019.8857471.
7.范玉玲,顾进广,黄智生. 中文医学指南的事件处理及其语义数据自动生成[J]. 中国数字医学, 2015(9): 76-78, 112. [Fan YL, Gu JG, Huang ZS. Event handling of Chinese medical guide and the automatic generation of its semantic data[J]. China Digital Medicine, 2015(9): 76-78, 112.] DOI: 10.3969/j.issn.1673-7571.2015.09.026.
8.Humphrey PA, Moch H, Cubilla AL, et al. The 2016 WHO classification of tumours of the urinary system and male genital organs-part b: prostate and bladder tumours[J]. Eur Urol. 2016, 70(1): 106-119. DOI: 10.1016/j.eururo. 2016.02.028.
9.Amin MB, Greene FL, Edge SB, et al. The eighth edition AJCC cancer staging manual: continuing to build a bridge from a population-based to a more "personalized" approach to cancer staging[J]. CA Cancer J Clin. 2017, 67(2): 93-99. DOI: 10.3322/caac.21388.
10.徐荣飞. Python正则表达式研究[J]. 电脑编程技巧与维护, 2015(9): 45, 49. [Xu RF. Python research on regular expressions[J]. Computer Programming Skills & Maintenance, 2015(9): 45, 49.] DOI: 10.3969/j.issn.1006-4052.2015.09.020.
11.梁立荣,李长伟,沈晔,等. 基于层叠条件随机场模型的电子病历文本信息抽取[J]. 计算机应用与软件, 2019, 36(10): 47-54, 112. [Liang LR, Li CW, Shen Y, et al. Text information extraction for electronic medical record based on cascaded conditional random field model[J]. Computer Applications and Software, 2019, 36(10): 47-54, 112.] DOI: 10.3969/j.issn.1000-386x.2019.10.009.
12.吴欢,应俊,王逸飞,等. 乳腺癌病理文本的结构化信息提取[J]. 解放军医学院学报, 2020, 41(7): 746-751. [Wu H, Ying J, Wang YF, et al. Structured information extraction from breast cancer pathological report texts[J]. Academic Journal of Chinese PLA Medical School, 2020, 41(7): 746-751.] DOI: 10.3969/j.issn.2095-5227.2020.07.022.
13.杨金荣,喻杰,叶豪,等. 正则表达式在提取冠状动脉CTA和钙化积分报告结构化信息中的应用[J]. 中国数字医学, 2022, 17(11): 38-44. [Yang JR, Yu J, Ye H, et al. Application of regular expression in extracting structured information of coronary artery CTA and calcification score reports[J]. China Digital Medicine, 2022, 17(11): 38-44.] DOI: 10.3969/j.issn.1673-7571.2022.11.008.
14.安辉. 健康评估中医学知识的可视化呈现与交互[D]. 浙江: 杭州师范大学, 2019. [An H. Visual presentation and interaction of medical knowledge in health assessment[D]. Zhejiang: Hangzhou Normal University, 2019.]
15.王晓琳. 正则表达式生成与复杂正则表达式识别技术研究[D]. 北京:中国科学院大学, 2022. [Wang XL. Research on regular expression generation and complex regular expression recognition techniques[D]. Beijing: University of Chinese Academy of Sciences, 2022.]
16.鲍彤,章成志. ChatGPT中文信息抽取能力测评——以三种典型的抽取任务为例[J/OL]. 数据分析与知识发现, 1-16. [Bao T, Zhang CZ. Extracting Chinese information with ChatGPT: an empirical study by three typical tasks[J/QL]. Data Analysis and Knowledge Discovery, 1-16. DOI: 10.11925/infotech.2096-3467. 2023.0473.
17.吴骋,徐蕾,秦婴逸,等. 中文电子病历多层次信息抽取方法的探索[J]. 中国数字医学, 2020, 15(6): 29-31. [Wu P, Xu L, Qin YY. Exploration on the multi-level information extraction method of Chinese electronic medical records[J]. China Digital Medicine, 2020, 15(6): 29-31.] DOI: 10.3969/j.issn.1673-7571.2020.06.009.
18.Adamson B, Waskom M, Blarre A, et al. Approach to machine learning for extraction of real-world data variables from electronic health records[J]. Front Pharmacol. 2023, 14: 1180962. DOI: 10.3389/fphar.2023.1180962.
19.周虎子威,张云静,于玥琳,等. 机器学习方法在预测麻精药品不合理使用风险中的应用现状和思考[J]. 药物流行病学杂志, 2023, 32(4): 446-457. [Zhou HZW, Zhang YJ, Yu YL, et al. Application of machine learning methods in predicting the risk of irrational use of narcotic and psychotropic drugs:current status and considerations[J]. Chinese Journal of Pharmacoepidemiology, 2023, 32(4): 446-457.] DOI: 10.19960/j.issn.1005-0698.202304010.
Popular Papers
-
A multicenter, open-label and phase Ⅳ clinical study on the treatment of urinary tract infections with Relinqing granules
Jul. 30, 20242491
-
Development situation and expert suggestion on "Internet+Traditional Chinese Medicine" in China
Jun. 01, 20242182
-
Analysis of the relationship between home skin care associated factors and disease severity for children with atopic dermatitis
Jun. 01, 20241928
-
Mechanism of ALKBH5 mediated m6A regulation of Galectin-9 in the invasion, migration, and proliferation of endometrial stromal cell
Jun. 01, 20241754
-
Current situation and reform trend of medical practical course teaching mode in the "AI+Education" era
Aug. 31, 20241579
-
Analysis of the disease burden of benign prostatic hyperplasia in China, the United States and Germany at 1990 and 2019
Jun. 01, 20241496
-
Risk factors and prediction model construction for poor outcome in asthma combined with severe community-acquired pneumonia in children
Jun. 01, 20241474
-
Relationship and potential mechanisms between gut microbiota and benign prostatic hyperplasia
Jun. 01, 20241319