Objective To investigate the application value of different machine learning models in predicting 5-year survival in patients with rectal squamous cell carcinoma (rSCC).
Methods Data from patients diagnosed with rSCC between 2004 and 2015 were collected using SEER*Stat software in SEER database. Patients were randomly divided into a training set and a validation set in a 7 : 3 ratio. Models were constructed using extreme gradient boosting (XGBoost), random forest, support vector machine, and k-nearest neighbor algorithms on the training set. The predictive ability of the models was evaluated using the area under the receiver operating characteristic curve (AUC), calibration curves, and decision curves. The SHAP algorithm was used to identify the contribution of variables to the model for the best-performing model.
Results A total of 833 patients with rSCC were included, including 584 in the training set and 249 in the validation set. Multiple machine learning models were constructed based on 10 variables: age, sex, race, marital status, T stage, N stage, M stage, surgery, chemotherapy, and radiotherapy. In the validation set, the XGBoost model performed best [AUC=0.758, 95%CI (0.696, 0.820)], demonstrating moderate predictive ability and good calibration. The decision curves demonstrated high clinical value. SHAP analysis showed that T stage contributed most to the XGBoost model's decision-making, while radiotherapy contributed least.
Conclusion This study constructed an interpretable XGBoost model that can assist physicians in assessing the 5-year survival and treatment efficacy of rSCC patients and in developing personalized treatment plans.
1.Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries[J]. CA Cancer J Clin, 2024, 74(3): 229-263. DOI: 10.3322/caac.21834.
2.Kulaylat AS, Holleneak CS, Stewart DS. Squamous cancers of the rectum demonstrate poorer survival and increased need for salvage surgery compared with squamous cancers of the anus[J]. Dis Colon Rectum, 2017, 60(9): 922-927. DOI: 10.1097/DCR.0000000000000881.
3.Guerra GR, Kong CH, Warrier SK, et al. Primary squamous cell carcinoma of the rectum: an update and implications for treatment[J]. World J Gastrointest Surg, 2016, 8(3): 252-265. DOI: 10.4240/wjgs.v8.i3.252.
4.Astaras C, Devito C, Chaskar P, et al. The first comprehensive genomic characterization of rectal squamous cell carcinoma[J]. J Gastroenterol, 2023, 58(2): 125-134. DOI: 10.1007/s00535-022-01937-w.
5.Pigott JP, Williams GB. Primary squamous cell carcinoma of the colorectum: case report and literature review of a rare entity[J]. J Surg Oncol, 1987, 35(2): 117-119. DOI: 10.1002/jso.2930350211.
6.Liz-Pimenta J, Ferreira C, Araujo A, et al. Comprehensive look at rectal squamous cell carcinoma[J]. BMJ Case Rep, 2024, 17(1): e255284. DOI: 10.1136/bcr-2023-255284.
7.Schizas D, Katsaros I, Mastoraki A, et al. Primary squamous cell carcinoma of colon and rectum: a systematic review of the literature[J]. J Invest Surg, 2022, 35(1): 151-156. DOI: 10.1080/08941939.2020.1824044.
8.Steinemann DC, Muller PC, Billeter AT, et al. Surgery is essential in squamous cell cancer of the rectum[J]. Langenbecks Arch Surg, 2017, 402(7): 1055-1062. DOI: 10.1007/s00423-017-1614-5.
9.Loganadane G, Servagi-Vernat S, Schernberg A, et al. Chemoradiation in rectal squamous cell carcinoma: bi-institutional case series[J]. Eur J Cancer, 2016, 58: 83-89. DOI: 10.1016/j.ejca.2016.02.005.
10.Song EJ, Jacobs CD, Palta M, et al. Evaluating treatment protocols for rectal squamous cell carcinomas: the duke experience and literature[J]. J Gastrointest Oncol, 2020, 11(2): 242-249. DOI: 10.21037/jgo.2018.11.02.
11.Kommalapati A, Tella SH, Yadav S, et al. Survival and prognostic factors in patients with rectal squamous cell carcinoma[J]. Eur J Surg Oncol, 2020, 46(6): 1111-1117. DOI: 10.1016/j.ejso.2020.02.039.
12.Ozuner G, Aytac E, Gorgun E, et al. Colorectal squamous cell carcinoma: a rare tumor with poor prognosis[J]. Int J Colorectal Dis, 2015, 30(1): 127-130. DOI: 10.1007/s00384-014-2058-9.
13.Giannakeas V, Lim DW, Narod SA. Bilateral mastectomy and breast cancer mortality[J]. JAMA Oncol, 2024, 10(9): 1228-1236. DOI: 10.1001/jamaoncol.2024.2212.
14.Ganti AK, Klein AB, Cotarla I, et al. Update of incidence, prevalence, survival, and initial treatment in patients with non-small cell lung cancer in the US[J]. JAMA Oncol, 2021, 7(12): 1824-1832. DOI: 10.1001/jamaoncol.2021.4932.
15.Guan X, Jiao S, Wen R, et al. Optimal examined lymph node number for accurate staging and long-term survival in rectal cancer: a population-based study[J]. Int J Surg, 2023, 109(8): 2241-2248. DOI: 10.1097/JS9.0000000000000320.
16.Astaras C, Bornand A, Koessler T. Squamous rectal carcinoma: a rare malignancy, literature review and management recommendations[J]. ESMO Open, 2021, 6(4): 100180. DOI: 10.1016/j.esmoop.2021.100180.
17.Liu R, Zhang J, Zhang Y, et al. Treatment paradigm and prognostic factor analyses of rectal squamous cell carcinoma[J]. Front Oncol, 2023, 13: 1160159. DOI: 10.3389/fonc.2023.1160159.
18.Chiu MS, Verma V, Bennion NR, et al. Comparison of outcomes between rectal squamous cell carcinoma and adenocarcinoma[J]. Cancer Med, 2016, 5(12): 3394-3402. DOI: 10.1002/cam4.927.
19.Diao JD, Wu CJ, Cui HX, et al. Nomogram predicting overall survival of rectal squamous cell carcinomas patients based on the SEER database: a population-based STROBE cohort study[J]. Medicine (Baltimore), 2019, 98(46): e17916. DOI: 10.1097/MD.0000000000017916.
20.Miller DD, Brown EW. Artificial intelligence in medical practice: the question to the answer?[J]. Am J Med, 2018, 131(2): 129-133. DOI: 10.1016/j.amjmed.2017.10.035.
21.Ma B, Meng F, Yan G, et al. Diagnostic classification of cancers using extreme gradient boosting algorithm and multi-omics data[J]. Comput Biol Med, 2020, 121: 103761. DOI: 10.1016/j.compbiomed.2020.103761.
22.Yu W, Lu Y, Shou H, et al. A 5-year survival status prognosis of nonmetastatic cervical cancer patients through machine learning algorithms[J]. Cancer Med, 2023, 12(6): 6867-6876. DOI: 10.1002/cam4.5477.
23.Jiang J, Pan H, Li M, et al. Predictive model for the 5-year survival status of osteosarcoma patients based on the SEER database and XGBoost algorithm[J]. Sci Rep, 2021, 11(1): 5542. DOI: 10.1038/s41598-021-85223-4.
24.孟祥勇, 秦嘉怡, 陈文生. 基于机器学习的早期胃癌淋巴结转移预测模型构建与验证[J]. 陆军军医大学学报, 2024, 46(21): 2432-2442. [Meng XY, Qin JY, Chen WS. Construction and validation of a prediction model for lymph node metastasis in early gastric cancer based on machine learning[J]. Journal of Army Medical University, 2024, 46(21): 2432-2442.] DOI: 10.16016/j.2097-0927.202403126.
25.田园, 林志浩, 李瑞, 等. 基于组合优化的机器学习模型预测胃癌术后感染性并发症的诊断性研究[J]. 中国循证医学杂志, 2024, 24(9): 993-1003. [Tian Y, Lin ZH, Li R, et al. Diagnostic study of machine learning model based on combinatorial optimization to predict postoperative infectious complications of gastric cancer[J]. Chinese Journal of Evidence-Based Medicine, 2024, 24(9): 993-1003.] DOI: 10.7507/1672-2531.202310069.