Application Test of the AI-Automatic Diagnostic System for Ki-67 in Breast Cancer
目的 研究人工智能（AI）用于辅助临床乳腺浸润性导管癌（IDC）Ki-67评分的不同方法并比较其结果。 方法 收集100例真实临床IDC诊断病例，包括HE、免疫组化Ki-67染色的切片和诊断结果。将病理切片扫描成全片数字化图像（whole slide image, WSI），并使用AI对其进行评分。AI评分方式分为两种，一种为AI纯自动计数，使用Ki-67自动诊断的评分系统对WSI进行全片计数；第二种是AI半自动计数，需要人工选择区域计数，然后用智能显微镜进行自动计数。病理医生的诊断结果作为纯人工计数的结果。将全人工（病理诊断结果）、AI半自动、AI全自动此3种计数所得的Ki-67分数进行两两比较，分别按差异高低进行归类，差异高低分为3档：相差≤10%、相差>10%～<30%和相差≥30%，并且使用组内相关系数 （intra-class correlation coefficient, ICC）对其进行相关性的评价。 结果 全自动AI计数1例Ki-67的时间为5～8 min，而半自动AI方法为2～3 min，全人工计数则需要1～3 min。两种AI计数方法相比较，Ki-67分数的相差全部在10%以内（占比100%），ICC指数高达0.992。全人工计数和AI半自动相比，相差≤10%的有60例（占比60%），相差>10%～<30%的例数为37例（占比37%），而≥30%的只有3例（占比3%），ICC指数为0.724；全人工计数和AI全自动相比，相差≤10%的有78例（占78%），相差>10%～<30%的例数为17例（占比17%），而≥30%的有5例（占比5%），ICC指数为0.720。ICC数值示，两种AI方法之间差异不大、可重复性很好，AI和人工计数之间的可重复性可接受。 结论 AI全自动方法的优势在于更节省人力，病理医生只需在最后核对诊断结果。而半自动的方法更符合临床病理医生的诊断习惯，整体耗时较AI全自动方法少。此外，AI方法虽然可重复性较高，但不能完全取代病理医生，而应作为有力的辅助工具看待。Abstract: Objective To study the different methods of artificial intelligence (AI)-assisted Ki-67 scoring of clinical invasive ductal carcinoma (IDC) of the breast and to compare the results. Methods A total of 100 diagnosed IDC cases were collected, including slides of HE staining and immunohistochemical Ki-67 staining and diagnosis results. The slides were scanned and turned into whole slide image (WSI), which were then scored with AI. There were two AI scoring methods. One was fully automatic counting by AI, which used the scoring system of Ki-67 automatic diagnosis to do counting with the whole image of WSI. The second method was semi-automatic AI counting, which required manual selection of areas for counting, and then relied on an intelligent microscope to conduct automatic counting. The diagnostic results of pathologists were taken as the results of pure manual counting. Then the Ki-67 scores obtained by manual counting, semi-automatic AI counting and automatic AI counting were pairwise compared. The Ki-67 scores obtained from the manual counting (pathological diagnosis results), semi-automatic AI and automatic AI counts were pair-wise compared and classified according to three levels of difference: difference ≤10%, difference of >10%−<30% and difference ≥30%. Intra-class correlation coefficient (ICC) was used to evaluate the correlation. Results The automatic AI counting of Ki-67 takes 5−8 minutes per case, the semi-automatic AI counting takes 2−3 minutes per case, and the manual counting takes 1−3 minutes per case. When results of the two AI counting methods were compared, the difference in Ki-67 scores was all within 10% (100% of the total), and the ICC index being 0.992. The difference between manual counting and semi-automatic AI was less than 10% in 60 cases (60% of the total), between 10% and 30% in 37 cases (37% of the total), and more than 30% in only 3 cases (3% of the total), ICC index being 0.724. When comparing automatic AI with manual counting, 78 cases (78% of the total) had a difference of ≤10%, 17 cases (17% of the total) had a difference of between 10% and 30%, and 5 cases (5%) had a difference of ≥30%, the ICC index being 0.720. The ICC values showed that there was little difference between the results of the two AI counting methods, indicating good repeatability, but the repeatability between AI counting and manual counting was not particularly ideal. Conclusion AI automatic counting has the advantage of requiring less manpower, for the pathologist is involved only for the verification of the diagnosis results at the end. However, the semi-automatic method is better suited to the diagnostic habits of pathologists and has a shorter turn-over time compared with that of the fully automatic AI counting method. Furthermore, in spite of its higher repeatability, AI counting, cannot serve as a full substitute for pathologists, but should instead be viewed as a powerful auxiliary tool.
图 1 Ki-67自动诊断的评分系统工作流程
Figure 1. The workflow for the Ki-67 automatic diagnosis system
The system of Ki-67 automatic diagnosis developed by our team was used for automatic counting of the WSI. A: Automatic identification of IDC area; B: Registration of HE and Ki-67 WSI; C: Ki-67 automatic counting in IDC area. IDC: Invasive ductal carcinoma of the breast, WSI: Whole slide image.
表 1 三种计数方式的一致性评价
Table 1. Consistency evaluation of the three counting methods
Index Semi-automatic AI vs.
manual counting (n=100)
Automatic AI vs. manual
Semi-automatic AI vs.
automatic AI (n=100)
Differ values between groups The values differ by ≤10%/case 60 78 100 The values differ by 10% to 30%/case 37 17 0 The values differ by ≥30%/case 3 5 0 ICC 0.724 0.720 0.992 Intra-class correlation coefficient (ICC) can be used to evaluate the repeatability and consistency of different measurement methods or evaluators for the same quantitative measurement results. Its value is between 0−1, with ICC<0.4 indicating poor repeatability, and ICC>0.75 indicating good repeatability.
 BARISONI L, HODGIN J B. Digital pathology in nephrology clinical trials, research, and pathology practice. Curr Opin Nephrol Hypertens,2017,26(6): 450–459. doi: 10.1097/MNH.0000000000000360  PILLERON S, SARFATI D, JANSSEN-HEIJNEN M, et al. Global cancer incidence in older adults, 2012 and 2035: A population-based study. Int J Cancer,2019,144(1): 49–58. doi: 10.1002/ijc.31664  SIEGEL R L, MILLER K D. Cancer statistics, 2020. CA Cancer J Clin,2020,70(1): 7–30. doi: 10.3322/caac.21590  ARIMA N, NISHIMURA R, OSAKO T, et al. The importance of tissue handling of surgically removed breast cancer for an accurate assessment of the Ki-67 index. J Clin Pathol,2016,69(3): 255–259. doi: 10.1136/jclinpath-2015-203174  KOS Z, DABBS D J. Biomarker assessment and molecular testing for prognostication in breast cancer. Histopathology,2016,68(1): 70–85. doi: 10.1111/his.12795  MILLER H C, DRYMOUSIS P, FLORA R, et al. Role of Ki-67 proliferation index in the assessment of patients with neuroendocrine neoplasias regarding the stage of disease. World J Surg,2014,38(6): 1353–1361. doi: 10.1007/s00268-014-2451-0  RADEMAKERS S E, HOOGSTEEN I J, RIJKEN P F, et al. Prognostic value of the proliferation marker Ki-67 in laryngeal carcinoma: Results of the accelerated radiotherapy with carbogen breathing and nicotinamide phase Ⅲ randomized trial. Head Neck,2015,37(2): 171–176. doi: 10.1002/hed.23569  MUNGLE T, TEWARY S, ARUN I, et al. Automated characterization and counting of Ki-67 protein for breast cancer prognosis: A quantitative immunohistochemistry approach. Comput Methods Programs Biomed,2017,139: 149–161. doi: 10.1016/j.cmpb.2016.11.002  COATES A S, WINER E P, GOLDHIRSCH A, et al. Tailoring therapies—Improving the management of early breast cancer: St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2015. Ann Oncol,2015,26(8): 1533–1546. doi: 10.1093/annonc/mdv221  WANG W, WU J, ZHANG P, et al. Prognostic and predictive value of Ki-67 in triple-negative breast cancer. Oncotarget,2016,7(21): 31079–31087. doi: 10.18632/oncotarget.9075  DOWSETT M, NIELSEN T O, A'HERN R, et al. Assessment of Ki67 in breast cancer: Recommendations from the International Ki67 in Breast Cancer working group. J Natl Cancer Inst,2011,103(22): 1656–1664. doi: 10.1093/jnci/djr393  VARGA Z, DIEBOLD J, DOMMANN-SCHERRER C, et al. How reliable is Ki-67 immunohistochemistry in grade 2 breast carcinomas? A QA study of the Swiss Working Group of Breast-and Gynecopathologists. PLoS One,2012,7(5): e37379[2020-08-07].https://doi.org/10.1371/journal.pone.0037379. doi: 10.1371/journal.pone.0037379  POLLEY M Y, LEUNG S C, MCSHANE L M, et al. An international Ki67 reproducibility study. J Natl Cancer Inst,2013,105(24): 1897–1906. doi: 10.1093/jnci/djt306  BANKHEAD P, FERN NDEZ J A, MCART D G, et al. Integrated tumor identification and automated scoring minimizes pathologist involvement and provides new insights to key biomarkers in breast cancer. Lab Invest,2018,98(1): 15–26. doi: 10.1038/labinvest.2017.131  KLAUSCHEN F, WIENERT S, SCHMITT W D, et al. Standardized Ki67 diagnostics using automated scoring—Clinical validation in the GeparTrio Breast Cancer Study. Clin Cancer Res,2015,21(16): 3651–3657. doi: 10.1158/1078-0432.CCR-14-1283  ACS B, RANTALAINEN M, HARTMAN J. Artificial intelligence as the next step towards precision pathology. J Intern Med,2020,288(1): 62–81. doi: 10.1111/joim.13030  FENG M, DENG Y, YANG L, et al. Automated quantitative analysis of Ki-67 staining and HE images recognition and registration based on whole tissue sections in breast carcinoma. Diagn Pathol,2020,15: 65[2021-02-02]. https://doi.org/10.1186/s13000-020-00957-5. doi: 10.1186/s13000-020-00957-5  R GE R, RIBER-HANSEN R, NIELSEN S, et al. Proliferation assessment in breast carcinomas using digital image analysis based on virtual Ki67/cytokeratin double staining. Breast Cancer Res Treat,2016,158(1): 11–19. doi: 10.1007/s10549-016-3852-6  ST LHAMMAR G, FUENTES MARTINEZ N, LIPPERT M, et al. Digital image analysis outperforms manual biomarker assessment in breast cancer. Mod Pathol,2016,29(4): 318–329. doi: 10.1038/modpathol.2016.34