乳腺癌Ki-67人工智能自动诊断系统的真实世界应用测试

邓杨; 李凤玲; 秦航宇; 周燕燕; 周琪琪; 梅娟; 李丽; 刘洪红; 王一哲; 步宏; 包骥

doi:10.12182/20210460202

乳腺癌Ki-67人工智能自动诊断系统的真实世界应用测试

Application Test of the AI-Automatic Diagnostic System for Ki-67 in Breast Cancer

摘要

摘要:
目的研究人工智能（AI）用于辅助临床乳腺浸润性导管癌（IDC）Ki-67评分的不同方法并比较其结果。
方法收集100例真实临床IDC诊断病例，包括HE、免疫组化Ki-67染色的切片和诊断结果。将病理切片扫描成全片数字化图像（whole slide image, WSI），并使用AI对其进行评分。AI评分方式分为两种，一种为AI纯自动计数，使用Ki-67自动诊断的评分系统对WSI进行全片计数；第二种是AI半自动计数，需要人工选择区域计数，然后用智能显微镜进行自动计数。病理医生的诊断结果作为纯人工计数的结果。将全人工（病理诊断结果）、AI半自动、AI全自动此3种计数所得的Ki-67分数进行两两比较，分别按差异高低进行归类，差异高低分为3档：相差≤10%、相差>10%～<30%和相差≥30%，并且使用组内相关系数（intra-class correlation coefficient, ICC）对其进行相关性的评价。
结果全自动AI计数1例Ki-67的时间为5～8 min，而半自动AI方法为2～3 min，全人工计数则需要1～3 min。两种AI计数方法相比较，Ki-67分数的相差全部在10%以内（占比100%），ICC指数高达0.992。全人工计数和AI半自动相比，相差≤10%的有60例（占比60%），相差>10%～<30%的例数为37例（占比37%），而≥30%的只有3例（占比3%），ICC指数为0.724；全人工计数和AI全自动相比，相差≤10%的有78例（占78%），相差>10%～<30%的例数为17例（占比17%），而≥30%的有5例（占比5%），ICC指数为0.720。ICC数值示，两种AI方法之间差异不大、可重复性很好，AI和人工计数之间的可重复性可接受。
结论 AI全自动方法的优势在于更节省人力，病理医生只需在最后核对诊断结果。而半自动的方法更符合临床病理医生的诊断习惯，整体耗时较AI全自动方法少。此外，AI方法虽然可重复性较高，但不能完全取代病理医生，而应作为有力的辅助工具看待。

Abstract:
Objective To study the different methods of artificial intelligence (AI)-assisted Ki-67 scoring of clinical invasive ductal carcinoma (IDC) of the breast and to compare the results.
Methods A total of 100 diagnosed IDC cases were collected, including slides of HE staining and immunohistochemical Ki-67 staining and diagnosis results. The slides were scanned and turned into whole slide image (WSI), which were then scored with AI. There were two AI scoring methods. One was fully automatic counting by AI, which used the scoring system of Ki-67 automatic diagnosis to do counting with the whole image of WSI. The second method was semi-automatic AI counting, which required manual selection of areas for counting, and then relied on an intelligent microscope to conduct automatic counting. The diagnostic results of pathologists were taken as the results of pure manual counting. Then the Ki-67 scores obtained by manual counting, semi-automatic AI counting and automatic AI counting were pairwise compared. The Ki-67 scores obtained from the manual counting (pathological diagnosis results), semi-automatic AI and automatic AI counts were pair-wise compared and classified according to three levels of difference: difference ≤10%, difference of >10%−<30% and difference ≥30%. Intra-class correlation coefficient (ICC) was used to evaluate the correlation.
Results The automatic AI counting of Ki-67 takes 5−8 minutes per case, the semi-automatic AI counting takes 2−3 minutes per case, and the manual counting takes 1−3 minutes per case. When results of the two AI counting methods were compared, the difference in Ki-67 scores was all within 10% (100% of the total), and the ICC index being 0.992. The difference between manual counting and semi-automatic AI was less than 10% in 60 cases (60% of the total), between 10% and 30% in 37 cases (37% of the total), and more than 30% in only 3 cases (3% of the total), ICC index being 0.724. When comparing automatic AI with manual counting, 78 cases (78% of the total) had a difference of ≤10%, 17 cases (17% of the total) had a difference of between 10% and 30%, and 5 cases (5%) had a difference of ≥30%, the ICC index being 0.720. The ICC values showed that there was little difference between the results of the two AI counting methods, indicating good repeatability, but the repeatability between AI counting and manual counting was not particularly ideal.
Conclusion AI automatic counting has the advantage of requiring less manpower, for the pathologist is involved only for the verification of the diagnosis results at the end. However, the semi-automatic method is better suited to the diagnostic habits of pathologists and has a shorter turn-over time compared with that of the fully automatic AI counting method. Furthermore, in spite of its higher repeatability, AI counting, cannot serve as a full substitute for pathologists, but should instead be viewed as a powerful auxiliary tool.

HTML全文

参考文献(19)

施引文献

资源附件(0)