Deep learning application in the oesophageal endoscopy
Editorial Commentary

Deep learning application in the oesophageal endoscopy

Shigeru Kiryu1, Hiroyuki Akai2, Koichiro Yasaka2

1Department of Radiology, International University of Health, School of medicine, Narita, Japan; 2Department of Radiology, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan

Correspondence to: Shigeru Kiryu, MD, PhD. Department of Radiology, International University of Health, School of medicine, 4-3, Kouzunomori, Narita, Chiba 286-8686, Japan. Email:

Provenance: This is an invited article commissioned by the Editorial Board member Dr. Xiao Li, MD (Department of Urology, Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research & Nanjing Medical University Affiliated Cancer Hospital, Nanjing, China).

Comment on: Nakagawa K, Ishihara R, Aoyama K, et al. Classification for invasion depth of esophageal squamous cell carcinoma using a deep neural network compared with experienced endoscopists. Gastrointest Endosc 2019;90:407-14.

Received: 30 September 2019; Accepted: 10 October 2019; Published: 13 December 2019.

doi: 10.21037/jmai.2019.10.01

This editorial is in response to the article on deep learning application in oesophageal endoscopy by Nakagawa et al. (1). The role of deep learning in the classifying invasion depth of oesophageal carcinoma and the future aspect of deep learning in endoscopy are described.

Artificial intelligence (AI) has attracted attention in many fields, not only medicine. Deep learning with a convolutional neural network (CNN) has been a primary strategy in recent advances in AI. Conventional machine-learning requires advanced knowledge of specific imaging features, while CNN uses imaging features extracted via the convolutional process to learn automatically (2). Deep CNN has been developed to handle images for self-driving car technology, facial recognition, and other uses. In medicine, deep CNN has had major impacts on radiology and pathology that deal with images from early stage of AI researches (3,4). Deep CNN has also been used in gastrointestinal endoscopy, and studies have examined the detection of Helicobacter pylori gastritis (5) and gastric cancer (6), and the classification of polyps at colonoscopy (7,8).

Oesophageal cancer is the sixth leading cause of cancer-related death (9). Although the main treatment of oesophageal cancer is oesophagectomy, the procedure is invasive, and the physical burden on patients is enormous (10,11). Endoscopic resection is effective for early oesophageal cancer, but it is crucial to determine the indications precisely because oesophageal wall extension of the lesion increases the likelihood of metastasis (11). Oesophageal cancer invading from the epithelium to 200 µm into the submucosa (EP-SM1) has a low risk of metastasis, whereas deeper oesophageal cancer (SM2/3) has a higher risk (12,13). Therefore, the indication for oesophageal resection is up to SM1, and it is required to differentiate EP-SM1 from SM2/3. However, endoscopy is an operator-dependent examination and the diagnosis of oesophageal wall invasion by humans is not thought to be sufficient.

To improve the diagnostic evaluation of oesophageal wall invasion using endoscopy, Nakagawa et al. applied a deep learning technique based on CNN (1). They acquired 14,338 endoscopic images (8,660 non-magnified and 5,678 magnified endoscopic images) from 804 superficial oesophageal squamous cell carcinomas with pathological proof of cancer invasion depth as a training dataset and 914 endoscopic images (405 non-magnified and 509 magnified endoscopic images) from 155 patients as a validation dataset.

The images were converted into joint photographic experts group (JPEG) format and resized to 300 × 300 pixels. The training dataset was fed to the deep learning process using the Single Shot MultiBox Detector CNN architecture to create a model to differentiate EP-SM1 and SM2/3. The trained deep CNN model was applied to the validation dataset and its diagnostic performance was assessed. The authors also compared the performance of their deep CNN model with that of 16 board-certified specialists. In this unique study, they found that the deep CNN model and experienced endoscopists had comparable diagnostic ability, with a sensitivity of 90.1% and 89.8%, specificity of 95.8% and 88.3%, and accuracy of 91.0% and 89.6%, respectively, in the validation dataset. The deep CNN model took 29 s to assess the validation dataset, whereas it took an average of 115 min for the endoscopists.

In this retrospective study, a Single Shot MultiBox Detector was used to detect lesions in endoscopic images. Therefore, the lesion does not need to be cropped manually, which is an excellent way to avoid operator-bias. The purpose of this study was to classify the wall invasion of oesophageal cancer; however, the detection of lesions during the examination is also a vital role for deep learning because endoscopy is performed within a limited time. Other researchers have assessed the ability of CNN to detect lesions in endoscopic images (14-16). It would of interest to determine whether the present method can detect lesions in a similar dataset.

Endoscopists diagnose oesophageal cancer wall invasion based on specific endoscopic findings, such as protrusion, depression, and hardness (1). CNN makes the diagnosis by extracting image features. In other hand, although some research has clarified which anatomical structures CNN uses for the diagnosis (17-19), it is difficult to define the imaging features that CNN focused on, such as the pattern of texture and heterogeneity in the lesion. CNN has the potential to surpass human diagnosis and may make diagnoses using imaging findings that humans may not consider (20). Therefore, it is important to clarify what imaging features CNN focuses on during the diagnostic process because this may contribute to the development of medical knowledge.

In recent years, radiomics research has been developed to evaluate image features, such as texture and heterogeneity, which are difficult to represent with general indicators such as size and signal value. Radiomics studies using computed tomography (CT) and positron emission tomography, and magnetic resonance imaging (MRI) have also been conducted in gastrointestinal tract, and are used to predict the prognosis and therapeutic response of lesions (21). Radiomics research has the potential to reveal which image features CNN focuses on; for now, however, the radiomics features that CNN focuses on are in a black box.

In this study, endoscopic images were converted to JPEG format and resized to 300 × 300 pixels; thus, images smaller than original were processed by deep CNN. Because the deep CNN algorithm consists of several layers and handles a lot of parameters during the training phase, it is necessary to reduce the data size of the input image. With advances in hardware and programming of the algorithm, CNN may extract diagnostically more useful imaging features from higher-quality original images, and its diagnostic performance may improve.

This study examined both magnified and non-magnified endoscopic images. The diagnosis of cancer invasion depth using non-magnified endoscopic images is based on subjective imaging findings, which cause interobserver variability. The deep CNN diagnostic performance of magnified endoscopic images was no better than that of non-magnified endoscopic images in this study, and the authors attributed this to the small training dataset. The size of the training dataset strongly correlates with the diagnostic performance of the AI model, and a larger dataset is desired. The use of publicly available datasets or collaborative collections of datasets at multiple facilities could increase the size of training datasets (22). Recently, a generative adversarial network (GAN) has been applied to increase the dataset (22). Given a training set, GAN learns to generate new data and creates fake images indistinguishable from real images. Using GAN may solve the problem of training dataset size.

In conclusion, Nakagawa et al. showed the usefulness AI in the diagnosis of cancer oesophageal wall invasion using endoscopic images. The problem of inter-observer variability, which often occurs with endoscopic diagnosis, was not seen with AI, and an accurate diagnosis was possible. In clinical endoscopy, it is necessary to make a diagnosis quickly, unlike with modalities such as CT and MRI. By overcoming this limitation, AI can play a more important role in endoscopic diagnosis.




Conflicts of Interest: The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.


  1. Nakagawa K, Ishihara R, Aoyama K, et al. Classification for invasion depth of esophageal squamous cell carcinoma using a deep neural network compared with experienced endoscopists. Gastrointest Endosc 2019;90:407-14. [Crossref] [PubMed]
  2. Suzuki K. Overview of deep learning in medical imaging. Radiol Phys Technol 2017;10:257-73. [Crossref] [PubMed]
  3. Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases. J Pathol Inform 2016;7:29. [Crossref] [PubMed]
  4. Yasaka K, Akai H, Kunimatsu A, et al. Deep learning with convolutional neural network in radiology. Jpn J Radiol 2018;36:257-72. [Crossref] [PubMed]
  5. Shichijo S, Nomura S, Aoyama K, et al. Application of Convolutional Neural Networks in the Diagnosis of Helicobacter pylori Infection Based on Endoscopic Images. EBioMedicine 2017;25:106-11. [Crossref] [PubMed]
  6. Hirasawa T, Aoyama K, Tanimoto T, et al. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer 2018;21:653-60. [Crossref] [PubMed]
  7. Komeda Y, Handa H, Watanabe T, et al. Computer-Aided Diagnosis Based on Convolutional Neural Network System for Colorectal Polyp Classification: Preliminary Experience. Oncology 2017;93:30-4. [Crossref] [PubMed]
  8. Byrne MF, Chapados N, Soudan F, et al. Real-time differentiation of adenomatous and hyperplastic diminutive colorectal polyps during analysis of unaltered videos of standard colonoscopy using a deep learning model. Gut 2019;68:94-100. [Crossref] [PubMed]
  9. Ferlay J, Soerjomataram I, Dikshit R, et al. Cancer incidence and mortality worldwide: Sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 2015;136:E359-86. [Crossref] [PubMed]
  10. Birkmeyer JD, Siewers AE, Finlayson EV, et al. Hospital volume and surgical mortality in the United States. N Engl J Med 2002;346:1128-37. [Crossref] [PubMed]
  11. Chang AC, Ji H, Birkmeyer NJ, et al. Outcomes after transhiatal and transthoracic esophagectomy for cancer. Ann Thorac Surg 2008;85:424-9. [Crossref] [PubMed]
  12. Yamashina T, Ishihara R, Nagai K, et al. Long-term outcome and metastatic risk after endoscopic resection of superficial esophageal squamous cell carcinoma. Am J Gastroenterol 2013;108:544-51. [Crossref] [PubMed]
  13. Akutsu Y, Uesato M, Shuto K, et al. The overall prevalence of metastasis in T1 esophageal squamous cell carcinoma: a retrospective analysis of 295 patients. Ann Surg 2013;257:1032-8. [Crossref] [PubMed]
  14. Ghatwary N, Zolgharni M, Ye X. Early esophageal adenocarcinoma detection using deep learning methods. Int J Comput Assist Radiol Surg 2019;14:611-21. [Crossref] [PubMed]
  15. Ebigbo A, Mendel R, Probst A, et al. Computer-aided diagnosis using deep learning in the evaluation of early oesophageal adenocarcinoma. Gut 2019;68:1143-5. [Crossref] [PubMed]
  16. Cai SL, Li B, Tan WM, et al. Using a deep learning system in endoscopy for screening of early esophageal squamous cell carcinoma (with video). Gastrointest Endosc 2019. [Epub ahead of print]. [Crossref] [PubMed]
  17. Selvaraju RR, Cogswell M, Das A, et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. Accessed 2 Feb 2019. Available online:
  18. Samek W, Binder A, Montavon G, et al. Evaluating the Visualization of What a Deep Neural Network Has Learned. IEEE Trans Neural Netw Learn Syst 2017;28:2660-73. [Crossref] [PubMed]
  19. Philbrick KA, Yoshida K, Inoue D, et al. What Does Deep Learning See? Insights From a Classifier Trained to Predict Contrast Enhancement Phase From CT Images. AJR Am J Roentgenol 2018;211:1184-93. [Crossref] [PubMed]
  20. Kiryu S, Yasaka K, Akai H, et al. Deep learning to differentiate parkinsonian disorders separately using single midsagittal MR imaging: a proof of concept study. Eur Radiol 2019;29:6891-9. [Crossref] [PubMed]
  21. Sah BR, Owczarczyk K, Siddique M, et al. Radiomics in esophageal and gastric cancer. Abdom Abdom Radiol (NY) 2019;44:2048-58. [Crossref] [PubMed]
  22. Soffer S, Ben-Cohen A, Shimon O, et al. Convolutional Neural Networks for Radiologic Images: A Radiologist’s Guide. Radiology 2019;290:590-606. [Crossref] [PubMed]
doi: 10.21037/jmai.2019.10.01
Cite this article as: Kiryu S, Akai H, Yasaka K. Deep learning application in the oesophageal endoscopy. J Med Artif Intell 2019;2:22.