Towards Real-Time Detection of Squamous PreCancers from Oesophageal Endoscopic Videos

Xiaohong Gao, Barbara Braden, Stephen Taylor, Wei Pang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

79 Downloads (Pure)


This study investigates the feasibility of applyingstate of the art deep learning techniques to detect precancerousstages of squamous cell carcinoma (SCC) cancer in real time toaddress the challenges while diagnosing SCC with subtleappearance changes as well as video processing speed. Two deeplearning models are implemented, which are to determineartefact of video frames and to detect, segment and classify thoseno-artefact frames respectively. For detection of SCC, bothmask-RCNN and YOLOv3 architectures are implemented. Inaddition, in order to ascertain one bounding box being detectedfor one region of interest instead of multiple duplicated boxes, afaster non-maxima suppression technique (NMS) is applied ontop of predictions. As a result, this developed system can processvideos at 16-20 frames per second. Three classes are classified,which are ‘suspicious’, ‘high grade’ and ‘cancer’ of SCC. Withthe resolution of 1920x1080 pixels of videos, the averageprocessing time while apply YOLOv3 is in the range of 0.064-0.101 seconds per frame, i.e. 10-15 frames per second, whilerunning under Windows 10 operating system with 1 GPU(GeForce GTX 1060). The averaged accuracies for classificationand detection are 85% and 74% respectively. Since YOLOv3only provides bounding boxes, to delineate lesioned regions,mask-RCNN is also evaluated. While better detection result isachieved with 77% accuracy, the classification accuracy issimilar to that by YOLOYv3 with 84%. However, the processingspeed is more than 10 times slower with an average of 1.2 secondper frame due to creation of masks. The accuracy ofsegmentation by mask-RCNN is 63%. These results are basedon the date sets of 350 images. Further improvement is hence inneed in the future by collecting, annotating or augmenting moredatasets.
Original languageEnglish
Title of host publication2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA)
Publication statusAccepted/In press - 7 Oct 2019


  • oesophagus endoscopy
  • pre-cancer detection
  • deep learning
  • segmentation
  • real-time video processing


Dive into the research topics of 'Towards Real-Time Detection of Squamous PreCancers from Oesophageal Endoscopic Videos'. Together they form a unique fingerprint.

Cite this