What Modality Matters? Exploiting Highly Relevant Features for Video Advertisement Insertion

Onn Keat Chong, Hui-Ngo Goh, John See

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Video advertising is a thriving industry that has recently turned its attention to the use of intelligent algorithms for automating tasks. In advertisement insertion, the integration of contextual relevance is essential in influencing the viewer’s experience. Despite the wide spectrum of audio-visual semantic modalities available, there is a lack of research that analyzes their individual and complementary strengths in a systematic manner. In this paper, we propose an ad-insertion framework that maximizes the contextual relevance between advertisement and content video by employing high-level multi-modal semantic features. Prediction vectors are derived via clip-level and image-level extractors, which are then matched accordingly to yield relevance scores. We also established a new user study methodology that produces gold standard annotations based on multiple expert selections. By comprehensive human-centered approaches and analysis, we demonstrate that automatic ad-insertion can be improved by exploiting effective combinations of semantic modalities.
Original languageEnglish
Title of host publication2023 IEEE International Conference on Image Processing (ICIP)
Number of pages5
ISBN (Electronic)9781728198354
Publication statusPublished - 11 Sept 2023


  • advertisement insertion
  • feature extraction
  • human-centered computing
  • video advertising

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition


Dive into the research topics of 'What Modality Matters? Exploiting Highly Relevant Features for Video Advertisement Insertion'. Together they form a unique fingerprint.

Cite this