A Unified Framework for Jointly Compressing Visual and Semantic Data

Shizhan Liu, Weiyao Lin, Yihang Chen, Yufeng Zhang, Wenrui Dai, John See, Hongkai Xiong

Research output: Contribution to journalArticlepeer-review

11 Downloads (Pure)


The rapid advancement of multimedia and imaging technologies has resulted in increasingly diverse visual and semantic data. A large range of applications such as remote-assisted driving requires the amalgamated storage and transmission of various visual and semantic data. However, existing works suffer from the limitation of insufficiently exploiting the redundancy between different types of data. In this paper, we propose a unified framework to jointly compress a diverse spectrum of visual and semantic data, including images, point clouds, segmentation maps, object attributes and relations. We develop a unifying process that embeds the representations of these data into a joint embedding graph according to their categories, which enables flexible handling of joint compression tasks for various visual and semantic data. To fully leverage the redundancy between different data types, we further introduce an embedding-based adaptive joint encoding process and a Semantic Adaptation Module to efficiently encode diverse data based on the learned embeddings in the joint embedding graph. Experiments on the Cityscapes, MSCOCO, and KITTI datasets demonstrate the superiority of our framework, highlighting promising steps toward scalable multimedia processing.
Original languageEnglish
JournalACM Transactions on Multimedia Computing, Communications and Applications
Early online date28 Mar 2024
Publication statusE-pub ahead of print - 28 Mar 2024


  • Computer Networks and Communications
  • Hardware and Architecture


Dive into the research topics of 'A Unified Framework for Jointly Compressing Visual and Semantic Data'. Together they form a unique fingerprint.

Cite this