Abstract
It is challenging to train a generalizable deep learning classifier with limited training images. Existing few-shot learning approaches try to improve classification performance largely by transferring prior knowledge from upstream large-sample tasks to the current small-sample task. Besides upstream image datasets, prior knowledge may also be obtained from signals of other modalities. In this study, we propose a novel learning framework that can utilize prior knowledge from audio signals to help train an image classifier. In the framework, a pre-trained and fixed audio encoder can transform the audio signal of each class label into a class-specific audio prototype. By attracting image representations to the corresponding audio prototypes during training of the image classifier, within-class image representations become more clustered, while image representations become further apart if they are from different classes. To the best of our knowledge, this is the first work that utilizes audio-based prior knowledge to help train an image classifier with limited training images. The proposed learning framework is compatible with existing learning approaches, making it flexible enough to be combined with existing approaches. Extensive empirical evaluations on both natural and medical image datasets demonstrate that the proposed learning framework significantly outperforms existing methods in image classification with limited training images, thus establishing a new state of the art. The source code will be released publicly.
Original language | English |
---|---|
Title of host publication | 49th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Publisher | IEEE |
Pages | 4975-4979 |
Number of pages | 5 |
ISBN (Electronic) | 9798350344851 |
DOIs | |
Publication status | Published - 18 Mar 2024 |
Event | 49th IEEE International Conference on Acoustics, Speech, and Signal Processing 2024 - COEX, Seoul, Korea, Republic of Duration: 14 Apr 2024 → 19 Apr 2024 https://2024.ieeeicassp.org/ |
Conference
Conference | 49th IEEE International Conference on Acoustics, Speech, and Signal Processing 2024 |
---|---|
Abbreviated title | ICASSP 2024 |
Country/Territory | Korea, Republic of |
City | Seoul |
Period | 14/04/24 → 19/04/24 |
Internet address |
Keywords
- Audio modality
- Few-shot learning
- Image classification
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering