Abstract
With the advent of deep learning, convolutional neural networks have solved many imaging problems to a large extent. However, it remains to be seen if the image 'bottleneck' can be unplugged by harnessing complementary sources of data. In this paper, we present a new approach to image aesthetic evaluation that learns both visual and textual features simultaneously. Our network extracts visual features by appending global average pooling blocks on multiple inception modules (MultiGAP), while textual features from associated user comments are learned from a recurrent neural network. Experimental results show that the proposed method is capable of achieving state-of-the-art performance on the AVA / AVA-Comments datasets. We also demonstrate the capability of our approach in visualizing aesthetic activations.
Original language | English |
---|---|
Title of host publication | 2017 IEEE International Conference on Image Processing (ICIP) |
Publisher | IEEE |
Pages | 1722-1726 |
Number of pages | 5 |
ISBN (Electronic) | 9781509021758 |
DOIs | |
Publication status | Published - 22 Feb 2018 |
Event | 24th IEEE International Conference on Image Processing 2017 - Beijing, China Duration: 17 Sept 2017 → 20 Sept 2017 |
Conference
Conference | 24th IEEE International Conference on Image Processing 2017 |
---|---|
Abbreviated title | ICIP 2017 |
Country/Territory | China |
City | Beijing |
Period | 17/09/17 → 20/09/17 |
Keywords
- Aesthetic visualization
- CNN
- Deep neural network
- Image aesthetics evaluation
- Textual features
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition
- Signal Processing