Ponzetto, Wessler, Weiland, Kopf, Effelsberg & Stuckenschmidt: Automatic Classification of Iconic Images Based on a Multimodal Model. An Interdisciplinary Project

The term “iconic image” refers here to images produced to create privileged associations between a particular visual representation and a referent. They are highly recognizable for media users and typically induce negative or positive emotions that have an impact on viewers’ attitudes and actions. In our work we focus on iconic images in the topical area of climate change. Previous research (see, for example, O`Neill & Nocholson-Cole, 2009) has identified recurring iconic representations such as the polar bear on a drifting ice floe (for the problem) or wind turbines in an untouched landscape surrounded by a clear sky (for possible solutions).
In this paper we present first results of an interdisciplinary endeavor involving media and communication researchers and computer scientists specializing in text and image analysis. We aim at producing a model that computationally captures the phenomenon of iconic images, and that is well-founded on solid theoretical ground. Our methodology focuses on the automatic detection of iconicity in context on the basis of multimedia features derived from both images and their surrounding text. The method consists of two steps:
(1) Data acquisition. We semi-automatically create a dataset of iconic images by relying on existing Web resources. We start with manually-created examples of iconic images, e.g., from the educational section of the National Geographic website1. Each example consists of a set of images (e.g. a polar bear on an ice floe), an abstract topical label (like “global warming”) as well as a rich textual context made up of a caption and an article. We then use captions and text to automatically generate queries (“ice bear ice floe melting”) and send these to online resources like any of Flickr, a popular image hosting website, Google or Wikipedia, in order to retrieve similar iconic images.
(2) Classification task. Given a set of iconic images, their textual contexts and abstract topical labels, we train algorithms for statistical classification (e.g., k-NN, Support Vector Machines, etc.) in order to automatically classify new, unseen images into a closed number of classes namely: (i) binary iconic vs. non-iconic image detection; (ii) topical classification (e.g. global warming); (iii) topical sub-classes labeling (e.g., icons capturing global warming impact vs. causes vs. solutions). To this end, we use both visual and textual features. Visual features consist of SIFT descriptors and object detection via contours, color, and texture. Textual features look at words, concepts and topics found in the textual context of the image.
This iconic image classification represents the first step towards a full-fledged methodology to automatically capture the phenomenon of iconic images in context. Our long-term vision is to cover all three aspects of content, usage, and effects of iconic images. The content aspect is covered by the predictive classification task described above. Usage will be analyzed with crowd sourcing techniques focusing on intercultural similarities and differences in recognizing iconic images of climate change around the globe. Effects will be studied by way of experiments focusing on the qualities of iconic images and their affective and behavioral consequences.

O’Neill, S., Nicholson-Cole, S. (2009). Fear won’t do it: promoting positive engagement with climate change through imagery and icons. Science Communication, 355-379.

  • © 2019 University of Bremen || Faculty of Linguistics and Literary Science