Image understanding: Visual dictionaries and Semantic segmentation

Sinem Aslan

University of Milano-Bicocca

19 January 2018, 2:00 pm


Image or scene understanding can be defined as describing the image content, the objects in an image, location and relation between objects, or describing the events in an image. Techniques for general scene understanding can be explored from the object level category recognition by exploiting visual dictionaries or from the pixel level category recognition which is known as semantic segmentation. In the first part of this talk I will present the recently published model-based visual dictionary construction technique which is named as Symbolic Patch Dictionary (SymPaD) that utilizes primitive appearance models. At SymPaD, dictionary building starts from a core of shape primitives, which have commonalities with the shape models envisaged by the earliest to the latest proponents of the idea, i.e., from Marr to Griffin. We then proceed to enrich the dictionary by using detailed parametrization of the shape space and by applying nonlinear dyadic compositions. Compared with the existing model-driven schemes, our method is able to represent and characterize images in various image understanding applications with competitive, and often better performance.

In the second part of this talk I will focus on pixel level category recognition, namely semantic segmentation. Semantic segmentation has been one of the key problems in the field of computer vision in recent years. Main goal is assigning a category-label to each pixel in an image which paves the way towards complete scene understanding. Many researchers have been attracted by employing Deep Convolutional Neural Networks (DCNN) for semantic segmentation convinced by DCNN’s high accuracy in classification tasks. Giving a general overview of a state-of-the-art technique at semantic segmentation, I will mention challenges to employ deep learning for semantic segmentation task by focusing on a niche application and address for possible solutions.


Sinem Aslan is a postdoctoral researcher in Imaging and Vision Laboratory (IVL) at Department of Informatics, Systems and Communication at University of Milano Bicocca since March 2017. In IVL, she has been working on food images segmentation for automatic dietary assessment applications. She obtained her Ph.D. degree (October 2016) in International Computer Institute, Ege University, Turkey, under supervisions of Prof. Bülent Sankur and Prof. E. Turhan Tunalı. She mainly investigated model-based visual dictionary techniques for image understanding applications in her thesis. During her Ph.D. studies, she held a visiting position in BUSIM laboratory at Boğaziçi University, Turkey, for two semesters. Prior to that, she received her MSc. degree (2007) in International Computer Institute from Ege University and BSc. degree in Department of Electronics Engineering from Ankara University. Beyond her research background, she has a teaching assistant experience (in Turkey) for 13 years. She is a reviewer for various international journals such as IEEE Transactions on Image Processing, Springer Multimedia Tools and Applications, Springer Pattern Analysis and Applications, IET Image Processing, and for the conference IEEE SIU.