Lab Home | Phone | Search | ||||||||
|
||||||||
The recent successes of deep learning are largely attributed to supervised training of networks with large numbers of parameters using large datasets. In computer vision, supervised training of convolutional networks with very large labeled datasets provide state-of-the-art solutions in many applications such as object recognition, image captioning and question answering. However, creating large labeled datasets is a very costly and time consuming task. We explore alternative approaches to supervised learning when labeled data is limited or not available. The first part of the talk will focus on semi-supervised learning with convolutional neural networks. Techniques such as randomized data augmentation, drop-out and random max-pooling provide better generalization and stability for classifiers. We propose an unsupervised loss function that takes advantage of the stochastic nature of these methods and minimizes the difference between the predictions of multiple passes of a training sample through the network. In the second part of the talk, we will present a neighborhood similarity layer which induces appearance invariance in a network when used in conjunction with convolutional layers and its applications in domain adaptation. The proposed layer transforms its input feature map using the feature vectors at each pixel as a frame of reference, i.e. center of attention, for its surrounding neighborhood. We demonstrate how the proposed layer leads to significantly improved cross-domain accuracy object recognition and semantic segmentation. Host: Sunil Thulasidasan |