Sunday, May 8, 2016

Advanced Computer Vision Final Projects

Highlighted project:
Avery Allen and Wenchen Li, Generative Adversarial Denoising Autoencoder for Face Completion. [webpage]

All projects:
Cusuh Ham, Sketch-Based Image Synthesis. [webpage]
John Turner and Siddharth Raja, O'FaMACap dataset (Obama Face&Mouth Image/Audio/Caption) and LSTM-based lipreader. [webpage]
Carl Saldanha, Visual Question Generation. [webpage]
Varun Agrawal and Palash Shastri, Deep Learning on the Yelp Image Dataset. [webpage]
Vasavi Gajarla and Aditi Gupta, Emotion Detection and Sentiment Analysis of Images. [pdf]
Avinash Bhaskaran and Anusha Sridhar Rao, Structure from Motion using Uncalibrated Cameras. [pdf]
Huda Alamri and Julia Deeb, Diving Deeper into IM2GPS. [pdf]
Jonathan Suit, Generating Facial Expressions. [pdf]
Punarva Katte and Prabhudev Prakash, Billboard Content Recognition for Driver Assistance Systems. [pdf]
Sam Seifert, Autocomplete Sketch Tool. [pdf]
Shantanu Deshpande and Naman Goyal, Sketch Based Image Retrieval. [pdf]
Stefano Fenu and Carden Bagwell, Image Colorization using Residual Networks. [pdf]

Saturday, May 7, 2016

Final projects

Hi class, I intend to post the final projects on the class webpage and blog. If you don't want your project posted then let me know.

Tuesday, April 26, 2016

Talks of Interest - Devi Parikh and Dhruv Batra

We have two visitors whose work has come up in this class. Devi Parikh will give a talk in TSRB in the second floor GVU cafe at 11:00 today (Tuesday). Dhruv Batra will give a talk at the same location on Wednesday. Topics include the Visual Question Answering task that we've addressed in class. Please attend if you can.

Monday, April 25, 2016

Final Presentations - Friday, April 29, 8am

We will have final project presentations this Friday during the final exam slot. Please aim for the same 6 minute presentation length that has been recommended all semester.

As the course syllabus says "Students will also produce a conference-formatted write-up of their project. Projects will be published on the this web page". This will not be due on Friday, but instead Wednesday, May 4th. Also, you have the option of producing either a conference-formatted pdf (download something like the CVPR author toolkit) or a web page with a similar level of detail. The level of detail should be that of a "short paper", i.e. about 4 pages with figures and references. It's OK if the writeup is longer.

Wednesday, April 20, 2016

Fri, April 22 - Sketchy Database

The Sketchy Database: Learning to Retrieve Badly Drawn Bunnies. Patsorn Sangkloy, Nathan Burnell, Cusuh Ham, James Hays. Siggraph 2016

This is our last paper. The camera ready is tomorrow so the paper will be posted on Thursday.

Edit: Here is the paper. Since I'm making this available so late don't worry about the summaries or questions. Feel free to post if you do have comments, though.

Monday, April 18, 2016

Wed, April 20 - LSDA

LSDA: Large Scale Detection Through Adaptation. Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross Girshick, Trevor Darrell, Kate Saenko. 2014.
arXiv

Varun will spend some time discussion this paper first:
Rich feature hierarchies for accurate object detection and semantic segmentation. Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik. 2014.
arXiv

Sunday, April 17, 2016

Mon, April 18 - Adverserial Networks

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. Alec Radford, Luke Metz, Soumith Chintala. 2015.

project page, arXiv

Wednesday, April 13, 2016

Fri, April 15 - no class

Use this free period to work on your semester projects. On Friday it is exactly two weeks until your final presentations during our final exam slot.

Monday, April 11, 2016

Wed, April 13 - A Neural Algorithm of Artistic Style

A Neural Algorithm of Artistic Style. Leon A. Gatys, Alexander S. Ecker, Matthias Bethge. 2015.

implementation, arXiv

Friday, April 8, 2016

Mon, April 11 -- Learning to Generate Chairs

Learning to Generate Chairs, Tables and Cars with Convolutional Networks. Alexey Dosovitskiy, Jost Tobias Springenberg, Maxim Tatarchenko, Thomas Brox. CVPR 2015.

arXiv

Wednesday, April 6, 2016

Fri, April 8 - Dense Semantic Correspondence

Dense Semantic Correspondence Where Every Pixel is a Classifier. Hilton Bristow, Jack Valmadre, Simon Lucey. ICCV 2015.

arXiv

Monday, April 4, 2016

Wed, April 6 - Visual Madlibs

Visual Madlibs: Fill in the blank Description Generation and Question Answering. Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg. ICCV, 2015.

project page, pdf

Another note -- attendance hasn't been great and many people are arriving to class late. It's vital to have people present for discussions. I do take attendance and as the syllabus says it is part of your grade.

Friday, April 1, 2016

Mon, April 4 - VQA

VQA: Visual Question Answering. S. Antol*, A. Agrawal*, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. ICCV, 2015.

project page, arXiv

Wednesday, March 30, 2016

Fri, April 1st - Exploring Nearest Neighbor Approaches for Image Captioning

Exploring Nearest Neighbor Approaches for Image Captioning. Jacob Devlin, Saurabh Gupta, Ross Girshick, Margaret Mitchell, C Lawrence Zitnick. arXiv, 2015.

arXiv

Naman is also going to discuss a non-baseline version of image captioning: Deep Visual-Semantic Alignments for Generating Image Descriptions.

Wednesday, March 23, 2016

Mon, March 28 - Project status updates

As shown on the course schedule, Monday and Wednesday are our second round of project status updates. The expectation is that you can demonstrate the core technical component of your semester project. You should have real results, although those results may still have issues or you may be using only a portion of the data that you plan to use eventually. A good presentation might demonstrate an implementation of what you proposed earlier, but highlight some unexpected problems revealed by your experiments.

The presentations will be in the reverse order of our first project presentations.

Wednesday, March 16, 2016

Fri, Mar 18 - AverageExplorer

AverageExplorer: Interactive Exploration and Alignment of Visual Data Collections. Jun-Yan Zhu, Yong Jae Lee, Alexei Efros. Siggraph 2014.

project page

Monday, March 14, 2016

Wed, Mar 16 - Learning Visual Similarity for Product Design with Convolutional Neural Networks

Learning Visual Similarity for Product Design with Convolutional Neural Networks. Sean Bell, Kavita Bala. Siggraph 2015.

author page, pdf

Friday, March 11, 2016

Mon, Mar 14 - Unsupervised Visual Representation Learning by Context Prediction

Unsupervised Visual Representation Learning by Context Prediction. Carl Doersch, Abhinav Gupta, Alexei A. Efros. ICCV 2015.

project page

Wednesday, March 9, 2016

Fri, Mar 11 - ResNet

Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. MS COCO detection challenge winner 2015.

arXiv

Monday, March 7, 2016

Wed, Mar 9 - Fully Convolutional Networks for Semantic Segmentation

Fully Convolutional Networks for Semantic Segmentation. Jonathan Long, Evan Shelhamer, Trevor Darrell. CVPR 2015.

arXiv

Friday, March 4, 2016

Mon, Mar 7 - Fast R-CNN

Fast R-CNN. Ross Girshick. ICCV 2015.

arXiv, code

Wednesday, March 2, 2016

Fri, Mar 5th - Diagnosing error in object detectors

Diagnosing error in object detectors. Derek Hoiem, Yodsawalai Chodpathumwan, and Qieyun Dai. ECCV 2012.

project page

Monday, February 29, 2016

Wed Mar 2 - Deepbox

DeepBox: Learning Objectness with Convolutional Networks. Weicheng Kuo, Bharath Hariharan, Jitendra Malik. ICCV 2015.

arXiv

Friday, February 26, 2016

Mon, Feb 29th - Selective Search

Selective Search for Object Recognition. J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, A. W. M. Smeulders. IJCV 2013.

project page

Wednesday, February 24, 2016

Fri, Feb 26 - Understanding Deep Image Representations by Inverting Them

Understanding Deep Image Representations by Inverting Them. Aravindh Mahendran, Andrea Vedaldi. CVPR 2015.

arXiv

Monday, February 22, 2016

Wed, Feb 22 - Object Detectors Emerge in Deep Scene CNNs

Object Detectors Emerge in Deep Scene CNNs. Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba. ICLR, 2015.

project page, arXiv

Friday, February 19, 2016

Mon, Feb 22 - Places database

Learning Deep Features for Scene Recognition using Places Database. B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. NIPS 2014.

project page, pdf, demo

Wednesday, February 17, 2016

Fri, Feb 19 - Deep Neural Decision Forests

Deep Neural Decision Forests. Peter Kontschieder, Madalina Fiterau, Antonio Criminisi, and Samuel Rota Bulo. ICCV 2015.

Project page

Monday, February 15, 2016

Miscellaneous stuff

This is a reminder that as you start to get your hands dirty with data and code (often deep learning), TA Nam Vo is someone you can turn to for help. His office hours are Friday 2-4pm, CCB 308L

Also, I was asked to post a link to Tomasz Malisiewicz's blog as I've mentioned some of the things he wrote: http://www.computervisionblog.com/

Wed, Feb 17 - PatchMatch

PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing. Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. Siggraph 2009.

project page

Friday, February 12, 2016

Mon, Feb 15 - Going Deeper with Convolutions

Going Deeper with Convolutions. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. 2014.

arXiv

Thursday, February 11, 2016

Fri, Feb 12 - What Makes Paris Look Like Paris?

What makes Paris look like Paris? Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. Siggraph 2012.

project page

Friday, February 5, 2016

Next two classes and TA hours

Reminder that Monday and Wednesday you are giving 6 minute presentations on your semester projects.

Now that you've nailed down a project topic, if you need help then TA Nam Vo will have office hours Friday 2-4pm, CCB 308L.

Wednesday, February 3, 2016

Fri, Feb 5 - How do Humans Sketch Objects?

How do humans sketch objects? Mathias Eitz, James Hays, and Marc Alexa. Siggraph 2012.

project page

Monday, February 1, 2016

Wed, Feb 3 - Sketch2Photo

Sketch2Photo: Internet Image Montage. ACM SIGGRAPH ASIA 2009, ACM Transactions on Graphics. Tao Chen, Ming-Ming Cheng, Ping Tan, Ariel Shamir, Shi-Min Hu.

project page

Friday, January 29, 2016

Mon, Feb 1 - ImageNet

ImageNet: A Large-Scale Hierarchical Image Database. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei. IEEE Computer Vision and Pattern Recognition (CVPR), 2009

pdf, project page

Wednesday, January 27, 2016

Fri Jan 29 - AlexNet

ImageNet Classification with Deep Convolutional Neural Networks. Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton. NIPS 2012.

pdf

Monday, January 25, 2016

Wed, Jan 27 - Deep Learning Tutorial

This class will be somewhat unusual in that we won't discuss a particular paper. We'll try to make sure that we understand deep learning well enough to follow the papers for the rest of the semester.

CVPR 2014 Tutorial on Deep Learning. Graham Taylor, Marc'Aurelio Ranzato, and Honglak Lee. Read only the first two sets of labeled Introduction and Supervised learning.

CVPR 2014 tutorial

Thursday, January 21, 2016

No class Friday

Campus will close at noon so we will not have class on Friday. We will talk about "Learning to predict where humans look" on Monday.

Mon Jan 25: Learning to predict where humans look

This topic is delayed until Monday because of inclement weather.

Paper assignments are up, and a tentative early schedule.

Learning to predict where humans look. T. Judd, K. Ehinger, F. Durand, and A. Torralba. IEEE International Conference on Computer Vision (ICCV), 2009.

Project Page

Monday, January 18, 2016

Wed, Jan 20: Photo Clip Art

Hi Class, I'm going to go ahead and lead another discussion on Wednesday because I don't think I would be giving enough notice for someone else to present.

The image generation topics seemed popular but nobody selected this paper. It's a good paper to read before we get to the later papers.

Photo Clip Art. Jean-Francois Lalonde, Derek Hoeim, Alexei A. Efros, Carsten Rother, John Winn and Antonio Criminisi. ACM Transactions on Graphics (SIGGRAPH 2007).

project page

Wednesday, January 13, 2016

Fri, Jan 15: MS COCO

Microsoft COCO: Common Objects in Context. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C. Lawrence Zitnick. ECCV 2014.

Monday, January 11, 2016

Wed, Jan 13 paper: Scene Completion

Scene Completion Using Millions of Photographs. James Hays, Alexei A. Efros. ACM Transactions on Graphics (SIGGRAPH 2007). August 2007, vol. 26, No. 3.

project page.

This is the first paper for which you'll post reading summaries. Here is the description of these summaries from the class website: Students will be expected to read one paper for each class. For each assigned paper, students must write a two or three sentence summary and identify at least one question or topic of interest for class discussion. Interesting topics for discussion could relate to strengths and weaknesses of the paper, possible future directions, connections to other research, uncertainty about the conclusions of the experiments, etc. Reading summaries must be posted to the class blog http://cs7476.blogspot.com/ by 11:59pm the day before each class. Feel free to reply to other comments on the blog and help each other understanding confusing aspects of the papers. The blog discussion will be the starting point for the class discussion. If you are presenting you don't need to post a summary to the blog.

Simply click on the comment link below this to post your short summary and one or more questions / discussion topics.

Partner Search

Hi Class. If you don't know who you want to work with feel free to reply to this thread and perhaps say a bit about what project topics you had in mind, if any. E.g. "I'm James and I'm very interested object proposals or crowdsourcing strategies. Let me know if you want to chat about working together on a project".

Welcome to CS 7476 Advanced Computer Vision

This is the blog where you will post discussion topics and questions for each class. We still need to decide which topics we want to cover. Many suggestions are available on the course website. http://www.cc.gatech.edu/~hays/7476/.

Please answer the poll, as well. It closes on the 18th of January.