CSE 591: Recognizing People, Objects, and Actions

Instructor: Tamara Berg  (tlberg -at- cs.sunysb.edu)
Office: 1411 Computer Science
Lectures: Tues/Thurs 11:20-12:40pm Rm P112, Physics
Office Hours: Tues/Thurs 1:00-2:00pm, and by appointment
Course Webpage: http://tamaraberg.com/teaching/Spring_12/cse591


Recognition is one of the core pursuits of computer vision. In recognition one attempts to attach semantics to visual data such as images or video. Object recognition is an important subtopic where one builds models to recognize object categories or instances. Other subtopics include: activity recognition -- building descriptions of what people are doing from visual data, face recognition -- attaching identities to pictures or video of faces, and detection -- localizing all instances of a particular category in an image. This course will look at both historical and current methods for recognizing objects, people, actions, and scenes in images and video. Students will have a chance to define their own problems and work on solutions through a course project.

  • Objects - single instance or category based
  • People - faces, pedestrians, pose, and actions
  • Scenes - whole image features, recognition in context, surfaces
  • Recognition using vision + other modalities
  • Human-centric recognition outputs

Tentative Schedule

DateTopic Readings Presenter Assignments
Jan 24Intro & Overview of Course - Slides-Tamara Get access to matlab, do a tutorial.
Jan 26Computer Vision Review - SlidesSections 1.1-1.2Tamara Choose a paper to present from the reading list, HW1 released
Jan 31Computer Vision Review (cont) see slides from Jan 26-Tamara -
Feb 2Bag of Features Models - SlidesObject Recognition from Local Scale-Invariant FeaturesTamara -
Feb 7Bag of Features Models - SlidesVisual Categorization with Bags of KeypointsTamara HW2 released
Feb 9BoF (cont) & Spatial Models - SlidesShape Matching and Object Recognition Using Low Distortion CorrespondenceTamara, Kalyan-
Feb 14To Categorize or Not to Categorize - SlidesPrinciples of Categorization,
Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships
Feb 16Recognizing Attributes - Slides1Attribute and Simile Classifiers for Face Verification,
Relative Attributes
Tamara, Chen-
Feb 21FacesFace Recognition using Eigenfaces,
Robust Real-time Face Detection
Chaitanya, VinayHW3 released
Feb 23Faces (cont)---
Feb 28Intro to Pose & Actions - Slides-Tamara-
March 1Pedestrian DetectionHistograms of Oriented Gradients for Human Detection,
Pedestrian Detection in Crowded Scenes
Hailin, Nihar-
March 6Project Proposals--Prepare a 5 minute proposal presentation
March 8Pose Estimation in ImagesRecovering Human Body Configurations: Combining Segmentation and Recognition,
Poselets: Body Part Detectors trained Using 3D Human Pose Annotations
Keerthi, Rajan-
March 13Action RecognitionRecognizing Action at a Distance,
Learning Realistic Human Actions from Movies
March 15Intro to Scenes - SlidesBeyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene CategoriesTamara-
March 20Scene InterpretationAutomatic Photo Pop-Up,
Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics
March 22Project Updates--Prepare a 5 minute update presentation
March 27Project Help Day---
March 29Scenes & What We can Do in ThemRecovering the Spatial Layout of Cluttered Rooms,
From 3D Scene Geometry to Human Workspace
Feifei, Fan-
April 3Spring Break---
April 5Spring Break---
April 10Words & Pictures-Tamara-
April 12Catch up Day---
April 17No Class - Traveling---
April 19Project Updates--Prepare a 5 minute update presentation
April 24Generating Descriptions of ImagesBaby Talk: Understanding and Generating Simple Image Descriptions,
Im2Text: Describing Images Using 1 Million Captioned Photographs
April 26Predicting Aesthetic QualityHigh Level Describable Attributes for Predicting Aesthetics and Interestingness,
Assessing the aesthetic quality of photographs using generic image descriptors
May 1Final Project Presentations-Shobha & Rajan,
Keerthi & Nihar,
Chaitanya & Kalyan,
Hanyu & Chen
Prepare a 15 minute final presentation
May 3Final Project Presentations-FeiFei & Hailin,
Swastika & Rohit & Vinay,
Prepare a 15 minute final presentation
May 8Final Project Write-up--Project Write-up Due


There will be 3 simple homeworks during the first 1-2 months of the course to get students aquainted with computer vision and recognition. Over the final part of the course students will develop and present a project related to recognition. Students will also be responsible for leading one class paper discussion. A few short quizzes will also be given about assigned papers.

Grading will consist of: Assignments (35%), Project (35%), Paper presentation (10%), Paper quizzes (10%), Participation (10%).

No prior experience in computer vision is required to take this course. Homeworks should be done individually, but projects may be done individually or in pairs. Homeworks will be completed in matlab.

Submit all homeworks, and presentations to: cse591@gmail.com

Useful links

Student Matlab licenses can be purchased from mathworks for $99 - Link.
Matlab tutorial by Hany Farid and Eero Simoncelli - Link
A more comprehensive Matlab tutorial by David Griffiths - Link

Label Me - Link
Tiny Images - Link
Code for downloading Flickr images - Link

Computing Features
SIFT features - Link
Scale Invariant Interest Points - Link
Affine Covariant Regions - Link
Shape Contexts - Link
Gist - Link

Other Useful Software
Various Code from INRIA - Link
Various Code from Oxford - Link
Various useful machine learning tools - Link

Reference Books
Forsyth, David A., and Ponce, J. Computer Vision: A Modern Approach, Prentice Hall, 2003.
Hartley, R. and Zisserman, A. Multiple View Geometry in Computer Vision, Academic Press, 2002.
Stephen E Palmer, Vision Science: Photons to Phenomenology, MIT Press, 1999.