|
|
|
Research
My main research area is Digital Media, specifically focused on organizing large collections of images with associated text through the development of techniques in Natural Language Processing and Computer Vision. Today billions of images with associated text are available in web pages, captioned photographs from news sources, video with speech or closed captioning, and others. In order to organize, search and exploit these enormous collections we have developed methods that combine information from both the visual and textual sources effectively. Past projects include: automatically identifying people in news photographs, classifying images from the web, and finding iconic images in consumer photo collections. I am also generally interested in bringing together people and expertise from various areas of Digital Media including digital art, music, and cultural studies. Teaching Spring 2010 - CSE/ISE 364 Advanced Multimedia.
Bio I graduated with a Ph.D. from the Computer Science Department at UC, Berkeley in the Spring of 2007 under the advisorship of Professor David Forsyth and was a member of the Berkeley Computer Vision Group. I spent 2007-2008 as a post-doc at Yahoo! Research devloping various digital media related projects including the automatic annotation of consumer photographs. I am currently an Assistant Professor at Stony Brook University and looking for excited, motivated graduate students. Please email me if you are interested in joining my group.
Students
Former Students
|
|
Projects | |
|
| |
|
We have built a set of classifiers to recognize several animal categories:
Alligator, Ant, Bear, Beaver, Dolphin, Frog, Giraffe, Leopard, Monkey and Penguin.
Using, Google Web Search, we identify a pool of candidate images for a given query.
These images are then re-ranked by our system using information extracted from both
the surrounding text and the images themselves. This give us quite a good pool
of images for each class. We also demonstrate that we can extend this pool of images
quite easily using a set of related queries for the monkey class.
We produce a startingly good set of results for complex web data.
| |
|
We define an iconic image for an object category (e.g. eiffel tower) as an image with a large clearly
delineated instance of the object in a characteristic aspect. We show that for a variety of
objects such iconic images exist and argue that these are the images most relevant to that category.
Given a large set of images noisily labeled with a common theme, say a Flickr tag, we show how to
rank these images according to how well they represent a visual category. We also generate a binary
segmentation for each image indicating roughly where the subject is located. The segmentation
procedure is learned from data on a small set of iconic images from a few training categories and
then applied to several other test categories. We rank the segmented test images according to shape
and appearance similarity against a set of 5 hand-labeled images per category. We compute
three rankings of the data: a random ranking of the images within the category, a ranking
using similarity over the whole image, and a ranking using similarity applied only within the
subject of the photograph. We then evaluate the rankings qualitatively and with a user study.
|
|
|