HW2 Mining Image Labels from Web Text Descriptions for Classification
Due March 6, 11:59pm
In this homework you will train classifiers for color based visual attributes.
Training images will be automatically labeled by mining the text descriptions associated
with web shopping imges. You will then use these classifiers to retrieve images
displaying each attribute from a collection of testing images.
We will again use shopping images, this time the bag portion of the dataset --
Part 1 - Mining Image Labels from Descriptions (15 points)
We will be training binary attribute classifiers for 5 color terms ("black",
"brown", "red", "silver", and "gold"). In this part of the homework you will
automatically collect training and testing images from the bag dataset by
utilizing their existing associated text descriptions.
- Write code that can determine for an attribute term, whether that attribute
term is present in the associated image description. Note: your code should match both upper and lower case
versions of each attribute term and should include processing to remove
- The set of images whose descriptions contain exactly one of the
attribute terms will form your training set. The rest of the images (with
descriptions containing multiple or no attribute terms) will form your testing
Part 2 - Computing Image Descriptors (10 points)
- Compute a hue-saturation-value color histogram descriptor for each image in your training and
testing sets, using 10 bins for each color dimension (h,s, and v).
This will be the image descriptor used in the remainder of the homework. Remember
to normalize your histograms so that they sum to 1. Note, you can remove images that are
grayscale from consideration since they won't have valid color histograms.
Part 3 - Training Classifiers (25 points)
In this part of the homework you will train RBF kernel SVMs to
recognize images displaying a color-based visual attributes (note you should train 5 binary SVMs, 1 for each attribute
term). Positive examples for training an
attribute, e.g. "black", will consist of those images in your training set that have the attribute
in their text description. Negative examples will be the rest of the images in your training set.
- You should use the LibSVM package located here as
your SVM implementation (description of LibSVM here, matlab and python interfaces available).
- Because your SVM will be sensitive to choice of parameter values, split
your training set into two parts: 70% as a temporary training set, and 30% as a
tuning set. Use the tuning set to select good values for the SVM parameters -- C and g.
To do this you can simply search over a range
of reasonable values for C and g (experiment to find a good range).
Report best parameter settings and tuning accuracies in your write-up.
- Once you have found good parameter values (for each attribute), use the
entire training set to train final SVM models (for each
attribute). Use the '-b 1' option so that your SVM will compute probability values.
Part 4 - Classifying/Retrieving Images without Attribute Annotations (20 points)
Here we will retrieve images displaying visual attributes from your testing
set (described in Part 1).
- Classify each image in your testing set using the '-b
1' option to get a probability value.
- For each attribute, rank the test images according to their probability.
Create a web page for each attribute showing the most likely 200 images in
ranked order. This ranking should make sense if you've implemented everything
- Compute precision@k curves for each attribute for k=1...200. Note since you don't have
labels for the test set, you will have to calculate these precision values by looking at the results
and determining whether each ranked image displays the attribute or not.
Part 5 - Freestyle (30 points)
- Implement several extensions of your choice.
- For example: try additional image descriptors, different classifiers, or download some new query images from the web and see what happens, etc.
What to turn in
Hand in via email to firstname.lastname@example.org:
- A write-up including a description of what you implemented, your best parameter
values for each attribute and tuning set accuracies, and precision@k curves for each
- 5 web pages showing top 200 results for each attribute. You should post these online
and include the page urls in your write-up along with a screen capture of the top results.
- A description of your freestyle experiments and results.
- Commented code.
- ReadMe documenting code and how to run your code.