Life is a game, take it seriously

Archive for January, 2012|Monthly archive page

How things work: Amazon Flow App Algorithm

In App, Computer Vision on January 29, 2012 at 6:49 pm

by Gooly (Li Yang Ku)

A9 Flow App

Amazon and Google are now the top players in the area of image query. Amazon’s laboratory A9 acquired the image query company SnapTell in 2009 and released a smart phone app Flow in mid 2011. Google have long been in the image search service since it acquired Neven Vision in 2006. Google also released Google Goggle in early 2011.

Amazon’s Flow is an app that allows users to obtain product information by pointing your phone camera to the product. The idea is to allow consumers to buy stuffs on Amazon in a rival’s physical store and also to report the local price of the product to Amazon. This controversial business idea is considered by many shop owners as immoral but might be unavoidable.

Google Goggle is a more general smart phone app that does image recognition and image query. I’ll leave Google Goggle to the next post and focus on Flow now.

The Flow App is made by A9’s visual search group, which I believe is basically the SnapTell group. Before Amazon acquiring SnapTell, it already had a visual search App. It even used a similar logo (see below).

On A9’s official website only very few information is revealed about the technology; we can only guess from the following flowchart. Apparently they are waiting for patents to get approved and doesn’t want to reveal any detail.

To get some sense about what this is all about we have to dig deeper and see who are the people that made this app. From SnapTell.com we know that the people that probably influenced this app most are Gautam Bhargava (CEO), Rajeev Motwan (Stanford Professor), and G.D. Ramkumar (CTO). If you google for their publications none of them was in the computer vision area. Gautam Bhargava worked on database, Rajeev Motwan teaches theoretical computer science, and G.D. Ramkumar seems used to work on geometric algorithms. Therefore the innovative part of this app very likely lies in the database part.

From the first part of the chart, the points look very like SIFT like features. According to vision wang’s blog post he believes that the ASG algorithm (Accumulated Signed Gradient) is a SIFT like feature. According to the name, Accumulated Signed Gradient, I guess it might use several gradient values around a point and accumulate it as a vector descriptor. Or it might simply mean the algorithm accumulated several descriptor as one vector. The app works real time so I suspect anything complicated was implemented.

However according to a 2008 white paper I found on the web (see flow chart below), in addition to the image, text is also used. Text inside the polygon surrounded by the feature points could be used as an additional information for visual search.

I would say there might not be anything amazing in the part I mentioned above, the core technology should be on how they query the image database. While I am not an expert on database, I guess the database is a tree like structure, something derived from the patent, method and apparatus for classification of high dimensional data, written by G.D. Ramkumar before he co-founded SnapTell.

The problem flow is trying to solve is actually very hard and complicated, and apparently they haven’t solve it. I currently only succeeded on matching books and a coke can. For the app to work well on other non-plane objects, more efforts need to be implemented on the first step.

Advertisements

Object matching method made in the 20th century

In Computer Vision, Matlab on January 15, 2012 at 8:33 pm

written by gooly

object recognition using SIFTI just submitted some Matlab code for object matching, using an old but simple method mentioned in the paper:

Lowe, D.G. 1999. Object recognition from local scale-invariant features.
In International Conference on Computer Vision, Corfu,
Greece, pp. 1150–1157.

This is the original famous SIFT paper. Most people know SIFT points for its robustness and scale, rotation invariant, but many might not notice that an object matching method is also mentioned in the paper.

This Matlab code is based on that method but uses SURF points instead of SIFT. To run the Matlab code you have to download the SURFmex library first. http://www.maths.lth.se/matematiklth/personal/petter/surfmex.php
Remember to include the SURFmex library by right clicking the folder in Matlab and add subfolders to path.

You can then run Demo.m to see the matching result.

Demo.m first calls createTargetModel with a target image and an image with the contour of the target in the same image as input. createTargetModel then gathers the information needed for object matching and output it as targetModel.

matchTarget is then called with the targetModel and the test image as input. The contour of the target in the test image will then be shown.

The algorithm works as follows. First the SURF points of the target image is extracted and stored.  In matchTarget.m the SURF points of the test image is also calculated and each of them is matched to the most similar SURF point in the model. By using the scale and orientation of the SURF point descriptor, each matched SURF point pair has a translation from the target image to the test image.

Therefore 1 pair of correctly matched SURF points can decide the position, scale and orientation of the target in the test image. However most of the matched pairs aren’t correct, therefore we use all of the pairs to cast votes on what are the correct position, scale and orientation of the target in the test image.

The result that has the highest votes are then refined. A rotation matrix and  a transition vector is then calculated based on the SURF point pairs in the result.