Life is a game, take it seriously

Archive for April, 2011|Monthly archive page

SURF On Images: feature point matching

In Computer Vision, Matlab on April 25, 2011 at 4:06 pm

written by Gooly

SURF (Speed Up Robust Features) is a scale and rotation invariant interest point detector and descriptor. It could be categorized under the family tree of the widely used SIFT feature. These SIFT like features are commonly used in various applications such as stereo vision, object recognition, image stitching since the 21th century. One good example is the famous Microsoft project “ Build Rome in a day“.

The feature finding process is usually composed of 2 steps; first, find the interest points in the image which might contain meaningful structures; this is usually done by comparing the Difference of Gaussian (DoG) in each location in the image under different scales. A major orientation is also calculated when a point is considered a feature point. The second step is to construct the scale invariant descriptor on each interest point found in the previous step. To achieve rotation invariant, we align a rectangle to the major orientation. The size of the rectangle is proportional to the scale where the interest point is detected. The rectangle is then cropped into a 4 by 4 grid. Different informations such as gradient or absolute value of gradient are then subtracted from each of these sub square and composed into the interest point descriptor.

The SURF feature is a speed up version of SIFT, which uses an approximated DoG and the integral image trick. The integral image method is very similar to the method used in the famous Viola and Jones’ adaboost face detector.  An integral image, despite its pretty name, is just an image which its each pixel value is the sum of all the original pixel values left and above it. The advantage of integral image is that after an image is computed into an integral image, it can compute block subtraction between any 2 blocks with just 6 calculations. With this advantage, finding SURF features could be several order faster than the traditional SIFT features.

SIFT like features have become quite a basic component in general computer vision courses, and also a good point to start some vision research. There is no need to reinvent the wheel, since there are various libraries online for free. OpenCV has both SIFT and SURF libraries. However, since Intel abandoned the project, the 2.x series is poorly documented and lack books to study. Fortunately, there are other choices for Matlab users. The VLFeat vision library provides a nice SIFT library and a simple tutorialSURFmex also provides an interface from Matlab to the OpenCV SURF library. I will show how SURFmex could be used in the following posts.

Book It: On Intelligence

In Book It, Computer Vision, Neural Science on April 19, 2011 at 8:49 pm

written by Gooly

On Intelligence is written by Jeff Hawkins, the Kobe Bryant in Neural Science, some people may not like him, but he scores. If you are a big fan of neural network or human brain, this is definitely a must read.

Jeff Hawkins, a brain theory enthusiast who founded the Palm PDA company while he had a bottleneck in brain research, is also the founder of the Redwood Neuroscience Institute and a neural science based company, Numenta. In this book, Hawkins pointed out the unbalance between theory and experiment in the area of brain research and the stupidity of claiming that we don’t understand brain yet because we lack data, making most of the some neural scientists look stupid.

The central theory of this book is Hawkins claim that Intelligence is prediction, which he compared it to the discovery of the Spherical Earth, Darwin’s Evolution Theory, and Plate Movement. It might seem quite a lot a bit of exaggerating at first, since we all know humans can predict, and intelligence can help make good predictions; why would anyone compare this quite intuitive theory to the other great discoveries. However, as a good speaker and author, Hawkins is persuasive. After reading this book, I have to admit I almost believed what he claimed.

This book is published in 2004, quit old compared to the other rapid changing technologies, but since brain theory is moving relative slow compared to the computer industry, it is definitely still worth reading.

Step aside from whether Hawkins’ theory is correct or not.
“It’s already 2011, where is our intelligent robots!?”

Paper Talk: Unsupervised Learning of Probabilistic Grammar-Markov Models for Object Categories

In Computer Vision, Paper Talk on April 16, 2011 at 9:35 am

written by Gooly

“The triangle is a foundation to an offense.”
Bill Cartwright, 3 times NBA Champion

What ever that means, triangle is definitely the foundation of this paper. Combining SIFT points into a chain of triangles allows us to use dynamic programming; the DP algorithm works as follows: after finding several triangles, we add each node to one of the triangles that most fit to create a new triangle for each iteration.  See figure below.

Since for each node we store the best fit triangle that it can combine, at the next iteration when we want to add the best n5 (see above graph) , we only have to consider the best fit among all n5, all the n4 from last iteration and the n3 which each n4 pick . For a model with m nodes and an image with n nodes to match this is a drop roughly from O(n^m) to O(m*n^2).

The fitness of the triangle is a probability defined by both the surf appearance and there location plus orientation compared to the model.

The paper also provides an unsupervised way to learn the model by DP. ( which is probably the emphasis of the paper )

Some of the paper’s result are shown below.

A Need to Change in Survival Strategy: the merge of world wide web and real world

In Serious Stuffs on April 15, 2011 at 9:57 am

written by Gooly

Since the big bang of the world-wide web, our world split into the real world and the virtual world. We all live in the real world physically, but could be spiritually in the virtual world “completely”. In the good old 2000s, there used to be a simple way to circle people from different professions into groups on the socially active axis, like the one below. However, recent year’s rapid change in social network makes this no longer true.

If you do regression on the graph above you would very likely get a negative slope; since with limited time and passion one could hardly be extremely active in both axis.

Back in the good old 2000s, you’ll probably do pretty good just by working hard in one world, this is due to the following success equation at that time.

As you can see, what you do in one world doesn’t affect your achievement in the other world. For example, even if you defeated the evil lord after a 10 hour war and step on others to obtain the glory to become the leader of your whole tribe in the virtual world, you still need to take the midterm tomorrow, and your professor won’t even praise on your epic accomplishment. This is why parents used to be the biggest enemy of online games. They simply believe that the parameter B in the equation is significantly small compared to A. Since parents definitely heard of the famous quote ” Everything is Optimization” by Stephen P. Boyd, they would know the best way to gain the most is to put all the efforts in the real world. I have to say they are probably right under this equation, but we’ll see that this is no longer true.

After the rapid change in the virtual world due to the thrive of one various social networks, the real world and the virtual world is merging. It’s harder to hide after nick names on the web, sites remember who you actually are. What happens on the virtual world directly affects your real world; saying something wrong on the web will very likely hit you hard in the real world. The old equation no longer reflects reality. This gives us the following more adequate success equation for the current world.

Now we have 3 parameters instead of 2 ( A B and C are usually positive, most people  have C > 1), which makes life even more complicated. To optimize success, we have to first estimate A B C and put the exact amount of effort on both world. However A B and C differs between people and changes with time, therefore we have to do a series of experiments and probably adjust the weights along the time.

But this is not the main point, what is significant is that the equation changed from addition to multiplication. What you do on the web will largely affect your total result. You can no longer be extremely success by just working hard in the real world; internet do matters. Politicians, doctors, lawyers all have to jump into the virtual world to get the most; if they don’t, their colleagues would.

This might also mean that in the near future, your professor probably would appreciate you for stopping an evil empire expanding to his tribe.

MATLAB: Read all images in a folder; everything starts here

In Computer Vision, Matlab on April 13, 2011 at 3:26 pm

written by Gooly

In ordinary people’s life, it’s often the case that you need to deal with a large but not extremely huge image database with non sequential naming. Instead of reading each image when you need it, you probably would prefer to read it all into a matrix for one time, or even store it into a mat file so that you can access it much faster next time. The MATLAB function dir would be a good way to do this, see the example below.

function X = ReadImgs(Folder,ImgType)
    Imgs = dir([Folder '/' ImgType]);
    NumImgs = size(Imgs,1);
    image = double(imread([Folder '/' Imgs(1).name]));
    X = zeros([NumImgs size(image)]);
    for i=1:NumImgs,
      image = double(imread([Folder '/' Imgs(i).name]));
      if (size(image,3) == 1)
        X(i,:,:) = image;
        X(i,:,:,:) = image;

To read in all the jpeg files in image_folder simply do the following,

ImageData = ReadImgs('image_folder','*.jpg');

Notice that this function only deals with image of the same size, which is usually the case; to be able to deal with different size images, you can use imresize in the function or other methods depending on what the task you are actually working on.

When dealing with huge image database (2GB up), you would probably prefer to store the images into separate mat files, only open the one that you want to use, and close it afterwards.

Note that as Thiruvikraman Kandhadai mentioned in the comment, this code won’t run if you have both RGB and Gray scale images in the folder. Save them into two separate matrices if you really need to.


Life is a game, take it seriously

In Serious Stuffs on April 13, 2011 at 4:02 am

This page is here so I can modify it some time later pretending I said it the first day.

Computer Vision is tough, why not have some serious fun?

In Computer Vision on April 12, 2011 at 9:32 pm

This page is here in case I have more stuff needed to pretend be said the first day.