Life is a game, take it seriously

Archive for April, 2012|Monthly archive page

Installing Asus Xtion on Ubuntu

In Computer Vision on April 22, 2012 at 11:56 am

written by Gooly (Li Yang Ku)

Researches using Kinect like sensors are currently hot. Kinect and Asus Xtion are both active point cloud sensors based on the OpenNI framework developed by PrimeSense, an Israel company. If you are not trying to play Xbox, I would highly recommend you to consider getting the Asus Xtion. The Xtion has the advantages that it is

  1. much smaller
  2. easier to mount
  3. doesn’t need an additional usb for power

The lab our lab is working with already bought 8 of them. The only problem is that it got out of stock fast, and the price started rising.

(Note OpenNI updated their website in 2013, but you can still download the old files at http://www.openni.org/openni-sdk/openni-sdk-history-2/ )

To install the sensor on Ubuntu you only need a few steps:

  1. Download OpenNI.
  2. Download Sensor Driver.
  3. Download Nite.
  4. Go to each downloaded file, extract them and execute the install file in each folder. (Note that Nite has to be installed last.)
  5. Run the NiViewer under the sample folder in the OpenNI folder to test the camera. You can right click your mouse to change settings.

http://openni.org/Documentation/Tutorial/smpl_simple_view.html includes some useful samples to start working on. You can get a better formatted version if you install OpenNI on windows.

If you received an error
“a timeout has occurred when waiting for new data”
when setting the RGB image size from 320*240(QVGA) to 640*480(VGA) or higher, you might need to update OpenNI and the sensor driver to a newer version. At least that solved my problem.

If you haven’t heard of PCL, http://pointclouds.org/ provides a large set of libraries on handling point clouds, and also a visualizer for viewing them in 3D. If you use PCL, it already provides an interface “OpenNIGrabber” to retrieve point clouds or color point clouds.

Note that some of the OpenNI functions doesn’t support multi-threading. I got crashes occasionally when running in a multi-thread environment.

You can test if the drivers are installed correctly by running NIViewer under /OpenNi/Samples/bin/x64-release/NiViewer .  I had some issues when the rgb camera starts with resolution 640*480; the color image showed up as noise. It could be solved by always starting the rgb camera with 320*240 and then immediately switch to other resolutions.

Advertisements

Viewing Computer Vision from a Bigger Picture

In Computer Vision, Machine Learning on April 5, 2012 at 10:38 pm

written by gooly (Li Yang Ku)

It’s easy to get lost when you are doing computer vision research, especially when you are deep in the codes tweaking parameters trying to improve while keeping balance. When you find yourself doing this for more than a half day, it’s probably a good time to lay back and look at the big picture.

For most computer vision problems, say object recognition, the quest is actually just trying to put data into different bins. Say we have a 200 by 200 grey image, we can just look at it as a point in a 200*200 = 40000 dimension space.  The problem now would be how to classify these points into different bins. In the paper “Origins of Scaling in Natural Images”, Ruderman shown that natural images have some common frequency spectrum. This suggests that natural images lies in a much smaller sub group in this immense dimension space. If we are able to map this high dimension point into a lower dimension while only throwing away uninformative data, we would be able to classify it easier.

Most of the vision work resides in this part, taking a high dimension data and turn it into a lower dimension data. SIFT points, HOG, SURF, and nameless researches are just doing this. Trying to find the lower dimension data that tells most.

And then we head to the second step where we have to classify these lower dimension data. It could be as simple as nearest neighbor, probability compared with your trained model or any machine learning algorithms such as Adaboost, SVM, neural network, etc. This step classifies all these points into different categories.

So back to where you were tweaking some magic parameters; what you are actually doing is probably slightly changing the sub space your images are mapped to, or throwing these points into bins slightly differently. So just take it easy, if it only works when you tweak it a lot, you are probably mapping to the wrong space or throwing points the wrong way.