Life is a game, take it seriously

Archive for the ‘Point Cloud Library’ Category

Book it: OpenNI Cookbook

In Book It, Computer Vision, Kinect, Point Cloud Library on November 20, 2013 at 7:53 pm

by Li Yang Ku (Gooly)

OpenNI Cookbook


I was asked to help review a technical book “OpenNI Cookbook” about the OpenNI library for Kinect like sensors recently. This is the kind of book that would be helpful if you just started developing OpenNI applications in Windows. Although I did all my OpenNI researches in Linux, it was mostly because I need it to work with robots that use ROS (Robotic Operating System), which was only supported in Ubuntu. OpenNI was always more stable and more supported in Windows than in Linux. However, if you plan to use PCL (Point Cloud Library) with OpenNI, you might still want to consider Linux.

OpenNI Skeleton Tracking

The book contains topics from basic to advance applications such as getting the raw sensor data, hand tracking and skeleton tracking. It also contains sections that people don’t usually talk about but crucial for actual software development such as listening to connect and disconnect events. The code in this book uses the OpenNI2 library, which is the latest version of OpenNI. Note that although OpenNI is opensource, the NITE library for hand tracking and human tracking used in the book isn’t. (But free under certain license)

You can also buy the book on Amazon.




RVIZ: a good reason to implement a vision system in ROS

In Computer Vision, Point Cloud Library, Robotics on November 18, 2012 at 2:33 pm

by Gooly (Li Yang Ku)

It might seem illogical to implement a vision system in ROS (Robot Operating System) if you are working on pure vision, however after messing with ROS and PCL for a year I can see the advantages of doing this. To clarify, we started to use ROS only because we need it to communicate with Robonaut 2, but the package RVIZ in ROS are truly very helpful such that I would recommend it even if no robots are involved.

(Keynote speech about Robonaut 2 and ROS from the brilliant guy I work for)


RVIZ is a ROS package that visualizes robots, point clouds, etc. Although PCL does provide a visualizer for point cloud, it only provides the most basic visualize function. It is really not comparable with what RVIZ can give you.

  1. RVIZ is perfect for figuring out what went wrong in a vision system. The list on the left has a check box for each item. You can show or hide any visual information instantly.
  2. RVIZ provides 3D visualization which you could navigate with just your mouse. At first I prefer the kind of navigation similar to Microsoft Robotic Studio or Counter Strike. But once you get used to it, it is pretty handy. Since I already have 2 keyboards and 2 mouses, it’s quiet convenient to move around with my left mouse while not leaving my right hand from my right mouse.
  3. The best part of RVIZ is the interactive marker. This is the part where you can be really creative. It makes selecting a certain area in 3D relative easy. You can therefore adjust your vision system manually while it is still running such as select a certain area as your work space and ignoring other region.
  4. You can have multiple vision processes showing vision data in the same RVIZ. You simply have to publish the point cloud or shape you want to show using the ROS publishing method. Visualizing is relatively painless once you get used to it.

Try not to view ROS as an operating system like Windows, Linux. It is more like internet, where RVIZ is just one service like google map, and you can write your own app that queries the map if you use the same communication protocol provided by ROS.

xtion kinect on Ubuntu 12.04

In Computer Vision, Point Cloud Library on September 3, 2012 at 3:01 pm

by Gooly (Li Yang Ku)

 (3/9/2014 Update: OpenNI released OpenNI2 last year. I still use the old OpenNI for most of my ROS applications but you can learn how to use OpenNI2 in my newer post. You can still download the old OpenNI under 

After upgrading to Ubuntu 12.04 I ran into some problems with kinect and xtion. (I used to have the stable version working on Ubuntu 11, see link) After trying several different driver combinations, I had some progress but could only have one kind of sensor working at a time.

(Update: the version number below doesn’t exist anymore, please download OpenNI SDK v1.5.7.10, OpenNI-Compliant Sensor Driver v5.1.6.6, NiTE v1.5.2.23 instead)

For Kinect to work:

  1. Install OpenNI (unstable version v from
  2. Install hardware binaries from ps-engine. Install it with “apt-get install” or through synaptics. The order is important, always install OpenNI first.
  3. If it’s not working, check and make sure XnSensorServer is not running. Kill it if it is.

For xtion to work:

  1. Install OpenNI (unstable version v from
  2. Install hardware binaries (stable version v 5.1.041) from . The order is important, always install OpenNI first.

I check the results with NiViewer and pcl OpenNIGrabber; try to switch usb ports if you still have problem. I actually have one computer to work on both sensors the same time, but my other computer doesn’t. Let me know if you have the same result if possible. Thanks.

Recording 3D video(oni) files that align rgb image with depth image

In Computer Vision, Point Cloud Library on July 15, 2012 at 10:51 am

by Gooly (Li Yang Ku)

Kinect or xtion like devices provide an easy way to capture 3D depth image or videos. The OpenNI interface that is compatible with these devices comes with a handy tool “NiViewer” that captures 3D video into an oni file. The tool is located under “Samples/Bin/x64-Release/NiViewer” for Linux; and should be in the startup menu if you use Windows. After starting up the tool, you can right click and show the menu. By clicking “Start Capture” and then “Stop” should generate an oni video file in the same folder.

However the rgb video and depth video in the oni file are recorded seperately and not aligned. (Turns out it should be aligned if you press 9, but it didn’t work on my machine, see comment) This is due to the camera position difference between the IR sensor and the rgb sensor. OpenNI do provide functions to adjust the depth image to match the rgb image (Note that it is not doable the other way around). By adding some additional code to the NiViewer, you should be able to record depth video that is aligned with the rgb image.

First open the file “src/NiViewer/Capture.cpp”; change the code that adds the depth node under the “captureFrame()” function to the following.

nRetVal = g_Capture.pRecorder->AddNodeToRecording(*getDepthGenerator(), g_Capture.nodes[CAPTURE_DEPTH_NODE].captureFormat);
START_CAPTURE_CHECK_RC(nRetVal, "add depth node");
g_Capture.nodes[CAPTURE_DEPTH_NODE].bRecording = TRUE;
DepthGenerator* depth = getDepthGenerator();
depth->SetIntProperty ("RegistrationType", 1);
nRetVal = depth->GetAlternativeViewPointCap().SetViewPoint(*getImageGenerator());
if(XN_STATUS_OK != nRetVal)
	depth->SetIntProperty ("RegistrationType", 2);
	nRetVal = depth->GetAlternativeViewPointCap().SetViewPoint(*getImageGenerator());
	if(XN_STATUS_OK != nRetVal)
		displayMessage("Getting and setting AlternativeViewPoint failed");
g_Capture.nodes[CAPTURE_DEPTH_NODE].pGenerator = depth;

Then type make in the terminal under the NiViewer folder. The new NiViewer binary should be generated.
Kinect and xtion sensor uses different methods to generate this alternative view depth image. Kinect does the adjustment in software and xtion does it in hardware. This is why different “RegistrationType” are set in the code above.

Object Recognition with Point Cloud Library

In Computer Vision, Point Cloud Library on June 10, 2012 at 3:35 pm

by Gooly (Li Yang Ku)

If you can afford 3D why 2D? This is probably true for both movie and object recognition. With 3D sensors you get a lot of advantages in object recognition, and thanks to XBox360, 3D sensors are now affordable.

Object Recognition in 3D easily avoids lots of segmentation problems in 2D, where occlusions only happen in certain angle. It also provides 3D shape information which differentiates a cup and a picture of a cup. You can easily combine 3D and 2D, making 3D object recognition always superior then 2D recognition.

The Point Cloud Library (PCL) is the place to start. A naive method would be to first cluster points using Euclidean Cluster Extraction . This gives you a clean 3D segmentation. Note that points in a point cloud has the same order as the rgb image so you would be able to segment your rgb image according to that.

Then you can collect any 2D features such as SIFT, SURF, HOG or whatever based on the segmented region.

Further more, you can collect 3D features such as
or 3D SIFT from each segmented point cloud.

By combining all the features in an invariant way, (invariant to the density of cloud points, SURF or SIFT points. Could probably use histogram to achieve that). You can acquire a high dimensional vector representing these object.

While testing, a nearest neighbor search to these vectors should then indicate the most likely object.

One more good thing about PCL is that it’s under BSD License so you can extend it to commercial product for free. ooh la la