Tuesday, October 1, 2013

Discussion on NARF features...

Thanks Bastian Steder for your nice replies to my pestering emails :)


Paper about NARF:
- B. Steder, R. B. Rusu, K. Konolige, and W. Burgard. 

Point Feature Extraction on 3D Range Scans Taking into Account Object Boundaries


or his PhD-Thesis:



- Bastian Steder. Feature-Based 3D Perception for Mobile Robots.



You can find those on his homepage:


The rest of the article is the reply I got from Bastian to my questions.

> I was going through the paper on NARF key points and it uses range image

> representation of pointcloud in key point extraction.

> 1. Does it mean that NARF key points can only be extracted on point
> clouds that can be converted to range images?

> 2. Can Un-organized point clouds be converted to range images?

> 3. Is the statement below true,

> "The range image that is visualized in pcl tutorial uses spherical
> projection for better view only. Essentially range image is depth image
> where pixels positions contain depth information. The range image can be
> viewed as normal planar image also but the points not in 180 field of
> view cannot be displayed."



it is correct that NARFs can only be extracted from range images. But a 
range image can be created both from organized and unorganized point 
clouds, as long as you can provide a viewpoint.
I am not sure if I understand the statement in 3. correctly, but there 
are two RangeImage classes. RangeImage uses spherical projections and 
can therefore represent full 360deg views. RangeImagePlanar uses a 
pin-hole camera model with a projection plane and can therefore only 
represent scenes with less than 180deg field of view.



>Goal: One thing I found is that NARF features are not that stable on
> Kinect data, so I was wondering if I can make them more stable. 


I noticed that too, but was not able to fix it in the original NARF method. The problem is that the borders ob objects in the Kinect-Data 'flicker' a lot and are not stable. This leads to big differences in keypoint positions between frames.
I am working on a new keypoint method that should work better in this context, but it is not finished and therefore also not published yet.


>  I guess that you had lidar point cloud in mind
> while developing NARF's.  But sometimes there is a possibility of point
> clouds from lidar like velodyne being very sparse.
> 1.So how does sparse point clouds effect the NARF's stability and  no.
> of features extracted?


Sparsity itself is typically not a problem. I used the method on Velodyne data before. The only thing you have to keep in mind is that you need to choose a large enough support size that there are actually enough points for the estimation process.


> 2. Could you give any suggestion on improving the algorithm and any
> drawbacks which you found on the current method from your experiences


As mentioned above, the method does not work so well on Kinect-like sensors, mainly because of the unstable borders. If you somehow make the border positions more stable that might help, but I do not have a good idea how to do that.


> 3. Currently I am in the beginning stage of understanding your
> algorithm. I was not able to find the math behind converting point cloud
> to range image. Could you please point where can I get theory on
> converting point cloud to range image.


Converting a point cloud into a range image is typically done using a simple z-buffer. This means that all points are converted into pixel coordinates and ranges and then every pixel is the minimum of the ranges that fall into it.


>  4.I went through the tutorial on creating range image from point cloud
> in PCL. We have to set a parameter "angular resolution". Does this
> parameter change the number the number of pixels in the range image
> generated from the point cloud ? If yes, then how can I get the range
> image with all the information I have in point cloud and not missing any
> data? Will setting a small value interpolate any data?



Yes, this conversion step does not keep the original number of pixels. This is a drawback you have to live with if you want to use this method. Range images naturally encode a certain perspective and therefore not encode 3D structures that are occluded from other structures given the perspective.
Yes, the angular resoultion parameter changes the number of pixels. The method in PCL includes some interpolation, but if the resolution of your cloud is much lower than the one of the range image you will have empty pixels in there.



> 5. One more pre-requisite is "view-point" information to generate range
> image from point cloud. If I am planning to use NARF's for pose
> estimation by feature matching, then I actually dont know the current
> "pose" or "view point" and that is what I am trying to estimate in pose
> estimation process. So does it mean that NARF's cannot be used for
> pose-estimation by feature matching. Please correct me, if I am missing
> something fundamental :(



Pose estimation is basically what I use NARFs for. :-)
The viewpoint I am talking about is the sensor position in the coordinate frame of your scan. If the scan is in the coordinate frame of the sensor (which is often the case) this viewpoint is typically just (0,0,0). If you converted the scan into the coordinate system of your robot, e.g., it is the position of the sensor on the robot.
But you can also choose another viewpoint to simulate another perspective.
I would propose to just try the method on one of your scans and visualize the resulting range image.


Hope that this article is useful :-)



No comments:

Post a Comment