Lehrstuhl für Informatik VII - Robotik und Telematik

    IAS-13 Tutorial

    Data Structures for Large 3D Point Cloud Processing

    Tutorial at the 13th International Conference on Intelligent Autonomous Systems


    Full Day Tutorial.
    Please bring your own laptop computer and install our Virtual machine before.
    Please download the virtual machine as .vdi or as .zip

    You can download the slides:

    1. 3DTK - The 3D Toolkit .pdf
    2. Introduction .pdf
    3. Basic Data Structures .pdf
    4. Registration .pdf
    5. Mobile Mapping .pdf
    6. Meshing on Large Point Clouds .pdf
    7. Normals .pdf
    8. Planes .pdf

    Among other videos, the following clips have been shown during the tutorial:

    1. Stadtwirt Bremen
    2. Kaisersaal Residenz Würzburg in 3D
    3. Radiohead House of Cards
    4. Video Crash
    5. Video NoCrash
    6. Animation of Globally Consistent Scan Matching (church Horn, Austria)
    7. Animation of Globally Consistent Scan Matching (market place Horn, Austria)
    8. Animation of Globally Consistent Scan Matching (church Horn Austria, top view)
    9. Urban 6DoF SLAM
    10. Automatic Markerless 3D Scanning in Ostia Antica
    11. Computing Precise Maximum Likelihood 3D Point Clouds from Mobile Laser Scanning
    12. Indoor Mapping Using a Backpack Mounted 3D Range Sensor
    13. Geometry Reconstruction

    Recently, 3D point cloud processing became popular in the robotics community due to the appearance of the Microsoft kinect camera and other low cost sensors. The kinect is a structured light laser scanner that obtains a colored 3D point cloud also called RGB-D image, with more than 300000 points at a frame rate of 30Hz. The optimal range of the kinect camera is 1.2 to 3.5m and is well suited for indoor robotics in office or kitchen-like environments. Besides the boost of 3D point cloud processing through the kinect, the field of professional 3D laser scanning has advanced.

    Light Detection and Ranging (LiDAR) is a technology for three-dimensional measurement of object surfaces. Aerial LiDAR has been used for over a decade to acquire highly reliable and accurate measurements of the earth's surface. In the past few years, terrestrial LiDAR systems were produced by a small number of manufacturers. When paired with classical surveying, terrestrial LiDAR delivers accurately referenced geo-data. Typical laser scanners gage up to 1 million 3D points per second of the surrounding with millimeter accuracy.

    The objective of the tutorial is to present data structures used for state of the art 3D scanning technology. Efficient data processing is a key issue of processing of large scale 3D point clouds. Scenes scanned with LiDARs contain often millions to billions of 3D points. The goal of the tutorial is to give an overview of existing techniques and enable field roboticists to use recent methods and implementations, such as 3DTK – The 3D Toolkit and the Las Vegas Reconstruction Toolkit. The focus of this tutorial lies on point cloud data structures and their implementation in C/C++, i.e., we discuss, range images, octrees, k-d trees in detail. In addition, data structures for the marching cubes algorithm and for meshing methods will play a central role. We create reference material for the participants for subtopics like 3D point cloud registration and SLAM, calibration, filtering, segmentation, meshing, and large scale surface reconstruction.

    To achieve the objectives and to gain hands-on experiences on the problems occurring, when trying to process large-scale 3D point clouds, the tutorial consists of presentations, software demonstrations and software trials. To this end, participants have to bring their Linux, MacOS or Windows laptops to run the provided virtual machine.


    Prof. Dr. Andreas Nüchter
    University of Würzburg
    Am Hubland, 97074 Würzburg, Germany

    Dr. Thomas Wiemann
    University of Osnabrück
    Albrechtstr. 28, 49069 Osnabrück, Germany

    List of Topics

    • Introduction and Overview of Laser Scanning Technologies

    A general review of range sensor systems is given in the introduction including triangulation and LiDAR systems.  State of the art in terrestrial large volume data acquisition is to use high resolution LiDAR systems. These sensors emit a focused laser beam in a certain direction and determine the distance to an object by measuring the reflected light. By measuring the time difference between the emitted and measured signal, the distance to an object surface can be calculated. Laser range finders are distinguished by the method used to determine the object's distance.

    Pulsed wave systems emit a short laser flash and measure the time until the reflected signal reaches the sensor. By the constant speed of light, the distance is calculated (time-of-flight method). Since the speed of light is as fast as 300000km/s, a time resolution of just about a few pico seconds (10^-12s) is necessary to reach an accuracy of about 10mm. Depending on the intensity of the laser light, very high maximum ranges (up to 1000m) can be achieved. Besides pulsed laser scanners, systems using continuous light waves exist. They determine the running time of the laser light by measuring the phase shift between the emitted and detected signal. Since the phase shift is only unambiguous in the interval between 0 and 2pi, their maximum range is limited, depending on the wavelength of the laser light. Typical values are about 80m. The following table shows typical technical specifications.

    • Basic Data Structures and Point Cloud Filtering

    In this section we cover the problem of storing the data. Due to large environments, the high amount of data, and the desired precision (millimeter scale), grid based approaches, that are commonly used for planar indoor environments in robotics do not work well. We present different types of range images, an octree (cf. next Figure) with low memory footprint and k-d trees. As the choice of the data structure depends on the algorithms, we identify the algorithmic requirements, and how and which data structures support the tasks.

    • Precise Registration and the SLAM Problem

    After a precise 3D scanner captured its environment the data has to be put into a common coordinate system. Registration aligns this data. Computing precise registrations means solving the SLAM problem. For general 3D point clouds, one has to work with six degrees of freedom. Roughly the approaches can be categorized as follows:

    1. Marker-based registration, which requires the placement of artificial markers in the scene.
    2. Feature-based registration, which uses corresponding natural  features of various types, such as intensity feature points, geometric feature points, feature surfaces, etc.
    3. Geometry-based registration, where the geometry of the full point cloud is used without a-priori correspondences. The Iterative Closest Point (ICP) algorithm is the most prominent representative of this category.

    We present state of the art GraphSLAM methods for high-precise registrations. The figure below shows a scene taken at Domshof Bremen. Left: Visualization of the 3D point cloud. Right: Bird’s eye view. Above:Initial alginment based on rough estimates. Middle: Result after a few iterations of the registration algorithm. Below: Final high precise alginment.

    • Segmentation and Normal Estimation

    Three core components of human perception are grouping, constancy, and contrast effects. Segmentation in robot vision approaches this natural way of observing the world by splitting the point clouds in components with constant attributes, and grouping them together. On range images, a few existing image based segmentation methods can be applied. Pure point cloud segmentation methods rely only on geometry information. As the local geometry of a point cloud is described by surface normals, we presents methods for computing these normals efficiently.

    • Mesh Generation and Polygonal Robot Map Generation

    Three dimensional environment representations play an important role in modern robotic applications. They can be used as maps for localization and obstacle avoidance as well as environment models in robotic simulators or for visualization in HRI contexts, e.g., tele-operation in rescue scenarios. For mapping purposes, when building high resolution maps of large environments, the huge amount of data points poses a problem. A common approach to overcome the drawbacks of raw point cloud data is to compute polygonal surface representations of the scanned environments. Polygonal maps are compact, thus memory efficient, and, being continuous surface descriptions, offer a way to handle the discretization problem of point cloud data.

    In the context of mobile robotics, polygonal environment maps offer great potential for applications ranging from usage in simulators, virtual environment modeling for tele-operation to robot localization by generating virtual scans via ray tracing. Integrating color information from camera images or point cloud data directly adds an additional modality to the representation and offers the possibility to fuse a compact geometric representation with local color information. However, creating polygonal environment maps based on laser scan data manually is a tedious job, hence we will present methods to automatically compute polygonal maps for robotic applications that are implemented into the Las Vegas Surface Reconstruction Toolkit. We will demonstrate several use-cases for such maps like using them as environment maps in Gazebo or for generation of synthetic point cloud data for localization purposes using ray tracing. An example of an automatically generated textured polygonal map is presented in the following Figure.

    • Semantic 3D Mapping

    A recent trend in the robotics community is semantic perception, mapping and exploration, which is driven by scanning scenes with RGB-D sensors. A semantic map for a mobile robot is a map that contains, in addition to spatial information about the environment, assignments of mapped features to entities of known classes.  Further knowledge about these entities, independent of the map contents, is available for reasoning in some knowledge base with an associated reasoning engine. In this part of the tutorial we explore, how background knowledge gives a boost to model-based object recognition in large-scale 3D laser data.