24
1. Introduction:
Recent hardware advances have reduced the cost of laser range scanners to the point where these, previously cost-prohibitive, devices are seeing widespread use in a number of different fields. Unfortunately, from the perspective of current software, these advances have also enabled the creations of scans composed of millions of individual distance reading in a matter of minutes. Such vast quantities of data easily overwhelm systems that were not explicitly designed to handle them. A number of new techniques for merging multiple range images together and extracting some form of polygonal surface models have been developed to address these difficulties. This paper examines one particular class of such techniques, called volumetric reconstruction, in detail and proposes several key optimizations that increase computational efficiency and enhance output quality. These improvements center on advocating the use of Euclidean distance measurements during distance field construction. This paper suggests how to use these distances selectively in order to minimize the additional memory and computational costs that they incur. It presents an efficient algorithm for calculating Euclidean distance between a point and a range surface. Finally, it proposes a new interpolation approach that improves model quality in regions of high curvature that relies on the use of distance vectors. Although these improvements can be applied individually, they are designed to work together in order to provide a higher quality reconstruction than traditional methods without losing overall efficiency.
2. BACKGROUND INFORMATION:
Range Data:
Laser range scanners produce range data sets (also referred to as range scans) – sets of depth measurements sampled on a regular lattice. Tessellating adjacent data points of a range scan into regular triangular arrangements will produce a range surface model of the scanned object. This process is called triangulation and for a single range scan it can be performed extremely efficiently because adjacency relationships between data points are known in advance due to their regular arrangement. One additional complication occurs when a part of the scanned surface is hidden from scanner’s view. Triangulation over these occluded regions is often undesirable and can be avoided by setting a threshold on the maximum difference in range measurements between adjacent data points. This will produce holes in the resulting surface model, which may introduce further problems, depending on the application domain. Although this simple approach to surface reconstruction is often sufficient, a surface model composed of multiple range scans from different points-of-view is frequently needed. For example, it is impossible to construct a complete three-dimensional surface model of something even as simple as a sphere from a single, fixed point-of-view range scan. The hemisphere that falls into the scanner’s field of view will inevitably occlude the opposite one.
The process of incorporating, or registering, multiple range scans together for the purpose of surface reconstruction presents a non-trivial challenge. Adjacency relationships between data points in the same scan are obvious. However, adjacency relationships between data points in different scans are not known in advance. Several methods for determining these relationships exist. For example, they can be readily calculated if the genus of the scanned object is known. But any solution that assumes that the scanned surface is always homeomorphic to a particular type is, by nature, extremely limited in scope.
Some other approaches to registration/reconstruction take the opposite route and attempt to compute connectivity information directly from the point cloud formed from merging all the data points in the same reference frame. (Hugues Hoppe, Tony DeRose, Tom Duchamp, John McDonald, and Werener Stuetzle, 1992) (Chandrajit Bajaj, Fausto Bernardini, and Guoliang Xu, 1995) (Nina Amenta, Sunghee Choi, Tamal Dey, and Naveen Leekha, 2000) (Jean-Daniel Boissonnat and Fédéric Cazals, 2000) (Tamal Dey, Joachim Giesen, and James Hudson, 2001) Although very general in nature, these methods suffer from extreme combinatorial complexity, which makes them unsuitable for all but the most modest of tasks.
Currently, only those approaches that leverage on the connectivity information inherent in individual scan images can remain sufficiently general and efficient for widespread application. (Greg Turk and Mark Levoy, 1994) (Mark Soucy and Denis Laurendeau, 1995) (Brian Curless and Mark Levoy, 1996)
Beyond the primary task of figuring out how the individual scans fit together to form the surface model, registration/reconstruction algorithms must overcome a number of additional difficulties, as outlined by Curless and Levoy (1996). Multiple scans may partially or completely overlap. Surface patches that contain this redundant information must be processed correctly. Error distributions for range scan data are often heavily influenced by the incidence angle of the laser beam to the surface. Generally the most accurate results are produced when the incidence angle at a data point is parallel to the surface normal at that data point. Therefore, redundant data sets must be identified, properly weighted in accordance to their degree-of-confidence metrics, and then interpolated together to achieve best possible output quality-to-size ratio. Overlap between range scans is not the only source of redundancy in a data set. Redundancy may also be present in surface regions where no scan overlap occurs. Laser range scanners sample on a regular grid with no consideration for the nature of the scanned surface. The scan resolution is not dynamically adjusted according to the local feature size. This can lead to severe over-sampling in regions of relatively little interest. This type of redundancy must also be detected and eliminated. Physical limitations of scanning technology can prevent complete scan coverage of an object even if multiple scans are composed together. Some surface patches will remain occluded regardless of the scanner position. The holes formed in the surface by these occlusions must be handled in a plausible manner.
Beyond these basic requirements, a number of additional considerations should influence the design of a registration/reconstruction algorithm. Modern scanning hardware can produce a scan composed of millions of data readings in a matter of minutes. Furthermore, a number of such scans may be required to fully model a complex object. Incremental processing is frequently the only option for registering a data set of this size. Instead of all the data being processed at once, individual range scans are registered one-at-a-time. Each intermediate step produces an entirely valid representation that can be processed by the reconstruction algorithm for a partial result. This approach has two primary advantages. At any moment in time, only a fraction of the entire data set is loaded and operated on, which correspondingly requires only a fraction of resources it would otherwise. The overall process can then be parallelized easily for additional efficiency. Although incremental processing has many attractive features, it imposes an order-invariance limitation on the actual registration algorithm. The sequence in which the scans are composed must not have a significant impact on the overall output.
Distance Fields:
Registration of multiple range images can be viewed as the process of approximating a set of separate surface functions with a single, higher-order surface function. These surfaces may be either in parametric or in implicit form. Each representation type has its particular advantages. Parametric representations, such as polygonal meshes, allow efficient enumeration of surface points while implicit representations, such as distance fields, allow efficient surface membership tests for arbitrary points. Because implicit functions provide information about arbitrary points, and not just those that are explicitly a part of the surface, they are more suitable for establishing surface patch connectivity information during registration.
A distance field of a surface is an implicit function that specifies the distance from any point in space to that surface. The sign of a distance value is used to distinguish between the inside and the outside of the surface while its magnitude can represent different notions of distance. Some of the common distance field representations use line-of-sight, projected, and Euclidean distances. The relative merits of these different distance definitions are discussed in detail below. Different types of distance fields have been used for modeling in a number of different application domains. (Hoppe et. all, 1992), (Curless and Levoy 1996) (Mark Wheeler, Yoichi Sato, and Katsushi Ikeuchi, 1998)
Distance fields can be used for volumetric registration of range data. (Curless and Levoy, 1996) This approach starts by converting a set of range images into a set of range surfaces. Each parametric surface is then transformed into a corresponding distance field. This is accomplished by draping a regular, three-dimensional grid over the polygonal mesh and by sampling distance values to the surface at its nodes. These readings are weighted in accordance to the error metrics specific to the corresponding scan. Distance measurements to surface patches whose normal vectors have a high incidence angle with the scanner’s line of sight are down-weighted. The individual distance fields are then composed into a collective distance field that covers the entire scanned surface. Weighted interpolation is used to reconcile conflicting information in the regions where the partial fields overlap.
Although effective, this approach suffers from a number of serious limitations. Sampling density of the distance fields has a significant impact on the quality of reconstruction. Accurate representation of small details, sharp corners, and other similar surface features may require high sampling densities. Otherwise, detrimental artifacts such as aliasing will form in these regions. Typically, only a small fraction of the overall volume may be affected, but if regular sampling is used, the entire distance field must be recomputed at a higher resolution in order to accommodate these high-frequency signals. This practice is wasteful because lower sampling density may be sufficient for most of the surface. Because representation size of a regular three-dimensional grid is a cubic function of the resolution, it is also prohibitively expensive in terms of storage and processing time.
Adaptive Distance Fields:
The concept of adaptive distance fields was developed fields by Frisken, Perry, Rockwood, and Jones (2000) to efficiently address the limitations of traditional distance fields. Adaptive distance fields rely on some form of adaptive sampling to dynamically adjust representation quality in different regions of the field as needed. The resolution at which a particular region is sampled corresponds to the relative level of interest or difficulty that the region presents to the reconstruction algorithm. This pattern of behavior can be achieved by sampling the distance field on an irregular grid formed by the vertices of an octree data structure instead of a regular three-dimensional grid.
Octrees:
Octrees are simple and versatile data structures with numerous applications in computer graphics and computational geometry. They provide a convenient way of indexing objects in a three-dimensional coordinate space. The entire space is modeled as an iso-cuboid cell that serves as the tree’s root. Data items are inserted into the root until some partitioning predicate is met. The root cell is then split into eight identical child cells and its data items are assigned to the children on the basis of their spatial coordinates. The partitioning process repeats until no additional data items remain or until maximum resolution is reached. Non-point data items that would occupy several adjacent child cells can either be subdivided in such a way that each division can be inserted into a unique child, or they can remain at the lowest level of hierarchy that does not mandate such subdivision. The former approach results in faster query times while the later conserves memory.
Jane Wilhelms and Allen Van Gelder (1992) observed that the basic approach of always splitting the cell into eight equal parts produces poor branch-to-leaf ratios in cases where resolutions along each axis either differ by significant amounts or are not even powers of two. Under these conditions, resolution along one or two of the smaller axis can be quickly exhausted, forcing cells to partition into less than eight children. Such splits are inefficient as they reduce the affected octree branches into quad- or even binary trees at the lower levels of the hierarchy. The branch-on-demand partitioning strategy was proposed to address this limitation. At the cost of a few additional calculations, the each split is positioned in a way that keeps the overall branch-to-leaf ratio of the octree as close to optimal as possible. This yields a compact representation that handles queries more efficiently.
Adaptive distance fields (cont.):
The simplest octree-based adaptive distance field relies on a tri-color classification of tree cells. (Wheeler et. all, 1998) Each cell is categorized as interior, exterior, or boundary with respect to the surface. At each level of the octree hierarchy, all interior cells are automatically subdivided. Subdivision stops if it encounters an interior or an exterior cell or if the maximum resolution is reached. This strategy is effective because typical surfaces occupy only a small fraction of their bounding volumes. The remaining regions of the distance field are represented with sub-maximum resolution cells.
Figure 1: A tri-color quadtree (light) of a surface (dark). Regions around the surface are split to maximum resolution.
In a tri-color octree, each cell bases its decision to split on a binary input signal – the presence or absence of a surface patch. This approach may be further refined to take into account the local feature size of the surface patch. (Ronald Perry and Sarah Frisken, 2001) (Ryusuke Sagawa, Ko Nishino, Katsushi Ikeuchi, 2001) Although the precise definition of local feature size varies from author to author, the concept is based on the general observation that planar surfaces can be reconstructed with high degree of accuracy from a small set of distance reading. (Sarah Gibson, 1998) Therefore, cells that contain flat surface patches do not need to be partitioned further. Even high curvature surfaces that are reasonably smooth can be accurately reconstructed with a small number of subdivisions. Typically, the only surfaces that require a high number of subdivisions are those that contain sharp corners or other similar discontinuities.
Figure 2: A true adaptive quadtree (light) of a surface (dark). Regions of high curvature at the corners are split to maximum resolution.