Projekt 1
Projekt 2
Projekt 3
Projekt 4
Projekt 5
Projekt 6
Projekt 7
Projekt 8
Projekt 9

Effiziente punktbasierte Modellgenerierung und Darstellung

Projektpartner: CG-TÜ, CG-UL, VIS-S, VIS-KN, MPI

Die Darstellung großer und detailreicher Szenen kann mit Punktrenderingverfahren, wie sie an der Universität Tübingen entwickelt wurden, um Größenordnungen beschleunigt werden. Interaktive Darstellungen extrem großer Szenen werden dadurch möglich, bei hoher Darstellungsqualität können allerdings bisher nur auf relativ kleinen Displays interaktive Bilderzeugungsraten erreicht werden. In diesem Teilprojekt sollen Techniken erforscht werden, die es erlauben, bekannte Verfahren zum Punktrendering so zu erweitern, dass sie auch für große und sehr große Displays eingesetzt werden künnen.

Bilder aus Animationen mit punktbasierten Multiresolutiontechniken: Szene mit 1015 Dreiecken bei 10 Bildern pro Sekunde (links) und mit 40 Mio. Dreiecken bei 8 Bildern pro Sekunde

Die reine punktbasierte Modellgenerierung hat sich als nur bedingt praktikabel für hochauflösende Displays erwiesen, weshalb nun eine hybride Datenstruktur verwendet wird, die es ermöglicht, situationsbezogen die Darstellung zu adaptieren. Dies scheint ein erfolgversprechender Weg für die nun geplanten Anwendungen zu sein. Ein paralleler Octree-Algorithmus erlaubt die effiziente Speicherung und Bearbeitung nur noch durch den Hintergrundspeicher limitierter Szenengrößen.

Langfristige Zielsetzung

Point based graphics has become a new trend in the computer graphics community. Real world scene acquisition systems like laser scanners produce huge amounts of point data. This data typically is incomplete, very noisy and contains a large number of outliers. Data sizes often are too big to fit into main memory. The challenge of real time visualization of this kind of data sets is addressed in this project. Software for visualization of large laser scanned environments on Gigapixel displays requires non-traditional approaches. Main goals of this subproject are: 1) Development of GPU-friendly data representation suitable for real-time rendering on Gigapixel displays; 2) Development of software capable of pre-processing raw data and rendering it in real time. The software needs to have the following features: 1) Ability to handle data sets limited only by background storage size; 2) Parallel pre-processing for full exploitation of all cluster nodes driving the display; 3) Real-time rendering of acquired data sets in very high resolution. Perception-driven rendering algorithms will be developed and a head tracking system will be implemented.

Stand der Forschung

The large point cloud visualization problem has been targeted in [Wand-2004]. The author has developed a software framework for the out-of-core point cloud visualization. Due to the multiresolution representation, the speed of the rendering algorithm was independent on data size however, direct point rendering used in this algorithm produces aliasing artifacts and the rendering speed depends on the number of screen-projected points. For gigapixel displays such an approach is not desirable when real-time rendering has to be maintained. In [Schnabel-2007] and [Wahl-2005], the authors describe shape detection in point clouds. Their method allows extraction of basic shapes like spheres, planes, cylinders and tori from a noisy point cloud. Applications of this method include reverse engineering, meshing, surface compression, simplification and hybrid rendering. Out-of-core algorithms for triangle meshes are described in [Correa-2002], [Baxter-2002] and [Varadhan-2002]. Surface reconstruction from point data is described in [Hoppe-1994], [Jenke-2006], [Jenke-2007]. Unfortunately, current state-of-the-art point cloud triangulation algorithms cannot be successfully applied to this project, because the data we have to deal with has a huge amount of outliers, a sig-nificant amount of noise, it is incomplete and the size of real world data often exceeds RAM size. Methods for adding detail to geometry using displacement mapping are described in [Hirche-2004], [Babound-2005] and [Policarpo-2005]. Texture compression is a very important part of this project. In [Stroem-2005] the authors achieve better quality than the standard S3TC method supported by current graphics hardware at compres-sion ratio of 4 bits per pixel, what is not too much for the purposes of this project. In [Beers-1996] the authors describe a method based on Vector Quantization, which was implemented in this pro-ject, but due to poor quality, cannot be used in this project.

Stand der eigenen Arbeiten

The Powerwall, a tiled display system, was build during this project. It consists of sixteen 30” moni-tors arranged in a 4x4 array, and a cluster of 16 powerful PCs. Each monitor is driven by a dedicated computer in order to achieve maximum rendering performance. Total resolution of the display is 64 megapixels. As mentioned before, direct point cloud rendering speed is limited by the number of points projected to screen. In order to overcome this limitation, the point cloud has to be converted into a GPU-friendly representation, suitable for high resolution displays. A completely new software system is being written in order to meet the project requirements. This section describes the current software system from an algorithmic point of view. In order to handle out-of-core data sets, new algorithms and data structures have been imple-mented. The most important part of our rendering software is an out-of-core octet tree (OCTree). According to [Correa-2002], [Baxter-2002] and [Varadhan-2002], this data structure is most suitable for out-of-core rendering. OCTree is a spatial subdivision structure that is used for the following purposes: 1) Viewport culling – used for every display in order increase rendering speed and to avoid unnecessary data transfers; 2) Multi-resolution rendering – inner nodes of an OCTree contain different levels of detail, which can be drawn faster. Our software can build OCTrees for data sets of any size using only a limited amount of RAM. In case of huge data, the number of disk IO operations can be high. To solve this problem, a parallel OCTree construction has been im-plemented. Each computer builds only its part of the tree, and merging the tree is mostly sub-tree linking with very small number of additional splits. The Figures show partial and merged OCTrees.

Parallel OCTree construction: (a) and (b) – partial OCTrees for 2 machines, (c) – merged OCTree.

Data obtained directly from laser scanners is not suitable for direct rendering. Rendering the data as points would require filling the whole screen (at least 4 Mpixels), which cannot be done in real-time on current graphics hardware. Splatting techniques do not help in this case as well.

Our software is able to convert point data into a representation that is more suitable for rendering on GPUs with a high resolution display. Our system will detect GPU-friendly shapes in the data set using the RANSAC algorithm. The RANSAC algorithm can detect primitives such as planes, spheres, cones, cylinders and others, but only planes are supported by the current version of the software. On the GPU a texel is processed much faster than a vertex, in practice about 20 times. This fact is exploited to render textured primitives much faster than a vertex cloud forming these primitives. Primitive detection is parallelized by dividing the set of OCTree cells among cluster nodes. In order to speed up primitive detection, normal vectors for all points are approximated using PCA.

The remaining points that could not be assigned to any primitives are rendered as vertices in OpenGL. For some scenes it may not be possible to detect many primitives, and most of the points are rendered as vertices. This will slow down the rendering, but a multi-resolution rendering algo-rithm is used for the remaining points, so the number of screen-projected points is reduced. The multi-resolution representation is used for RANSAC shapes as well. Figure 3.2 shows point clouds after processing them with our RANSAC algorithm. Figure 3.3 presents a comparison between direct point rendering (15 fps) and RANSAC primitive rendering (70-300 fps). Typical rendering speed for our test scenes is 60-500 fps.

For each plane a texture is synthesized by projecting points onto the surface. The textures are held in texture atlases in order to increase the rendering speed. Textures in the atlases are grouped spatially in order to avoid frequent texture switching, and to avoid data reloading from disk in case of out-of-core rendering. Texture atlases occupy a huge amount of a video memory and they need to be compressed. Standard OpenGL S3TC compression gives a bit rate of 4 bits per pixel, but in case of out-of-core rendering it requires a large amount of data transfer from disk to video memory. In order to reduce this data transfer we have developed a special form of texture compression. The code we have developed allows us to reduce the bit rate to 1-1.5 bits per pixel. The decompression code is implemented as a fast pixel shading program. The texture compression is described later in this document.

In order to render huge scenes, an out-of-core rendering algorithm has been implemented. The implementation is based on two caches, and OCTree cells are loaded from disk on demand. The system is using caches for geometry and textures. There are two threads: one for rendering and the other for reading from disk (IO thread). There are two heuristics for visible set prediction for the next frame. Since the user can look around arbitrarily, an extended viewport is defined. The IO thread loads data into the primary viewport first, and then, if there is time left, it fetches data contained in the extended viewport. This heuristic allows us to pre-load OCTree cells that will become visible later, when the user will look around in horizontal and vertical direction. Other heuristics predict when the level of detail for an OCTree cell is about to change, and when to load the required data. This heuristic pre-fetches the required data when the user moves the camera forward or backward. For tiled displays, the main viewport is split, and every computer is rendering to its own viewport.

Displacement mapping is used to add details to shapes detected by the RANSAC algorithm. Since the RANSAC algorithm extracts all points that lie within a certain distance from a shape and converts them to a primitive, some detail might get lost. Per-pixel displacement mapping uses height-map textures to offset texel values from shapes. Figure 3.5 shows the comparison between rendering with and without displacement mapping. Our implementation is based on GPU ray-casting through height-map textures.

RANSAC reconstruction from point clouds

Direct point rendering at 15 fps (a) and RANSAC rendering at 70-300 fps

Example texture atlas

Scene rendered with RANSAC planes (a) and with RANSAC planes with per-pixel displacement mapping

Since our research showed that the fastest rendering speed is achieved using textured polygons and the texture data can be practically unlimited in size, a general purpose texture compression scheme is being developed. Beside this project, many other applications of computer graphics like geographic information systems and flight simulators require huge amounts of pixel data to be rendered in real-time. Most image compression methods (like JEPG) are not suitable for real-time rendering, because they do not provide random access to pixel data.

Existing texture compression schemes, which provide random access to pixels, have a low compression ratio. They do not exploit self-similarities among images to improve image quality. Our texture compression/decompression scheme is able to handle extreme amounts of image data and display them in real-time. The method used in this project is based on Residual Vector Quantization. This method allows decompression of data at a very high rate. Compression is a time consuming pre-processing step. During this sub-project a complete software framework has been developed. The decompression code is implemented as a pixel shading program, which allows us to decompress huge amounts of image data in real-time. Decompression speeds achieved by our software on Geforce 8800 cards can be as high as 800 millions of pixels per second. During rendering, image data is decompressed on the fly and applied to any kind of geometry. The current code can use mip-mapping and uses hardware filtering support for textures.

The compression algorithm was implemented as multi-threaded and distributed program. It can utilize as many computers as there are available, and all CPUs available in a node. In order to process really huge images, the NEC-SX8 supercomputer at HLRS in Stuttgart is used in this project. To demonstrate the possibilities, we have compressed a 6 Gigapixel aerial image to about 750 megabytes (1 bit per pixel) and we use it as a texture in real-time rendering. The user can zoom, pan and rotate the image in real time. Such a fast display was not possible with any image viewers for such huge pictures.

Image of a part of Germany (6400 square kilometers with 1 meter resolution)


[Wand-2004] M. Wand, “Point-Based Multi-Resolution Rendering”, PhD Thesis, 2004

[Hoppe-1994] H. Hoppe, “Surface reconstruction from unorganized points”, PhD thesis, 1994

[Baxter -2002] W. V. Baxter, III, A. Sud, N. K. Govindaraju, D. Manocha, “GigaWalk: Interactive Walthrough of Complex Environments”, 2002

[Correa-2002]W. T. Correa, J. T. Klosowski, C. T. Silva, “iWalk: Interactive Out-Of-Core Rendering of Large Models”, 2002

[Varadhan-2002] G. Varadhan, D. Manocha, “Out-of-core rendering of massive geometric environ-ments”,2002

[Schnabel-2007] R. Schnabel, R. Wahl, R. Klein, “Efficient RANSAC for Point-Cloud Shape Detection”, 2007

[Wahl-2005] R. Wahl, M. Guthe, R. Klein, “Identifying Planes in Point Clouds for Efficient Hybrid Ren-dering”, 2005

[Jenke-2006] P. Jenke, M. Wand, M. Bokeloh, A. Schilling, W. Straßer, “Bayesian Point Cloud Recon-struction“, 2006

[Jenke-2007] P. Jenke, M. Wand, W. Straßer, “Efficient Surface Reconstruction for Piecewise Smooth Objects“, 2007

[Hirche-2004] J. Hirche, A. Ehlert, S. Guthe, M. Doggett, “Hardware Accelerated Per-Pixel Displace-ment Mapping”, 2004

[Babound-2005] L. Babound, X. Decoret, “Rendering Geometry with Relief Textures”, 2005

[Policarpo-2005] F. Policarpo, M. Oliveira, J. L. D. Comba, “Real-Time Relief Mapping on Arbitrary Polygonal Surfaces”, 2005

[Stroem-2005] J. Stroem, T. Akenine-Moeller, “iPACKMAN: High-Quality, Low-Complexity Texture Compression for mobile phones”, 2005

[Beers-1996] A. C. Beers, M.Agrawala, N. Chaddha, “Rendering from Compressed Textures”, 1996

Sonstige Forschungsleistungen (Patente/Vorträge/Ausstellungen etc.)

08.11.2007 Vortrag auf der VMV2007 im Rahmen des BW-FIT Kolloquiums (Gigapixel session)

Sonstige Aktivitäten im Projektverbund

The Display Wall is being used in Gigapixel subproject 4 (HR-Video für Großdisplays). Concerning attention controlled rendering, we are collaborating with subproject 6 (Aufmerksamkeitsgesteuerte Darstellungsalgorithmen)


Prof. Dr. Andreas Schilling (CG-TÜ)
Arbeitsbereich Graphisch-Interaktive Systeme
Universität Tübingen
Sand 14, 72076 Tübingen
Tel: 07071/29-75462
Fax: 07071/29-5466