This video describes the fundamentals of digital fringe projection techniques, which provide dense 3D measurements of dynamically changing surfaces. It also demonstrates the design and operation of a high-speed binary defocusing system based on these techniques.
Digital fringe projection (DFP) techniques provide dense 3D measurements of dynamically changing surfaces. Like the human eyes and brain, DFP uses triangulation between matching points in two views of the same scene at different angles to compute depth. However, unlike a stereo-based method, DFP uses a digital video projector to replace one of the cameras1. The projector rapidly projects a known sinusoidal pattern onto the subject, and the surface of the subject distorts these patterns in the camera’s field of view. Three distorted patterns (fringe images) from the camera can be used to compute the depth using triangulation.
Unlike other 3D measurement methods, DFP techniques lead to systems that tend to be faster, lower in equipment cost, more flexible, and easier to develop. DFP systems can also achieve the same measurement resolution as the camera. For this reason, DFP and other digital structured light techniques have recently been the focus of intense research (as summarized in1-5). Taking advantage of DFP, the graphics processing unit, and optimized algorithms, we have developed a system capable of 30 Hz 3D video data acquisition, reconstruction, and display for over 300,000 measurement points per frame6,7. Binary defocusing DFP methods can achieve even greater speeds8.
Diverse applications can benefit from DFP techniques. Our collaborators have used our systems for facial function analysis9, facial animation10, cardiac mechanics studies11, and fluid surface measurements, but many other potential applications exist. This video will teach the fundamentals of DFP techniques and illustrate the design and operation of a binary defocusing DFP system.
Digital fringe projection (DFP) techniques are based upon correlation and triangulation between two views of the same scene at different angles, the same principle employed by the human eyes and brain to achieve stereo vision. However, unlike a stereo-based method, DFP uses a digital video projector to replace one of the cameras1. The projector rapidly projects a known sinusoidal pattern onto the object that the object’s surface distorts in the camera’s view. Three such distorted patterns (fringe images) at differing phase shifts from each other can be analyzed to retrieve the depth via triangulation. The use of a known pattern eliminates the difficult computational problem of identifying correspondence points, allowing the capture of depth measurements at the camera resolution. For example, with a 576 x 576 camera, the technique can capture 331,776 points. This allows DFP systems to measure very fine details such as the movement of facial muscles in human emotions.
3D optical imaging techniques for static or quasi-static events have been extensively studied over the past few decades and have seen great success in video game design, animation, movies, music videos, virtual reality, telesurgery, and many engineering disciplines5. Though numerous 3D profilometry techniques exist, they can be classified into two categories: surface contact methods and surface noncontact methods. Both the coordinate measurement machine (CMM) and the atomic force microscope (AFM) require contact with the measuring surface to obtain 3D profiles at high accuracy. This requirement places severe restrictions on the speed of contact methods. They cannot reach kHz measurement speed with thousands of points per scan.
Surface noncontact techniques typically utilize optical triangulation methods (e.g. stereo vision, spacetime stereo, structured light). By actively projecting known patterns onto the objects, structured light techniques can be used to measure surfaces without strong local texture variations1. Fringe analysis is a special group of structured light techniques that uses sinusoidal structured patterns (also known as fringe patterns). Because these patterns have intensities that vary continuously from point to point in a known manner, they boost the structured light techniques from projector-pixel resolution to camera-pixel resolution12. In the recent past, fringe analysis techniques were instrumental in achieving high-resolution 3D imaging.
The digital fringe projection (DFP) technique uses digital video projectors to generate sinusoidal fringe patterns. This technique has the merits of lower cost, higher speed, and simplicity of development, and it has been a very active research area within the past decade. Recent developments in DFP and similar digital structured light techniques are summarized in1-5. To achieve high-speed applications, a digital-light-processing (DLP) projector is preferable due to its fundamental operation mechanism. The speed and flexibility of this technique has allowed us to acquire 3D video at 40 Hz 13 and then later at 60 Hz 6,7.
Nevertheless, a fundamental speed limit exists for the traditional DFP technique. A DLP projector can only swap 8-bit color images at its maximum refresh rate (typically 120 Hz). Since the traditional fringe patterns are 8-bit grayscale images, we can encode three of them into one color image as the red, green, and blue color channels. The projector will swap each channel (and therefore each fringe pattern) at three times the refresh rate (typically 360 Hz). However, since each 3D video frame requires three fringe patterns, the maximum rate of 3D video capture is still only the refresh rate (120 Hz)3,14. To break past this hardware limitation, we have invented a modified DFP technique that uses binary defocusing8. Instead of 8-bit grayscale fringe patterns, this technique uses computer-generated 1-bit binary structured patterns. These patterns are defocused using the projector lens to become pseudo-sinusoidal patterns for DFP. Because DLP projectors can display binary images orders-of-magnitude faster than 8-bit grayscale images, the binary defocusing technology permits tens of kilohertz 3D video imaging speed with the same resolution as the conventional DFP techniques15.
The overall goal of the following protocol is to demonstrate the basic implementation and operation of a binary defocusing three-step phase-shifting DFP system. First, the protocol will cover the selection and integration of the necessary components. Then, it will discuss the simplest, most readily accessible method of calibration for the system; more complex calibration methods are available in the literature for specific applications16,17. The protocol will then focus on the procedure for 3D video capture with the system and the process for converting the fringe images into visualized 3D measurements. Finally, we will present some representative results from our real-time and high-speed systems.
1. System Configuration
A schematic of the system is shown in Figure 1.
2. System Calibration
This reference plane calibration is the simplest and most readily accessible method of calibration for the system. Therefore, it is the best for getting started. More accurate calibration methods are available in the literature for specific sinusoidal16 and binary defocusing17 applications. For maximum accuracy, calibration should be performed just before data capture. After calibration, the camera and projector should not be displaced relative to each other.
3. Data Acquisition
4. Data Analysis and Visualization
With software optimized for speed such as our in-house GUI, this step can take place during data capture. Real-time processing allows the user to immediately detect if the resulting data is desirable for the application and adjust if necessary. However, post processing can be more flexible and higher in accuracy. Post processing is also much simpler to implement and the best place to begin.
Figure 1 shows the schematic of the system. The high-speed binary defocusing system in this video consists of a Logic PD DLP LightCommander projector and a Phantom v9.1 CMOS camera.
Figure 2 presents a single frame from our 3D real-time system of a human face. This system uses a 640 x 480 camera. Thanks to the aforementioned known sinusoidal pattern, we can capture 640 x 480 = 307,200 measurements, enough resolution to record very fine details.
Figure 3 shows an example of measuring human facial expressions in 3D at 60 Hz. Here, four frames selected from a video sequence clearly demonstrate the capability of the real-time system to capture dynamic changes in finely detailed geometry.
Figure 4 demonstrates our live visualization software used in conjunction with our real-time binary defocusing 3D video system. The 3D captured video of the subject is displayed in real time on the computer monitor to his right. This software was written in C++ using the OpenGL library, GLSL, and QT. The computer used is a Lenovo laptop.
Figure 5 shows 3D frames from live rabbit heart measurement with our newly developed superfast binary defocusing system. This system can record 3D frames at 667 Hz with an image resolution of 576 x 576. A superfast rate is required to measure the heart surface without motion-induced artifacts. The heart measurement research is in collaboration with Prof. Igor Efimov at Washington University-St. Louis (see11 for further details); note that the rabbit was humanely killed and that the images were taken while the heart was still beating.
Figure 1. Layout of the 3D video imaging system. In this system, a high-speed DLP projector projects three binary dithered phase-shifted images in rapid succession onto the subject. A high-speed CMOS camera is used to capture the three fringe images one by one for computation of the depth.
Figure 2. 3D measurements of a human face at a resolution of 640 x 480, revealing fine details. Left to right shows the simultaneously captured texture perfectly aligned with the geometry, a shaded view of the geometry, the wireframe view depicting the density of the points, a close-up view of the nose area, and a close-up view of the eye region.
Figure 3. Four selected frames from 3D video of the formation of a facial expression. The video was captured at 60 Hz with a resolution of 640 x 480. These frames highlight the geometric changes in the woman’s face as she moves from a neutral expression to a smile.
Figure 4. Live 3D video capturing, processing, and rendering. The 3D measurements are displayed in real time on the computer screen to the subject’s right.
Figure 5. Capturing a live rabbit heart with our superfast 3D video imaging system. The heart is beating at approximately 200 beats/min. The 3D capture rate was 166 Hz with an image resolution of 576 x 576. See11 for further details.
This high-resolution, real-time to superfast 3D video imaging technology is a platform technology that could potentially benefit numerous and diverse scientific fields ranging from biological science to engineering practice. Biomedical applications include precision measurements of facial movements and organ surfaces. Other applications include 3D automated quality control with detection of warped surface features; 3D enhanced videoconferencing; detailed digitization of facial features for movies and videogames; dense and rapid deformation measurements for the design and analysis of structures; and fluid surface characterization. Many biological and engineering applications (e.g. beating rabbit hearts, fluid shockwaves) require the superfast imaging rates of a binary defocusing system to correctly resolve features without aliasing artifacts.
Nevertheless, many challenges remain to the widespread adoption of this technology. Conventional DFP technology requires the projector to display 8-bit grayscale sinusoidal fringe patterns. The speed of this technique is limited by the projector’s refresh rate (typically 120 Hz). This speed is sufficient for slow motion capture such as that in facial expressions. However, numerous applications exist that require faster capture rates.
Binary defocusing technology has relaxed this speed limitation, and we have successfully created a superfast 3D video imaging system. However, this system has two drawbacks. First, it requires an expensive projector such as the DLP Discovery platform and a costly high-speed video camera such as the Vision Research Phantom v9.1. Second, since it generates sinusoidal patterns via the defocusing of squared binary patterns, the binary defocusing technique has difficulty generating sinusoidal fringes of the same quality as the traditional DFP technique and a reduced depth measurement range (for further explanation, see23). Recent investigation indicates that dithered binary sinusoidal patterns can significantly alleviate the limitations on depth measurement range19. Future research will focus on overcoming the remaining issues while preserving the merits of binary defocusing.
Another challenge is compressing and storing the large amount of data generated by high-speed, high-resolution 3D video imaging systems. Uncompressed 3D videos are drastically larger than uncompressed 2D videos. For instance, for a 3D video recorded at 30 Hz for 1 min at a resolution of 640 x 480, the .OBJ file size could be over 50 GB, making it extremely difficult to store. Since little progress has been made in the 3D video compression field, we will continue to focus on this in the future.
The authors have nothing to disclose.
This research was an accumulated effort that began more than 10 years ago when Dr. Zhang was a graduate student at Stony Brook University. The current and previous students in our team at Iowa State University have contributed tremendously toward advancing this technology to where it is today. This work was partially sponsored by National Science Foundation under project number CMMI 1150711, and the William and Virginia Binger Foundation.