Monday, May 24, 2010

Primer: The Principles Of 3D Video And Blu-ray 3D






Today we're partnering up with the experts at CyberLink to introduce the principles underlying 3D video, how it is created, and how it's displayed. Our main interest is Blu-ray 3D, so we'll be exploring the tech your 3D-enabled home theater might include.



Tom Vaughan is the director of business development for CyberLink, developers of the leading Blu-ray player software, PowerDVD. He is responsible for marketing, strategic relationships, and new business development in the US. When the DVD format first emerged, Tom was responsible for developing the DVD authoring and mastering processes, managing the production of some of the first commercial DVDs in the US. Tom holds a B.S. in Electrical and Computer Engineering and an M.B.A. from Drexel University.



What Is 3D?



3D is an abbreviation for “three-dimensional.” Objects in the real world can be measured in three dimensions; for example, by measuring the length, width, and height of an object. When we look at objects in the real world, we can see the width and height of an object (the two-dimensional view of the object), but we can also perceive the depth and distance of the object.



We see the world with our two eyes. Because each eye is in a slightly different location, each sees a slightly different perspective of whatever we are looking at. We don’t normally think about these two different views, but if you close one eye at a time, you will see the image that each eye sees. Notice how much different nearby objects appear from the view of each eye.



Although each eye sees a different image, we don’t perceive two images. In a process called stereopsis, our brain combines the view from each eye into a single picture, and the combined image includes three-dimensional objects and depth perception. The word “stereopsis” is from the Greek words stereo, meaning “solid,” and opsis, meaning “sight.” Stereopsis was first described in 1838 by Charles Whetstone, but scientists and artists have been fascinated with three-dimensional perception for many centuries.



While most of the population can see 3D, a small percentage of the population (estimates range from 3 to 15%) suffers from some stereoscopic vision impairment. Depending on the quality of the 3D presentation, this population will see no 3D effect or limited 3D depth perception. There are a number of possible causes for this, from decreased vision in one eye, to the loss of the ability to point both eyes inward towards nearby objects.



Humans (and most predators) have two eyes in the front of their head. This “binocular vision” improves depth perception, letting a hunter estimate the distance to its prey.



In addition to stereoscopic vision, depth perception also comes from a number of monocular depth cues (depth perception cues that can come from only one eye, or more precisely, that come from the 2D version of the picture that you see). These cues are important to good 3D video, as your brain will expect your stereoscopic perception to closely match your 2D perception of the scene you are viewing.



Monocular cues include:



Your memory of the shape and size of different objects: combined with the relative size of the image you see, this lets you perceive the distance to that object. For example, in the photo below, if you are familiar with the size of the bricks that the squirrel is standing on, you can quickly perceive the size of the squirrel, and your distance to the squirrel.



Perspective: Objects at greater distances appear smaller than near objects. Parallel lines appear to converge as distance increases. This effect is obvious as you stand on a straight road or path and look down the road, or when you look up at a tall building.



Occlusion (interposition): If we see two objects, where the first object is blocking part of a second object, we recognize that the first object is closer. In the photo below, you can tell that the tree in the center is closer than the building because it is blocking your ability to see part of the building. Occlusion helps us estimate the relative distance of objects in the photo.



Shadows and Highlights: Help us to see objects that are raised above or recessed into a surface. In the photo above, we can see that there are bumps on the tree trunk, thanks to the shadows and highlights.



To create the illusion of “being there,” and to give our brains the same vision of a scene that we would see if we were seeing the scene with our own eyes, a camera needs to record the scene that each eye would see separately. 3D cameras have two lenses, spaced several inches apart, aligned in parallel. Some 3D cameras use a single camera, and some use two cameras, each with its own lens in a 3D camera rig.



By recording and later displaying a separate image of the scene for each eye, 3D film and video systems can recreate the scene in a way that closely matches what we would see if our eyes were in the same place that the camera was when it recorded the scene.



The average “interocular distance” (spacing of the eyes) is about 2.5 inches. One important variable for 3D camera systems is the interocular distance. The further the spacing of each camera lens, the greater the 3D effect. Cameras set up with an interocular distance of 2.5 inches are said to be configured to be orthostereoscopic. This setup attempts to accurately replicate human vision.



Another important parameter is the angle of convergence. 3D camera lenses that are aligned in parallel will result in a picture where all objects appear to be in front of the TV screen (or display). Objects at an infinite distance will appear to be on the screen. To create a stronger 3D effect, camera lenses can be angled (converged) slightly inward. With this setup, objects at the distance where the optical axes of both lenses converge will later appear to be on the screen. Closer objects will appear in front of the screen, and farther objects will appear to be behind the screen. Cameras like the Panasonic AG-3DA1 (shown above) feature lenses that allow for the angle of convergence to be adjusted, to align to a distance that the videographer prefers.



Animated Movies



3D animated movies are movies that are created using 3D object modeling software. This genre of movies was pioneered by Pixar with the movie Toy Story. Characters and scenery in the movie are generated as three-dimensional models. Of course, these movies are normally rendered to standard two-dimensional frames.



Modern computer games are created in a very similar fashion, but they are rendered in real-time as you play the game.



A big advantage of 3D animation is that it can also be rendered and viewed in 3D. To create a 3D version of the movie, the movie is rendered in two separate passes (one for each eye). For the second pass, the studio simply moves the virtual camera perspective 2.5 inches to one side, creating the video for the second eye. Though each frame of video can take hours to render (due to complexity), the cost of rendering a second perspective of a movie is small compared to the overall cost of creating the movie. For a good movie, the additional cost of creating a 3D version through a second rendering pass is modest compared to the benefits.



A 3D display must be able to display two separate video images on the same screen. There are several methods that are used to accomplish this. Each display method must be paired with the corresponding 3D glasses technology designed to assure that each eye only sees the video meant for that eye.



Polarized Displays and Polarized Glasses



Modern TVs and displays emit light from each pixel in some combination of red, green, and blue wavelengths. The light emitted by a TV or display can be filtered, such that all of the light coming from a row of pixels has the same electromagnetic orientation. Though the light travels in a straight line from the display pixel to your eye, it may be filtered so that it has one of two circular polarization states (left-hand or right-hand).



For example, imagine that a beam of light is traveling along the center of the spiral graphic below. The arrows pointing outward from the axis of the direction of travel represent the changing direction of the orientation of the electric field of the light beam (though we don’t think of light in the same way we think of radio waves, light is another type of electromagnetic wave). If you align the thumb of your left hand along the center axis of the spiral graph below (the direction of travel of the light), you will be able to close your fingers into a fist in the direction that the electric field rotates around this beam of light. Light with this circular polarization is said to be left-handed.



The graphic below shows the direction that the electric field rotates around a beam of circularly-polarized light with right-handed orientation.



Circularly-polarized light with one orientation can pass through a polarizing filter with the same orientation, but will be blocked by a polarizing filter with the opposite orientation. In this way, half of the pixels of a 3D display can be used to display the video for one eye, while the other half display the picture for the other eye.



3D displays can be manufactured with polarization filters, which are aligned with the rows of pixels on the display. This allows half the pixels on the display to be dedicated to displaying the picture for one eye, and the other half of the pixels for the other eye. Note that the effective resolution provided by a polarized display to each eye is half of the full display resolution.



Row interleave polarized 3D display. Odd horizontal rows of pixels are used for one eye, even rows for the other eye. Red and blue are used to indicate left and right eye images.Row interleave polarized 3D display. Odd horizontal rows of pixels are used for one eye, even rows for the other eye. Red and blue are used to indicate left and right eye images.



To play back a stereoscopic 3D video program, such as a Blu-ray 3D on a polarized display, the left and right video frames are converted to interlaced video frames. The display is designed to show odd rows of pixels to one eye, and even rows of pixels to the other eye.



With polarized 3D glasses, each eye will only see the part of the image meant for that eye. In the image above, red and blue indicate the different circular polarization on the lens for each eye. Though two images appear on the display at the same time. With the 3D glasses, each eye only sees the image meant for that eye. The human visual system combines the image into a single 3D image.



Polarized displays are one of the least-expensive ways to display a 3D video, and polarized glasses are inexpensive. However, polarized displays aren’t always able to filter the light perfectly, such that 100% of the light meant for each eye has the correct orientation. Similarly, polarized 3D glasses aren’t always able to block 100% of the light that is meant for the other eye. This problem, where one signal bleeds into another signal traveling along the same transmission path is known generically as cross-talk. For 3D display systems, cross-talk leads to double images (fuzzy, unsharp images). The image quality of a 3D polarized display decreases noticeably if the viewer is not directly in front of (perpendicular to) the display.



Blu-ray 3D is a new movie format developed by the member companies of the Blu-ray Disc Association (BDA). Blu-ray 3D movies are expected to be released in early 2010, providing an extremely high-quality format for enjoying 3D movies at home.



The physical format for Blu-ray 3D is identical to all other forms of Blu-ray disc. The logical format is based on the current Blu-ray audio/video format, but has been extended to provide for stereo 3D video and 3D menus. Earlier Blu-ray players will not be able to play Blu-ray 3D titles. While set-top Blu-ray players will need to be replaced, PC-based Blu-ray player software can be upgraded. Blu-ray 3D player software will require a Blu-ray drive that is capable of 2x or faster read speeds. Fortunately, all but the first generation of BD-ROM and BD-R drives are 2x or faster.



Blu-ray players capable of playing Blu-ray 3D are backward-compatible, supporting standard (two-dimensional) Blu-ray movies. In addition, the Blu-ray 3D format allows for Blu-ray 3D titles to be created in such a way that they can be played by a legacy Blu-ray player as a standard 2D Blu-ray movie. Blu-ray 3D players can be configured to operate in either 2D or 3D (stereoscopic) mode, allowing consumers to upgrade their player and disc collection before they upgrade their TV or display to 3D.



Blu-ray 3D movie titles will contain two full Blu-ray quality video streams, one for each eye. Decoding a Blu-ray 3D is comparable to decoding two standard Blu-ray movies at the same time. While it would be reasonable to expect that the video file size and bit rate would double (since the number of decoded frames doubles), there are some efficiencies in a 3D video that can be taken advantage of. Since each eye is seeing a slightly different perspective of the same scene, there are many similarities in the frames of video for the left and right eyes. The video encoding experts in the Motion Picture Experts Group (MPEG) have taken advantage of this fact to reduce the overall bit rate and file sizes for stereoscopic 3D. A new video codec was developed, based on the Advanced Video Codec (AVC, also known as H.264), called Multi-View Codec (MVC). Blu-ray 3D uses MVC video encoding, which provides for very high picture quality with an overhead (versus standard Blu-ray) of 50%. While the peak bit rate for standard Blu-ray movies is 40 Mb/s, the peak bit rate for Blu-ray 3D is 60 Mb/s.



Blu-ray 3D MVC is encoded as a primary video stream (for one eye, or for 2D playback) and a dependent video stream for the other eye. The dependent video stream references the objects in each frame of the primary video stream, encoding only the differences.



Blu-ray 3D has enhanced graphics capabilities, allowing for 3D menus and subtitles positioned in 3D video. Menu and subtitle graphics and text can be defined to appear on a plane that is offset from the screen. This plane can be defined to be either closer to or farther away from the viewer. This depth offset is accomplished by shifting the text or graphics horizontally by an equal and opposite amount over the video stream for each eye.



Upgrading to Blu-ray 3D



To enjoy Blu-ray 3D titles, consumers must upgrade their PC or their home theater system. There are several components that are needed:



* A 3D-capable display (TV, desktop display, or notebook PC display)

* 3D glasses compatible with your display

* A PC with Blu-ray 3D player software, or a (set-top) Blu-ray 3D player





In order to choose the right solution, there are some important things to consider for each of these components.



Blu-ray 3D TVs or Displays



The Blu-ray 3D format does not specify the 3D display technology. This allows consumers to choose the 3D display technology that best meets their needs. At the high-end, consumers will likely select true 120 Hz frame-sequential displays that use LC active shutter glasses. Less expensive systems can be configured using polarizing displays and glasses.



Blu-ray 3D Players



Blu-ray 3D players can be implemented on a PC using Blu-ray player software, or as a dedicated hardware solution, otherwise known as a set-top Blu-ray player. Sony's PlayStation 3 (PS3) game consoles, for example, are expected to get a firmware upgrade in the summer of 2010, providing support for Blu-ray 3D. Several set-top Blu-ray 3D players have been announced, and some are already available.



Blu-ray 3D on a PC



Another way to enjoy Blu-ray 3D is to purchase Blu-ray 3D player software like CyberLink's PowerDVD 10 Ultra. PCs can be connected to a 3D-compatible display, and later, to a 3D-capable TV. In other words, a PC with Blu-ray player software is a true Blu-ray player, capable of all of the same functions as a set-top Blu-ray player. In addition, a Blu-ray 3D-capable PC offers many capabilities that fixed function hardware devices don’t:



* Enjoy 3D Games; over 400 game titles can be played in 3D

* Access and enjoy Internet video from any Web site, including 3D video

* Play 2D and 3D video files from almost any source (DV, HDV, AVCHD, AVI, WMV, MOV, etc.)

* View 2D and 3D photos

* Support for cable or satellite TV content through solutions such as DirecTV2PC

* Support for premium, protected video (Amazon, iTunes, etc.)

* Video enhancement, such as CyberLink TrueTheater HD, TrueTheater Motion, and TrueTheater Lighting

* Access and play music, video, or browse photos on your home entertainment system

* Use other 3D software, such as CAD, 3D animation, or 3D solid object modeling software





Blu-ray 3D capability will be available in every PC form factor, including:



* Notebook PCs (with true 120 Hz sequential-frame displays)

* Desktop PCs and displays

* Home Theater PCs



Brightness



Because they block alternate pixels, rows, or frames of video from each eye (depending on the type of 3D display you have chosen), less than half the light from a 3D display system reaches your eyes. To minimize crosstalk on frame sequential display systems, active shutter glasses block both eyes during the transition period between the display of each video frame. For all of these reasons, it is helpful to choose a 3D display with high brightness levels.



It is also important to avoid any reflections on the screen of your TV or display, as these reflections will be seen at a fixed depth (the distance from your eye to the display), making it a bit harder for your eyes to naturally focus on whatever you are interested in.



Due to both concerns, (brightness and reflections) you will find that 3D video is best viewed in a dark room.



Accommodation Disparity



Although objects may appear to be in front of or behind the display, they are not really there. Because the image is really coming from a flat screen, to see the 3D video clearly, the muscles in your eyes must keep your eye lens focused to the distance of the screen. The fact that the 3D video is really only in focus on a flat plane creates a disparity between one visual cue (accommodation) and the other visual cues.



When your eyes try to focus on 3D objects that appear to be close to you, your eyes will naturally converge inward while trying to accommodate for viewing a nearby object. Unlike the real world, all objects in a 3D video will only be in focus on the display. If you try to focus on objects that appear to be right in front of your nose, you will be disappointed, as you instead lose focus.



Fortunately, it seems that most people are able to adjust to this disparity without much difficulty, letting them relax and enjoy a 3D video without losing focus.



Blur Disparity



In the real world, our eyes focus on the objects at which we are looking. Objects that are nearer or farther appear out of focus. Because a 3D video is presented on a flat screen, the blur gradient that we experience in the real world will not be seen in a 3D video. If the 3D video is shot with a wide depth of field, the majority of the scene will be in focus, allowing the audience to see any part of the scene clearly when they focus on the screen.



If the director or cinematographer chooses to use a narrow depth of field, scenes may be shot with the subject in focus and other areas out of focus. While this technique can approximate the blur gradient we experience in the real world, it has the drawback of causing objects that we would normally be able to focus on to be out of focus, and impossible to focus on.



Blur disparity is an unavoidable issue, regardless of how a 3D video is shot or rendered. Studies have suggested that blur disparity and accommodation disparity tend to provide a cue to the brain that although it is seeing a stereoscopic view of a three-dimensional scene (real or computer-generated), the actual 3D video presentation is on a flat screen.



Eye Strain



We naturally view our world in 3D, and so a good 3D production makes it easy to suspend the disbelief that we are not actually “on scene,” live and in-person. However, the viewer will naturally try to focus to different distances, depending on the apparent distance of subjects and scenery in the 3D video. When you not only scan your eyes from side to side, but focus in and out, your eye muscles get a bigger workout than you would get from watching a video in 2D. Once you are able to adjust to the brave new world of 3D video, you will find yourself relaxing and enjoying, instead of trying to actively focus on objects near and far.



Motion Sickness



Motion sickness is normally caused by a disagreement in your brain between what you see and the motion that you feel (by your inner ear, which gives you your sense of balance). Motion sickness can also be caused by a disagreement within the visual system of your brain. If a 3D video is shot, displayed, or viewed poorly, the 3D depth perception of the objects in the scene may conflict with the 2D depth information that we perceive. These conflicts can cause the viewer to suffer similar symptoms to common motion sickness (fatigue, headache, dizziness, or in the worst case, nausea).



3D producers know how to minimize the potential for problems by:



* keeping subjects in the 3D comfort zone, at roughly the distance of the convergence point of the camera (at least most of the time)

* avoiding focusing on objects that are extremely close to the camera (your eye will try to focus on the object as it if is close to you, when your eye needs to focus to the distance of the display)

* avoiding zooming in and out (which changes the scale of the 3D space)

* avoiding excessive camera motion (for example, flying through a jungle; the audience has suspended the disbelief that they are watching a movie, and now the subconscious part of their brains are more prone to be concerned when their eyes are telling them that they are flying through the jungle but the sense of balance from their ears is telling them that they are sitting still)

* keeping near subjects away from the edge of the frame (where the picture for one eye could leave the frame)

* being sure that all content is 3D (producers cannot use flat 2D backgrounds or effects in a 3D production)

* minimizing the use of a narrow depth of field (causing parts of the scene to be out of focus – causing problems for viewers who attempt to focus on these objects)





Fortunately, experienced 3D producers know how to avoid these problems.



Consumers can minimize the potential for problems by:



* choosing a high-quality 3D display and 3D glasses solution (minimizing ghost images caused by crosstalk)

* minimizing reflections on their TV or display (reflections are 2D)

* viewing 3D content from the center of the direction that the display is facing (or from the center of a 3D theater – keeping the relative distance of all parts of the scene centered and in proportion)



3D video is similar in many ways to surround sound. Just as surround sound adds depth, placing you in the middle of the performance, 3D video places you, the viewer, in the action.



Just as we see the world with two eyes, we hear the world with two ears. Binaural hearing lets us sense the direction from which sounds are coming. Your brain processes the sounds detected by both of your ears. Without thinking, your brain will sense the difference in the time that the sound arrives at each ear, and the difference in the volume of the sound that each ear detected and you will have a sense for the direction that the sound came from. Our three-dimensional vision is similar to our three-dimensional hearing. We don’t have to think about the differences in the picture that each eye sees; we just sense the relative distance of everything we see.



When sound recording was first developed, each recording contained only a single channel of audio. Monaural recordings were later improved with two-channel stereo recordings. Two channels of audio provide an added dimension with respect to the apparent location of the source of each sound that is mixed into the recording. This “sound stage” allows recording engineers to arrange the relative position of instruments in a band, from left to right. When played back through a stereo amplifier and stereo speakers, the listener can hear a difference in relative volume coming from each speaker, and the relative timing of the sound for each instrument or voice. These differences give our brains an audible clue that helps us sense where the sound is coming from.



With a stereo recording, all of the sound appears to come from one general direction: the direction of the speakers. This is fine for reproducing music, as we are used to music coming from the direction we are facing when we attend a live music performance. Movies producers want their audience to feel that they are at the location that the movie is happening (in the room, or at the scene). To give the audience the feel of “being there”, multi-channel surround sound was developed.



Just as surround sound systems provide a more immersive experience, 3D video adds an important third dimension to movies and television.



Done well, 3D video provides an experience that feels real and natural. Clearly, audiences are excited by the experience, and they have shown a strong preference for seeing 3D movies in 3D cinemas.