There has been a good deal of talk about the IIF/ACES system, but there seems to be a lot of misunderstanding as to exactly what it is. A lot of the early talk centered around the proposed file format to contain ACES information, but the file format is only a very small part of what the system is intended to be, and one of the least significant.
The IIF/ACES system has been developed by the Academy of Motion Picture Arts & Sciences technology committee, and is basically intended to provide a way to achieve more accurate color (and much more range to manipulate that color) from all cameras (both electronic and film) in a simpler way, along with being able to exchange that information among many vendors during the post production process while retaining consistency and accuracy. IIF stands for “Image Interchange Format,” which refers primarily to the file format proposed to carry ACES information, which is itself based on ILM’s Open EXR format. ACES stands for “Academy Color Encoding Specification,” which refers to the color space used by the process. It is that color space that is the key to the proposed system.
When dealing with images through post production today, most processes rely on the camera manufacturer to provide a representative image of what was shot as a starting point for further manipulation via color grading, visual effects, or other post processes. In almost every case, that image is what we might call “display referred,” which means that is created with reference to the display it’s going to be viewed on. This is particularly obvious with HDTV images, which are usually targeted to a standard display specification known as Rec709. Regardless of the capture method, color is usually manipulated in this color space, which means that the colors and values that are possible are limited to those that are allowed by the display. This makes sense, but more often than not, the camera is capable of capturing far more information than a Rec709 display can show, and that is lost in the process. There have been various attempts at retaining more of that information. Some of these would include Panavision, Sony, and Arri’s use of logarithmic encoding curves to allow for more information to be available within a Rec709 limitation by “compressing” the values into a lower contrast container. Cameras that can deliver sensor RAW data, such as Red, Phantom, and Alexa present a slightly better situation in that all of the sensor’s captured range is supplied to post production for manipulation, although it is also necessary to process these values in order to see a coherent RGB image. In all of these cases, however, the color grading process is usually working in a specific display targeted color environment. ACES attempts to change this in various ways, but its basic intent is to provide a much larger container to represent real world color values, allowing more accuracy to the original scene. It then takes the result, renders an image using a “standard” renderer (we’ll talk about that later), and uses transforms to yield a version of that image for whatever display device is being used. How it does this is the key to the system.
First, ACES defines a theoretically unlimited color space, one in which any value possible in the visible spectrum can be described. Second, ACES is not based on “display referred” color, it is based on what we would call “scene referred” or in some cases, “image referred” values. This means that when encoding to the ACES color space, the values are “reverse engineered” to represent what was actually shot, regardless of the camera used to shoot it. This requires the use of linear light values to represent the way real world physics works rather than the way our eyes work. By doing an extensive series of characterizations of the cameras that are used, along with their optical systems, a very accurate picture of the real world light values in the original captured scene can be obtained. This is what is used to create what is known as the Input Device Transform, or IDT, for the particular camera being characterized, and that in turn is used to create the ACES representation. That representation can then be manipulated within the ACES color space without loss, as that space is basically unlimited. An image is created from that information by using what is called the Reference Rendering Transform, or RRT. The RRT is designed to represent an “idealized” image from the ACES data that can be viewed on any device by passing it through an Output Device Transform, or ODT. The combination of the RRT and the ODT are the heart of the ACES output system, and are roughly analogous to the use of an rendered image followed by a display LUT in a typical film targeted digital intermediate system. The RRT has been developed over a number of years, and though originally based on film colorimetry as it seemed to represent the most pleasing image,it has been modified numerous times based on a lot of thought and testing as to what best represents real world photography in a way that is pleasing to human perception. The result of the RRT is then passed through the ODT for display on either a projection or electronic display, depending on the particular situation. For most colorists who have been involved in testing the ACES implemention, it provides a much more idealized “starting point” than has been previously possible, with better greyscale representation, better saturation, and more accurate color. With ACES, true “one light” dailies become reasonably accurate and sensible, and final grading becomes easier and more creative.
Because this is one of the more technically involved topics I’ve posted on here, I’m going to cut this introduction a bit short and continue with some observations in another post. Suffice it to say that having been involved in ACES testing, I’m a believer in the system and its intent, and will continue to be involved in helping to test and improve it in the future.