Complete Camera Convergence

We are headed towards a point in technology where spaces are visually replicated in three dimensions in real time, transmitted, and viewed interactively. Photography as we know it, will be considered merely a mode of visually and temporally framing a multidimensionality, whether actual or modeled. It will be to representation, what today a freeze frame is to video. Currently, we still think of photography as a specific medium with its own culture and mode of production, that overlaps on the fringes with film, video, scanning, 3d imaging and mapping. Eventually, however, photography will be a consciously limited subcategory in the completely converged system of optic reproduction.

Imagine a typical (and inevitable) application: an athletic game (soccer or football) is taking place and viewers at home and in the stadium view a live broadcast. But they can see it in three dimensions, even have it displayed as a holographic object in front of them. They can also change its angle of view and its level of enlargement. The viewer can zoom in on a particular player, and watch the action from any angle, all in real time. What will be broadcast will not be a dimensionally limited sequence of images, but a real time 3d, camera-based model. The entire model will be stored locally by viewers, as it changes through time, and so any interaction with it remains open-ended. Even a recording of the event, will be accessible from all angles and points in time. Consequently replaying the event in a different speed or from a different angle will be a personal choice. Dimensionality will be democratized. Image framing will be democratized.

The actual recording will be done by video cameras, some stationary, some mobile, some crowd-sourced (any member of the audience with a camera can upload a particular angle). There may also be some form of 3d scanning using optical or other wavelengths as an additional way to measure distances and dimensionality. All data will be combined into a single model in motion and then broadcast.

It will likely have different resolutions in different areas, and some areas (such as niches or spaces out of view) will be missing altogether. I imagine there will be more sophisticated algorithms by then to fill in the unknown areas based on intelligent conjecture. We can also expect heavy use of augmented reality, the overlaying of graphics, texts, social media, since the model already contains geo-locational data for easy superimposition.

All this is inevitable because the necessary technologies already exist and the trends are pointing towards spatialisation. The only bottlenecks are processing speed and data bandwidths, as usual. If we can assume that technology will accelerate as before, then all the pieces are destined to merge, and probably fairly soon.

We are already getting used to some of these ideas in consumer cameras. Many cameras now can take both still and video images, each of which is still considered a distinct entity. A video frame is lower in resolution, so as to allow for faster capture. Better processors are doing away with that distinction. A video will be merely an image sequence; a photo will be a freeze-frame, technically speaking.

Photos are also becoming dimensional for consumers, with many cameras having built-in options for panoramas, which can be viewed as if they are surrounding spaces. Or they include two lenses allowing for 3d photo and video (with a single perspective 3d result, not yet virtual modeling of space).

3d recording and displaying is advancing rapidly, creating a renewed commercial “added-value” and a necessity for Hollywood, from what used to be a gimmicky fad. TV and phone 3d displays not requiring glasses are already on the market. Video holography is also progressing rapidly, but may need a little more time to break through the processing bottleneck.

Video games are usually lagging behind Hollywood productions in realism, but only because games require real-time rendering, while Hollywood can devote however long they need to render just a single frame. Usually, what took hours and days to render a few years ago, gets reduced down to a fraction of a second, and can then be applied to gaming.

Lagging behind further is the speed of input. Both films and video games use similar methods of input: mathematical modeling (synthesizing) or photography and live video that is mapped onto geometric surfaces. But photography is also used to create 3d models directly, instead of just becoming a surface. Using multiple cameras positioned optimally, software can interpolate dimensionality. This method is being used very effectively by NASA and other space programs to create models of outer space or surfaces of planets. The data is gathered very gradually by telescopes and satellites and then creatively and mathematically combined to create “fly-throughs” and landscape “photographs.” Apple's new (and beautiful but rather useless and comically flawed) 3d mapping app is using the same principles: photography as the generator of space, as opposed to merely the content of surfaces projected on mathematical models, which is how Google's 3d maps are produced. Eventually, the input and its recombination into models could be accelerated to the point of being live.

Notice how we have to use quotation marks to distinguish a rendered viewpoint from an “actual” photograph of a sequence. The veracity of a scientific, model-based photograph is actually not inferior, because the original data is from measurements. It is not a photograph of just a model, it is a photo of a model based on photographs (and other data). In many ways such an image contains more information, i.e. more “truth.” What is missing perhaps, is the original momentary intent, the conscious framing in time, the soul of the photographer. But even that could be argued about. As the latest images from the European space program show, there is such a thing as the soul of the “post–photographer.” He/she wasn’t at the location then, but has experienced it virtually and internally and been affected by it emotionally, and this results in images that also reveal humanity.

It seems the idea of photography will become something similar to what in film production is called the “director of photography." It will denote the framing in time and space, the design of the image, not a residue of the moment of being there, feeling it, and taking the shot. The latter notion of photography could become something not necessarily retrograde, but certainly specialized. Perhaps it will be called moment-based photography (momentography) or some such.

In the meantime it is worth considering what such a convergence could mean for historical documentation. We are already recording the present in ways that could one day be reconstructed to create spatial models. Just as we can colorize old black and white films, and turn traditional movies into 3d (up to a point), it will be easier in the future to render spaces from optical recordings. So, it is actually urgent that we think about how we record the present for posterity, to prepare it for the proper dimensioning by future historians.

links:
European Space Agency images of mars, composited from multiple images into models, then reframed
http://www.esa.int/esaMI/Mars_Express/SEMWF0474OD_1.html
http://www.esa.int/esa-mmg/mmg.pl?b=b&type=I&mission=Mars%20Express&start=1

Holographic video in development:
http://web.mit.edu/newsoffice/2011/video-holography-0124.html
http://www.youtube.com/watch?v=Y-P1zZAcPuw

Microsoft photosynth uses public images to create dimensional spaces/views of the original subjects
http://photosynth.net

Multi-View Stereo for Community Photo Collections
http://grail.cs.washington.edu/projects/mvscpc/

No comments:

Post a Comment