Science —

Terminator-vision and the complex questions behind “augmented reality”

With info overlaid on our vision, "cool" doesn’t equal "useful" or even "safe."

I need your clothes, your boots, and your motorcycle. Because according to my augmented reality scanner, you match my criteria.
Enlarge / I need your clothes, your boots, and your motorcycle. Because according to my augmented reality scanner, you match my criteria.

For folks in my generation—those born in the late 70s or early 80s—the definition of "virtual reality" is informed by decades of popular entertainment and includes at least a few strong Lawnmower Man images. Virtual reality, as it’s been sold to us by the combined forces of Hollywood and consumer electronics companies, is the experience one gets when one straps on a head-mounted display and slips into a computer-created world. And even though most of the world’s images of VR come from the hilariously terrible first wave of VR popularity in the 1990s, mainstream companies like Oculus are close to actually making VR happen in a way that isn’t inconvenient, overly expensive, or dumb.

But VR has a twin: augmented reality. If VR means slapping on a head-mounted display and waving VR-gloved hands around like a crazy person, augmented reality ("AR") is maybe more immediately useful; it’s most easily defined as the overlaying of extra information onto your perception of the world. Think of what the Terminator sees—a view of the world with extra data popping up all over, giving additional context to the things that you see.

Augmented reality isn’t anything new—military pilots (and some commercial pilots), for example, use heads-up displays that project information onto a reflective screen directly in front of them, allowing them to see the horizon even when it’s cloaked by clouds or darkness or bad weather. Similar heads-up display technology is even becoming common in higher-end automobiles, too—automotive HUDs can mirror a speedometer and tachometer onto your windscreen or even show navigation information.

A military aircraft HUD (in this case, from an <a href="http://arstechnica.com/gadgets/2014/12/mach-2-hair-on-fire-ars-flies-the-navys-fa-18-sim-into-the-danger-zone/">F/A-18F simulator)</a> showing dense symbology.
Enlarge / A military aircraft HUD (in this case, from an F/A-18F simulator) showing dense symbology.
Lee Hutchinson

But as with so many other technological advances, AR isn’t without its complications and dangers. We humans are highly visual creatures, and the entire mental process by which we perceive the world and make decisions based on visual input is precise and finely tuned. While the idea of "tweaking" the visual field through adding symbology or colors or data sounds like it might be nothing but upside, a very real danger remains. The ultimate sin in perceptive modification is distraction.

The topic is explored in depth by Eric E. Sabelman and Roger Lam in their excellent IEEE Spectrum piece on the dangers of augmented reality, the thesis of which is that when it comes to screwing around with humans’ visual field, less is more—at least where fast thinking and decision-making matter.

Look around you

The diffusion of AR into the consumer space is largely tied to advances in mobile technology. Most people in the developed world now carry around smartphones these days, smartphones with sensors and fast CPUs and GPUs. Indeed, putting aside for a moment dedicated AR devices like Google Glass and Microsoft HoloLens, the smartphone has become an excellent, capable augmented reality platform for some use cases.

Take amateur astronomy, for example. Locating a particular star in the sky at night typically requires the would-be stargazer to have some knowledge of at least one celestial coordinate system in order to know where in the vast bowl of the sky to look. Most commonly, you would use the equatorial coordinate system to trace the star’s right ascension and declination (and you also need to know where the vernal equinox is in order to start at the right place).

An equirectangular plot of stars in the celestial sphere with an apparent magnitude of 5 or brighter, showing declination versus right ascension. This is complicated.
Enlarge / An equirectangular plot of stars in the celestial sphere with an apparent magnitude of 5 or brighter, showing declination versus right ascension. This is complicated.

But smartphone apps (like Sky Guide) can greatly simplify this process, taking math out of the equation and transforming stargazing into a purely visual experience. The app takes advantage of the phone’s rich set of sensors and knows exactly where you are and where the phone is being pointed. Aim the phone at the sky and the screen shows you the stars that are directly in your field of view, handily labeled and with the constellations traced in.

Or, for an example with perhaps more day-to-day utility, consider apps like Word Lens that can perform real-time visual translation on printed words in a dizzying variety of places. If you’re lost in a country where you don’t speak the language and you need to read the street signs, for example, you can point your smartphone at them and see the sign’s contents translated to your native language.

Pick up the pace

These kinds of AR applications are neat and practical, and they draw oohs and aahs when you show them off, but they’re also not particularly time-dependent—in other words, they’re not using AR in a way to supplement critical motor decisions. That gets a little more complicated.

For gaming, even for fast-twitch gaming, AR can be a good fit. The examples most recently shown off by Microsoft at E3—using Hololens with Minecraft and other games—are excellent examples of how AR can be an additive component to the gameplay experience. Video games, especially first-person games, have for a long time been a poor man’s version of augmented reality, with interfaces cluttered up with ammo counters and hit points stats and objective markers. To some extent, replicating that kind of symbology in real life seems desirable.

Video games have been doing a version of "augmented reality" forever. Those stats at the bottom? AR!
Enlarge / Video games have been doing a version of "augmented reality" forever. Those stats at the bottom? AR!
id software

But it doesn’t necessarily work out that way. As the Spectrum article explains, some visual triggers can be perceived almost automatically, while some require active thought to understand. An info-rich worldview with colorful high-resolution graphics and text might seem like a desirable thing to have, but the faster you need to react to something, the worse an idea it becomes.

For example, fast forward to a world where AR is cheap and ubiquitous and we’re all wearing smart contact lenses with built-in imaging. When you’re shopping at the grocery store, an AR system that pops up a "FRESH" or "BAD" overlay on fruits and vegetables might be helpful. It might also shade the part of the fruit or veggie that might be bruised or overripe. You’re probably not making quick life-or-death decisions at the grocery store (unless you’re in a hurry to get home and watch season 21 of Orange is the New Black, anyway). You might look like a zombie, standing there holding two oranges and staring off into space for minutes at a time, but the rich visual overlay could help you finally pick a good piece of fruit.

Things change when on the drive home from the grocery store (assuming Google self-driving cars haven’t taken over all transportation duties at this point in our speculative future). A rich visual overlay with all the paths of every other vehicle traced out in glowing color, and distances to other vehicles highlighted in color-coded numbers, and your speed and destination and a giant pulsing arrow showing your proper direction of travel are almost certainly very bad in this context. Even a light visual augmentation—say, a floating number above every other car showing its distance to you in meters—is unnecessary and distracting.

A driving HUD with a projected path and lots of words is great if you're a robot motorcycle (like this still from <em>Terminator: Salvation</em>), but it turns out that humans need to spend extra time reading the words and <em>thinking</em> about info-dense displays, and this hinders fast decision-making.
Enlarge / A driving HUD with a projected path and lots of words is great if you're a robot motorcycle (like this still from Terminator: Salvation), but it turns out that humans need to spend extra time reading the words and thinking about info-dense displays, and this hinders fast decision-making.
Warner Bros. Pictures

Some stats-obsessed geeks might argue that slight visual augmentation like this is helpful, and it might be under slow-moving circumstances, but translating numbers into gut feel and perception isn’t an automatic process. When a car swerves into your lane, you can react to it in less than a second because it triggers impulses that don’t require higher thought. You don’t think, for example, "Oh my, I believe that vehicle has crossed into my lane. Let’s see. I need to lower my left arm by some amount to turn the wheel to the left in order to move the car to the left, and I should also apply some pressure to the brake with my right foot." You just do it, relying on that same combination of reflex and conditioning that lets you jerk your hand away from a hot stove.

Injecting additional cognitive steps into that decision-making process doesn’t help—it hinders, suppressing reflex and getting your conscious mind wrapped up in areas where it shouldn’t necessarily be treading.

This principle is followed with aircraft heads-up displays. Military heads-up displays, through a combination of technological limitations (military HUD symbology is decades old) and human interface research, use an extremely limited set of symbols and focus that information in a very narrow area directly in the middle of the field of view (rather than, say, by using Google Glass’s peripheral view screen). Along with that, pilots undergo extensive, exhaustive training on how to incorporate that symbology directly into their decision-making as they fly.

But how many drivers would accept driving with an arcane set of symbology in front of them—symbology that required at minimum several hours of training to fully grok?

Channel Ars Technica