Gesture recognition was supposed to be the way we would all interact with computers someday. But things aren't looking too bright for the futuristic interface, with two significant products recently waving goodbye to the technology.
Nest, the company purchased a year ago by Google for $3.2 billion, was riding high with its $249 Learning Thermostat. Then it introduced the Protect smoke and carbon monoxide detector—with gesture recognition. The idea was to allow owners to shut off false alarms with a wave of a hand, perfect for a gadget that's usually mounted on a ceiling out of reach.
Unfortunately, determining intentionality from a hand wave is tricky (was that a hello, or a brush off?). The Protect could unintentionally be shut off, leaving owners vulnerable in the event of a fire. So the company recalled the product and took it off the market for roughly two months. Now it's back--minus the gesture recognition.
The other surprise was Microsoft's decision to offer its Xbox One gaming console without the motion-sensing Kinect controller, something the tech giant said it would never do. Without Kinect, the Xbox One is $100 cheaper ($399 versus $499), but it's an unusual change in direction given how well some games work with the system, tracking body movements and using facial recognition.
So why is gesture recognition failing and flailing?
The first time I witnessed gesture recognition was back in 1997 when IBM demonstrated something it called natural computing - a combination of speech recognition and gesture recognition software. A giant globe spun on a screen while a researcher waved his arms to move it and shrink it. It was the inspiration for those scenes of Tom Cruise pinching and squeezing the air in “Minority Report,” and it all seemed very impressive at the time.
The trick, of course, was that the computer had no distractions. It was looking in a controlled area for specific movements and sounds. But that's not very practical in the real world. A wave of an arm could be a warning, a threat, or just a languorous stretch, it all depends on the context, which is difficult for a machine to discern. And while researchers continue to experiment with human-to-machine alternatives to the keyboard and mouse, gestures are turning out to be less convenient than, say, touchscreens which are, after all, just keyboards in a different form.
One problem is that human gestures are not universal. A particular finger hoisted into the air means one thing in one country and something completely different in another. One person's victory sign is another person's peace sign.
Gestures are also not the easiest way to communicate. Sitting on the couch and waving at your smart TV to change channels may sound like a good idea, until you consider that you may also have a (non-alcoholic) beverage in one hand and a kale-based snack in the other.
In the U.S., we've also been conditioned not to gesture. Didn't your mother always tell you it's not polite to point? So that means in order to control a TV, game, or other gadget, one has to learn a new language of hand movements. It often makes me feel like I'm conducting some strange cyber age form of tai chi (not to mention the strange looks everyone gives me).
It's true that Nest and Microsoft are not completely abandoning gesture technology. If Nest finds a software fix for the feature, it can download it to Internet connected Protect smoke detectors. And Kinect on the Xbox is still a fun feature and works extremely well with exercise and dance titles.
Furthermore, several companies continue to work on the technology for everything from smart phones to cars. Looking for a competitive edge, Intel has been considering gesture tech for tablets--it would certainly be more sanitary. Mercedes-Benz has been playing with the technology to replace all the buttons and controls in cars. But the problems for on-board computers are legion: Is the driver telling me to turn right or just scratching his nose? It's much easier, from a technical perspective, to keep a car from crashing into another vehicle than it is for it to understand human movements.
So the perfect system for humans to communicate with machines continues to elude us, at least for now. Of course, that shouldn't be too surprising. Just think of how difficult it is for us to understand each other.