Kinect – What’s the future of gesture recognition in the ten-foot experience?

According to Microsoft, the popularity of Kinect has far exceeded expectations – in a conference call the following statement was made:

“Kinect, in particular, exceeded our expectation. Kinect is the fastest-selling consumer electronics device in history. It’s just our first step in delivering on the opportunity to fundamentally change the way people interact with technology. Kinect exceeded all expectations with 8 million stand-alone and bundled sensors sold in just 60 days.”

This is by far an amazing feat, and only points to how keen consumers are for gesture-based gaming.  But how soon will this translate into a desire for gesture-based control of other ten-foot experiences, such as controlling your TV or set top box interface? And what does this mean for the traditional remote control?


Not so futuristic after all

Whenever I think of the future of gesture-based controls, it sparks some excitement, merely from the fact that it all seems so futuristic – and typically, gesture-based controls are depicted in many futuristic movie sets, namely the crazy interface in Minority Report.  But really, the interface was designed and conceptualised by MIT’s Underkoffler, and is closer to reality than we think.  Check out Oblong and the amazing G-Speak interface:



Already TV companies like LG have been experimenting with alternate remote control devices – their Magic Motion gesture controller offering a gesture-based controller that you point to highlight and select items on the TV screen. Not too dissimilar from a Wii controller, though I’m sure offering more accurate control, Sony have the same device coming out soon as well.


But this is still reliant on a remote style device. True gesture control sans remote is not a new concept in TV – in fact companies have been R&D’ing this for years – but it’s yet to reach mainstream:


Toshiba had previously released a laptop (Qosmio G55) with gesture control – but it only goes to show how much more appropriate gesture control is with a ten-foot experience.  In 2008, Toshiba demoed their version of gesture control, but this is more a type of gesture control where you hand (or rather, fist) acts like a giant mouse.


Hitachi also demoed some of their great R&D at 2009′s CES:



Check out this video by Canesta, a company which makes technology that enables human gestures to power devices (interestingly, recently acquired by Microsoft late last year).  They’re behind some of the great work in Kinect and Playstation EyeToy accessory, and the Hitachi TVs.



And it’s clear early adopters are keen for the same kind of thing – Two developers, John Simons and Joel Griffin Dodd, have recently released free public beta code for KinEmote – which turns Kinect peripherals into a Media Control, in the example below being used to control Boxee.





Finally free from D-Pad Navigation?

Though I don’t think the Canesta or Hitachi interfaces necessarily represent “good” user experience, it definitely shows how different an interface might be if the interface design is free from the constraints of a “D-Pad” remote style navigation.  Traditional D-Pad navigation restricts and limits an interface in many ways – and there are many design considerations and paradigms that work best for these styles of interfaces.

It will be interesting to see the unique challenges of designing for a gesture-controlled interface – consider that you’d have to work with preset defined gestures, almost in the same way programmed buttons on a remote control limit interaction possibilities (and what does this mean for the Red Button?!). In addition, it will be more so challenging (and complicated) to design systems that cater to both styles of technology – which is a likely case during consumer transition from one method of interaction to the other – or more likely, TV will never rid itself of the basic remote control, but will be supplemented by gesture recognition.


Will the popularity of the Kinect devices bring the concept of gesture-based controlling into the mainstream? Is it something people are likely to adopt in the near future? I’m starting to think maybe gesture recognition in TV land is not so far fetched after all.