SXSWi: The Future of Touch User Interface Design

Session: Sunday 13th March 2011
Presenters: Amish Patel (@amishpatel) and Kay Hofmeester (@kayhof) from Microsoft

Hashtag: #futureoftouch

Get the slides and videos, courtesy of @amishpatel:

I’ll admit I went into this session with a bit of doubt as to whether it would be another portfolio slide deck, or company plug. But my fears were soon put to rest, by this well thought out presentation by Microsoft.  I was pleasantly surprised.


This session focused on discussing the language of touch, and where we are today, with some insights into what we need to do next to evolve it further.  Here’s my notes (and videos) on the session:
The history of input languages

Touch is a language which needs to be developed – so they ask – how do we take this language so that it can enable us to see into the future?
Patel and Hofmeester take us through their study of the history of language:

We’ve been creating languages for decades in relation to technology —  They compare the change in input, output and users:
We started with the command line, which involved text commands: This is a single channel input, single channel output, and involves a single user.
This evolved into the GUI era, with a combination of text and on screen pointer: This is a single channel input, multichannel output, but still involves a single user.
And now we look at touch, which is a direct touch input, with multichannel output, and is potentially multiuser.
Patel and Hofmeester ask: What happens when you take an input method and map it to an old language? They believe that when you take a new input method, you use the old language, and it just doesn’t work.
The quite humourous example they use is a video of a person using the Power glove:

Touch needs it’s own language.

How do languages develop?

Input languages develop in stages. But what are the stages, what do we have to think about?

They propose there are “three stages of media”

  1. New technology is introduced – at this level, it’s a typically a prototype, it’s not fully developed
  2. Copy old language – at this stage, we tend to copy or mimic an old language
  3. New language is developed – eventually, a new language is developed for the media.

Example: Film language development
1877: Horse in motion- uses film to show how horses run. It’s still very much so experimental – use of a new technology.
1895: They began to capture performance with film, and project it elsewhere, essentially, copying the old language of theatre into another theatre.
1945: Citizen Kane and other movies – It was here that we began to see film techniques that are still in user today: The camera started to move around, they went outside the theatre, and a language for film developed that we are still using today.

Consider the stages with input methods
The Mouse:
1968: Douglas Engelbart invented the mouse, and at this stage, it was still a technical experiment.
1977: The mouse began to be adopted, but it was originally used with the command line interface – using the old language with a new input method.
It wasn’t until Xerox Parc Alto came out with the first GUI which introduced the pointer, and in 1984 the interface really started to become available to the general public.

It’s interesting to note that touch has been around for a long time.
In 1972, the Plato IV was a computer touch screen terminal.

Nothing much happened with touch for awhile as the mouse took off instead of touch.  It wasn’t until ten years ago, touch soon became more available, mostly in kiosks and POS systems. Funnily, it’s the same technology in the 1972 screen.

So what stage is touch in?

Patel and Hofmeester believe that touch is in stage 2. And here’s why:

  • Touch is still emulating mouse, as the touch is still a single point of contact with an x y coordinate.
  • Touch is still emulating hardware keyboard.
  • Touch is still using the paradigm of scrolling and scroll bars

Why? Because everything we know is still concepts of desktop GUI concept. It was invented for GUI, and we are still using it. We pan through long pages, and slowly find things. Is this a touch concept or GUI concept? And where do we go from here?

Some interesting things leading the way:
Combining gestures with keyboard:

Copying real life

They mention that perhaps flipping a page is the solution to scrolling problem, but question whether it is this still mapping an idea to real life.  They mention if you copy real life, then you are limiting your new input method to real life.

We are still trying to get to stage three.

To get to stage three:

  1. Body aware
  2. Multitouch
  3. Multimodal

Body aware
Touch has many aspects, one of the thing is posture: how are you approaching the device – Hofmeester stresses the need to  take into account the human posture. Our language today doesn’t account for most human factors such as the angle of the elbow, direction of swipe. He uses the example of left-handed users, for which certain touch movements are awkward.  The device doesn’t know you’re left handed – it doesn’t adapt.

Patel and Hofmeester question: Why are we converting a finger into a tiny x y coordinate? The language needs to understand that I’m touching, not just converting into an existing mouse/keyboard paradigm.  The computer is typically clueless to the form factor of the human hand, and when it doesn’t respond in the way that we want, the user becomes frustrated.

What we have to think about is how does the system work with multitouch? What’s interesting is that we need to remove the concept of a single focus, as with multi-touch, there are now multiple areas of focus.

There are different ways of input that are better for doing some things, and others are good at other things

Every input method is best for something, and worst for something else - Bill Buxton

Patel and Hofmeester propose that we should look at combining methods of interaction, and provide some examples:

Touch and pen

It’s very hard to draw a straight line using touch – it’s just part of the form factor of the human body. It’s hard to be very precise with touch, but with a tool, you can become very precise.

Touch and speech

Touch is very good at pointing at things. Speech is good for issuing commands. There could be some interesting ways on how to combine these methods, such as selecting something by touch, and issuing a command using speech.

Touch and air
On this point, they discuss the issue of proximity – computers only realize you are there when you touch the mouse or screen. Potentially, you can use air and gestures for a computer to sense your proximity, and behave accordingly.

Is touch the end of the yellow brick road? Or is it really the natural user interface that we are aiming for?
Natural user interface involves:

  • Touch, voice, air movement gestures
  • Sensor input
  • Multichannel in and out
  • Multiuser

From audience questions:

On affordance

There is an issue of learning the language – the problem of affordance. With touch and air, commands are hidden. As designers, we need to provide feedback, and we have to have some kind of teaching mechanism.
This is a very broad issue and topic – something they are working on – but they see it as the critical mass.

On interacting with content and not the device

The goal is to be interacting directly with the content, the information, not the devices. But input methods are a fact. They will always be there. If we define the language right, other people won’t know it’s there.

Overall this was a very insightful session that highlighted in my mind the fact that the current touch interface designs are just the beginning, and as an industry, we need to explore and define the language to take it into stage 3.