Photography

Tuesday, 29 October 2013

The Future of User Input (aka Gesture Is Dead)

I travel a lot for my job. Occasionally I find myself travelling with Sayyed Shah; friend, fellow developer and long time co-worker since day 1 at Neutrino, in his kindness as he gives me a lift most of the way home.
On one such journey, our conversation drifted to that of User Input and as such serves as catalyst for this post, Sayyed deserving as much credit for eliciting the thoughts from my head for this article as me.



Gesture, as it is now, is essentially a movement syntax overlayed on a touch screen.  In its first generation, this syntax was essentially mouse input translated into a single finger language (there are others, but they're more crass...) - simplistic, but ultimately effective.
Move on from there and we have a multi-touch expansion. It is this that should mark the first nail in the coffin of gesture; consider your hand, both of them if you wish, and then consider how many of your multi-touch gestures involve more than just 2 touch points. Multi-touch brought a vocabulary expansion, for the most part, of x2.  This is probably because, for most devices, placing all 10 fingers on the screen generally means you can't see what's under them (or at least not in any usable way).
So what does this mean for touch based gesture? Ultimately it is a constrained "language", unless the devices get bigger...
Now.. a large touch screen... Hmm say... 32 inch screen.  That would mean that you can use your whole hand... Both hands even, but mobility may have a tendency to suffer...

OK, we'll suffer the mobility hit - we have a massive screen our fingers tapping, swiping  (smudging, ewww.. how'd chocolate get on my finger? Damn muffins)..... How long do you use your device for in one session? Personally, I've spent 10 minutes so far writing this on my Desire HD, and my thumbs are starting to develop slightly numb spots on the areas that are constantly hitting the screen.  Touch is not a particularly comfy technology - glass is hard and somewhat unforgiving on the impact front.  I wonder how soon it will be before joint related issues in fingers are related to smartphone and tablet use through constantly tapping a solid surface - at least a keyboard has springs.

So back to our immobile, 32 inch touch screen.  It's kinda bright.  So bright in fact that I think I need to step back a little. Oh but wait... Now I can't reach the damn thing!
I know... What we need now is

#ta-dah!#

In steps motion tracking hardware such as kinect, or the play station alternative if you are a MS Hater, or Wii if you're one of "those"; in steps gesture MK III.

No one wants to look like a knob.

With gesture 3 we can be at distance from the device and still interracted with it. The vocabulary is expanded again, properly this time, using you whole body - imagine the syntax you could create with such a range of input!  You can wave at your device (be it a telly, Xbox or whatever) and it will react in a favorable manor (hopefully) and there lies the second and in my view final nail in the gesture coffin (it's a concept.. how big a coffin does it need?) - be it in the office, your house, bedroom or the bus / train, no-one wants to look like a knob.
Yes, Samsung, I can outstretch my arm, pinch waggle and what ever else to watch a recording from last week, but will I feel comfy while doing it?  In short, no, I won't.
Minority Report has a lot to answer for.  It's what all this gesture business is trying to emulate but it is missing 1 key element; minority gesture relied on there being no physical surface to interacted with. It was also fairly dim on the brightness front, in a room with no source of natural light if memory serves (only seen it the once).
So the pet hate aside, what would gesture need to be feasible?
I would like to see a gesture 3.5 that brought in a redeeming feature;I don't want to flap my arm to change a channel (and possibly guide a plane to landing on a carrier deck), I want a minimalist, nonchalant flick of a wrist or finger. I want discretion.

"No body move! I want to watch this!"

The problem now is that the syntax system needs to be really sensitive and flexible at the same time. It  needs to fit in with the individuals preference for relaxed gesture "accent".
But all this is now. Gesture is "dead". We need to look forward to the next input medium.

Voice? Because that worked well, didn't it?... Ok so maybe now, applications like siri and android equivalents are better at speach recognition, but there is still too much variance in language and accent to make it reliable.  This also makes a mention back to discretion relevant. Do you (if you have it) sit there in public talking to siri? Maybe when you first got it you sat around in the office for a joke, maybe in the car with no-one else present, but on the train? Voice is not discrete enough for mainstream uptake.  Even Ironman only seems to talk to Jarvis when he's alone. And he's Ironman..!
Even wearable tech like google glass with the tapping the frame... It's too obvious for it to fit comfortably into peoples lives.  Smartphone touch is discrete, you can hide away, check email, text.... Blog even and all you are doing is exerting minimal movement to achieve it.

So what's next then, smart-arse?
Fringe.

No not the T.V. show, though the concept shares similarities.   There are 4 levels of progression for tech in my view:

  • Redundant
  • Accepted
  • Emerging
  • Fringe

Redundant is old hat, replaced by the next thing that people are willing to accept.  Acceptance takes time and and as acceptance grows it emerges to the point of mass acceptance. 
Fringe is the stuff that makes people give you a funny look when you mention it. Fringe is the stuff that sci-fi grabs onto and takes to it logical worst case scenario and then throws it into popular culture, which in turn alienates it more.
I'm thinking...

Duh-duh-da-da-da-duh..

Mind control. Or at least the concept that brain patterns can be measured and then used to influence input.

The infamous "brain read/write" device

Now the first thing you probably thought was either:
  • Invasive surgery for technology implantation, or
  • Lots of red and blues wires woven in a cap that looks anything but discrete, attached to ribbon cables and a BBC micro.
OK, maybe not the BBC micro bit. Unless you're Greg.

The invasive surgery response is sci-fi pop culture kicking in... You don't need to crack skulls to measure brain activity.

The geeky wires response is currently valid and is also part of the fringe effect - there is nothing that shows the capacity for minimal impact.  There is also the sci-fi effect, though I think to a lesser extent.


Image result for christopher walken dreams
(I guess you could argue that this might place gesture MK3 in the fringe category, but I'm not so sure.  Perhaps hardware vendors have prematurely forced full gesture onto the public on the grounds that "if we make out it is normal, people will accept it as the norm".  As people may have heard me quote Timothy Lister / Tom DeMarco "constant reassertion of a statement does not make it fact".  Perhaps that is for a difference post on technology for profit instead of suitability...)

One thing Google Glass may be showing us is that technology invites us to wear things we don't need to wear (ie glasses). If the bulk of current brain measurement devices could be shrunk to be less obvious elements of glasses arms... Would that be acceptably discrete?  If so, all of a sudden Brain Influenced Input doesn't seem so impractical.  All you need then is the software to interpret and you can't "see" software at all....
As a first pass, the software might be as simplistic as a learning remote or a macro system of sorts, using crude and distinct patterns as triggers, but as the technology evolves, I dare say that it would be possible to distinguish the difference between the mental image of a cat and dog as our capacity to read / interpret the signals that the brain generates increases.

The discretion of thought of course is unlimited - it is the perfect input medium and coupled with other emerging tech like google glass as the display / input proxy to the device...



This is my vision of the future. Non-invasive, discrete, functional, practical. 4 qualities you can probably apply to all your favourite tech that you are using now and have in your home, that 50 years ago would have generated funny looks if you suggested them.