Augmented Reality is the overlapping of digital information and physical environment. Sci-Fi has often portrayed A.R. as interactive floating transparent computer screens projected into the air, or perhaps the most absolute example: standing inside an entirely computer generated world.A.R. in the here and now however, has never taken off. Remember the Nintendo Power Glove, Virtual Boy, as well as numerous failed PC peripherals and software that attempted to provide you with a more physical interface to your computer?
The 2D human interface we currently use to operate our computers is sufficiently efficient. The past attempts at trying to sell a 3D interface to the PC failed because they never tried changing the mouse – an inherently 2D device. Walking down a hallway to find a file is not as fast as pointing and clicking on a file browser.
If the designers of these failed products were given the task of inventing a successor for the train, before cars existed; they would decide to take the train off the tracks so that you could go in any direction you wanted, but would not change a single aspect of the train itself. It would still be steam powered, and have the turning circle of a small planet.
In the attempts to create a new system to interact with PCs, these products failed to change the PC itself. Those who tried to replace the mouse failed because they couldn’t change the software: MS Windows, MS-Office etc. are designed entirely around mouse input.
Disruptive innovations can only succeed once all over options have been expended. A third party peripheral manufacturer made a tilt sensitive Playstation controller (but with Dual Shock) in 1996. So why now is SONY saying that the SIXAXIS is such an important aspect of the PS3? The technology was available over 10 years ago.
In the previous (but still mostly current) generation of consoles (XBox1/GameCube/PS2), all three major console manufacturers did to some extent fail to live up to ‘next gen’ hype because in the end all they had to sell was better graphics. There was no significant switch in paradigm, unlike the Playstation 1, Saturn and N64, which all stepped from 2D games into 3D gaming. It is only now that every option has been expended that innovation can come through. To repeat the same mistake of offering only ‘better graphics’ is to offer nothing new.
The Nintendo Wii represents the first successful ‘3D’ interface with a computer. It is not simply a matter of X, Y & Z.; The Wii also understands acceleration, force, tilt and roll. What this gives us, is a wide and natural range of gestures for input, something the mouse is unable to express.
The PC has not gone under any major transition, retaining the same mouse interface since XEORX Parc in the 70s. However, we have sat and dreamt about A.R. for years, and now that it’s in our laps, we have been so wrapped up in the dreams that we’ve not noticed what the arrival of this technology means for the much wider computer industry.
In order for computers to evolve to the next generation, they will need to dispatch of the mouse. Pen based and voice input can provide simpler, quicker access, usable by a wider range of people. Both of these interfaces have failed to catch on properly so far because they have still been tacked-onto a mouse-designed system. The Nintendo Wii did not have to tack Wii controls on top of traditional controls:- Wireless controllers have existed for ages. Nintendo started from scratch with a fresh new interface designed only for the Wiimote. PCs will not evolve until they can do the same. Tablets don’t need ‘XP Tablet Edition’; they need ‘Tablet OS’ before they will ever take off.
The Apple iPhone also represents a disruptive innovation in the market, bringing A.R. to user. You can interact with your data by touching it; it has made every handset since the invention of the mobile phone suddenly look positively stone-age.
As we’ve seen with the Wii and the iPhone, new operating systems need to be developed to make A.R. a reality. The PC industry will not move on if companies are not prepared to ditch the mouse fully. Within 10 years time, the 2D desktop will look as stone-age as using punch-cards. Companies who only make half-baked attempts at ditching the mouse will be eventually ditched by consumers.
If you would like to see your thoughts or experiences with technology published, please consider writing an article for OSAlert.
Having more natural/direct controllers is one thing, but it seems to me that integration of graphics is vital for something to be really called AR, i.e. augmenting a view on the real world with computer graphics. With the Wii and iPhone everything you see is still on a separate screen. It’s cool but it doesn’t fit the definition of ‘augmented reality’ in my opinion.
True A.R. will come evetually I think, but to get there, the mouse will have to go first. A screen-in-the-air would be a pain to use if you had to stand there trying to interact with the data with a trackball in your hand instead of reaching out and ‘grabbing’ the data.
I think that the Wii & iPhone are A.R., but just primitive A.R. The way the Wii allows you to swing a golf club in the game, but have the feeling of a golf club in your hands by having something to hold, and move in direct correlation with the virtual world is a form of A.R. imo. Things have a long way to go, but it’s all exciting!
the wiimote can act as a mouse. hell, i think its used as such in most of the wii menus.
Sure, the wiimote can be used as a mouse, but let’s be honest: as a mouse it sucks, it’s far less precise than a real mouse.
> it seems to me that integration of graphics is vital for
> something to be really called AR
Indeed. We need glasses or contact lenses with integrated displays, preferably with depth of field, lest our eyes will go way worse way faster than they do now.
Then we need head/eye tracking. And I mean seriously good tracking. I’ve been inside the VR cube at KTH in Sweden, and with the tracker it had it felt like my eyes were on rubber bands; when I moved my head the display was updated some 100-400 milliseconds later. This causes nausea in most people. We need to be talking microseconds instead of milliseconds here. Especially if the display is on contact lenses, because of the saccadic eye movements.
It would also be nice to have some kind of tracking of external objects, making computer graphics overlays possible, but I guess that’s even farther down the road.
Integrating most comments in this thread, I conclude that:
-True AR involves both “augmented input” and “augmented output” to achieve an augmented interactivity.
-Augmented input could include motion sensing, touch sensing, true voice recognition (understanding of natural speech, think about Star Trek computers), eye tracking and other alternatives.
-Augmented output could include an improved desktop metaphor (3D or not), information superimposed to visual reality (glasses or contact lenses with integrated displays, but I think glasses are way more cool :>), speech and other alternatives.
It looks like current state of AR is more advanced in the input area (motion and touch at least), but as Kroc points out nicely, software (mainly OS) has to evolve a lot to embrace this kind of interactivity and use it for increased productivity. And as said by Yamin, efficiency of keyboards is difficult to beat in this context, especially in places where other people are working.
In order to truly develop AR, some technological advances have to be made, and a true benefit for users has to derive from it. I think AR will eventually develop, but how fast, only time will tell.
I cannot agree the iPhone or the Wii can be seen as augmenting reality.
True augmented reality will happen when technology like Gibson’s “Neuromancer” – simstims, jacking, now that is cool.
I burst out laughing when I read the part about the iPhone being an A.R.device because it has a touch screen. What a joke.
Touchscreens have been around for a long damn time… I remember my first encounter with one back in ’86. They’ve been in use on portable devices such as palm pilots for quite some time. This is not revolutionary at all, nor does it represent augmented reality.
Comes from a good cup of java (not Java). YMMV
Hmm something like this?
http://www.youtube.com/watch?v=g8Eycccww6k
Or like this
http://fastcompany.com/video/general/perceptivepixel.html (think iPhone on a 100″ screen)
“In order for computers to evolve to the next generation, they will need to dispatch of the mouse. Pen based and voice input can provide simpler, quicker access, usable by a wider range of people.”
Let us for a minute assume we perfect pen and voice recognition. Is it really better than what we have now?
I personally type much faster than i write with a pen. I also find it way easier than writing. I’d also not have to sit there and talk 50 pages just to write an essay.
The mouse is also a very good design. A pen is unquestionably better for graphic art work. I’ve used convertible tablet pcs where i can get the benefits of both worlds. Do I find myself more drawn to pen-mode? No, because for many tasks, I need a keyboard (see above) and the mouse is a much better companion to a keyboard than the pen (I doesn’t need to touch the screen…it is a much quicker switch from keyboard to mouse…than from keyboard-pen-screen)
A common example of this is the voice-menu systems over the telephone. It used be the voice would say “press 1 for x, press 2 for y, press 3 for z”. And you would press a key. Now its “tell me what you’d like to do” and you have speak to this machine. Especially at work or something, it’s far more convenient to just let me press numbers than to interrupt everyone around me with me answering voice prompts.
By all means A.R is getting better and I welcome any new changes. But they have to actually have a benefit. Off the top of my head, the only thing more efficient than typing would be some kind of direct connection to the brain
Voice has been slow because the computer has not yet been able to understand us more fluently, and nor has an OS been developed that is not trying to replace the mouse pointer with the voice
For example, what’s quicker?
Saying “Computer, play the 25 highest rated songs that were last played on tuesday please”,
or saying the clicks out loud – “Start, All Programs, iTunes… File, New Smart Playlist…”
And to iterate your phone menu system; again current systems are no more than replacing clicks or buttons, not /real/ speech. In the future, you could expect to just say “I’d like to talk to the finance department please” and instantly be taken to the right place without silly menus.
Edited 2007-03-07 21:17
> “Computer, play the 25 highest rated songs that were last
> played on tuesday please”
It’s good that you’re polite to your computer, lest it starts playing Step by step (by New Kids on the Block) on repeat instead.
Saying “Computer, play the 25 highest rated songs that were last played on tuesday please”,
or saying the clicks out loud – “Start, All Programs, iTunes… File, New Smart Playlist…”
What’s actually easier, nicer and maybe even faster is clicking with a mouse through the thing, expecially if you already did the smart playlist.
(I don’t know how it works on iTunes, but on Amarok it would be so).
Voice interfaces? No thanks (see my other comment about it).
3D may be great for demos and games, but lets not forget that human eyes are 2D input devices. Our brains may reconstruct 3D information via a number of clues, but even that reconstruction is only be partial (e.g. walls or objects may occlude my vision).
Books, signs, and graphs all encode their information in 2D because that is the only way to present all the information at a single glance. Eyes can move about a 2D interface faster than legs can walk about a 3D interface. (Information that in inherently 3D (e.g. architecture models) not withstanding.)
One qualification on this is that most GUIs include the ability for objects to overlap or stack on top of each other. Good interface design often revolves around how to use these sorts of tricks to hide the information when the user doesn’t want it (e.g. menu collapsed or a window behind another one) but still show it when the user does want it (e.g. menu expands, etc). The information is still 2D but it is a mangaged 2D with occlusion and hiding that you might have in a 3D world. I belive the future of user interfaces lies not in 3D but in developing these (2+epsilon)D interfaces to make it easier for the user to navigate information and applications. I for one would like to be able to just think at a menu to have it open or to switch between the window I’m reading from and the one I’m writing notes in.
Neal Stephenson got it right back in 92, if only someone would get around to building the technology. I want my metaverse, damn it.
It’s called SecondLife and it’s boring.
arquake thats great ar for you
http://wearables.unisa.edu.au/projects/ARQuake/www/
and dont forget the
http://www.tinmith.net/backpack.htm
thats some fun ar
In order for computers to evolve to the next generation, they will need to dispatch of the mouse. Pen based and voice input can provide simpler, quicker access, usable by a wider range of people.
Apart from a bit of wrist syndromes, the mouse is a damn good usability concept. It is easy to implement as a control on the computer side and it is extremly logical to use.
Voice interfaces are actually much worse. Do you imagine a train/plane full of people speaking to their laptops? An office full of people speaking to their desktops? What if the PC answers to talk in the room instead of your one? You can use some clever filtering/recognizing, OK: what if it answers to *you* talking with someone else? Do I have to begin every sentence with some “My Computer please” header? What about voice strain? And what would be the advantage of such a mess coupled with the increased development difficulty and costs?
On a sidenote, touchscreen interfaces are also subpar. See http://catb.org/jargon/html/G/gorilla-arm.html on the Jargon File for explanation.
Edited 2007-03-08 09:32
I agree about the gorilla arm thing, our arms are much better adapted to perform small movements while resting or being supported on a desk, but I think touch recognition can bring some enhanced interactivity. Maybe a better alternative would be some kind of big touchpad integrating keyboard and mouse roles, or something similar to a laser keyboard tracking finger position and gestures.
–“Voice interfaces are actually much worse. Do you imagine a train/plane full of people speaking to their laptops? An office full of people speaking to their desktops?”–
Not really different than people talking to each other, if the technology is implemented properly. I would also want a tactile trigger to let the computer know I was talking to it, and not a flight attendant, but if you do it right, speaking conversationally to anything would be pretty close to ideal. Not because it’s efficient, of course, but rather that it’s easily taught to humans and puts the work firmly back on the robots where it belongs.
thats the one part of this opinion piece i agree with.
hell, i sometimes wonder if not a good tablet os may even make for a good pc os as i sometimes think the desktop metaphor have become way to literal…
mind you, more often then not the underlying os do not have to change, its just the gui that have to do so.
linux, windows, mac/bsd, it can all be used, one just need to rethink the gui, the part that the user interacts with on a daily basis.
for that i find it interesting that the tablet/umpc part of vista now comes with a special gui for launching apps and some other stuff. more like their htpc gui then a desktop gui iirc.
hell, even the htpc gui works better because its more focused. movies? all are found under the movies area. pictures, sorted where you expect them to be and so on.
much more focused then the dekstop gui with its general purpose file manager.
sure, there have been inclusions in xp and vista of folders that are specifcally there to house specific kinds of files. but in vista, are they auto-updated search folders?
hmm, i need to have my gui thoughts hammered out into a coherent whole. linux (or maybe something else. but linux is a ok start) kernel, database assisted file system, file/object/action centered gui with plugin/kpart style backend. dropping the overlapping windows in favor of something that tile windows automatically for those times when one needs to move or compare stuff.
That would be an interesting and handy approach. I think Linux is good to experiment with new GUI concepts. The problem is to find the right people to face such challenge!
well i may have to learn to program to ever see it make any headway beyond a few drawings and ideas…
I know, “if you want something done, do it yourself, but you know, ideas have power. The best programmer in the world can’t bring anything really brillinant without innovative ideas. Look at Microsoft: lots of money and good programmers, lack of new ideas. A lot of projects need a lot of people working on it: programmers, designers, testers… and people thinking about goals, objectives and so. So not being a programmer isn’t so bad :>)
Ideas need to be spread to be useful, communicate them. Maybe you will start a huge project, maybe a small one, maybe inspire somebody, or just start a nice discussion, like the one inside this thread, that is, by itself, a good thing. At the bottom line, speaking/writing is gratis :>)
that it is, but i fear some corp will grab my ideas and run of with them before i get any kind of project going. and then sue my ass of later on…
Right. So, for your project, you need lawyers. Lots of them. A few programmers and designers won’t hurt, too, but mainly lawyers.
Have luck :>)
While I agree with the article that for AR to be useful, it needs a proper input device, but I fail to see which input device can be used:
– a wiimote: it’s not very precise.
– voice input: not reliable especially in noisy environment, annoying for bystanders..
– pen, touch: those usually tires the hand fast.
Plus the presentation device is quite tricky: many display will induce sea-sickness for users.
Frankly, I think some form of sign language will be the lingua franca of the AR era.
And voice and gesture only seem less efficient, because people typically don’t create and use new words and movements to communicate. Yet we make new words when programming a computer by keyboard all the time. It could even be argued that creating new words is at the heart of what programming is.
The biggest technical reason AR is not quite here yet is that latency is still an enormous problem. The system just doesn’t respond to human action fast enough. Head trackers are too slow, translating complicated movement takes too long, and Immersions’ 3D stylus jumps and stutters when moved too quickly (at least last time I used it). Even video through firewire has a very observable delay.
Until that nut is cracked, AR will not have really arrived.
i’m reminded of the star trek movie where scottie picks up the mouse and yells into it.
In an office voice tech could never work to many people but at home from the couch it could be cool.
Edited 2007-03-09 02:13
Cool UIs dont really bring productivity unless you can connect easily to DATA. Past and Present points to
Apple unwilling to roll with Corporate environments.
http://www.iphone-converter.org/
quote:”Voice interfaces are actually much worse. Do you imagine a train/plane full of people speaking to their laptops? An office full of people speaking to their desktops?”
crazy! Get iPhone Converter http://www.iphoneconverter.com/