In short, Amazon is building the operating system of the home – its name is Alexa – and it has all of the qualities of an operating system you might expect:
- All kinds of hardware manufacturers are lining up to build Alexa-enabled devices, and will inevitably compete with each other to improve quality and lower prices.
- Even more devices and appliances are plugging into Alexa’s easy-to-use and flexible framework, creating the conditions for a moat: appliances are a lot more expensive than software, and much longer lasting, which means everyone who buys something that works with Alexa is much less likely to switch.
It’s definitely an interesting case to make – and Ben Thomspon does it well – but I still have a very, very hard time seeing voice-driven interfaces as anything but a gimmick at this point in time. Every point I made about this subject in the Summer of 2016 still stands today – limited functionality, terrible speech recognition, inability to deal with dialects and accents, and the complete and utter lack of support for people who live multilingual lives.
I can’t hammer this last point home often enough: not a single one of the voice-driven interfaces we have today – Alexa, Siri, Google Now, Google Assistant, Cortana, whatever – support multilingual use. Some of them may allow you to go deep into a menu structure to change input language (while some, like smartwatches, even require a full wipe and reset), but that’s not a solution to the problem of switching language sometimes even several times a minute, something multilingual people have to do dozens of times every day. And again – there are literally hundreds of millions of people who lead multilingual lives.
Heck, Alexa is only available in English and German!
If voice-driven interfaces are really as important as people make them out to be, they’ve got at least a decade of development ahead of them before they become actually useful and usable for the vast majority of the world.
I would have no idea why multilingual use is shunned. Being an English speaking American, it doesn’t affect me. But google translate seems to work well, so if google can do it, the brainiacs that Amazon employs/outsources should be able to sort it out.
I recently got an echo for xmas. I like it, it is convenient. The app is really nice to see a history of activity that Alexa did, or was asked to do.
Personally, I don’t think I’m going to dive into the whole home automation thing. At least not the part where I’m speaking to Alexa to turn on room lights. I did turn off the voice ordering feature. I don’t want my kids ordering 200 pounds of nestle chocolate bars because they think it’s funny.
I mainly use the streaming music feature, and recipe skill. And ask it the weather and sports scores.
I personally have an Alexa and of course Siri and I understand their both ‘English speakers’ and so I need to speak to them in English. This works fine, so it’s really not a huge deal.
I don’t get what your issue is – you speak English to those who only speak English – why can’t you speak English/German/whatever to a machine which only speaks that language?
Anyway, there is at least 250 million consumers ( not just people, but people who buy stuff ) in the US who are going to be able to use an English only voice interface. That is a huge market that will drive these innovations, much as the English speaking internet drove internet innovations.
PS. English is my 5th language.
Edited 2017-01-06 01:12 UTC
Try asking Siri to play a song with an english title in your local language. Funny things entails, but never the correct action.
If I set it to English:
“Siri, send a message to my friend and tell her ‘ik ben rond zes uur bij jullie; moeten we nog boodschappen doen voor het eten?'”
Siri doesn’t know what the fuck is going on.
If I set it to Dutch:
“Siri, stuur een e-mail naar David ‘Hi David, I’ll be on a weekend trip, so can you take care of OSAlert for the next few days? Thanks, and say hi to Beth & the kids!'”
Siri doesn’t know what the fuck is going on.
Whatever I set it to – English or Dutch – it will be useless to me for about 50% of my daily computer activity. And if you think this is a rare kind of thing – it isn’t. Hundreds of millions of people live multilingual lives like this, and technology has no idea how to handle this stuff.
Pretty much what you said: most of my e-mails are in English, my SMS-messages are in Finnish, my Whatsapp-messages are in both languages and sometimes the messages themselves contain both languages, and due to Finland itself being bilingual in the first place (Finnish and Swedish), that also brings with it issues!
Just setting a voice-assistant to a single language isn’t going to fly, it’s a complete no-starter for someone who lives a multi-lingual life.
Wow. You don’t want much from voice recognition, do you?
I’ve been playing with voice recognition since OS/2 Warp 4 (which for 1996, was pretty good).
But the examples you give are, to put it mildly, incredibly difficult to parse. Oh, once it’s in text, sure, Google Translate parsed out your first message to:
Siri, send a message to my friend and tell her “I’m about six hours with you; we have to go shopping for dinner?”
Which is (mostly) accurate, I assume– so obviously, being able to translate a multilingual sentence isn’t outside our current technology. Translating a spoken version of that to text, however, has some pitfalls.
When siri hears “ik ben rond”, how does it know that’s dutch? The sound is very close to “ich bin rohn”, which might mean “I am raw” in German.
I’m sure that to you, as a native speaker of Dutch, the two phrases sound wildly dissimilar, but to a computer that has to handle a wide range of variations, it’s going to be a tossup.
The best you could hope for would be:
“Siri, send a message to my friend and tell her in dutch, “ik ben rond zes ….”.
That might be doable with today’s technology. But auto-recognizing language switching on the fly, with the various dialects and pronunciations? We’re still a number of years from that.
This has been one of my complaints about voice recognition for twenty years– By the time you put in the various trimmings to make the computer understand exactly what you want, it would have been faster to type it.
Then again, I can hit 80+ WPM on a keyboard– Especially if it’s a good mechanical keyboard like laptops and tablets don’t have.
“ik ben rond” sounds like “ick ben rrond”
“ich bin rohn” sounds like “ihgh bin rroan”
They sound as different as background and bank ground do.
However, if you type both phrases into google translate, and have it speak them in German and Dutch, it’s close enough that an untrained voice recognition program could have a fit.
Just the variation in the USA in English between, say, Boston, Alabama and Texas could give a computer a fit.
While I haven’t been there, I’ve been assured by multiple people from Germany that there is a noticeable difference in pronunciation between regions.
All of this has to be accounted for by software that’s used to either “true” or “false”– “maybe” isn’t well understood by computers.
Relax Thom, we only recently achieved wide support for multilingual smartphone keyboards, give it some time, it’s not like it’s the 21st century. Oh, wait…
My *human* assistant wouldn’t be able to do the things you just suggested are required of machine assistants, because she doesn’t speak those languages ( she speaks 3, but we only have English in common ).
That does not make my human assistant useless, very much the contrary. Are you really going to tell me that if you had the option of an English speaking human assistant or no assistant you would rather choose not having one?
It’s the same with machine assistants. I walk around my house all day asking siri to do simple stuff and she/it does them without any difficulty and I can only imagine Siri/Alexa/Google whatever will only improve rapidly.
Edited 2017-01-07 00:45 UTC
The only way I would use voice activation is if I was incapacitated and in bed. From the bedsite I could ask to turn on the light, or the TV or issue commands to raise/lower the bed mattress or the temperature in the room.
And there is nothing that I need so far that requires me to purchase a IOT item.
I have seen fridges with IOT panels. These panels use software and that software has a maintenance contract. Who benefits? My existing fridge lasted 20 years before we decided to add a second fridge to include more storage.
“and the complete and utter lack of support for people who live multilingual lives.”
Silicon Valley still very localist.
Localist? I suppose that’s one way to put it. I’d say that it’s market driven and most of the US market doesn’t give a fig about bilingual support. http://cis.org/record-one-in-five-us-residents-speaks-language-othe… That sounds like a lot but it actually means that 80% of US residents DON’T speak another language at home and of the 20% that do, a large chunk of them are capable of speaking English well enough to use Alexa or similar. Of course, that doesn’t apply to the rest of the world outside the US, and Amazon is most definitely a global company. Expect to see language support expand and evolve quite rapidly if Alexa continues to grow and doesn’t become just another passing fad in the electronics world. As has been pointed out, most of the software already exists in things like Google translate. I’d imagine multilingual support could be implemented in a matter of a few months IF the demand for it becomes apparent. I’d wager it’s already on the “coming features” list.
Being a multilingual speaker with a medium-strong dialect actually makes monolingual dialect-sensitive voice interfaces more familiar: I’m constantly choosing language and dialect based on who (or what) I’m talking to anyways.
It’s so ingrained that I even do it to animals when I pet-sit them.
For me, the much bigger issue is features and recall: I don’t think I’ve ever gotten a useful response out of Alexa if I didn’t already know she could answer it. As such, I almost only use her to turn the bedroom lights on and off.
Meanwhile, I’ll ask Google anything (half the time in semi-natural language), because even if I don’t get a structured answer, I’ll get useful web results.
Full disclosure: Until recently, I worked on improving recall for Google.
I’m Dutch and don’t really care whether or not my native tongue survives much longer. I also don’t care that there are languages that have more native speakers than English. I don’t care that there are languages that have a more consistent grammatical structure, and fewer exceptions. I am fluent in German and read French, Latin and some Greek but really don’t see the need anymore.
English is now the language of commerce, science and popular culture, and it is good enough. We can import the missing words like Schadenfreude and let other languages fade away. Let’s teach everyone the world over how to speak English and give everyone a chance to participate on an equal footing, rather than teach our machines to perpetuate these historical curiosities.
Ah well. Not in my lifetime. But a man can dream.
Edited 2017-01-06 03:31 UTC
Interesting proposal. Even the dialects, variations and nuances of English itself is blending faster. I’d say mainly from communicating with others on the internet. For instance, I hear more UK English slang and sayings in American English all the time.
Disregarding the cultural pride aspect of an individual language, I’d like to hear a debate on setting a one world language, to learn what both sides would present as arguments for and against.
Universal Galactic Standard.
We’re talking some Tower-of-Babel-type blasphemy, but hey I totally am in with you on this dream of universal communication.
Also, I moved to NL from the USA so I’ve been getting more perspective of Thom’s words for the past year.
Edited 2017-01-06 13:17 UTC
Yes you can, but I don’t see what you have to gain from everyone speaking only one language, over a situation where everyone speaks their native, local language and English as a lingua franca.
That said, the idea that it would be feasible to have seven billion people speaking natively the same language is unattainable. Even in the US you see different accents emerging, and in a 1000 years time that will be different languages, getting a situation where everyone speaks the “standard” English language plus their local one, comparable to the current situation in many (if not all) European and Asian countries.
Obviously what I advocate is to put the lingua franca first and to have ‘mother tongue’ as an optional cultural enrichment program. And I personally don’t mind Australian, South African, Singlish, Ebonics or Minnesotean. I think that stuff is charming. Many concerned parents do, however.
Edited 2017-01-06 16:34 UTC
The most popular programming language in the world is Java.
I suggest we abolish all other programming languages and just use Java, since it’s the most popular. Let’s teach everyone the world over how to program in Java and give everyone a chance to participate on an equal footing, rather than teach our programmers to perpetuate these historical curiosities.
See how idiotic that sounds?
I’m sure there’s a lot of Java programmers out there that thinks this sounds GREAT!
Luckily for us, Oracle is working really hard to make sure that never happens.
Yeah, lets drop English too and all start speaking Java, that would be interesting!
I don’t, really, but what I do see is that you feel you need to resort to an analogy. That should not be necessary if there were a self-evident truth in the point of view you’re trying to present.
Let’s briefly entertain this deeply flawed comparison, however. I posit that all children born north of the Danube river should learn Assembler before anything else. But all children born to its south must instead begin their programming in BASIC. If your parents happen to live west of the Atlantic ocean, your first coding steps will be in Java. Move another ocean further west and your cultural roots will mandate the use of C++. At least until you get to the middle east where you will start to encounter nomadic tribes that still predominantly favor LISP. Special blessings to the seed of Oceania, on whom is bestowed the honor of keeping the Cobol tradition alive and well.
Come on. That is not idiocy. That is beautiful diversity. It should be preserved forever.
But it IS a self-evident truth. I can do things and express things in Dutch that I cannot express in English, and vice versa. Or I can express things in Dutch with only one or a few words that would require several pages in English – and vice versa. Same for the other languages I’m familiar with, like German and French.
It’s exactly the same as programming languages, which is why the analogy is so fucking apt. It’s easier to do certain things in C# than it is to do in Obj-C, and vice versa.
Language is the expression of culture, and only those who believe culture is worthless would advocate we all just use one and the same language (setting aside the fact that’s literally impossible due to how quickly language speakers diverge from the set norm). Languages also affect the development and working of the brain, something we don’t fully understand just yet.
The amount of knowledge that would disappear if we all just started using English and forgot all the other languages is staggering. Even proposing something like that is so incredibly idiotic I can barely grasp that people who have the ability to use a computer would propose such a thing.
I think your argument is a very personal and emotional one, which is certainly understandable given your occupation as a translator. And that is reflected in both the vagueness in describing the actual value of what you’re trying to preserve, and your use of hyperbole.
I think it’s more that you just lack experience with different languages and different cultures, and therefore presume that they are all the same as yours.
You state that we should all just learn English, but that makes no sense – since Chinese is a lot more popular than English, we should clearly all learn Chinese. That would be most efficient – it would mean the fewest number of learners and the largest number of people who already speak it.
So, when are you learning Chinese?
Edited 2017-01-06 21:42 UTC
Yeah, I already mentioned how I feel about the popularity argument in my initial post. Its appearance sadly seems inevitable. So again: Commerce, science and popular culture. But there is certainly an opportunity in learning Chinese at this point in time, particularly for a Westerner, and as such it is actually one of 3 languages that my children are focusing on. After English, of course, and followed by some German as well. I wish it weren’t necessary but I am a practical person, not an ideologue.
Chinese, German… Maybe You don’t even like poetry. I’m OK with You following that path. Sorry, if I feel a little sad about You.
TRANSLATION doesn’t exist. Is a Wish. A Hope. We only have AVATARS.
Sometimes -to convey the message to another Culture, you have to deliberatively IGNORE the words.
When You get a new Language, also a new Soul.
http://i.imgur.com/7cZNPyb.jpg
Google Translate couldn^aEURTMt!
[neither translate well to my charset!]
Beerfloat,
In many ways I would technically be right, and yet many people would take offense to this happening for personal reasons. It’s not hyperbole, it’s that we each value different things. Others will feel a loss even if I do not. For what it’s worth, I personally value diversity in and of itself and believe the loss of diversity is both dangerous and shortsighted.
Edited 2017-01-06 21:49 UTC
OK, now we have the absolute main contender for the Ig Nobel Prize in literature disclosed. Hope you make a good use of your sure prize. :p
Just read this article: http://www.thebookoflife.org/why-germans-can-say-things-no-one-else…
Hi,
Link in the article to the actual news is broken. I think it was supposed to be:
https://stratechery.com/2017/amazons-operating-system/
– Brendan
I finally succumbed to making my home a smart one.
Frankly, it all revolves around Alexa and I don’t buy any device that isn’t compatible.
Lights, Harmony, Thermostat, all compatible.
The “problem” is each set of devices seem to need their own hub. If the next echo can become that then manufacturers can work towards that standard rather than increasing the cost of entry to using their tech.
——-
Simple example of the setup I have is leave the house for work (gps on the phone), it turns the heaters down, turns off the tv and other entertainment devices and plays some music on spotify for the dog.
The words “Alexa, turn on tv”, sets my lights stops any music and switches on the relevant devices to the correct outputs.
These cases may not seem much and are frankly quite trivial in my setup, but then again I have visited my mother before who hadn’t watched tv for a week because she accidentally pressed a button on one of the 5 remotes and was unable to make things work again.
With only a Little more advancement these techs could really help those who struggle with technology or are simply physically unable. The arthritis in my mothers hands is unlikely to get better, but her voice is as clear as ever.
edit: sorry, wrong theard. Never mind.
Edited 2017-01-06 15:06 UTC
If YOU, Thom, were to create a product. Are you telling me that the FIRST version you shipped wouldn’t be made in the language that you think would be the easiest for you to create and support? If you try to tell me otherwise I would say that you are lying.
Nobody goes for the most complicated version the first time. You have to get it working. And getting it working is ALWAYS the easiest version that you can make that will make enough money to let you make money or get enough people using your product that you feel confident in the future that ultimate you will be making money. To say otherwise is a fools game.
Later, once you get the simple version into production, only THEN do you take the second to the easiest version, the version that is just a little more complicated such as a different version of your language. In the case of “English” speakers, which usually means “American English” speakers, the next versions would be the versions of english with the next biggest market. In the case of American English speakers that would be the UK version of English. Then maybe Canadian English and Australian English unless you would make bigger profits with a German or French or Chinese or Japanese version and you’ve been able to hire enough of the people that speak/read/write that language fluently.
Once you hit all the low hanging fruit, ONLY THEN do you take the next of languages with smaller number of speakers. Somewhere down the line you will EVENTUALLY support multi-lingual speakers.
Anything else would be a lie.
By the way, you don’t like that these devices are American English first and bi/multi-lingual much later? That is a business opportunity for people like you if you and people you know could get together and build your own device starting out with multi-lingual support. If there are enough people that would speak your multiple languages there should be enough money in it for you in order to make enough money to make it worth while.
Why are you so angry for me pointing out voice-driven interfaces need a lot more work before they can become as widespread and important as Silicon Valley makes them out to be?
I don’t understand why you direct so much anger towards me just for stating something obvious that we all know. Care to explain what about stating the obvious makes you so angry?
You are reading anger that isn’t there. And you are stating exactly what I stated. The software takes time to mature. And the fact that it is starting out speaking only one language, or one language at a time is absolutely logical. Anything else is illogical.
Regardless whether they are able to sort out the language shortcomings, these sort of devices will never be introduced into my home. I have just barely escaped vendor lock-in on the computer – I’m certainly not going to introduce it into every appliance in my house!
If I ever want to do home automation (which is possible, after the security situation is cleaned up), I will only use open standards, and preferably open-source software. The system will not “learn” my “habits” and do things I don’t instruct it to do. And it will definitely not “phone home” about what happens in the house!
Pro-Competition,
If only there were more people like you, but unfortunately there aren’t and it severely limits our choices. Even if 2% of us staunchly reject these things, it’s hardly going to catch the eyes of manufacturers. It bothers me to the core that the “internet of things” is evolving to be proprietary and vendor locked, yet so long as there aren’t substantial numbers of consumers protesting it, that’s what they’ll deliver. Everything gets tethered to proprietary corporate systems for the purposes of controlling us, monetizing us, and vendor lock. Like you, I’m absolutely sick of it and yet there’s very little I can do to stop it.
I wanted a wifi controlled thermostat, but so many of them can’t even be controlled locally. You have to connect to a proprietary service over the internet (like google “nest”) in order to control your own thermostat in your own home. <insert expletives here> This is so ef’n moronic, just like android’s print subsystem on android that won’t print without sending the data to google. Engineers at google should have been fired over such stupid engineering, obviously they had orders to do it this way so that users would be more dependent on google.
BTW I ended up buying a “radio thermostat”. The device still connects to their proprietary servers but at least it has a local REST API that I can use instead. The official app is a bit buggy. The local physical interface is kind of “meh”. The device lacks a local web based thermostat interface, which is disappointing, but you can write your own with the API. The local API has zero authentication, which is dumb. It hasn’t been updated in many years. Not a great review, but it’s what was left after not compromising on it being locally programmable.
I considered building something with a raspberry-pi but the existing thermostat was broken and I didn’t have the time or inclination to start from scratch. I have low hopes for the future as manufacturers continue to drop self-sufficiency.
This is all a cunning plan to make us all speak English !!
Song in title comes to mind
Random example:
http://www.theregister.co.uk/2017/01/07/tv_anchor_says_alexa_buy_me…
I put them and youtube down to a global conspiracy to bring illiteracy back. Why train to absorb information fast from text when you can do it verrry slooowly from a talking head?
Edited 2017-01-07 14:10 UTC
I much prefer to read my information too. But talking heads are useful if I am working in the kitchen or driving, or any other situation where I cannot divert my attention to reading.
It is also fun to see the TV weather reporter standing outside in a parka in the middle of a snowstorm.
Or paddling in “deep” waters…
https://www.youtube.com/watch?v=cgm3_jzcNm4
Alexa is in 3rd grade. Why no 20 languages Alexa?
She’ll grow up.