When the mouse and the point-and-click interface came along as part of modern computers and operating systems of the mid-1980s, it drastically changed the way we interacted with technology. Though the CLI still has its place, today we rely heavily upon the mouse and the point-and-click interface as we have for the last 30+ years. Over time, the mouse became wireless (making it look a lot less like a “mouse”) and the window systems became more sophisticated. Faster computers and networks made video possible, and cheap, fast mass-storage devices allowed us to hold all our photos, songs and movies. The point-and-click interface, though, has remained largely the same.
About a decade ago, a new interface became available and caught on quickly. The capacitive touch interface, first on smartphones, later on tablets, and now on some laptop computers, has become the dominant interface for those with mobile devices.
Will speech and natural language interaction be the next game-changer, as the mouse was in the 1980s, and touch was 20 years later? Apple, Amazon, Google, and Microsoft all seem to think so. All have introduced products and capabilities that allow speech as input and audio as output, meaning you can talk to your technology and it talks back.
The first major system on the scene was Apple’s Siri, released in the fall of 2011 for use on iPhones. Google Assistant, Amazon Echo and Microsoft Cortana followed soon after. While Apple’s Siri was first among the big players, the others each introduced interesting new capabilities. Google could leverage what it knew from your calendar and other tools to bring more value as an assistant. Amazon introduced standalone, stationary devices which could be extended to deliver “smart home” capabilities through “skills” or capabilities accessible via voice. Most of all, each new voice interaction system showed advancements and improvements in language interaction to the point where the earliest systems like Siri soon seemed far less impressive.
The Good, the bad, the ugly
There are pretty clearly some disadvantages to consider, too. First, talking to technology still looks and feels a little strange. When I pass people on the street carrying on a phone conversation via a Bluetooth earpiece or some in-ear headphones, a small part of me can’t help noticing how they look like they are talking to themselves (and how can I be really sure that they aren’t!). And some people have noticed that kids growing up with Alexa in their house or Siri on their devices talk a little loudly and abruptly, even when talking to people. Second, an office full of people interacting with their technology may be noisy and disruptive and may even error prone for obvious reasons. Third, there are some tasks that probably just work better in a point and click interface. The voice interface makes it more difficult to leverage visual information (like a map) and to select specifics from that information (like a street or address). Finally, as a practical matter, the technology is still far from perfect. Voice assistants seem to still misunderstand us a little too often to be a primary interface.
Maybe a next leap forward in user interfaces will leverage Augmented Reality and Virtual Reality as the next logical step. A number of niche vendors are producing interesting AR and VR products, and the big vendors will surely respond with some of their own. Another interesting blog topic for another day.
What do you think about user interfaces? And what will be the dominant interface of the next 5 years? Please post a comment and let us know what you think.
Thanks for reading! A blog works best with active participation. If you enjoy this blog, please +1 it and leave a comment. Share it on your favorite social network. More readers will drive more discussion.