There is a huge amount of hype surrounding Siri, Apple’s new voice assistant for the iPhone. I’ll leave it to the numerous gadget sites to do a thorough hands-on review of it’s capabilities. Instead, I’d like to take a historical view of user interfaces and detail why Siri represents a profoundly different and improved capability. And while I am a frequent critic of expert prediction, I’ll do my best to project the ways in which Siri and other advanced interfaces could be used in the future.
The history of human computer interfaces has been a slow and continual march towards ease of use and accessibility. The earliest electronic computers, emerging in the 1940’s, could only be accessed by skilled scientists who programmed them via a series of switches and cables. The 1950’s would see the rise of paper punch cards as a means of programming and output for computers. They were still used primarily by a limited population of technicians and operational personnel. The 1960’s saw the introduction of teletype machines. These were effectively electronic typewriters hooked to computers. They established a paradigm of typed input and response that is still dominant today. They were, however, highly inflexible, limited to single line input and printed responses. The 1970’s saw the introduction of video display terminals (VDT) replacing the teletype as the dominant interface. The video screen had greater flexibility than paper and introduced the concept of fields to guide user input. Ultimately, graphics and color capabilities were added to VDTs, providing an even richer experience.
The introduction of the personal computer in the early 1980’s initially didn’t change the user interface. Most systems of that time held onto the character based interfaces initially popularized by the VDTs. However, a major change was brewing. The introduction of PC’s meant that a larger, consumer based audience was now using computers for a broad array of tasks. People wanted a way to use these systems, without needing to be a technical expert. The character based command line interface was unintuitive, cumbersome and limiting.
A major leap forward in user interfaces came from two different sources. First, in 1984, Apple introduced the Macintosh. Shortly thereafter, Microsoft introduced the initial version of Windows. While the former was a computer and the latter a piece of software, both ushered in an era of user interface that predominates to this day. They each featured a clickable mouse, graphical interface and drag and drop capabilities.
The introduction of the original iPhone in 2007 represented the next major leap forward for user interfaces. Although there had been some minor introduction of touch screens in ATM’s, kiosks and early tablets, there was no widespread acceptance of their value. The iPhone featuring a new operating system, iOS, quickly changed that. It featured a virtual keyboard, multi-touch gestures and introduced the concepts of pinching and panning to zoom and scroll.
Which brings us to Siri, the latest user interface advancement. The concepts behind Siri were originally developed as part of a research program called CALO run by SRI International. CALO stood for Cognitive Assistant that Learns and Organizes. The work was funded by the defense research agency DARPA. In 2008, several entrepreneurs, in conjunction with SRI’s venture unit, looked to commercialize the technology developed under CALO. They initially developed Siri as an app for the iPhone’s iOS operating system. In 2010, Apple acquired Siri and went to work incorporating it’s features natively into iOS. Then, in October of 2011, Apple released Siri with iOS 5, making it available on the new iPhone 4S.
So what’s all the buzz about Siri? Why do people feel that it is a revolutionary new capability as a user interface? To understand this, some background on Siri is needed. Siri differs from previous interfaces in the following ways:
- Most obviously, it uses voice as a means of input and response
- It is a natural language interface. It has a basic understanding of grammar and sentence structure that permits a user to speak naturally as opposed to the specific phrase requirements of previous voice interfaces.
- It is context driven. That is, it understands time, place and intent to carry on a purpose driven dialogue and accomplish a user desired task.
- It is services oriented. That is, it links to a number of service providers to get information, send a message or conduct a transaction.
All together, these features lead to one profound outcome: Siri obscures the underlying mechanics of the computer, allowing the user to simply specify the task they are looking to accomplish. All previous interfaces forced the user to understand syntax, command sequences, menu structures and data locations. Siri allows the user to make a request in natural english without needing to understand the underlying complexities needed by the computer to accomplish the task.
Now that I’m done singing Siri’s praises let’s take a broader look at it’s immediate impact as well as limitations. As a relatively new piece of software, Siri certainly has its issues. Its voice recognition, while good, is still prone to comical errors. It is not an “open” piece of software, with Apple controlling its service partnerships. It is still fairly limited in terms of functionality. It is not a full replacement for touch on the iPhone as a user interface.
But there is certainly reason to believe that Siri represents the future for iPhones and for user interfaces in general. It will unquestionably continue to improve in terms of voice recognition, understanding of context and inclusion of new services. If Apple opens up the interface to 3rd parties, there could be a huge rush of innovative services leveraging Siri’s architecture.
Here are a number of innovative ways that Siri could be leveraged in the future:
- Inclusion with OS X for desktop and laptop users
- Hands free operation through headsets for vertical markets such as field service personnel, mechanics and emergency responders
- Software could automatically analyze a website and generate code to make it “Siri aware”
- Siri could be a front end to corporate databases and enterprise software platforms
- Integration with automotive entertainment systems
The evolution of user interfaces has been a continual march towards ease of use, allowing an ever broader and non-technical population to interact with computers, without a need to know their inner workings. Siri is a bold step towards accomplishing the long sought after dream of an electronic personal assistant. I expect that voice and natural language dialogues will be continue to play a major role in the user experience for the foreseeable future.