Tag Archives: hands-free

Voice recognition was done first and best by humans

Back in 2008 I theorized that it would be just a few years before voice commands revolutionized marketing and commerce. Not necessarily for everyone, mind you, but most significantly for people who wouldn’t dream of using a keyboard, or even a smartphone!

My post, Leaping the chasm to a plugged-in construction site, predicted that voice recognition isn’t that far away, and is the only way that many professionals would benefit from the utility of digital networking and cloud computing — ranging from the “safety glasses and hard hats set,” to offshore oil technicians (were you listening BP?), and even to surgeons.

One Million Years BC was a very cheesy movie about life before history. Original voice was mostly simple words and grunts. Heavy breathing was also involved -- at least, I'm imagining, by certain audience members.
In the beginning, even before we had a written language with which to record history, our original form of communication was voice. The problem with voice, however, was that once the words were spoken, they were gone forever. HarQen was launched at a time of technology convergence, when original voice can be turned into an asset.

That was as an outsider in the digital voice space. After spending time “inside,” with my friends and co-workers at HarQen, I’m realizing that voice recognition isn’t the only way to make a big difference with these types of phone users. I’ve discovered that you can derive value simply from people talking into their phones and having these snippets turned into sharable assets.

In other words, I hadn’t considered original voice. Original voice can be thought of as voice “captured, stored and shared,” pretty much as-is.

HarQen believes The Original Voice Matters. I recently talked about their view, of how voice is the “original rich media,” at Ungeeked Elite. Here’s a post from last week, on the VoiceScreener blog, that helps to explain why the best voice recognition software still resides between our ears — and how HarQen is using voice asset management to give clients an impressive competitive advantage.

So I was wrong. But I’m even more excited now than I was then. I cannot wait to see what happens when voice asset management is commonly adopted. Although it might not be powered directly by voice recognition, there may be a plugged-in construction site after all, using speech in the way it was used in the days when the only construction sites were in barely habitable caves!

iPhone voice recognition app presages a new mobile interface

A newly-launched iPhone application allows Google searches through voice alone. This brings us closer to when non-computing types can work and play in a Web 2.0 world. Imagine: If this future comes to pass, productivity increases in many industries would be huge.

More significant to us marketers, large swaths of the workforce will no longer consider the computing world to be hostile — or at the very least, impenetrable. As I speculated two years ago many workers simply will not make portable computing a habit until it is easy enough to do through speech alone.

You might consider this Part II of a two-part post. Last week I reported on Powerset, Microsoft’s acquisition in semantic search. Now, here is an exciting stride in the the voice-recognition half of the hands-free computing equation.

Below is how the New York Times characterized the voice recognition arms race (at least, the race for the juicy prize of mobile search dominance):

Both Yahoo and Microsoft already offer voice services for cellphones. The Microsoft Tellme service returns information in specific categories like directions, maps and movies. Yahoo’s oneSearch with Voice is more flexible but does not appear to be as accurate as Google’s offering. The Google system is far from perfect, and it can return queries that appear as gibberish. Google executives declined to estimate how often the service gets it right, but they said they believed it was easily accurate enough to be useful to people who wanted to avoid tapping out their queries on the iPhone’s touch-screen keyboard.

The service can be used to get restaurant recommendations and driving directions, look up contacts in the iPhone’s address book or just settle arguments in bars. The query “What is the best pizza restaurant in Noe Valley?” returns a list of three restaurants in that San Francisco neighborhood, each with starred reviews from Google users and links to click for phone numbers and directions.

The emphasis above is mine. Here’s a demo of the new Google app for the iPhone:

This is going to get very interesting, very fast.

As Raj Reddy, an artificial intelligence researcher at Carnegie Mellon University, reported in the NY Time’s piece: “Whatever [Google] introduces now, it will greatly increase in accuracy in three or six months.”

The semantic search problem, when solved, will help computers understand what people are saying based on their wording and a phrase’s context. On the other hand, voice recognition requires something at least as daunting: Penetrating regional accents. The most visible flaw in this first full week of the iPhone app’s release is it is baffled by British accents.