Friday, April 04, 2008

Writing With Speech

by Marc Zeedar macopinion@designwrite.com


Edited Unedited

As a writer, speech recognition software has always fascinated me. It sounds like the Holy Grail of computing: you speak and it types. Unfortunately, the reality has always fallen far short of expectations.

A few years ago I tried MacSpeech's iListen software, but was left extremely unimpressed. I felt a bit deceived -- it was promoted as not needing much training, but even after hours of training it still made far too many errors to be useful. Worse, it required a special microphone which was a hassle to connect and use.

At MacWorld this year MacSpeech announced a new product called Dictate which promises to bring Dragon's Naturally Speaking from the PC to the Macintosh. Supposedly this is a much better speech recognition engine and I would have better results. I was eager to try it so I purchase the upgrade.

The results are impressive. Dictate is definitely much better than iListen. With very little training -- less than five minutes -- and using the built-in microphone of my MacBook laptop, I am able to achieve decent results.

However, decent is not perfect. Not perfect means you have to edit the resulting text by hand. That's not the end of the world, but it seems to me it spoils the point of using dictation software in the first place. The main problem is that the errors Dictate inserts are subtle. For instance you don't have spelling errors, but you might have a missing word, an extra word, an incorrect word, or missing letters such as plurals or ED at the end of a word. Those kinds of errors can be difficult to spot and will probably slip through an editing process more easily than your own typos.

Another problem is that dictation is not speech. There are parts of dictation that are learnable -- such as speaking clearly, speaking at a consistent pace, and speaking punctuation that normally would be implied -- but much more difficult is actually having your thoughts organized in your head enough to be able to speak them.

I've been writing for over 25 years and I don't need to think about it much -- I just write and the words just flow. I also tend to do a lot of rewriting even as I'm writing. I'll write part of the sentence and then correct the sentence while I'm typing it. The slowness of typing actually acts as an editing process. With dictation, I have to think clearly about what I'm trying to say because rewriting is difficult with dictation. It actually feels slower and less natural for me. However, that could just be a learning issue. I'm sure with time I'd get used to dictation and it would be more comfortable.

When it works, Dictate works really well. It's very fast on my MacBook and can keep up with me even when I talk rapidly. It's great to not have to worry about how to spell complicated words. For certain applications and situations I can see Dictate being not only useful and practical but extremely valuable and perhaps better than typing.

Unfortunately, Dictate works best with certain kinds of writing. Perhaps this is just a training issue, but when I attempted to use Dictate to write a short story, it struggled. Names of characters, for example, were not always interpreted correctly. At this early stage of testing, I'm not sure how to add things to Dictate's dictionary. Certain aspects of fiction seem to me less appropriate for dictation software. I also find it difficult to think fiction out in my head before I write it, whereas for an article or e-mail, I do tend to think out my response before I write it.

Things like business correspondence strike me as much more practical for dictation. Perhaps this is because Dictate is tuned this way, but I think it's also because more formal communications are simpler and less ambiguous. For example, in my fiction I was using words like "whirled" -- which Dictate insisted on interpreting as the planet, not the motion. This software is supposed to be sophisticated enough to distinguish ambiguities like homonyms via context, but fiction has a much broader context than something like business correspondence.

Another interesting aspect of dictation is correcting via speech. To me this seems awkward and I had not thought it very important. The current version of Dictate is limited in what you can correct via speech (more so than the PC version). However, as I used Dictate, I realized that being able to correct via speech is very important. It requires a context switch to correct stuff by hand and it's much easier to concentrate on speaking alone. It's similar to switching between the keyboard and the mouse -- it's annoying to have to take your hands off the keyboard and use the mouse for certain functions. Similarly, it's annoying to have to stop speaking and use the keyboard to make corrections that Dictate won't let you do via voice.

Dictate has certain commands like "forget that" that will delete the last phrase interpreted. Unfortunately, sometimes it writes out the words and other times it misinterprets the command. A bigger problem, however, is that because it deletes the entire phrase it makes it hard to correct a single incorrect word at the beginning of the phrase. Sometimes I've tried to correct a single word and then it misinterprets other parts of the phrase when I repeat it. So it's like it fixes one problem only to insert other problems. One solution is to speak in shorter chunks of text, but that in itself causes other problems, because Dictate works much better if you speak in full sentences -- that's because it has more context to work with. Some of these issues I'm sure are just due to my inexperience. Perhaps I'll have less trouble as I get used to the software.

For people who have difficulty typing, either through inability or injury, Dictate is definitely terrific software. It might have flaws, but it's better than not communicating at all. Someone who really needs the software would also be more likely to tweak it to its maximum potential. For myself, I'm not sure exactly how I'll use it.

I'd love it if I could use it to dictate items on my portable digital recorder and have the software type them in for me later, but I'm not sure if that would work. The digital recorder itself adds noise, and if I'm recording something in my car I'd have engine noise and other distractions that would confuse the speech recognition. I'm also not sure how to train Dictate to work with the portable digital recorder. Since the training process is interactive, how can I pre-record a training session?

All that said, I am impressed enough with Dictate that I think I will continue to play with it and see if I can't find a place for it in my workflow. For certain kinds of writing I think it could be really valuable. For instance, how about a daily thought journal? It might be too much work to write it out, but I could speak it.

I dictated this entire article and I'm including both versions -- edited and unedited -- so you can compare them and see what kinds of errors Dictate makes. As you will see, most of the errors are minor, but they still require a fine tooth comb to catch them all. Whether or not using dictation software is worth the trouble is up to you. But keep in mind I am not using this under ideal conditions: I only trained Dictate for a few minutes, I am an amateur dictator, and I am not using a headset microphone. In that regard, the accuracy is amazing. I'd bet with practice and a better microphone I could get near a hundred percent accuracy. However, I think I would prefer to not fuss with a microphone and put up with less accuracy -- but your preference may be different.

If you're interested in speech recognition software, I would encourage you to check out MacSpeech's Dictate. If you test drove earlier speech recognition software and were frustrated like I was, you should definitely check out Dictate as it is noticeably better.

 

As a writer, speech recognition software has always fascinated me. It sounds like the holy Grail of computing: you speak and it types. Unfortunately, the reality is always fallen far short of expectations.

After years ago I tried MacSpeech is I listen software, but was left extremely unimpressed. I felt a bit deceived -- it was promoted as not meeting much training, but even after hours of training it still made far too many errors to be useful. Worse, it required a special microphone which was a hassle to connect and use.

At MacWorld this year MacSpeech announced a new product called Dictate which promises to bring dragons naturally speaking from the PC to the Macintosh. Supposedly this is a much better speech recognition engine and I would have better results. I was eager to try it so I purchase the upgrade.

The results are impressive. Dictate is definitely much better than I listen. With very little training -- less than five minutes -- and using the built-in microphone of my Mac book laptop, I am able to achieve decent results.

However, decent is not perfect. Not perfect means you have to edit the resulting text by hand. That's not the end of the world, but it seems to me it spoils the point of using dictation software in the first place. The main problem is that the errors dictate inserts are subtle. For instance you don't have spelling errors, but you might have a missing word, an extra word, An incorrect word, or missing letters such as plurals or ED at the end of a word. Those kinds of errors can be difficult to spot and will probably slept through an editing process more easily than your own typos.

Another problem is that dictation is not speech. There are parts of dictation that are learnable -- such as speaking clearly, speaking at a consistent pace, and speaking punctuation that normally would be implied -- but much more difficult is actually having your thoughts organized in your head enough to be able to speak them.

I've been writing for over 25 years and I don't need to think about it much -- I just write and the words just slow. I also tend to do a lot of rewriting even as I'm writing. I'll write part of the sentence and then correct the sentence while I'm typing it. The slowness of typing actually acts as an editing process. With dictation, I have to think clearly about what I'm trying to say because rewriting is difficult with dictation. It actually feels slower and less natural for me. However, that could just be a learning issue. I'm sure with time I get used to dictation and it would be more comfortable.

When it works, Dictate works really well. It's very fast on my Mac book and can keep up with me even when I talk rapidly. It's great to not have to worry about how to spell complicated words. For certain applications and situations I can see Dictate being not only useful and practical but extremely valuable and perhaps better than typing.

Unfortunately, Dictate works best with certain kinds of writing. Perhaps this is just a training issue, but when I attempted to use dictate to write a short story, it struggled. Names of characters, for example, were not always interpreted correctly. At this early stage of testing, I'm not sure how to add things to dictates dictionary. Certain aspects of fiction seemed to me less appropriate for dictation software. I also find it difficult to think fiction out in my head before I write it, whereas for an article or e-mail, they do tend to think out my response before I write it.

Things like business correspondence strike me as much more practical for dictation. Perhaps this is because Dictate is tuned this way, but I think it's also because more formal communications are simpler and less ambiguous. For example, in my fiction I was using words like "world" -- which dictate insisted on interpreting as the planet  not the motion. This software is supposed to be sophisticated enough to distinguish ambiguities like homonyms via context, but fiction has a much broader context than something like business correspondence.

Another interesting aspect of dictation is correcting via speech. To me this seems awkward and I had not thought it very important. The current version of Dictate is limited in what you can correct via speech (more so than the PC version). However, as I use  dictate, I realized that being able to correct via speech is very important. It requires a context switch to correct stuff by hand and it's much easier to concentrate on speaking alone. It's similar to switching between the keyboard and the mouse -- it's annoying to have to take your hands off the keyboard and use the mouse for certain functions. Similarly, it's annoying to have to stop speaking and use the keyboard to make corrections that dictate won't let you do ViaVoice.

Dictate has certain commands like "forget that" that will delete the last phrase interpreted. Unfortunately, sometimes it writes out the words and other times it misinterprets the command. A bigger problem, however, is that because it deletes the entire phrase it makes it hard to correct a single incorrect word at the beginning of the phrase. Sometimes I've tried to correct a single word and then it misinterprets other parts of the phrase when I repeat it. So it's like it fixes one problem only to insert other problems. One solution is to speak in shorter chunks of text, but that in itself causes other problems, because Dictate works much better if you speak in full sentences -- that's because it has more context to work with. Some of these issues I'm sure are just due to my inexperience. Perhaps I'll have less trouble as I get used to the software.

For people who have difficulty typing, either through inability or injury, Dictate is definitely terrific software. It might have flaws, but it's better than not communicating at all. Someone who really needs the software would also be more likely to tweak it to its maximum potential. For myself, I'm not sure exactly how I'll use it.

I do love it if I could use it to dictate items on my portable digital recorder and half the software type them in for me later, but I'm not sure if that would work. The digital recorder itself adds noise, and if I'm recording something in my car I'd have engine noise and other distractions that would confuse the speech recognition. I'm also not sure how to train dictate to work with the portable digital recorder. Since the training process is interactive, how can I pre record a training session?

All that said, I am impressed enough with Dictate that I think I will continue to play with it and see if I can't find a place for it in my workflow. For certain kinds of writing I think it could be really valuable. For instance, how about a daily thought Journal? It might be too much work to ride it out, but I could speak it.

I dictated this entire article and I'm including both versions -- edited and unedited -- so you can compare them and see what kinds of errors dictate makes. As you will see, most of the errors are minor, but they still require a fine tooth comb to catch the mall. Whether or not using dictation software is worth the trouble is up to you. But keep in mind I am not using this under ideal conditions: I only trained dictate  or a few minutes, I am an amateur dictator, and I am not using a headset microphone. In that regard, the accuracy is amazing. I'd bet with practice and a better microphone I could get near hundred percent accuracy. However, I think I would prefer to not fussed with a microphone and put up with less accuracy -- but your preference may be different.

If you're interested in speech recognition software, I would encourage you to check out MacSpeech is Dictate. If you test drove earlier speech recognition software and were frustrated like I was, you should definitely check out Dictate as it is noticeably better.


For Your Amusement
When I accidentally left Dictate running while watching TV, the microphone picked up random sounds and tried to interpret them. When I returned to my laptop, I was greeted with a page of gibberish! It's rather amusing:

Her and her son is a very on their a the ready to upload a new old moon is equipment and you as a bit of a time and a of of a new on a new and a new is is a little new to the the is a war when in a well known bit and make the bee a to a news is wanted is big name in a fit of a with the and

A new and and and and and burden of an and is in or with Clinton in a phone and phone a guy in a said he and his win and a news for you at a now that you've got a sustained, and longing of and all in a new angle acid when they will and I know my-we know that everything and that he's worried that it isn't then I'd say in her name down within the defense to a movement that will and a new and curtains and Clinton in a in SF for a news and bolts up in a new and I'll bet you a bundle that a new page in a new kind of figured you and only you and you alone. There is no return a phone news and the rights and and on him and puts them in a single and on a new is awesome and and only the in a now I had it a good thing to a certain goods as it has been a good phone and see him in a new meaning with him new meaning when one evening when you cease this is not a school all and this is a ruse that in a room room room I needn't worry update their long will the new and then and in him is in a ruling is a there was quote we know I'm all this in his personal in their senior and nice as you seem is stricken in rare in well come listing no subject sooner asked to come" yeah yeah yeah yeah see it that's what I think to I think it's very on a song is certainly yeah yeah yeah yeah yeah you are in a scenery are being sent on a is group as it is in your you you you didn't mean to say that he knew who could know who didn't see it when not let alone this year I I'm useless at your cause is the author of was a great picture with him on a. And you new to you if you often he is

macopinion@designwrite.com

Posted by Charles in • Less Tangible
(0) CommentsPermalink
Page 1 of 1 pages