Audience note: The audience is a national professional association of medical office administrators. Each has a basic working knowledge of PCs, but little or no familiarity with the technology behind voice recognition software, with its limitations, or with the various products on the market.]


Voice Recognition Software: Comparison and Recommendations


Use of voice recognition software is under consideration by medical office administrators nationally. Administrators have long searched for alternatives to the expense, error rate, and record-completion delays associated with conventional transcription. It is no wonder that, with the recent advances in voice recognition software, medical transciptionists are looking at this emerging technology as a powerful way of accomplishing essential record-keeping tasks.

This report investigates four of the leading voice recognition applications to determine whether this technology has become a practical option and to determine which application is the best choice. And so that this report and further study of the software can be better understood, an introduction to the subject of voice recognition software follows.

Introduction to Voice Recognition Technology

Several different voice recognition products currently exist in the marketplace, and viable choices are greater in number than they were only a few years ago. Rapid changes have been fueled by the ever-increasing power and plummeting prices of desktop systems. Though room for improvement still exists, accuracy has advanced tremendously in a stunningly short time.

Brief history. The first software-only dictation product for PC's, Dragon Systems' DragonDictate for Windows 1.0, using discrete speech recognition technology, was released in 1994. Discrete speech is a slow, unnatural means of dictation, requiring a pause after each and every word [11]. Two years later, IBM introduced the first continuous speech recognition software, its MedSpeak/Radiology. These systems often had five-figure price tags and required very expensive PCs. Continuous speech technology allows its users to speak naturally and conversationally, relieving much of the tedium of discrete speech dictation [11].

Dragon Systems made an enormous stride in June, 1997, when it released NaturallySpeaking, the first general-purpose continuous speech software program. Much more affordable than earlier programs, it brought the realm of continuous speech recognition to a much wider range of users. Two months later, IBM released its competing continuous speech software, ViaVoice [10].

Stringent demands. Much is demanded of speech recognition programs. Accuracy is critical, and speed is essential to any effective program. Added to these challenges are the enormous variance that exists among individual human speech patterns, pitch, rate, and inflection. These variations are an extraordinary test of the flexibility of any program. Voice recognition follows these steps:

  1. Spoken words enter a microphone.

  2. Audio is processed by the computer's sound card.

  3. The software discriminates between lower-frequency vowels and higher-frequency consonants and compares the results with phonemes, the smallest building blocks of speech. The software then compares results to groups of phonemes, and then to actual words, determining the most likely match.

  4. Contextual information is simultaneously processed in order to more accurately predict words that are most likely to be used next, such as the correct choice out of a selection of homonyms such as merry, marry, and Mary.

  5. Selected words are arranged in the most probable sentence combinations.

  6. The sentence is transferred to a word processing application [11].

Power devourers. With all of the complex selections and tremendous flexibility demanded of voice recognition software, it is small wonder that considerable computer muscle is required to run these programs. To take fullest advantage of current speech recognition programs, a PC with a minimum of a 300 MHz Pentium II processor is recommended. A separate 16-bit SoundBlaster-compatible card is also advisable, because the sound cards that are bundled as part of a PC's motherboard can produce inferior results with voice recognition software [4].

Realistic reminders. The technology has advanced impressively over the last year, with programs variously offering smarter speech recognition engines, larger active vocabularies, integration with the most popular word-processing programs, and improved accuracy. This report sorts through these to find the most accurate program and the best value available, and determines if the accuracy is acceptable at this time [4]. It is essential to remember the following:

Requirements for the Purchase of Voice Recognition Software

Based upon stated preferences and system specifications, the following conditions have been established:

Points of Comparison

The different voice recognition software programs compared are Dragon Systems' NaturallySpeaking 3.0 Preferred Edition, IBM ViaVoice 98 Executive, L&H Voice Xpress Plus, and Philips FreeSpeech98. Discussion of Dragon Systems' NaturallySpeaking will also include its Medical Suite.

Eight categories of comparison will be made in order to effectively evaluate these competing programs: (1) accuracy; (2) minimum system requirements; (3) capacity to manage a specialized medical vocabulary and medical records; (4) integration with Microsoft Word; (5) ease and speed of installation, customization and use; (6) industry ratings and awards; (7) inclusion of microphones, and (8) cost.

Accuracy. Accuracy is the single most significant consideration; without it, the program is useless. Dragon Systems' NaturallySpeaking 3.0 scored highest on all of the accuracy tests performed by PC Magazine and was unequivocally selected as the Editors' Choice. In their tests, the average accuracy was 91% and at times was considerably greater [1].

Average accuracy for L&H Voice Xpress was 87% [2]. Accuracy for IBM's ViaVoice tested at 85% [14], and Philips FreeSpeech98 was 80% [15].

At first glance, these percentages, particularly the top two, may not seem significantly different. Consider, however, that for every 1,000 words, an accuracy rate of 87% means that 130 words must be corrected. An accuracy rate of 91% represents an average of 90 errors per 1,000 words, while an 80% rate means that 200 out of every 1,000 words must be corrected.

Thousands of words are dictated daily in this practice. Time is scarce and precious. Medicolegal conditions mandate that records must be exhaustively thorough and accurate. Under these rigorous circumstances, with every percentage point counting heavily, Dragon Systems' NaturallySpeaking yields the highest accuracy.

Minimum system requirements. All four programs run on Pentium-powered PC's utilizing Windows 95, 98 or NT 4.0 and require 16-bit SoundBlaster-compatible sound cards. Random access memory (RAM) requirements for software run under Windows NT are higher for all of these programs [5].

Table 1. Comparison of Minimum System Requirements
Software CPU RAM Hard Disk Space L2 Cache
Dragon Pentium/133 MHz 32 MB 180 MB none
IBM ViaVoice Pentium/166 MHz-MMX 32 MB 180 MB 256 KB
L&H Pentium/166 MHz-MMX 40MB 130MB none
Philips Pentium/166 MHz 32 MB 150 MB none

It is important to recall that, as noted earlier, significantly greater system resources are recommended to optimize performance. Given the sufficient system resources, none of these software programs should present a problem for the existing system.

Capacity to manage a customizable, specialized medical vocabulary. Medicine in general, and each medical specialty in particular, have their own complex, specialized vocabularies.

Two of the four companies offer a product that provides medical terminology. IBM's emergency room and radiology add-on software is not applicable to the dictation needs of obstetric and gynecologic practices, for example. Dragon Systems' NaturallySpeaking Medical Suite offers the same voice recognition technology as the previously mentioned NaturallySpeaking Preferred Edition, with the addition of extensive customizable medical terminology that can be tailored to other specialty practices.

Integration with Microsoft Word. All four programs integrate with Word97 and can therefore be used with existing word processing software. [5].

Ease and speed of installation, customization and use. Each of the four programs uses "wizards" to install and configure hardware, and all programs support macros for frequently used phrases.

Installation of all of the programs appears straightforward, and the initial basic "training" is not excessively time-consuming for any of the products. While all provide macros, the medical customization features of Dragon Systems' product are considerably greater. Though they will initially require more time and document input, accuracy is increased, and for this reason, Dragon's software is recommended in this comparison.

Industry ratings and awards. Only one of these products refers to and lists awards on its web site, and that is Dragon Systems' NaturallySpeaking. None of the other three products has any such mention anywhere on its site, nor do any awards or industry recognition show up on multiple web searches for the products.

Dragon Systems' web site lists over fifty awards, some of which are listed here:

While industry recognition and journalistic evaluations are not the only considerations, Dragon Systems boasts an impressive list of awards and ratings by prestigious periodicals.

Inclusion of microphones. As previously noted, a microphone is necessary for capture of spoken words.

None of these is a make-or-break detail, but Dragon Systems has a slight edge with the reviews provided by PC Magazine.

Cost. Highly significant price differences exist among these programs.

Summary

From business, medical, and legal perspectives, the creation and maintenance of accurate, complete records are crucial. The primary downside to such thorough record-keeping includes: (1) the time required for dictation, (2) the costs in and inherent hassles of finding and hiring a competent medical transcriptionist, (3) the necessary delays between dictation and actual availability of the transcribed records, and (4) the time needed to proof and correct the transcriptionist's output.

To date, the weakest link in speech recognition technology has been accuracy. This is fast changing, and current software programs have significantly improved within the last year. Can a voice recognition software program eliminate some of the problems occurring in conventional medical transcription? The following conclusions will help answer this question in the recommendation that follows:

  1. All of the programs specify system requirements that are well within the parameters of the existing system.

  2. All of the programs integrate with the existing word processing software, Microsoft Word97.

  3. All of the programs can reasonably be installed by the average user.

  4. Dragon Systems NaturallySpeaking Medical Suite is by far the most expensive voice recognition program. While it is $1,243, including one year of technical support, the other three programs are all under $200, exclusive of support.

  5. Philips does not include a microphone with its software as do the other three software companies, but purchase of one does not increase the total cost appreciably. Dragon Systems' microphone is considered more comfortable than the other microphones tested by PC Magazine.

  6. Dragon Systems' NaturallySpeaking has accumulated a lengthy list of awards; no awards were found for the other three programs.

  7. Dragon Systems' NaturallySpeaking Medical Suite with Add-On Vocabularies is easily customizable to specific needs of different practices for specialized medical vocabulary and medical forms.

  8. Dragon Systems' NaturallySpeaking technology is the most accurate of the four programs tested.

  9. Although Dragon Systems' NaturallySpeaking is the most expensive, it offers the best function while the other options considered are barely adequate.

  10. The best choice of the four applications considered is Dragon Systems' NaturallySpeaking.

Recommendation

Dragon Systems NaturallySpeaking Medical Suite is strongly recommended for its superior accuracy, powerful customization features, and industry recognition and awards. No other product comes close, and its strong advantages justify its higher price. Once the program has been customized, and the user has dictated for several weeks and become familiar with the software, acceptably accurate transcription and instantly available medical records should be possible with NaturallySpeaking Medical Suite, solving some of the record-keeping problems faced by this medical practice.

Literature Cited

All references are found online:
  1. Alwang, Greg. "Editors' Choice." PC Magazine Online. October 20, 1998. http://www.zdnet.com/pcmag/features/speech98/edchoice.html (23 October 1998).

  2. Alwang, Greg. "L&H Voice Xpress Plus 1.01." PC Magazine Online. October 20, 1998. http://www.zdnet.com/pcmag/features/speech98/rev3.html (23 October 1998).

  3. Alwang, Greg. "Performance Tests." PC Magazine Online. October 20, 1998. http://www.zdnet.com/pcmag/features/speech98/perftest.html (23 October 1998).

  4. Alwang, Greg. "Speech Recognition: Finding Its Voice." ZDNN. October 2, 1998. http://www.zdnet.com/zdnn/stories/zdnn_display/0,3440,350879,00.html (23 October 1998).

  5. Alwang, Greg. "Summary of Features." PC Magazine Online. October 20, 1998. http://www.zdnet.com/pcmag/features/speech98/features.html (23 October 1998).

  6. Berkeley Voice Solutions. "Products and Services." http://www.pcvoice.com/products.html (21 October 1998).

  7. Dragon Systems, Inc. "Dragon NaturallySpeaking Awards." http://www.dragonsys.com/news/awards.html (21 October 1998).

  8. Dragon Systems, Inc. "Dragon NaturallySpeaking Medical Suite." http://www.dragonsys.com/products/medical.html (21 October 1998).

  9. Lernout & Hauspie. "L&H Online Store." http://www.storefront.zbr.com/LHS-store/ (21 October 1998).

  10. Munro, Jay. "Speech Technology Timeline." PC Magazine Online. March 10, 1998. http://www.zdnet.com/pcmag/features/speech/sb1.html (23 October 1998).

  11. Munro, Jay. "Watch What You Say." PC Magazine Online. March 10, 1998. http://www.zdnet.com/pcmag/features/speech/intro1.html (23 October 1998).

  12. Philips. "Philips Speech Processing." http://www.speech.be.philips.com/ (21 October 1998).

  13. Provantage. "IBM VoiceType Dictation Vocabularies." http://www.provantage.com/FP_09907.htm (21 October 1998).

  14. Stinson, Craig. "IBM ViaVoice 98 Executive." PC Magazine Online. October 20, 1998. http://www.zdnet.com/pcmag/features/speech98/rev2.html (23 October 1998).

  15. Stinson, Craig. "Philips FreeSpeech98." PC Magazine Online. October 20, 1998. http://www.zdnet.com/pcmag/features/speech98/rev4.html (23 October 1998).


Interested in courses related to this page or a printed version? See the resources page. Return to the main menu of this online textbook for technical writing.

Information and programs provided by hcexres@io.com.