Matches in SemOpenAlex for { <https://semopenalex.org/work/W2613665621> ?p ?o ?g. }
Showing items 1 to 55 of
55
with 100 items per page.
- W2613665621 endingPage "481" @default.
- W2613665621 startingPage "467" @default.
- W2613665621 abstract "Research in speech recognition and synthesis over the past several decades has brought speech technology to a point where it is being used in real-world applications. However, despite the progress, the perception remains that the current technology is not flexible enough to allow easy voice communication with machines. The focus of speech research is now on producing systems that are accurate and robust but that do not impose unnecessary constraints on the user. This chapter takes a critical look at the shortcomings of the current speech recognition and synthesis algorithms, discusses the technical challenges facing research, and examines the new directions that research in speech recognition and synthesis must take in order to form the basis of new solutions suitable for supporting a wide range of applications. After many years of research, speech recognition and synthesis systems have started moving from the controlled environments of research laboratories to applications in the real world. Voice-processing technology has matured to such a point that many of us wonder why the performance of automatic systems does not approach the quality of human performance and how soon this goal can be reached. Rapid advances in very-large-scale integrated (VLSI) circuit capabilities are creating a revolution in the world of computers and communications. These advances are creating an increasing demand for sophisticated products and services that are easy to use. Automatic speech recognition and synthesis are considered to be the key technologies that will provide the easy-to-use interface to machines. The past two decades of research have produced a stream of increasingly sophisticated solutions in speech recognition and synthesis (1). Despite this progress, the perception remains that the current technology is not flexible enough to allow easy voice communication with machines. This chapter reviews the present status of this important technology, including its limitations, and discusses the range of applications that can be supported by our present knowledge. But as we look into the future and ask which speech recognition and synthesis capabilities will be available about 10 years from now, it is important also to discuss the technical challenges we face in realizing our vision of the future and the directions in which new research should proceed to meet these challenges. We will examine these issues in this paper and take a critical look at the shortcomings of the current speech recognition and synthesis algorithms. Much of the technical knowledge that supports the current speech-processing technology was created in a period when our ability to implement technical solutions on real-time hardware was limited. These limitations are quickly disappearing, and we look to a future at the end of this decade when a The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. ?1734 solely to indicate this fact. single VLSI chip will have a billion transistors to support much higher processing speeds and more ample storage than is now available. The speech recognition and synthesis algorithms available at present work in limited scenarios. With the availability of fast processors and a large memory, tremendous opportunity exists to push speech recognition technology to a level where it can support a much wider range of applications. Speech databases with utterances recorded from many speakers in a variety of environments have been important in achieving the progress that has been realized so far. But on the negative side, these databases have encouraged speech researchers to rely on trial-and-error methods, leading to solutions that are narrow and that apply to specific applications but do not generalize to other situations. These methods, although fruitful in the early development of the technology, are now a hindrance as we become much more ambitious in seeking solutions to bigger problems. The time has come to set the next stage for the development of speech technology, and it is important to realize that a solid base of scientific understanding is absolutely necessary if we want to move significantly beyond where we are today. The 1990s will be a decade of rising expectations for speech technology, and speech research will expand to cover many areas, from traditional speech recognition and synthesis to speech understanding and language translation. In some areas we will be just scratching the surface and defining the important issues. But in many others the research community will have to come up with solutions to important and difficult problems in a timely fashion. This paper cannot discuss all the possible new research directions but will be limited to examining the most important problems that must be solved during this decade. CURRENT CAPABILITIES Voice communication from one person to another appears to be so easy and simple. Although speech technology has reached a point where it can be useful in certain applications, the prospect of a machine understanding speech with the same flexibility as humans do is still far away. The interest in using speech interface to machines stems from our desire to make machines easy to use. Using human performance as a benchmark for the machine tells us how far we are from that goal. For clean speech, automatic speech recognition algorithms work reasonably well (2, 3) with isolated words or words spoken in grammatical sentences, and the performance is continuing to improve. Fig. 1 shows the word error rate for various test materials and the steady decrease in the error rate achieved from 1980 to 1992. This performance level is not very different from that obtained in intelligibility tests with human listeners. The performance of automatic methods, however, degrades significantly in the presence of noise (or distortion) (4) and for conversational speech." @default.
- W2613665621 created "2017-05-19" @default.
- W2613665621 creator A5012446143 @default.
- W2613665621 date "1994-11-01" @default.
- W2613665621 modified "2023-09-26" @default.
- W2613665621 title "Speech technology in 2001: new research directions" @default.
- W2613665621 hasPublicationYear "1994" @default.
- W2613665621 type Work @default.
- W2613665621 sameAs 2613665621 @default.
- W2613665621 citedByCount "0" @default.
- W2613665621 crossrefType "book" @default.
- W2613665621 hasAuthorship W2613665621A5012446143 @default.
- W2613665621 hasConcept C14999030 @default.
- W2613665621 hasConcept C154945302 @default.
- W2613665621 hasConcept C2524010 @default.
- W2613665621 hasConcept C28719098 @default.
- W2613665621 hasConcept C33923547 @default.
- W2613665621 hasConcept C41008148 @default.
- W2613665621 hasConcept C504749915 @default.
- W2613665621 hasConceptScore W2613665621C14999030 @default.
- W2613665621 hasConceptScore W2613665621C154945302 @default.
- W2613665621 hasConceptScore W2613665621C2524010 @default.
- W2613665621 hasConceptScore W2613665621C28719098 @default.
- W2613665621 hasConceptScore W2613665621C33923547 @default.
- W2613665621 hasConceptScore W2613665621C41008148 @default.
- W2613665621 hasConceptScore W2613665621C504749915 @default.
- W2613665621 hasLocation W26136656211 @default.
- W2613665621 hasOpenAccess W2613665621 @default.
- W2613665621 hasPrimaryLocation W26136656211 @default.
- W2613665621 hasRelatedWork W11471960 @default.
- W2613665621 hasRelatedWork W176481470 @default.
- W2613665621 hasRelatedWork W2018485074 @default.
- W2613665621 hasRelatedWork W2040972462 @default.
- W2613665621 hasRelatedWork W2042565708 @default.
- W2613665621 hasRelatedWork W2051350553 @default.
- W2613665621 hasRelatedWork W2096619464 @default.
- W2613665621 hasRelatedWork W2187351902 @default.
- W2613665621 hasRelatedWork W2647605865 @default.
- W2613665621 hasRelatedWork W2733769665 @default.
- W2613665621 hasRelatedWork W2839578436 @default.
- W2613665621 hasRelatedWork W290723265 @default.
- W2613665621 hasRelatedWork W303563536 @default.
- W2613665621 hasRelatedWork W3126400779 @default.
- W2613665621 hasRelatedWork W3167067495 @default.
- W2613665621 hasRelatedWork W396658282 @default.
- W2613665621 hasRelatedWork W2112902345 @default.
- W2613665621 hasRelatedWork W2189233133 @default.
- W2613665621 hasRelatedWork W2549932868 @default.
- W2613665621 hasRelatedWork W319472833 @default.
- W2613665621 isParatext "false" @default.
- W2613665621 isRetracted "false" @default.
- W2613665621 magId "2613665621" @default.
- W2613665621 workType "book" @default.