Matches in SemOpenAlex for { <https://semopenalex.org/work/W125614575> ?p ?o ?g. }
Showing items 1 to 72 of
72
with 100 items per page.
- W125614575 abstract "In most speech recognition systems today, acoustic modeling and lexical modeling are viewed as separable problems. Currently the most popular approach is to manually define canonical word pronunciations in terms of phonetic units and let the acoustic models capture differences between actual spoken and canonical pronunciations implicitly with Gaussian mixture models. As a result, these models can be very broad, particularly for casual spontaneous speech. An alternative approach, explored in this thesis, is to learn a unit inventory and pronunciation dictionary from training data using a maximum likelihood objective function. In particular, this thesis addresses previously unsolved problems in automatic unit design with three main contributions. First, to make design of a large unit inventory practical, a new approach is described that combines the problems of unit selection and lexicon design. The design of the units is acoustically driven but constrained to guarantee a matched, limited complexity pronunciation model. Instead of using an acoustic unit training algorithm followed by separate pronunciation model design, the algorithm proposed here incorporates a pronunciation constraint within the unit design algorithm. The resulting unit inventory, unit models and lexicon are matched since they are designed by a single joint design step. The second problem addressed involves synthesizing models for unobserved contexts, needed to model contextual variation at word boundaries. As in phone-based systems, decision tree clustering is used, but this requires classes or sets of units that have a similar influence in context. The solution is to learn these classes from data by a parallel context clustering process. Third, the ability to generalize at the word-level, i.e. to handle words not observed in the training data, is provided by a hybrid system design algorithm. In the hybrid system, automatically derived units are designed for the most frequent words, and phonetic units are designed for all words in the vocabulary. Using an estimation step, the word models constructed by the independent automatic and phonetic units are evaluated and the most likely model is included in the lexicon. The new automatic unit design algorithm showed improved performance over phonetic units in experiments on a medium vocabulary (1000 words) task (Resource Management) for both small and large unit inventory systems, outperforming an alternative approach to automatic unit design reported on this task. The algorithm for learning context conditioning groups is successful in that the performance of a system derived by decision tree clustering is equivalent to that of the best unconstrained clustering system and an additional gain is observed when modeling contextual effects across word-boundaries. Finally, when automatically derived units were used in experiments on a large vocabulary (20,000 word) conversational speech task (Switchboard), the recognition accuracy improved over the phonetic unit baseline. In summary, the joint unit and lexicon design algorithm gives higher recognition performance or can be configured to give similar performance at lower cost (lower system complexity) than phone-based units for applications where several examples of each vocabulary word can be provided." @default.
- W125614575 created "2016-06-24" @default.
- W125614575 creator A5049614700 @default.
- W125614575 creator A5087215613 @default.
- W125614575 date "1999-01-01" @default.
- W125614575 modified "2023-09-26" @default.
- W125614575 title "Speech recognition system design based on automatically derived units" @default.
- W125614575 hasPublicationYear "1999" @default.
- W125614575 type Work @default.
- W125614575 sameAs 125614575 @default.
- W125614575 citedByCount "14" @default.
- W125614575 countsByYear W1256145752021 @default.
- W125614575 crossrefType "journal-article" @default.
- W125614575 hasAuthorship W125614575A5049614700 @default.
- W125614575 hasAuthorship W125614575A5087215613 @default.
- W125614575 hasConcept C138885662 @default.
- W125614575 hasConcept C151730666 @default.
- W125614575 hasConcept C154945302 @default.
- W125614575 hasConcept C204321447 @default.
- W125614575 hasConcept C2524010 @default.
- W125614575 hasConcept C2776036281 @default.
- W125614575 hasConcept C2778121359 @default.
- W125614575 hasConcept C2779343474 @default.
- W125614575 hasConcept C2780844864 @default.
- W125614575 hasConcept C28490314 @default.
- W125614575 hasConcept C33923547 @default.
- W125614575 hasConcept C41008148 @default.
- W125614575 hasConcept C41895202 @default.
- W125614575 hasConcept C73555534 @default.
- W125614575 hasConcept C86803240 @default.
- W125614575 hasConceptScore W125614575C138885662 @default.
- W125614575 hasConceptScore W125614575C151730666 @default.
- W125614575 hasConceptScore W125614575C154945302 @default.
- W125614575 hasConceptScore W125614575C204321447 @default.
- W125614575 hasConceptScore W125614575C2524010 @default.
- W125614575 hasConceptScore W125614575C2776036281 @default.
- W125614575 hasConceptScore W125614575C2778121359 @default.
- W125614575 hasConceptScore W125614575C2779343474 @default.
- W125614575 hasConceptScore W125614575C2780844864 @default.
- W125614575 hasConceptScore W125614575C28490314 @default.
- W125614575 hasConceptScore W125614575C33923547 @default.
- W125614575 hasConceptScore W125614575C41008148 @default.
- W125614575 hasConceptScore W125614575C41895202 @default.
- W125614575 hasConceptScore W125614575C73555534 @default.
- W125614575 hasConceptScore W125614575C86803240 @default.
- W125614575 hasLocation W1256145751 @default.
- W125614575 hasOpenAccess W125614575 @default.
- W125614575 hasPrimaryLocation W1256145751 @default.
- W125614575 hasRelatedWork W1603053412 @default.
- W125614575 hasRelatedWork W1611060092 @default.
- W125614575 hasRelatedWork W1637441004 @default.
- W125614575 hasRelatedWork W1762672039 @default.
- W125614575 hasRelatedWork W1950396994 @default.
- W125614575 hasRelatedWork W1968149043 @default.
- W125614575 hasRelatedWork W1983912746 @default.
- W125614575 hasRelatedWork W2021892355 @default.
- W125614575 hasRelatedWork W2033350476 @default.
- W125614575 hasRelatedWork W2116738958 @default.
- W125614575 hasRelatedWork W2125142492 @default.
- W125614575 hasRelatedWork W2144709899 @default.
- W125614575 hasRelatedWork W2149559596 @default.
- W125614575 hasRelatedWork W2170353620 @default.
- W125614575 hasRelatedWork W2744194203 @default.
- W125614575 hasRelatedWork W2944691285 @default.
- W125614575 hasRelatedWork W2952613254 @default.
- W125614575 hasRelatedWork W2963070863 @default.
- W125614575 hasRelatedWork W3025870691 @default.
- W125614575 hasRelatedWork W2184468823 @default.
- W125614575 isParatext "false" @default.
- W125614575 isRetracted "false" @default.
- W125614575 magId "125614575" @default.
- W125614575 workType "article" @default.