RESEARCH
Acoustic & Speech
Investigates various Speech signal processing schemes for acoustic modeling so that more robust speech recognition can be achieved. Our aim is to perform the state-of-art research providing effective means for achieving:
Investigates various Speech signal processing schemes for acoustic modeling so that more robust speech recognition can be achieved. Our aim is to perform the state-of-art research providing effective means for achieving:
Contents
2. The Acoustic modeling of speech recognition unit
3. Statistical Language Modeling
Automatic speech recognition system is composed of feature extraction, acoustic modeling, language modeling and searching. We estimate parameters of acoustic models using training data and estimate language model using text corpora. Then, we decode speech signal into recognized word sequence using acoustic models, language models and word network.
2. Acoustic Modeling of Speech Recognition Unit
Acoustic model describes how speech signal is expressed. Recently, the most frequently used acoustic model is HMM (Hidden Markov Model). Each HMM models temporal and spectral variation of a speech-recognition unit. We estimate parameters of acoustic models using training data.
3. Statistical Language Modeling
The probabilistic relationship among a sequence of words can be directly derived and modeled from the corpora with the statistical language models. We mainly use bigram or trigram language model as n-grams language model.
We use two kinds of networks i.e. linear lexicon and lexical tree. Linear lexicon is composed of words in parallel and used for small vocabulary recognition. Lexical tree holds previously listed pronunciations in common and is used for large vocabulary recognition.
Lexical decoding of continuous speech is to find the word sequence of the highest score out of all possible word sequences given observations sequence, acoustic model and language model using word network. In evaluation (recognition), Viterbi decoding and forward-backward algorithm are used.
6.1 Voice Navigation
6.2 Keyword Recognition
6.3 LVCSR Demo
6.4 LVCSR Demo (English)