发音错误检错中声学模型训练准则的比较研究A Comparative Study on Acoustic Model Training for Automatic Mispronunciation Detection
王建明,黄浩,王羡慧
摘要(Abstract):
在发音错误检错系统中,将传统语音识别系统中的最大似然估计准则和最小音素错误准则应用于声学模型训练通常不能获得F1值指标意义下的最佳性能.本文在对最大似然准则以及最小音素错误准则参数更新公式进行分析的基础上,提出了最大化F1值函数的区分性训练准则,并利用构造弱意义辅助函数的方法对声学模型参数进行优化.通过比较,发现最大化F1值函数的区分性训练准则能够有效地增大训练和测试数据检错的F1值,同时训练数据和测试数据上的精确度、召回率都有明显改进.
关键词(KeyWords): 最大似然估计;最小音素错误;最大化F1值;辅助函数
基金项目(Foundation): 国家自然科学基金资助项目(60965002);; 新疆高校科研计划培育基金资助项目(XJEDU2008S15);; 新疆大学博士科研启动基金资助项目(BS090143)
作者(Author): 王建明,黄浩,王羡慧
参考文献(References):
- [1]Baum L E,Eagon J A.An Inequality with Applications to Statistical Estimation for Probabilistic Functions of MarkovProcesses and to a Model for Ecology[J].Bull Amer Math Soc,1967,73:360-363.
- [2]Povey D,Woodland P.Minimum Phone Error and I-smoothing for improved discriminative training[M].Proceeding ofICASSP,2002,1:105-108.
- [3]Witt S M,Young S J.Phone-level Pronunciation Scoring and Assessment for Interactive Language Learning[J].SpeechCommunication,2000,30:95-108.
- [4]Fujino A,Isozaki H,Suzuki J.Multi-label Text Categorization with Model Combination based on F1-score Maximization[J].Proceedings of The 3rd International Joint Conference on Natural Language Processing(IJCNLP),2008:823-828.
- [5]Witt S M.Use of Speech Recognition in Computer-Assisted Language Learning[D].England:Cambridge University,1999,93-99.
- [6]Povey D.Discriminative Training for Large Vocabulary Speech Recognition[D].England:Cambridge University,2004,22-30.
- [7]Huang X D,Acero A,Hon H W.Spoken Language Processing[M].Washington:Prentice Hall,2001,398-403.
- [8]Qian X,Song F,Meng H.Discriminative Acoustic Model for Improving Mispronunciation Detection and Diagnosis inComputer-Aided Pronunciation Training(CAPT)[J].Proceedings of Inter speech,2010,4:757-760.
- [9]石现峰,张学智,张峰.基于HTK的语音识别系统设计[J].计算机技术与发展,2006,16(10):37-38.
- [10]Davis S,Mermelstein P.Comparison of Parametric Representations for Monosyllabic Word Recognition in ContinuouslySpoken Sentences[J].IEEE Trans.on Acoustics Speech and Signal Processing,1980,28(4):357-366.