汉语语音识别声调模型集成中基于决策树的上下文相关权重参数聚类方法Tone Model Integration Using Tree Based Weight Parameter Tying in Mandarin Speech Recognition
黄浩,李兵虎,吾守尔.斯拉木
摘要(Abstract):
声调集成是汉语语音识别的一个重要任务.在语音识别的二次解码过程中,使用区分性训练的权重因子进行声调模型集成已被证明是一个有效的方法,而且使用上下文相关的得分加权进行模型组合也得到了应用.上下文相关模型组合方法的一个不足是将会带来大量的训练参数,从而导致权重训练受到过拟合的影响.针对该问题,提出利用声学决策树对上下文相关权重参数进行参数聚类,决策树节点问题集根据最小化训练数据的期望误识率进行选择.提出问题集剪枝来加快决策树的构建速度.汉语连续语音识别实验表明与人工选择上下文相关权重参数相比,该方法能够在大大减少参数数量的条件下明显降低误识率.
关键词(KeyWords): 声调集成;最小音子错误;决策树;汉语语音识别;区分性模型组合;上下文相关
基金项目(Foundation): 国家自然科学基金(60965002);; 新疆高校科研计划培育基金(XJEDU2008S15);; 新疆大学博士科研启动基金(BS090143)资助
作者(Author): 黄浩,李兵虎,吾守尔.斯拉木
参考文献(References):
- [1]Huang C H,Side F.Pitch tracking and tone features for mandarin speech recognition[C].Proceedings of InternationalConference of Acoustics,Speech and Signal Processing,Istanbul,Turkey,Jun 5-9,2000,1523-1526.
- [2]Lei X,Siu M H,Hwang M,et al.Improved Tone Modeling for Mandarin Broadcast News Speech Recognition[C].InProceedings of Interspeech.Pittsburgh,PA,USA,Sept.17-21,2006,1277-1280.
- [3]Wang H L,Qian Y,Soong F K,et al.Improved Mandarin Speech Recognition by Lattice Rescoring with Enhanced Tonemodels[C].Proceedings of ISCSLP,2006,445-443.
- [4]Beyerlein P.Discriminative model combination[C].in Proc.IEEE Automatic Speech Recognition and Understanding Work-shop,Santa Barbara,California,USA,Dec.1997,238-245.
- [5]Huang H,Zhu J.Discriminative incorporation of explicitly trained tone models into lattice based rescoring for Mandarinspeech recognition[C].Proceedings of International Conference of Acoustics,Speech and Signal Processing,2008-Las Vegas,Nevada,U.S.A.,March 30-April 4,2008,1541-1544.
- [6]Hoffmeister B,Liang R,Schlulter R,et al.Log-linear model combination with word-dependent scaling factors[C].Pro-ceedings of the 10th Annual Conference of the Speech Communication Association Brighton,U.K.Sept 26-30,2009,248-251.
- [7]Liu X,Gales M,Woodland P.Use of Contexts in Language Model Interpolation and Adaptation[C].Proceedings of the10th Annual Conference of the Speech Communication Association Brighton,U.K.,Sept 26-30,2009,2009.
- [8]Povey D,Woodland P C.Minimum Phone Error and I-smoothing for Improved Discriminative Training[C].Proceedings ofInternational Conference on Acoustics Speech and Signal Processing Florida,USA,May.13-17,2002,1:105-108.
- [9]Young S,Odell J,Woodland P.Tree-based state tying for high accuracy acoustic modeling[C].Proceedings of1994Workshop on Human Language Technology Plainsboro,New Jersey,USA,March 8-11,1994,351-354.
- [10]Chang E,Shi Yu,Zhou Jian Lai,et al.Speech lab in a box:a Mandarin speech toolbox to jumpstart speech relatedresearch[C].Proceedings of the 7th European Conference on Speech Communication and Technology Aalborg,Denmark,Sept.3-7,2001,2779-2782.
- [11]Gunawardana A,Hahajan M,Acero A,et al.Hidden conditional random fields for phone classification[C].Proceedings ofthe 9th European Conference on Speech Communication and Technology,Lisbon,Portugal,Sept 4-8,2005,1117-1120.