基于灰度直方图与改进Hu不变矩的维吾尔文关键词图像二次检索Uygur Keyword Image Secondary Retrieval Based on Gray Histogram and Improved Hu Invariant Moment
宋志平,朱亚俐,徐学斌,吾尔尼沙·买买提,库尔班·吾布力
摘要(Abstract):
维吾尔文文字具有粘连性大、结构不封闭等特点,这给维吾尔文关键词图像检索造成了极大的困难.为提高维吾尔文文档图像检索效率,提出一种基于灰度直方图与改进Hu不变矩的关键词图像二次检索算法,该算法对单词图像进行两次检索:粗略检索和二次检索.在粗略检索阶段,对切分后的单词图像提取灰度直方图特征并对单词数据库进行粗略匹配,在保证召回率的情况下,过滤掉部分无关单词图像形成候选单词库.在粗略匹配的基础上进行精确匹配,使用改进的Hu不变矩对关键词图像的轮廓特征进行描述,该方法在Hu不变矩中将离心率、区域矩和结构矩统一,可以有效地描述图像的轮廓信息.在包含115张纯文本维吾尔文文档图像数据库上进行实验,其检索准确率平均值为78.36%,召回率平均值为81.68%.
关键词(KeyWords): 维吾尔文;灰度直方图;Hu不变矩;粗略匹配;二次检索
基金项目(Foundation): 国家自然科学基金重点项目(61862061;61563052;61363064);; 新疆维吾尔自治区科技厅青年基金项目(2021D01C119)
作者(Author): 宋志平,朱亚俐,徐学斌,吾尔尼沙·买买提,库尔班·吾布力
DOI: 10.13568/j.cnki.651094.651316.2021.04.10.0003
参考文献(References):
- [1]LEE Y K,SONG J,WON Y.Improving personal information detection using OCR feature recognition rate[J].The Journal of Super Computing,2019,75(4):1941-1952.
- [2]MANMATHA R,HAN C,RISEMAN E M.Word spotting:a new approach to indexing handwriting[C].Amherst:Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition,IEEE,1996:631-637.
- [3]MANMATHA R,HAN C,RISEMAN E M,et al.Indexing handwriting using word matching[C].Bethesda:Proceedings of the First ACM International Conference on Digital Libraries,ACM International Conference on Digital Libraries,1996:151-159.
- [4]ROTHFEDER J L,FENG S,RATH T M.Using corner feature correspondences to rank word images by similarity[C].Madison:2003 Conference on Computer Vision and Pattern Recognition Workshop,IEEE,2003,3:30.
- [5]VADIVUKARASSI M,PUVIARASAN N,ARUNA P.A frame-work of keyword based image retrieval using proposed Hog Sift feature extraction method from Twitter dataset[J].Procedia Computer Science,2018,13(4):1422-1431.
- [6]NIAZ H A,AKRAM U,AKBAR U.Word spotting using clustering on extracted DCT and DWT features[C].Lahore:2018International Conference on Engineering and Emerging Technologies(ICEET),IEEE,2018:1-4.
- [7]黄祥琳,高芸,杨丽芳,等.一种基于关键词的中文文档图像检索方法[J].中文信息学报,2017,4(5):61-64.
- [8]魏宏喜.蒙古文古籍图像检索技术研究[D].呼和浩特:内蒙古大学,2012.
- [9]GUO G L,WEI H X,SU X.A case study of BOVW for keyword spotting on historical Mongolian document images[C].Datong:2016 9th International Congress on Image and Signal Processing,Bio Medical Engineering and Informatics(CISP-BMEI),IEEE,2016:374-378.
- [10]周文杰.基于关键词的维吾尔文文档图像检索技术研究[D].乌鲁木齐:新疆大学,2019.
- [11]李静静,木特力甫·马木提,吾尔尼沙·买买提,等.基于层级匹配的维吾尔文关键词文档图像检索[J].计算机工程与设计,2020,41(4):1062-1069.
- [12]阿丽亚·巴吐尔.基于局部特征的维吾尔文印刷体复杂文档图像检索研究[D].乌鲁木齐:新疆大学,2017.
- [13]周文杰,木特力铺·马木提,吾尔尼沙·买买提,等.基于形态学梯度算法的维吾尔文文档图像单词切分[J].计算机工程与设计,2019,40(9):2552-2557.
- [14]欧阳彝华,黄芳,周敏.基于灰度直方图的心脏图像检索[J].计算机技术与发展,2009,19(9):125-127+203.
- [15]李顺山,庄天戈,陈辉.基于灰度直方图和互相关方法的医学图像检索[J].上海交通大学学报,2001,35(5):694-698.
- [16]HU M K.Visual pattern recognition by moment invariants[J].IRE Transactions on Information Theory,1962,8(2):179-187.