[1] CHAO W H, LI Z J. A Graph-based bilingual corpus selection approach for SMT[C]// Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation. Singapore: Waseda University Press, 2011: 120-129. [2] CUI L, ZHANG D D, LIU S J, et al. Collective corpus weighting and phrase scoring for SMT using graph-based random walk[C]// The 2nd Conference on Natural Language Processing & Chinese Computing. Chongqing, China, 2013: 176-187. [3] ECK M, VOGEL S, WAIBEL A. Low cost portability for statistical machine translation based on n-gram coverage[C]// International Workshop on Spoken Language Translation. Pittsburgh, USA: IWSLT Press, 2005: 61-67. [4] MANDAL A, VERGYRI D, WANG W, et al. Efficient data selection for machine translation[C]// Spoken Language Technology Workshop. Goa, India: IEEE Press, 2008: 261-264. [5] SKADIA I, BRLTIS E. English-Latvian SMT: knowledge or data? [C]// Proceedings of the 17th NODALIDA Conference Processing, http://beta.visl.sdu.dk/~eckhard/nodalida/paper_57.pdf, 2009: 242-245. [6] HAN X W, LI H Z, ZHAO T J. Train the machine with what it can learn: Corpus selection for SMT[C]// Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: From Parallel to Non-Parallel Corpora. Suntec, Singapore: ACM Press, 2009: 27-33. [7] 王志洋,吕雅娟,刘群. 面向形态丰富语言的多粒度翻译融合[J]. 中文信息学报. 2011, 25(4): 75-81. WANG Z Y, LV Y J, LIU Q. System combination with multiple granularities for morphologically rich language translation[J]. Journal of Chinese Information Processing, 2011, 25(4): 75-81. [8] 米莉万·雪合来提, 刘凯,吐尔根·依布拉音. 基于维语尔语词干词缀粒度的汉维机器翻译[J]. 中文信息学报, 2015, 29(3): 201-206. MILIWAN·XUEHELAITI, LIU KAI, TURGUN·IBRAHIM. Chinese-Uyghur machine translation based on smallest translation units of stem and suffixes[J]. Journal of Chinese Information Processing, 2015, 29(3):201-206. [9] HAN J W, JI H, SUN Y Z. Successful data mining methods for NLP[C]// Proceedings of the Tutorials of the 53rd Annual Meeting of the ACL and the 7th International Joint Conference on Natural Language Processing. Beijing, China: ACL Press, 2015: 1-4. [10] LIU L, HONG Y, LIU H, et al. Effective selection of translation model training data[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, USA: IEEE Press, 2014: 569-573. [11] HILDEBRAND A S, ECK M, VOGEL S, et al. Adaptation of the translation model for statistical machine translation based on information retrieval[C]// Proceedings of the 10th Annual Conference on European Association for Machine Translation. San Diego, USA: ACM Press, 2005: 133-142. [12] 黄瑾,吕雅娟,刘群. 基于信息检索方法的统计翻译系统训练数据选择与优化[J]. 中文信息学报, 2008, 22(2): 40-46. HUANG Jin, LV Yajun, lIU Qun. The statistical translation system based on information retrieval method selection and optimization of training data[J]. Journal of Chinese Information Processing, 2008, 22(2): 40-46. [13] 姚树杰, 肖桐, 朱靖波. 基于句对质量和覆盖度的统计机器翻译训练语料选取[J]. 中文信息学报, 2011, 25(1): 72-77. YAO Shujie, XIAO Tong, ZHU Jingbo. Selection of SMT training data based on sentence pair quality and coverage[J]. Journal of Chinese Information Processing, 2011, 25(1): 72-77. [14] 王星, 涂兆鹏, 谢军, 等. 一种基于分类的平行语料选取方法[J]. 中文信息学报, 2013, 27(6): 144-150. WANG Xing, TU Zhaopeng, XIE Jun, etal. Selection of parallel corpus based on classification[J]. Journal of Chinese Information Processing, 2013, 27(6): 144-150. [15] KIRCHHOFF K, BILMES J. Submodularity for data selection in statistical machine translation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha, Qatar: ACL Press, 2014: 131-141. |