论文标题
学习苗族化合物的订购和智慧的表达式的订购
Learning the Ordering of Coordinate Compounds and Elaborate Expressions in Hmong, Lahu, and Chinese
论文作者
论文摘要
坐标化合物(CC)和精心表达式(EES)是东亚和东南亚语言中常见的坐标结构。 Mortensen(2006)声称(1)可以通过语音层次结构来预测Hmong,Lahu和中文的EES和CC的线性顺序,以及(2)这些语音层次结构缺乏明确的语音理由。这些主张很重要,因为形态元素通常被视为与语音学的馈送关系中,并且通常认为语音概括是语音上的“自然”。我们研究是否可以从经验上学习CCS和EES的排序,以及计算模型(分类器和序列标记模型)是否学习与Mortensen(2006)相似的不自然层次结构。我们发现,决策树和SVM学会根据语音学预测CCS/EE的顺序,而DTS学习层次结构与Mortensen提出的层次结构非常相似。但是,我们还发现,神经序列标记模型能够在不使用任何语音信息的情况下非常有效地学习苗族中精致表达式的顺序。我们认为,可以通过两种独立的途径来学习EE排序:语音和词汇分布,比以前的工作更细微的图片。 [ISO 639-3:HMN,LHU,CMN]
Coordinate compounds (CCs) and elaborate expressions (EEs) are coordinate constructions common in languages of East and Southeast Asia. Mortensen (2006) claims that (1) the linear ordering of EEs and CCs in Hmong, Lahu, and Chinese can be predicted via phonological hierarchies and (2) these phonological hierarchies lack a clear phonetic rationale. These claims are significant because morphosyntax has often been seen as in a feed-forward relationship with phonology, and phonological generalizations have often been assumed to be phonetically "natural". We investigate whether the ordering of CCs and EEs can be learned empirically and whether computational models (classifiers and sequence labeling models) learn unnatural hierarchies similar to those posited by Mortensen (2006). We find that decision trees and SVMs learn to predict the order of CCs/EEs on the basis of phonology, with DTs learning hierarchies strikingly similar to those proposed by Mortensen. However, we also find that a neural sequence labeling model is able to learn the ordering of elaborate expressions in Hmong very effectively without using any phonological information. We argue that EE ordering can be learned through two independent routes: phonology and lexical distribution, presenting a more nuanced picture than previous work. [ISO 639-3:hmn, lhu, cmn]