论文标题

句法依赖关系距离的最佳性

The optimality of syntactic dependency distances

论文作者

Ferrer-i-Cancho, Ramon, Gómez-Rodríguez, Carlos, Esteban, Juan Luis, Alemany-Puig, Lluís

论文摘要

人们经常说,作为其他生物系统,人类语言是通过削减成本压力来塑造的,但是在多大程度上?尝试通过最佳分数量化语言的最佳程度的尝试已经稀缺,主要集中在英语上。在这里,我们将句子单词顺​​序的最佳性问题重述为空间网络上的优化问题,在该网络中,顶点是单词,弧表示句法依赖性,空间由句子中单词的线性顺序定义。我们引入了一个新的分数来量化认知压力,以减少句子中链接单词之间的距离。对代表19个语言家庭的93种语言的句子的分析表明,一半的语言被优化为70%或更多。分数表明,几种语言中的距离没有显着降低,并确认了两个理论预测,即更长的句子是更优化的,并且在短句子中,距离更可能比偶然的距离更长。我们通过其优化程度提出了一种新的语言等级排名。新的分数对语言研究的各个领域具有影响(依赖语言学,类型学,历史语言学,临床语言学和认知科学)。最后,分数设计背后的原则对网络科学具有影响。

It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality score have been scarce and focused mostly on English. Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies and the space is defined by the linear order of the words in the sentence. We introduce a new score to quantify the cognitive pressure to reduce the distance between linked words in a sentence. The analysis of sentences from 93 languages representing 19 linguistic families reveals that half of languages are optimized to a 70% or more. The score indicates that distances are not significantly reduced in a few languages and confirms two theoretical predictions, i.e. that longer sentences are more optimized and that distances are more likely to be longer than expected by chance in short sentences. We present a new hierarchical ranking of languages by their degree of optimization. The new score has implications for various fields of language research (dependency linguistics, typology, historical linguistics, clinical linguistics and cognitive science). Finally, the principles behind the design of the score have implications for network science.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源