论文标题
用十种印度 - 欧洲语言建造一种具有跨语言转移模式的家谱
Constructing a Family Tree of Ten Indo-European Languages with Delexicalized Cross-linguistic Transfer Patterns
论文作者
论文摘要
可以合理地假设,历史语言学家和类型学家提出的差异模式反映了对人类语言的限制,因此以某种方式与第二语言获取(SLA)一致。在本文中,我们验证了十种印欧语语言的这一假设。我们将可解释的树对弦和树木模式形式化为可解释的转移,可以通过应用神经句法解析和语法诱导技术自动从Web数据中诱导。这使我们能够定量探测跨语言转移并扩展SLA的查询。我们扩展了利用混合特征的现有作品,并支持避开跨语言转移与由历史比较范式范式产生的系统发育结构之间的一致性。
It is reasonable to hypothesize that the divergence patterns formulated by historical linguists and typologists reflect constraints on human languages, and are thus consistent with Second Language Acquisition (SLA) in a certain way. In this paper, we validate this hypothesis on ten Indo-European languages. We formalize the delexicalized transfer as interpretable tree-to-string and tree-to-tree patterns which can be automatically induced from web data by applying neural syntactic parsing and grammar induction technologies. This allows us to quantitatively probe cross-linguistic transfer and extend inquiries of SLA. We extend existing works which utilize mixed features and support the agreement between delexicalized cross-linguistic transfer and the phylogenetic structure resulting from the historical-comparative paradigm.