对标准痴呆筛查测试的自动评估

论文标题

对标准痴呆筛查测试的自动评估

Automated Evaluation of Standardized Dementia Screening Tests

论文作者

Braun, Franziska, Förstel, Markus, Oppermann, Bastian, Erzigkeit, Andreas, Hillemacher, Thomas, Lehfeld, Hartmut, Riedhammer, Korbinian

论文摘要

对于痴呆症筛查和监测，标准化测试在临床常规中起着关键作用，因为它们旨在通过测量各种认知任务的性能来最大程度地降低主观性。在本文中，我们报告了一项由半标准化病史组成的研究，然后进行了两个标准化的神经心理学测试，即SKT和CERAD-NB。测试包括基本任务，例如命名对象，学习单词列表，但也广泛使用的工具，例如MMSE。大多数任务是在口头上执行的，因此应根据成绩单适用于自动评分。对于30例患者的第一批，我们根据手动和自动转录分析了专家手动评估与自动评估之间的相关性。对于SKT和CERAD-NB，我们都可以使用手动笔录观察到高至完美的相关性。对于某些相关性较低的任务，自动评分比人类参考更严格，因为它仅限于音频。使用自动转录，相关性如预期的那样下降，并且与识别精度有关；但是，我们仍然观察到高达0.98（SKT）和0.85（CERAD-NB）的高相关性。我们表明，使用单词替代方案有助于减轻识别错误，并随后改善与专家分数的相关性。

For dementia screening and monitoring, standardized tests play a key role in clinical routine since they aim at minimizing subjectivity by measuring performance on a variety of cognitive tasks. In this paper, we report on a study that consists of a semi-standardized history taking followed by two standardized neuropsychological tests, namely the SKT and the CERAD-NB. The tests include basic tasks such as naming objects, learning word lists, but also widely used tools such as the MMSE. Most of the tasks are performed verbally and should thus be suitable for automated scoring based on transcripts. For the first batch of 30 patients, we analyze the correlation between expert manual evaluations and automatic evaluations based on manual and automatic transcriptions. For both SKT and CERAD-NB, we observe high to perfect correlations using manual transcripts; for certain tasks with lower correlation, the automatic scoring is stricter than the human reference since it is limited to the audio. Using automatic transcriptions, correlations drop as expected and are related to recognition accuracy; however, we still observe high correlations of up to 0.98 (SKT) and 0.85 (CERAD-NB). We show that using word alternatives helps to mitigate recognition errors and subsequently improves correlation with expert scores.

下载PDF全文

下载文献需遵守相关版权规定

论文标题