论文标题

在可解释的睡眠阶段中的性能和公用事业权衡取舍

Performance and utility trade-off in interpretable sleep staging

论文作者

Al-Hussaini, Irfan, Mitchell, Cassie S.

论文摘要

深度学习的最新进展导致了接近人类准确性水平的模型的发展。但是,医疗保健仍然是缺乏广泛采用的领域。医疗保健的安全性质的性质使这些黑盒深度学习模型付诸实践,这是一种自然的沉默。本文探讨了称为“睡眠分期”的临床决策支持系统的可解释方法,这是诊断睡眠障碍的重要步骤。临床睡眠分期是一个艰巨的过程,需要使用诸如脑电图(EEG)之类的生理信号为每30秒的睡眠注释。最近的工作表明,使用简单模型和一组详尽的功能的睡眠分期几乎可以和深度学习方法一样,但仅适用于某些特定数据集。此外,从临床角度来看,这些功能的实用性是模棱两可的。另一方面,提出的框架Normintsleep通过使用归一化功能来表示深度学习嵌入,从而在不同数据集中展示了出色的性能。 NormIntsleep的性能比基于详尽的特征方法好4.5%,比其他代表学习方法好1.5%。这些模型解释的效用之间的经验比较突显了在稍微交易绩效时的临床期望的改善与临床期望的一致性。 NormIntsleep与临床上有意义的功能相结合,可以通过提供可靠的,临床上相关的解释和稳健的性能来最大程度地平衡这种权衡。

Recent advances in deep learning have led to the development of models approaching the human level of accuracy. However, healthcare remains an area lacking in widespread adoption. The safety-critical nature of healthcare results in a natural reticence to put these black-box deep learning models into practice. This paper explores interpretable methods for a clinical decision support system called sleep staging, an essential step in diagnosing sleep disorders. Clinical sleep staging is an arduous process requiring manual annotation for each 30s of sleep using physiological signals such as electroencephalogram (EEG). Recent work has shown that sleep staging using simple models and an exhaustive set of features can perform nearly as well as deep learning approaches but only for some specific datasets. Moreover, the utility of those features from a clinical standpoint is ambiguous. On the other hand, the proposed framework, NormIntSleep demonstrates exceptional performance across different datasets by representing deep learning embeddings using normalized features. NormIntSleep performs 4.5% better than the exhaustive feature-based approach and 1.5% better than other representation learning approaches. An empirical comparison between the utility of the interpretations of these models highlights the improved alignment with clinical expectations when performance is traded-off slightly. NormIntSleep paired with a clinically meaningful set of features can best balance this trade-off by providing reliable, clinically relevant interpretation with robust performance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源