不确定性下的分类：用于诊断抗体测试的数据分析

论文标题

不确定性下的分类：用于诊断抗体测试的数据分析

Classification Under Uncertainty: Data Analysis for Diagnostic Antibody Testing

论文作者

Patrone, Paul N., Kearsley, Anthony J.

论文摘要

制定准确而健壮的分类策略是开发诊断和抗体测试的关键挑战。没有明确说明疾病患病率和不确定性的方法可能导致重大分类错误。我们提出了一种利用最佳决策理论来解决此问题的新颖方法。作为初步步骤，我们开发了一项分析，该分析使用假定的诊断测量结果的患病率和条件概率模型，以定义最佳（在最小化假阳性和假否定率的速率）分类域的最佳率（在最小化的意义上）。至关重要的是，我们证明了如何将该策略推广到一个未知的设置：（i）定义需要进一步测试的第三类保持样本；或（ii）在定义分类域之前使用自适应算法估算患病率。我们还为最近发表的SARS-COV-2血清学测试提供了示例，并讨论了如何将测量不确定性（例如与仪器相关联）可以纳入分析中。我们发现，基于置信区间，相对于更传统的方法，我们的新策略将分类错误减少了十年。此外，它通过将它们连接到更广泛的优化领域来建立概括技术（例如接收器操作特征（ROC））的理论基础。

Formulating accurate and robust classification strategies is a key challenge of developing diagnostic and antibody tests. Methods that do not explicitly account for disease prevalence and uncertainty therein can lead to significant classification errors. We present a novel method that leverages optimal decision theory to address this problem. As a preliminary step, we develop an analysis that uses an assumed prevalence and conditional probability models of diagnostic measurement outcomes to define optimal (in the sense of minimizing rates of false positives and false negatives) classification domains. Critically, we demonstrate how this strategy can be generalized to a setting in which the prevalence is unknown by either: (i) defining a third class of hold-out samples that require further testing; or (ii) using an adaptive algorithm to estimate prevalence prior to defining classification domains. We also provide examples for a recently published SARS-CoV-2 serology test and discuss how measurement uncertainty (e.g. associated with instrumentation) can be incorporated into the analysis. We find that our new strategy decreases classification error by up to a decade relative to more traditional methods based on confidence intervals. Moreover, it establishes a theoretical foundation for generalizing techniques such as receiver operating characteristics (ROC) by connecting them to the broader field of optimization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题