依次采样数据中的假设测试：ADAPRT，以最大化IID采样以外的功率

论文标题

依次采样数据中的假设测试：ADAPRT，以最大化IID采样以外的功率

Hypothesis Testing in Sequentially Sampled Data: AdapRT to Maximize Power Beyond iid Sampling

论文作者

Ham, Dae Woong, Qiu, Jiaze

论文摘要

测试感兴趣的变量是否影响结果是统计中最根本的问题之一，并且通常是利益的主要科学问题。为了解决此问题，有条件的随机测试（CRT）被广泛用于测试感兴趣的变量（x）的独立性，结果（y）持有其他变量（s）（z）固定。 CRT使用基于随机或设计的推断，仅依赖于（x，z）的IID采样来产生使用任何测试统计量构建的精确有限样本p值。我们提出了一种新方法，即自适应随机测试（ART），该测试解决了独立问题，同时允许对数据进行自适应采样。我们首先在特定的多臂强盗问题中展示艺术，称为正常均值模型。在这种设置下，我们从理论上表征了IID采样程序和自适应抽样程序的功能，并从经验上发现，艺术可以均匀地胜过以均等概率独立拉动所有手臂的CRT。我们还出人意料地发现，即使是在信号相对强时，艺术也比使用Oracle IID采样程序的CRT更强大。我们认为，提出的自适应程序是成功的，因为它的手臂最初可能看起来像“假”信号，因为随机的机会并将其稳定在接近“ null”信号的情况下。我们还向流行的阶乘调查设计环境展示了该艺术，称为联合分析。我们通过模拟和有关性别歧视在政治候选评估中的作用的最新应用找到了类似的结果。

Testing whether a variable of interest affects the outcome is one of the most fundamental problem in statistics and is often the main scientific question of interest. To tackle this problem, the conditional randomization test (CRT) is widely used to test the independence of variable(s) of interest (X) with an outcome (Y) holding other variable(s) (Z) fixed. The CRT uses randomization or design-based inference that relies solely on the iid sampling of (X,Z) to produce exact finite-sample p-values that are constructed using any test statistic. We propose a new method, the adaptive randomization test (ART), that tackles the independence problem while allowing the data to be adaptively sampled. We first showcase the ART in a particular multi-arm bandit problem known as the normal-mean model. Under this setting, we theoretically characterize the powers of both the iid sampling procedure and the adaptive sampling procedure and empirically find that the ART can uniformly outperform the CRT that pulls all arms independently with equal probability. We also surprisingly find that the ART can be more powerful than even the CRT that uses an oracle iid sampling procedure when the signal is relatively strong. We believe that the proposed adaptive procedure is successful because it takes arms that may initially look like "fake" signals due to random chance and stabilizes them closer to "null" signals. We additionally showcase the ART to a popular factorial survey design setting known as conjoint analysis. We find similar results through simulations and a recent application concerning the role of gender discrimination in political candidate evaluation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题