一个基于遗传算法的学习统计功率歧管的框架

论文标题

一个基于遗传算法的学习统计功率歧管的框架

A Genetic Algorithm-based Framework for Learning Statistical Power Manifold

论文作者

Umrawal, Abhishek K., Lane, Sean P., Hennes, Erin P.

论文摘要

统计能力是分类假设检验的可复制性的量度。正式地，如果人口中存在真正的效果，则可能是检测效应的概率。因此，需要优化统计能力作为假设检验的某些参数的函数。但是，对于大多数假设检验，单个模型参数的统计能力的明确功能形式尚不清楚。但是，使用模拟实验可以计算给定参数值集的功率。这些模拟实验通常在计算上很昂贵。因此，使用模拟开发整个统计功率歧管可能非常耗时。我们提出了一种基于遗传算法的新型基于学习统计能力歧管的框架。对于多个线性回归$ f $检验，我们表明所提出的算法/框架与蛮力方法相比，随着电源甲骨文的查询数量大大减少，统计功率歧管的速度要快得多。我们还表明，随着遗传算法的迭代次数增加，学习流形的质量会提高。当研究人员几乎没有关于对主要效果的最佳猜测或非主要效果的采样可变性影响主要效果的最佳猜测的最佳猜测，此类工具对于评估统计功率权衡很有用。

Statistical power is a measure of the replicability of a categorical hypothesis test. Formally, it is the probability of detecting an effect, if there is a true effect present in the population. Hence, optimizing statistical power as a function of some parameters of a hypothesis test is desirable. However, for most hypothesis tests, the explicit functional form of statistical power for individual model parameters is unknown; but calculating power for a given set of values of those parameters is possible using simulated experiments. These simulated experiments are usually computationally expensive. Hence, developing the entire statistical power manifold using simulations can be very time-consuming. We propose a novel genetic algorithm-based framework for learning statistical power manifolds. For a multiple linear regression $F$-test, we show that the proposed algorithm/framework learns the statistical power manifold much faster as compared to a brute-force approach as the number of queries to the power oracle is significantly reduced. We also show that the quality of learning the manifold improves as the number of iterations increases for the genetic algorithm. Such tools are useful for evaluating statistical power trade-offs when researchers have little information regarding a priori best guesses of primary effect sizes of interest or how sampling variability in non-primary effects impacts power for primary ones.

下载PDF全文

下载文献需遵守相关版权规定

论文标题