高维功能数据的协方差和变更点标识的均匀性测试

论文标题

高维功能数据的协方差和变更点标识的均匀性测试

Homogeneity Tests of Covariance and Change-Points Identification for High-Dimensional Functional Data

论文作者

Santo, Shawn, Zhong, Ping-Shou

论文摘要

我们考虑了高维（HD）功能数据的推论问题，其重复测量值（t）对来自少数N实验单元的大量P变量进行了重复测量。空间和时间依赖性，高维度以及重复测量的致密数量都使理论研究和计算具有挑战性。本文有两个目标。我们的第一个目的是解决从高清功能数据中检测和识别协方差矩阵之间的变化点时解决的理论和计算挑战。第二个目的是通过保证的随机误差控制提供计算高效且无调的工具。更改点检测程序是以测试协方差矩阵均匀性的形式开发的。测试统计量形成的随机过程的弱收敛性在“大P，大T和小N”设置下建立。在一组温和的条件下，我们的变更点识别估计器被证明是一致的，对于序列的任何位置的变更点是一致的。它的收敛速率取决于数据维度，样本量，重复测量的数量以及信噪比。我们还表明，我们提出的计算算法可以大大减少计算时间，并且适用于现实世界中的数据，例如具有大量HD重复测量的fMRI数据。仿真结果证明了我们提出的程序的有限样本性能和计算有效性。我们观察到，测试的经验大小在标称级别受到很好的控制，并且可以准确地识别多个变更点的位置。 FMRI数据的应用程序表明，我们提出的方法可以在电影夏洛克的序言中识别事件边界。我们提出的程序是在R软件包TechPHD中实现的。

We consider inference problems for high-dimensional (HD) functional data with a dense number (T) of repeated measurements taken for a large number of p variables from a small number of n experimental units. The spatial and temporal dependence, high dimensionality, and the dense number of repeated measurements all make theoretical studies and computation challenging. This paper has two aims; our first aim is to solve the theoretical and computational challenges in detecting and identifying change points among covariance matrices from HD functional data. The second aim is to provide computationally efficient and tuning-free tools with a guaranteed stochastic error control. The change point detection procedure is developed in the form of testing the homogeneity of covariance matrices. The weak convergence of the stochastic process formed by the test statistics is established under the "large p, large T and small n" setting. Under a mild set of conditions, our change point identification estimator is proven to be consistent for change points in any location of a sequence. Its rate of convergence depends on the data dimension, sample size, number of repeated measurements, and signal-to-noise ratio. We also show that our proposed computation algorithms can significantly reduce the computation time and are applicable to real-world data such as fMRI data with a large number of HD repeated measurements. Simulation results demonstrate both finite sample performance and computational effectiveness of our proposed procedures. We observe that the empirical size of the test is well controlled at the nominal level, and the locations of multiple change points can accurately be identified. An application to fMRI data demonstrates that our proposed methods can identify event boundaries in the preface of the movie Sherlock. Our proposed procedures are implemented in an R package TechPhD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题