论文标题
通过非阴性矩阵分解的COVID-19新案例计数的时间序列按时间序列进行聚类
Clustering US States by Time Series of COVID-19 New Case Counts with Non-negative Matrix Factorization
论文作者
论文摘要
在不同的隔离措施和重新开放政策下,美国各州的Covid-19的扩散模式在美国各州有很大差异。我们提出,根据每日新确认的案例计数,通过非负矩阵分解(NMF)将美国各州聚集到不同的社区中,然后根据NMF基础系数进行K-Means聚类程序。采用了交叉验证方法来选择NMF的等级。从3月22日至7月25日,我们将该方法应用于整个研究期,我们将49个大陆州(包括哥伦比亚特区)聚类为7组,其中两个包含一个州。为了研究聚类结果的动态,随着时间的推移,从3月22日至3月28日开始,将相同的方法依次应用于一周的时间。结果表明,从5月30日开始的一周开始的聚类中有一个变化点,这可以通过两种隔离措施和重新启动策略的联合影响来解释。
The spreading pattern of COVID-19 differ a lot across the US states under different quarantine measures and reopening policies. We proposed to cluster the US states into distinct communities based on the daily new confirmed case counts via a nonnegative matrix factorization (NMF) followed by a k-means clustering procedure on the coefficients of the NMF basis. A cross-validation method was employed to select the rank of the NMF. Applying the method to the entire study period from March 22 to July 25, we clustered the 49 continental states (including District of Columbia) into 7 groups, two of which contained a single state. To investigate the dynamics of the clustering results over time, the same method was successively applied to the time periods with increment of one week, starting from the period of March 22 to March 28. The results suggested a change point in the clustering in the week starting on May 30, which might be explained by a combined impact of both quarantine measures and reopening policies.