论文标题
经验测量有限空间上加固链的大偏差
Empirical Measure Large Deviations for Reinforced Chains on Finite Spaces
论文作者
论文摘要
令$ a $为有限状态空间上的过渡概率内核$δ^o = \ {1,\ ldots,d \} $,使所有$ x,y \inΔ^o $ $ a(x,y)> 0 $。将加强链作为序列$ \ {x_n,\; n \ in \ Mathbb {n} _0 \} $ of $Δ^o $ $ $ $ $值随机变量,根据,$$ l^n = \ frac {1} {n} {n} {n} \ sum_ {i = 0} p(x_ {n+1} \ in \ cdot \ mid x_0,\ ldots,x_n)= l^n a(\ cdot)。$$我们建立了$ \ {l^n \} $的大偏差原理。速率函数的形式与Donsker-varadhan速率函数的形式截然不同,与Markov链的经验度量相关,并用过渡内核$ a $ a $ a $ a $ a $ $ a $,并用新颖的确定性无限的无限型折扣成本控制问题以及相关的线性控制动态和涉及相对熵功能的非线性运行成本。证明是基于对指数矩的对数转换的表示动态的时间反转的分析,以及弱收敛方法的分析。
Let $A$ be a transition probability kernel on a finite state space $Δ^o =\{1, \ldots , d\}$ such that $A(x,y)>0$ for all $x,y \in Δ^o$. Consider a reinforced chain given as a sequence $\{X_n, \; n \in \mathbb{N}_0\}$ of $Δ^o$-valued random variables, defined recursively according to, $$L^n = \frac{1}{n}\sum_{i=0}^{n-1} δ_{X_i}, \;\; P(X_{n+1} \in \cdot \mid X_0, \ldots, X_n) = L^n A(\cdot).$$ We establish a large deviation principle for $\{L^n\}$. The rate function takes a strikingly different form than the Donsker-Varadhan rate function associated with the empirical measure of the Markov chain with transition kernel $A$ and is described in terms of a novel deterministic infinite horizon discounted cost control problem with an associated linear controlled dynamics and a nonlinear running cost involving the relative entropy function. Proofs are based on an analysis of time-reversal of controlled dynamics in representations for log-transforms of exponential moments, and on weak convergence methods.