缺少项目发现问题的流算法

论文标题

缺少项目发现问题的流算法

Streaming algorithms for the missing item finding problem

论文作者

Stoeckl, Manuel

论文摘要

数据流上的许多问题已经在两个极端的困难中进行了研究：要么在静态设置中允许随机算法（在最坏情况下，它们应以有限的概率误导）；或仅需要确定性和无误算法时。最近的一些作品考虑了对抗性设置，其中即使在自适应对手提供的数据流中，随机流算法也必须成功，该数据流可以看到算法的中间输出。为了更好地了解这些模型之间的差异，我们研究了一个称为“缺少项目查找”的流媒体任务。在此问题中，对于$ r <n $，一个数据流$ a_1，\ ldots，$ [n] $（可能带有重复）的元素的a_r $，并且必须在[n]中输出一些$ x \ in [n] $，而这些$ x \ in [n] $不等于任何$ a_i $。我们证明，对于$ r = n^{θ（1）} $和$Δ= 1/\ mathrm {poly}（n）$，在静态设置中解决此问题的随机算法所需的空间，具有错误$δ$ $δ$ us $δ$ us $θ（\ mathrm {polylog}（polylog}（n）（n））$;对于具有错误$δ$的对抗设置中的算法，$θ（（1 + r^2 / n）\ mathrm {polylog}（n）（n））$;对于确定性算法，$θ（r / \ mathrm {polylog}（n））$。由于我们的对抗性鲁棒算法依赖于免费访问$ o（r \ log n）$随机位的字符串，因此我们研究了流媒体算法的“随机启动”模型，其中所有随机位均包含在空间成本中。在这里，我们在空间使用情况下找到了有条件的下限，这取决于伪确定算法所需的空间来解决该问题。我们还证明了$ω（r / \ mathrm {polylog}（n））$下限，该算法的空间为$ <1/2^{\ mathrm {polylog}（n）} $ erory所需的空间，而“白盒”对手可以看到“白框”对手，但可以随机预测其内部的预期，但预测了未来的未来决定。

Many problems on data streams have been studied at two extremes of difficulty: either allowing randomized algorithms, in the static setting (where they should err with bounded probability on the worst case stream); or when only deterministic and infallible algorithms are required. Some recent works have considered the adversarial setting, in which a randomized streaming algorithm must succeed even on data streams provided by an adaptive adversary that can see the intermediate outputs of the algorithm. In order to better understand the differences between these models, we study a streaming task called "Missing Item Finding". In this problem, for $r < n$, one is given a data stream $a_1,\ldots,a_r$ of elements in $[n]$, (possibly with repetitions), and must output some $x \in [n]$ which does not equal any of the $a_i$. We prove that, for $r = n^{Θ(1)}$ and $δ= 1/\mathrm{poly}(n)$, the space required for randomized algorithms that solve this problem in the static setting with error $δ$ is $Θ(\mathrm{polylog}(n))$; for algorithms in the adversarial setting with error $δ$, $Θ((1 + r^2 / n) \mathrm{polylog}(n))$; and for deterministic algorithms, $Θ(r / \mathrm{polylog}(n))$. Because our adversarially robust algorithm relies on free access to a string of $O(r \log n)$ random bits, we investigate a "random start" model of streaming algorithms where all random bits used are included in the space cost. Here we find a conditional lower bound on the space usage, which depends on the space that would be needed for a pseudo-deterministic algorithm to solve the problem. We also prove an $Ω(r / \mathrm{polylog}(n))$ lower bound for the space needed by a streaming algorithm with $< 1/2^{\mathrm{polylog}(n)}$ error against "white-box" adversaries that can see the internal state of the algorithm, but not predict its future random decisions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题