私人眼睛：在视频会议中通过眼镜反射窥视文本屏幕的范围

论文标题

私人眼睛：在视频会议中通过眼镜反射窥视文本屏幕的范围

Private Eye: On the Limits of Textual Screen Peeking via Eyeglass Reflections in Video Conferencing

论文作者

Long, Yan, Yan, Chen, Xiao, Shilin, Prasad, Shivan, Xu, Wenyuan, Fu, Kevin

论文摘要

这项研究使用数学建模和人类对象实验，探讨了新兴网络摄像头可能在多大程度上泄漏了可识别的文本和图形信息，从而从网络摄像头捕获的眼镜反射中闪闪发光。我们工作的主要目标是衡量，计算和预测随着网络摄像头技术在未来发展的可识别性因素，限制和阈值。我们的工作探讨并表征了基于光学攻击的可行威胁模型，该模型使用视频帧序列上的多帧超级分辨率技术。我们的模型和实验结果在受控实验室设置中表明，可以重建和识别超过75％的准确性在屏幕文本上，其高度高达10毫米，并使用720p网络摄像头进行重建。我们进一步将此威胁模型应用于具有不同攻击者功能的Web文本内容，以找到可以识别文本的阈值。我们与20名参与者的用户研究表明，当今的720p网络摄像头足以使对手在大福特网站上重建文本内容。我们的模型进一步表明，向4K摄像机的演变将使文本泄漏的阈值倾斜到流行网站上大多数标头文本的重建。除文本目标外，还案例研究了，案例研究以720p网络摄像头识别Alexa前100个网站的封闭世界数据集显示，即使没有使用机器学习模型，也没有10个参与者的最高识别精度为94％。我们的研究提出了近期缓解，包括用户可以用来模糊视频流的眼镜区域的软件原型。对于可能的长期防御，我们主张一个个人反思测试程序，以评估各种环境下的威胁，并证明遵循最少特权原则对隐私敏感的情况的重要性是合理的。

Using mathematical modeling and human subjects experiments, this research explores the extent to which emerging webcams might leak recognizable textual and graphical information gleaming from eyeglass reflections captured by webcams. The primary goal of our work is to measure, compute, and predict the factors, limits, and thresholds of recognizability as webcam technology evolves in the future. Our work explores and characterizes the viable threat models based on optical attacks using multi-frame super resolution techniques on sequences of video frames. Our models and experimental results in a controlled lab setting show it is possible to reconstruct and recognize with over 75% accuracy on-screen texts that have heights as small as 10 mm with a 720p webcam. We further apply this threat model to web textual contents with varying attacker capabilities to find thresholds at which text becomes recognizable. Our user study with 20 participants suggests present-day 720p webcams are sufficient for adversaries to reconstruct textual content on big-font websites. Our models further show that the evolution towards 4K cameras will tip the threshold of text leakage to reconstruction of most header texts on popular websites. Besides textual targets, a case study on recognizing a closed-world dataset of Alexa top 100 websites with 720p webcams shows a maximum recognition accuracy of 94% with 10 participants even without using machine-learning models. Our research proposes near-term mitigations including a software prototype that users can use to blur the eyeglass areas of their video streams. For possible long-term defenses, we advocate an individual reflection testing procedure to assess threats under various settings, and justify the importance of following the principle of least privilege for privacy-sensitive scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题