使用贝叶斯推理探索无监督的顶级标签

论文标题

使用贝叶斯推理探索无监督的顶级标签

Exploring unsupervised top tagging using Bayesian inference

论文作者

Alvarez, Ezequiel, Szewc, Manuel, Szynkman, Alejandro, Tanco, Santiago A., Tarutina, Tatiana

论文摘要

在许多LHC搜索标准模型以及超越标准模型物理学的许多LHC搜索中，识别喷气机中的强发腐烂的顶级喷气机，甚至是样品中的总比例。尽管存在出色的著名算法，但它们的结构和预期的性能取决于蒙特卡洛的模拟，这可能会引起潜在的偏见。由于这些原因，我们基于对混合模型的贝叶斯推断，开发了两种简单的无监督的顶级tager算法。在其中一个中，我们用作观察到的变量，一种新的基于几何的可观察到的$ \ tilde {a} _ {3} $，而在另一种情况下，我们考虑了更传统的$τ_{3}/τ_{2} $ $ $ $ n $ -n $ -n $ -subjettiness，从而产生了更好的性能。正如预期的那样，我们发现无监督的标签性能低于现有的监督标签者，在曲线AUC $ \ sim 0.80-0.81 $下达到预期面积，准确性约为69％$ - $ 75％，$ 75％。但是，这些表演对蒙特卡洛的可能偏见更为强大，而这些表现比其监督对应物更为强大。我们的发现是探索和考虑更简单，公正的标签者的一步。

Recognizing hadronically decaying top-quark jets in a sample of jets, or even its total fraction in the sample, is an important step in many LHC searches for Standard Model and Beyond Standard Model physics as well. Although there exists outstanding top-tagger algorithms, their construction and their expected performance rely on Montecarlo simulations, which may induce potential biases. For these reasons we develop two simple unsupervised top-tagger algorithms based on performing Bayesian inference on a mixture model. In one of them we use as the observed variable a new geometrically-based observable $\tilde{A}_{3}$, and in the other we consider the more traditional $τ_{3}/τ_{2}$ $N$-subjettiness ratio, which yields a better performance. As expected, we find that the unsupervised tagger performance is below existing supervised taggers, reaching expected Area Under Curve AUC $\sim 0.80-0.81$ and accuracies of about 69% $-$ 75% in a full range of sample purity. However, these performances are more robust to possible biases in the Montecarlo that their supervised counterparts. Our findings are a step towards exploring and considering simpler and unbiased taggers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题