论文标题

分类变量的黑盒优化的傅立叶表示形式

Fourier Representations for Black-Box Optimization over Categorical Variables

论文作者

Dadkhahi, Hamid, Rios, Jesus, Shanmugam, Karthikeyan, Das, Payel

论文摘要

在纯粹的分类变量上定义的现实世界黑框函数的优化是一个活跃的研究领域。特别是,具有特定功能或结构特性的生物序列的优化和设计对医学,材料科学和生物技术具有深远的影响。独立搜索算法,例如模拟退火(SA)和蒙特卡洛树搜索(MCT),通常用于此类优化问题。为了提高此类算法的性能和样本效率,我们建议将现有方法与替代模型结合使用,以对纯粹的分类变量进行黑框评估。为此,我们提出了两种不同的表示,一个群体理论的傅立叶扩展和一个删节的单热编码布尔傅里叶扩展。为了学习此类表示形式,我们考虑两种不同的设置来更新我们的替代模型。首先,我们利用了对抗性的在线回归设置,在该设置中,每次评估黑匣子时,每个表示的傅立叶字符都被视为专家,并且通过指数重量更新规则进行更新。其次,我们考虑通过汤普森采样选择查询的贝叶斯环境,并通过稀疏的贝叶斯回归模型(超过我们提出的代表)更新后验,并具有正规的马蹄铁。对合成基准测试以及现实世界中RNA序列优化和设计问题的数值实验证明了所提出方法的代表性,与最先进的方法相比,这些方法实现了竞争性或优越的性能,同时提高了计算成本和/或样本效率。

Optimization of real-world black-box functions defined over purely categorical variables is an active area of research. In particular, optimization and design of biological sequences with specific functional or structural properties have a profound impact in medicine, materials science, and biotechnology. Standalone search algorithms, such as simulated annealing (SA) and Monte Carlo tree search (MCTS), are typically used for such optimization problems. In order to improve the performance and sample efficiency of such algorithms, we propose to use existing methods in conjunction with a surrogate model for the black-box evaluations over purely categorical variables. To this end, we present two different representations, a group-theoretic Fourier expansion and an abridged one-hot encoded Boolean Fourier expansion. To learn such representations, we consider two different settings to update our surrogate model. First, we utilize an adversarial online regression setting where Fourier characters of each representation are considered as experts and their respective coefficients are updated via an exponential weight update rule each time the black box is evaluated. Second, we consider a Bayesian setting where queries are selected via Thompson sampling and the posterior is updated via a sparse Bayesian regression model (over our proposed representation) with a regularized horseshoe prior. Numerical experiments over synthetic benchmarks as well as real-world RNA sequence optimization and design problems demonstrate the representational power of the proposed methods, which achieve competitive or superior performance compared to state-of-the-art counterparts, while improving the computation cost and/or sample efficiency, substantially.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源