论文标题
在第四fermi-lat目录中,梯度增强了不确定类型的blazar的决策树分类
Gradient boosting decision trees classification of blazars of uncertain type in the fourth Fermi-LAT catalog
论文作者
论文摘要
$γ$ -Ray频段中最深入的全天候调查 - 基于12年内累积的数据,Fermi-LAT目录的最后发行版(4FGL-DR3)包含6600多个来源。来源中最大的人口是Blazar子类-3743,$ 60.1 \%$ $分类为Bl lacertae对象(BL LACS)或扁平频谱无线电数量(FSRQ),而其余的则列为不确定类型(BCU)的大型候选者(BCU),因为其企业是光学分类的。这项研究的目的是使用不同的机器学习算法对BCU进行分类,这些算法在已经分类的BL LAC和FSRQ的光谱和时间属性上进行了训练。人工神经网络,\ textit {xgboost}和LightGBM算法用于构建BCU分类的预测模型。使用18个输入参数为2219 BL LAC和FSRQ,我们训练(80 \%的样本)和测试(20 \%)这些算法,发现LightGBM模型,基于梯度的最先进的分类算法,基于梯度提升决策树,提供了最高的性能。根据我们的最佳模型,我们将825 BCUS分类为BL LAC候选人,而405个候选人为FSRQ候选人,但是,有190个没有明确的预测,但是4FGL的BCUS百分比降低到5.1 \%。 $γ$ - 雷光子指数,同步加速器峰频率和大型样品的高能量峰频率用于研究FSRQ和BL LAC(LBL,IBL和HBL)之间的关系。
The deepest all-sky survey available in the $γ$-ray band - the last release of the Fermi-LAT catalogue (4FGL-DR3) based on the data accumulated in 12 years, contains more than 6600 sources. The largest population among the sources is blazar subclass - 3743, $60.1\%$ of which are classified as BL Lacertae objects (BL Lacs) or Flat Spectrum Radio Quasars (FSRQs), while the rest are listed as blazar candidates of uncertain type (BCU) as their firm optical classification is lacking. The goal of this study is to classify BCUs using different machine learning algorithms which are trained on the spectral and temporal properties of already classified BL Lacs and FSRQs. Artificial Neural Networks, \textit{XGBoost} and LightGBM algorithms are employed to construct predictive models for BCU classification. Using 18 input parameters of 2219 BL Lacs and FSRQs, we train (80\% of the sample) and test (20\%) these algorithms and find that LightGBM model, state-of-the-art classification algorithm based on gradient boosting decision trees, provides the highest performance. Based on our best model, we classify 825 BCUs as BL Lac candidates and 405 as FSRQ candidates, however, 190 remain without a clear prediction but the percentage of BCUs in 4FGL is reduced to 5.1\%. The $γ$-ray photon index, synchrotron peak frequency, and high energy peak frequency of a large sample are used to investigate the relationship between FSRQs and BL Lacs (LBLs, IBLs, and HBLs).