利用预先训练的模型进行故障分析三胞胎生成

论文标题

利用预先训练的模型进行故障分析三胞胎生成

Leveraging Pre-trained Models for Failure Analysis Triplets Generation

论文作者

Ezukwoke, Kenneth, Hoayek, Anis, Batton-Hubert, Mireille, Boucher, Xavier, Gounet, Pascal, Adrian, Jerome

论文摘要

预先训练的语言模型最近在自然语言处理（NLP）领域中获得了牵引力，用于文本摘要，发电和提问任务。这源于变压器模型中引入的创新及其与复发性神经网络模型（长期记忆（LSTM））相比，其压倒性的性能。在本文中，我们利用预先训练的因果语言模型（例如变形金刚模型）的注意机制，用于下游，用于生成失败分析三胞胎（FATS）的任务 - 分析半导体行业中缺陷组件的一系列步骤。我们比较了此生成任务的不同变压器模型，并观察到生成的预训练的变压器2（GPT2）优于其他变压器模型，用于故障分析三重态生成（FATG）任务。特别是，我们观察到，GPT2（在1.5B参数上训练）在胭脂上大幅度优于预先训练的BERT，BART和GPT3。此外，我们介绍了Levenshstein顺序评估指标（LESE），以更好地评估结构化脂肪数据，并表明它与人类判断完全比较了，而不是现有指标。

Pre-trained Language Models recently gained traction in the Natural Language Processing (NLP) domain for text summarization, generation and question-answering tasks. This stems from the innovation introduced in Transformer models and their overwhelming performance compared with Recurrent Neural Network Models (Long Short Term Memory (LSTM)). In this paper, we leverage the attention mechanism of pre-trained causal language models such as Transformer model for the downstream task of generating Failure Analysis Triplets (FATs) - a sequence of steps for analyzing defected components in the semiconductor industry. We compare different transformer models for this generative task and observe that Generative Pre-trained Transformer 2 (GPT2) outperformed other transformer model for the failure analysis triplet generation (FATG) task. In particular, we observe that GPT2 (trained on 1.5B parameters) outperforms pre-trained BERT, BART and GPT3 by a large margin on ROUGE. Furthermore, we introduce Levenshstein Sequential Evaluation metric (LESE) for better evaluation of the structured FAT data and show that it compares exactly with human judgment than existing metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题