论文标题
学习视觉词语嵌入,用于报告胸部X射线异常发现
Learning Visual-Semantic Embeddings for Reporting Abnormal Findings on Chest X-rays
论文作者
论文摘要
自动医学图像报告的生成由于减轻放射科医生的工作量而引起了人们的关注。报告生成的现有工作经常训练编码器核编码网络以生成完整的报告。但是,这样的模型受数据偏差(例如〜标签失衡)的影响,并且面临文本生成模型中固有的常见问题(例如〜重复)。在这项工作中,我们专注于报告放射学图像的异常发现。除了对完整的放射学报告培训,我们提出了一种方法,除了将其分组为无监督的聚类和最小规则外,还从报告中识别出异常发现。我们将任务制定为跨模式检索,并提出有条件的视觉语义嵌入,以使图像和连接嵌入空间中的细粒度异常发现对齐。我们证明,我们的方法能够检索异常的发现,并且在临床正确性和文本生成指标上都超过了现有的生成模型。
Automatic medical image report generation has drawn growing attention due to its potential to alleviate radiologists' workload. Existing work on report generation often trains encoder-decoder networks to generate complete reports. However, such models are affected by data bias (e.g.~label imbalance) and face common issues inherent in text generation models (e.g.~repetition). In this work, we focus on reporting abnormal findings on radiology images; instead of training on complete radiology reports, we propose a method to identify abnormal findings from the reports in addition to grouping them with unsupervised clustering and minimal rules. We formulate the task as cross-modal retrieval and propose Conditional Visual-Semantic Embeddings to align images and fine-grained abnormal findings in a joint embedding space. We demonstrate that our method is able to retrieve abnormal findings and outperforms existing generation models on both clinical correctness and text generation metrics.