论文标题
凯克天文台档案馆(KOA)的实时数据摄入
Real-time Data Ingestion at the Keck Observatory Archive (KOA)
论文作者
论文摘要
自今年2月以来,KOA在接近真实的时间内开始准备,传输和摄取数据;在大多数情况下,数据可在收购后的一分钟内通过KOA提供数据。到2022年夏末,所有活性仪器将完成实时摄入。天文台支持现代Python数据减少管道的开发,该管道在交付后将自动在每个晚上结束时自动创建科学就绪的数据集,以摄取档案中的档案。本演讲将描述为支持实时数据摄入而开发的基础架构,这本身就是在观测站进行现代化端到端操作的较大计划的一部分。 在望远镜操作期间,当通过监视基于关键字的天文台控制系统识别新获取的文件时,WMKO的软件将自动执行;该系统在凯克(Keck)几乎可以执行所有天文台功能。该监视器使用内置在控制系统中的回调来开始数据准备文件以单独传输到存档:调度脚本或文件系统相关的触发器是不必要的。一个基于HTTP的系统,称来自烧瓶微框架的系统可实现WMKO和NEXSCI之间的文件传输,并在NexSci触发数据摄入。 NexSci的摄入系统是一个紧凑型(4 kloc),高度容忍,基于Python的系统。它使用共享文件系统将数据从WMKO传输到NexSci。摄入代码是仪器不可知论的,仪器参数从配置文件中读取。它取代了自2004年以来一直使用的笨拙(50 KLOC)C的系统。
Since February of this year, KOA began to prepare, transfer, and ingest data as they were acquired in near-real time; in most cases data are available to observers through KOA within one minute of acquisition. Real-time ingestion will be complete for all active instruments by the end of Summer 2022. The observatory is supporting the development of modern Python data reduction pipelines, which when delivered, will automatically create science-ready data sets at the end of each night for ingestion into the archive. This presentation will describe the infrastructure developed to support real-time data ingestion, itself part of a larger initiative at the Observatory to modernize end-to-end operations. During telescope operations, the software at WMKO is executed automatically when a newly acquired file is recognized through monitoring a keyword-based observatory control system; this system is used at Keck to execute virtually all observatory functions. The monitor uses callbacks built into the control system to begin data preparation of files for transmission to the archive on an individual basis: scheduling scripts or file system related triggers are unnecessary. An HTTP-based system called from the Flask micro-framework enables file transfers between WMKO and NExScI and triggers data ingestion at NExScI. The ingestion system at NEXScI is a compact (4 KLOC), highly fault-tolerant, Python-based system. It uses a shared file system to transfer data from WMKO to NExScI. The ingestion code is instrument agnostic, with instrument parameters read from configuration files. It replaces an unwieldy (50 KLOC) C-based system that had been in use since 2004.