论文标题
梅德福:人类和机器可读的元数据标记语言
MEDFORD: A human and machine readable metadata markup language
论文作者
论文摘要
研究的可重复性对于科学至关重要。但是,以现代计算生物学研究的方式,很容易失去对小但非常关键的细节的追踪。关键细节,例如所使用的软件的特定版本或基因组的迭代可以轻松丢失,或者根本没有注意到。数据库和存储方面正在完成许多工作,以确保存在一个存储特定于实验细节的空间,但是当前用于记录细节的机制对于科学家来说很麻烦。我们提出了一种新的元数据描述语言,名为Medford,其中科学家可以记录与他们的研究相关的所有细节。梅德福(Medford)可读,易于读取,易于编辑且可模板,是所有注意到研究人员都可以发现与他们的研究相关的所有注释的收集点,无论是用于内部使用还是将来复制。梅德福(Medford)已应用于珊瑚研究,记录了RNA-Seq分析到照片集的研究。
Reproducibility of research is essential for science. However, in the way modern computational biology research is done, it is easy to lose track of small, but extremely critical, details. Key details, such as the specific version of a software used or iteration of a genome can easily be lost in the shuffle, or perhaps not noted at all. Much work is being done on the database and storage side of things, ensuring that there exists a space to store experiment-specific details, but current mechanisms for recording details are cumbersome for scientists to use. We propose a new metadata description language, named MEDFORD, in which scientists can record all details relevant to their research. Human-readable, easily-editable, and templatable, MEDFORD serves as a collection point for all notes that a researcher could find relevant to their research, be it for internal use or for future replication. MEDFORD has been applied to coral research, documenting research from RNA-seq analyses to photo collections.