Formulae Differences Commence a Database for Interlaboratory Studies of Natural Organic Matter
Sarycheva Anastasia, Perminova Irina V., Nikolaev Evgeny N., Zherebker Alexander
Environmental Science and Technology, 2023, , doi: 10.1021/acs.est.2c08002
Direct comparison of high-resolution mass spectrometry (HRMS) data acquired with different instrumentation or parameters remains problematic as the derived lists of molecular species via HRMS, even for the same sample, appear distinct. This inconsistency is caused by inherent inaccuracies associated with instrumental limitations and sample conditions. Hence, experimental data may not reflect a corresponding sample. We propose a method that classifies HRMS data based on the differences in the number of elements between each pair of molecular formulae within the formulae list to preserve the essence of the given sample. The novel metric, formulae difference chains expected length (FDCEL), allowed for comparing and classifying samples measured by different instruments. We also demonstrate a web application and a prototype for a uniform database for HRMS data serving as a benchmark for future biogeochemical and environmental applications. FDCEL metric was successfully employed for both spectrum quality control and examination of samples of various nature.