Batch Effect

From Metabolomics Society Wiki
Jump to: navigation, search

Introduction

In data obtained from large-scale LC-MS experiments typically performed in untargeted metabolomics studies, different sources of variance can be distinguished. Apart from the inherent biological variation, instrumental variation originating from minor changes in i.e. the injection volume is observed which can ideally be described as a normal white noise process with mean zero and constant variance. However, in addition frequently also gradual changes in the instrumental response during the measurement of a batch of samples (i.e. intra-batch effect) are observed due to inlet interface contamination, drifts in ionization efficiency or column performance. The intra-batch effect decreases the power to detect biological responses and hinders data interpretation and joint analysis of data from several batches. The intensity of intra-batch effects will depend on several factors including sample clean up and robustness of the LC column, ionization source and detector. The appearance of intra-batch effects is a priori unpredictable and usually unavoidable. To tackle this, different approaches for post-acquisition signal correction have been proposed. They are based on modelling the instrumental drift of the analytical variation in the response observed in quality control (QC) samples dispersed evenly throughout the batch. QC samples are theoretically identical to the biological samples under study, with a similar metabolite and sample matrix composition. Basically two types of QC sample are available: pooled QC samples in which small aliquots of each biological sample to be studied are mixed, and commercially available biofluids composed of multiple biological samples. Examples for available correction algorithms modelling the variation in the response of pooled QC-samples are:

Intensity of an internal standard (Phenylalanine D5) before and after QC-SVR correction

R packages for batch correction