Homepage | Data extraction Moderator: Elwin VERHEIJ (TNO, Background: The application of LC-MS, GC-MS and NMR for metabolomics results in very rich and complex raw data often obtained for large numbers of samples. Data extraction, i.e. translation of raw data / signals into accurate and concise clean data, is a daunting task. The increasing popularity of high resolution mass spec, e.g. FT and Orbitrap systems and GCxGC-MS and high throughput systems (e.g. UPLC) results in the generation of even larger amounts of complex data. Several data extraction strategies exist for the various analytical techniques, e.g. for NMR: binning, peak picking, deconvolution and for LC-MS/GC-MS: target processing, peak picking, deconvolution, etc. using software provided by instrument vendors, independent companies, or homemade tools (public domain or proprietary). All data extraction methods and softwares have their pros and cons with respect to critical issues such as throughput and data accuracy/quality. Metabolomics collaboration is hindered by the application of a wide variety of data extraction strategies and tools, especially because N data extraction tools (and a multitude of user defined settings) applied to the same raw data results in at least N different clean data sets, and finally in at least N different statistical models. Goal: Improve possibilities for nutritional metabolomics collaboration by sharing experiences with different data extraction methods and proposing a standard or reference method(s). Approach: In order to make this workshop a success we invite experts to contribute to this workshop and present their view on the data extraction as described above. Topics discussed in the session on data extraction will include:
Result: Criteria to be defined on what is a well characterized raw dataset. Based on these citeria, data set(s) should be selected for the testing of different extraction methods of potential interest. Decision on a round robin/benchmarking study Recent publications: The pdf files for the session data extraction can be found at: The slides shown during this session can be found in the attachments
| . |