Detecting Per-and-Polyfluoroalkyl Substances from Unidentified Chemical Compounds

National Health and Medical Research Council

This research focuses on addressing concerns regarding Per- and polyfluoroalkyl substances (PFAS), which can pose significant health risks at certain concentrations. High resolution mass spectrometry (HRMS) in Non-target analysis (NTA) has shown promise in detecting these substances. However, manually sifting through NTA data to identify compounds is time-consuming and demands expert knowledge. To overcome this, the study proposes a machine learning algorithm, capable of swiftly detecting PFAS and their derivatives from unidentified compounds.

The algorithm's sole inputs are the compound's mass and fragment masses, which are critical attributes and typically able to be detected. These are transformed and serve as inputs for a random forest classifier. The random forest classifier is a machine learning algorithm that combines data to reach a single result. Preliminary findings highlight the classifier's substantial reliability.

The algorithm provides a confidence level indicating whether a compound is a PFAS, PFAS derivative, or not related to PFAS. Its rapid analysis, requiring no user expertise, allows for the evaluation of compounds previously overlooked, guiding analysts toward compounds of interest.

Project members

Mathieu Feraud

PhD Candidate

Dr Jake O’Brien

Senior Research Fellow

Dr Pradeep Dewapriya

Research Fellow

A/Prof Sarit Kaserzon

Co-Theme Leader, Environmental Health Risk Assessment

Prof Kevin Thomas

QAEHS Director
and Theme Leader, Environmental Health Toxicology