Trustworthiness of Laser-Induced Breakdown Spectroscopy Predictions via Simulation-based Synthetic Data Augmentation and Multitask Learning
Date:
Laser-induced breakdown spectroscopy is a technique able to perform fast measurements in ambient air without any limitation on the targeted elements. It consists in focusing a laser beam on the surface of a sample in order to form a plasma and in analyzing the radiation emanating from its cooling. It is a versatile procedure used in various scenarios, such as nuclear decommissioning or fundamental physics experiments, for qualitative and quantitative spectral analyses. For the latter, the objective is usually to build a model relating experimental spectra to the concentration of the species of interest. This is based on the availability of a calibration set of known samples and can be done through a variety of supervised techniques. However, in the most straightforward implementation, models do not estimate to which extent an unknown sample is well represented by the calibration set. Hence, we do not know, in general, how reliable the prediction is. For this purpose, we build robust calibration models using deep convolutional multitask learning architectures to predict the concentration of the analyte, alongside additional spectral information as auxiliary outputs. Due to the experimental lack of training samples, we introduce a simulation-based data augmentation process to synthesize an arbitrary number of spectra for training, statistically representative of the experimental data. The secondary predictions are finally used to validate the model’s trustworthiness by taking advantage of the mutual dependencies of the parameters of the multitask neural networks: a statistical analysis of the outputs can be directly performed through a comparison with ground truth quantities. Such an end-to-end pipeline has a good ability in detecting anomalies and out-of-distribution samples without the need for a separate elemental analysis. Results on different types of materials, such as cement samples and alloys, show an improvement in the robustness (seen as homoscedasticity) and the trueness of the predictions, especially in the presence of noise and strong spectral interference in the spectra.