# About BIFROST

BIFROST is a three years project funded by the excellence initiative A*MIDEX, at the interface of analytical chemistry and applied mathematics. BIFROST concerns the development of data processing and data acquisition schemes capable of improving the power of chemical analysis of complex mixtures, in view of achieving a qualitative and quantitative decomposition of instrumental responses, loosely called spectra hereafter. This will be achieved by addressing bottlenecks in both disciplines, as detailed below.

The challenge resides in producing stable algorithms producing high-purity source representation in the presence of signal distortions and instabilities and, more importantly, a wide dynamic range of molecular concentrations. This last aspect is crucial for several reasons. From the chemical point of view, the most abundant compounds not always carry relevant information about the state of a sample, with biomarkers and contaminants being typical examples of this. The most intricate case of study is the one in which the spectrum presents severe overlap, so that more intense signals are likely to obscure the ones from minor species. Thus, an obvious avenue of investigation to improve the detection of less abundant compounds is to seek analytical techniques with increased resolution. This can be achieved for instance by physical separation of the sample into simpler portions (to the limit case of pure compounds). Multidimensional (nD) analyses are a possible response to this approach that has been extensively researched. Possibly, the richest selection of nD experiments can be found in the field of Nuclear Magnetic Resonance, with tens of methods available. Other spectral techniques, such as chromatography and mass spectrometry (MS) also rely more and more on nD combinations (sometimes called hyphenation). Intriguingly, signal processing, namely covariance analysis, has been used with some success to create the equivalent of nD spectra by relying on variations of the signal intensity along series of samples.

From the mathematical viewpoint, the unmixing problem is a blind source separation problem, which can also be seen as an instance of the dictionary learning problem, which currently receives considerable attention. Dictionary learning generally leads to difficult non-convex optimization problems, for which there exist very few provably convergent and stable algorithms. Developing such algorithms for spectroscopy unmixing is by itself a very challenging goal. The NMF (Nonnegative Matrix Factorization) based approach developed earlier by the consortium has been shown to yield good separation results in some specific situations, but has not been developed so far for nD experiments, and lacks theoretical convergence guarantees. Alternative strategies must be investigated. A main goal of this project is to develop mathematical and signal processing approaches that stay as close as possible to signal acquisition, e.g. avoiding ``black box'' pre-processing methods and software. The problem is to integrate such pre-processings in the unmixing problem. Another goal will be to integrate prior knowledge in the unmixing, e.g. knowledge about spectra of some of the compounds (for example biomarkers or contaminants).