Rostakova Z., Rosipal R.

Determination of the number of components in the PARAFAC model with a nonnegative tensor structure: A simulated EEG data study.

Neural Computing and Applications, 34:14793-14805, 2022. doi:10.1007/s00521-022-07318-x.

Parallel factor analysis (PARAFAC) is a powerful tool for detecting latent components in higher-order arrays (tensors). As an essential input parameter, the number of latent components should be set in advance. However, any component number selection method already proposed in the literature became a rule of thumb. Existing studies have compared the component number selection methods’ performance on simulated data with a simplified structure. It was shown that the obtained results are not directly generalizable to real data. Using a real head model and cortical activation, our study demonstrates the advantages and disadvantages of twelve different methods applied to well-controlled nontrivial and nonnegative simulated data that resemble real electroencephalogram (EEG) properties as closely as possible. Different noise levels and disruptions from the optimal structure are considered. Moreover, we validate a new method for component number selection, which we have already successfully applied to real EEG tasks. We also demonstrate that the existing approaches must be adapted whenever a nonnegative data structure is assumed. We identified four methods that produce promising but not ideal results on nontrivial simulated data and present superior performance in EEG analysis practice. Nevertheless, component number selection in PARAFAC is a complex and unresolved problem. The nonnegative data structure assumption makes the problem more challenging. Although several methods have shown promising results, the issue remains open, and new approaches are needed.